User:Joannac/Spam

From Geohashing

The problem

As frequent users have noticed, we are getting a non-trivial amount of spam (typical with anything on the internet, really). These fall into:

  1. Crappy username spam like these
  2. Spam edits like this
  3. Odd spam like this, which I'm pretty sure is spam but could just be anonymous users showing their appreciation...

--> This is indeed not a spam. Just a friend of mine, who's not a geohasher, and can't speak english. She's doing her best! ;-) -- Pierre 17:43, 7 July 2009 (GMT+1)

Suggestions

  1. We stay as we are. I don't actually mind protecting pages, deleting spam pages, and blocking spam IPs/users. Maybe I'll change my mind later, but for the moment, I don't care. But several different people have asked about more spam fighting measures, so I'm opening it up for wider discussion.
  2. We ban all anonymous editing. This will fix problems 2 and 3, but not so much 1. Also this will annoy people logging in from public computers (i.e. internet cafe on holiday). Also people who want to make a contribution but don't want the hassle of making an account.
  3. We install CAPTCHAs like this extension. This allows anyone to edit, requires unregistered users to solve a captcha, solving a captcha to register, and NO CAPTCHA to anyone signed in. This is a minor annoyance to people who regularly edit without logging in.
  4. Some kind of script like Wenslayer suggested. I think the biggest problem with this is getting it past the wiki server admin.
  5. We delete the whole wiki. (Just kidding).

Comments

Comment please! Add opinions, agreement/disagreement, other suggestions, random comments. Please also add a ranking for the options (if you have an opinion)

What helped extremely well against any kind of wiki spam in one mediawiki I administer is a silent word blacklist. I.e. you put common spam words (which can be brand names, certain otherwise uncommon html tag parameters, or certain link parts on a blacklist. Whenever one of these words is used, the page is not saved when pressing the save button, but the wiki behaves as if. I run my blacklist on less than ten keywords and haven't seen either spam or user complaints in years. That requires tinkering with the wiki php code though. It wouldn't work if it were a common extension.

I'm fine with any text-based (i.e. accessible) captchas otherwise. I don't like banning anonymous. -- relet 09:05, 7 July 2009 (UTC)

Will it be smart enough to recognize an expedition report where the hashpoint is at Pfizer headquarters? ;) --ilpadre 09:24, 7 July 2009 (UTC)

The ReCAPTCHA seems like a pretty useful project even beyond serving as a spam protection - I'd favor that solution. I wouldn't mind disabling anonymous editing, but since we have registered spammers too, that's probably unneccessary. --dawidi 09:49, 7 July 2009 (UTC)

1. I know you say you don't care, but that may just be because you have gotten used to it. My opinion is that you would really appreciate not having to do it anymore. Though you would have to find a new outlet for all your singing.
2. I haven't ever logged onto the wiki from a cyber cafe, and doubt I ever will, so disabling anonymous editing wouldn't be any skin off of my teeth. However, I do know that other people do so from time to time, so this isn't my favorite option.
3. I'm in favor of the ReCAPTCHA setup. Captchas in general tend to do a good job at discouraging spam, and the one used by that extension seems pretty good.
4. The problem with using a script to prevent certain types of user names is that it must necessarily be strict and focused, otherwise user names will always have to be confirmed by someone with authority. The problem with a strict and focused approach is that spammers, though annoying, are clever. When they see that their current approach is no longer working, they will just tweak it a bit, and we will have to bother the wiki server admin again to update our script.
5. I know you said you were joking, but this is the only surefire way to get rid of spam on the wiki. =P
So, given my opinions, my ranking is this: 3>2>1>4>5
And I suggest we use the Beatpath Winner (can be calculated here) based on people's rankings. Relet suggested it for a vote I proposed earlier, and it is quite appropriate for this vote as well, since we have multiple options to choose from. --aperfectring 12:27, 7 July 2009 (UTC)

I vote 3>1>2=4=5 -- relet 13:25, 7 July 2009 (UTC)

CAPTCHAs FTW! Banning anonymous editing really has the internet cafe problem and spammers would only create accounts. Staying as we are and manually deleting spam might not annoy you (but I'm sure it does or will do, whatever you say) but it annoys ME (and that's what counts most IMNSHO). relet's blacklist approach sounds good, but might be difficult to implement. 3>2>4>1>5 - Danatar 13:33, 7 July 2009 (UTC)

I agree to CAPTCHAs, as well, for account creating and anonymous users only. For those worried about accessibilty, the ReCAPTCHA also offers an audio test for those using screenreaders. Also it helps digitizing books. Otherwise, we might also use the ConfirmEdit extension, which is given in plain text. Even though bots should be able to circumvent this, most spambots don't seem to do so. At least all wikis I know using ConfirmEdit are spam-free. I would oppose disabling anonymous editing, as this would kind-of break the original idea of a wiki. Also public computers and people having problems logging in (e.g. from a cellphone). Also oppose to leave everything as it is. Spam sucks, and even though Spammers wouldn't even profit from doing this (the wiki software parses external links as e.g. <a href="http://freeviagra.lol" rel="nofollow">Free Viagra</a>, preventing search engine bots to crawl these links), it still wastes valuable server resources and I hate it. Koepfel talk 21:01, 7 July 2009 (UTC)

I'd like to see the CAPTCHAs too; my vote: 3>2>1>4>5 --Wenslayer 01:14, 8 July 2009 (UTC)

I'd say banning anoymous edits would hurt more of us than the spammers at this point. CAPTCHAs seems like a reasonable approach. I'd stick with 1 until JoannaC is fed up with deleting stuff or the rest of are fed up with flagging stuff. 1>3>2>4>5 Jiml 06:45, 8 July 2009 (UTC) Now they spammed one of my expeditions, so 3>1>2>4>5 :-) Jiml 23:01, 10 July 2009 (UTC)

I really don't like banning anonymous, so I guess for me it's 3>4>1>2>5 -HiroProtagonist 10:59, 25 August 2009 (UTC)

myka comments

captchas could work. They'd be occasionally annoying, but generally not a nusience (for all their evilness). note that 1 and 2 in my list are somewhat interchangable. I really haven't decided as yet. Then again, reducing joanna's workload can only be a good thing.

  1. captchas. (ReCaptcha is a decent thing in any case)
  2. leave as is (as long as joanna remains happy with the situation)
  3. ban anonymous editing.
  4. cleaning script.

NWoodruff comments

I've been asked to weigh in on this topic because I do a lot of anonymous editing. I only do anonymous editing because I am too lazy to log in every time.

My thought is that if anonymous editing is causing to much of a problem of spam, then we should ban anonymous editing. It isn't that much more effort to force me to log in every time.

But, you say it will cause the spam bots to create phony user id's. It will make it much easier to run a script to rid the wiki of spam comments if each comment is tagged to a user. Just my though.

If that is not the route to go, then I say white list IP address. I only post from two IP addresses the 65.12 is my home IP address that is static and doesn't change. The 72.243 IP address is my work IP address and doesn't change either.

I wouldn't mind either one to get a handle on spam.

Support the concensus building of 3>2>other. - Wmcduff 22:31, 10 July 2009 (UTC)

Robyn Comments

Me: 3 > 2 > 1

1. Everyone works on this, and I think many hands make lighter work, but if we didn't, what cool improvements could we make instead?
2. It's a start, and I like NWoodrow's point that if spammers have a registration, it's easier to revert all their edits.
3. I see no downside to this one.
4. Not knowledgeable enough about impact to comment.
5. Noooo!

The IP whitelist would be a real problem for me, as I am a wireless slut, connecting to systems all over the continent, whatever will have me. Unless it only applies to not-logged-in users, then almost no impact.

And thank you to the various people who have been fighting spam on my talk page while I've been away.

Yvh11a (TalkContribs) Comments

3 > = 2 > 1

  1. I'm opposed to this - just because Joannac is a Hero of the Internet doesn't mean that it's right to allow spam to continue. That which is worth having is worth fighting for (this means the rest of us)
  2. I'm personally in favor of this, but only because I'm always on my home machine (whether locally or via ssh). I understand that removing anonymous access entirely might be an unbearable difficulty for some nomadic users, and it does violate the spirit of a wiki, so if everyone else doesn't like this idea, I won't complain.
  3. I'm in favor of this. If Google does it, it must be a good idea, right?
  4.  ? (no knowledge of)
  5. Actually, I changed my mind. Let's do this one!

davidc's comments

It seems a lot (but not all) of the spam is on anonymous user talk pages, by anonymous users. Is it possible to disable anonymous users from commenting on anonymous talk pages? If not, is it possible to disable anonymous user talk pages entirely? (I can see in a bigger wiki this would be a useful place to keep notes about an IP address, but I can't see any use for it here?)

Also for the record, I wouldn't like to see CAPTCHAs for anonymous users unless (a) they were accessible (easy to use on mobile phones etc) and (b) you only had to do them ONCE and then could edit without limits. --davidc 20:43, 2 September 2009 (UTC)