User:Joannac/Spam

From Geohashing
< User:Joannac
Revision as of 15:44, 7 July 2009 by imported>Pierre.blondelle (The problem)

The problem

As frequent users have noticed, we are getting a non-trivial amount of spam (typical with anything on the internet, really). These fall into:

  1. Crappy username spam like these
  2. Spam edits like this
  3. Odd spam like this, which I'm pretty sure is spam but could just be anonymous users showing their appreciation...

--> This is indeed not a spam. Just a friend of mine, who's not a geohasher, and can't speak english. She's doing her best! ;-) -- Pierre 17:43, 7 July 2009 (GMT+1)

Suggestions

  1. We stay as we are. I don't actually mind protecting pages, deleting spam pages, and blocking spam IPs/users. Maybe I'll change my mind later, but for the moment, I don't care. But several different people have asked about more spam fighting measures, so I'm opening it up for wider discussion.
  2. We ban all anonymous editing. This will fix problems 2 and 3, but not so much 1. Also this will annoy people logging in from public computers (i.e. internet cafe on holiday). Also people who want to make a contribution but don't want the hassle of making an account.
  3. We install CAPTCHAs like this extension. This allows anyone to edit, requires unregistered users to solve a captcha, solving a captcha to register, and NO CAPTCHA to anyone signed in. This is a minor annoyance to people who regularly edit without logging in.
  4. Some kind of script like Wenslayer suggested. I think the biggest problem with this is getting it past the wiki server admin.
  5. We delete the whole wiki. (Just kidding).

Comments

Comment please! Add opinions, agreement/disagreement, other suggestions, random comments. Please also add a ranking for the options (if you have an opinion)

What helped extremely well against any kind of wiki spam in one mediawiki I administer is a silent word blacklist. I.e. you put common spam words (which can be brand names, certain otherwise uncommon html tag parameters, or certain link parts on a blacklist. Whenever one of these words is used, the page is not saved when pressing the save button, but the wiki behaves as if. I run my blacklist on less than ten keywords and haven't seen either spam or user complaints in years. That requires tinkering with the wiki php code though. It wouldn't work if it were a common extension.

I'm fine with any text-based (i.e. accessible) captchas otherwise. I don't like banning anonymous. -- relet 09:05, 7 July 2009 (UTC)

Will it be smart enough to recognize an expedition report where the hashpoint is at Pfizer headquarters? ;) --ilpadre 09:24, 7 July 2009 (UTC)

The ReCAPTCHA seems like a pretty useful project even beyond serving as a spam protection - I'd favor that solution. I wouldn't mind disabling anonymous editing, but since we have registered spammers too, that's probably unneccessary. --dawidi 09:49, 7 July 2009 (UTC)

1. I know you say you don't care, but that may just be because you have gotten used to it. My opinion is that you would really appreciate not having to do it anymore. Though you would have to find a new outlet for all your singing.
2. I haven't ever logged onto the wiki from a cyber cafe, and doubt I ever will, so disabling anonymous editing wouldn't be any skin off of my teeth. However, I do know that other people do so from time to time, so this isn't my favorite option.
3. I'm in favor of the ReCAPTCHA setup. Captchas in general tend to do a good job at discouraging spam, and the one used by that extension seems pretty good.
4. The problem with using a script to prevent certain types of user names is that it must necessarily be strict and focused, otherwise user names will always have to be confirmed by someone with authority. The problem with a strict and focused approach is that spammers, though annoying, are clever. When they see that their current approach is no longer working, they will just tweak it a bit, and we will have to bother the wiki server admin again to update our script.
5. I know you said you were joking, but this is the only surefire way to get rid of spam on the wiki. =P
So, given my opinions, my ranking is this: 3>2>1>4>5
And I suggest we use the Beatpath Winner (can be calculated here) based on people's rankings. Relet suggested it for a vote I proposed earlier, and it is quite appropriate for this vote as well, since we have multiple options to choose from. --aperfectring 12:27, 7 July 2009 (UTC)

I vote 3>1>2=4=5 -- relet 13:25, 7 July 2009 (UTC)

CAPTCHAs FTW! Banning anonymous editing really has the internet cafe problem and spammers would only create accounts. Staying as we are and manually deleting spam might not annoy you (but I'm sure it does or will do, whatever you say) but it annoys ME (and that's what counts most IMNSHO). relet's blacklist approach sounds good, but might be difficult to implement. 3>2>4>1>5 - Danatar 13:33, 7 July 2009 (UTC)

I agree to CAPTCHAs, as well, for account creating and anonymous users only. For those worried about accessibilty, the ReCAPTCHA also offers an audio test for those using screenreaders. Also it helps digitizing books. Otherwise, we might also use the ConfirmEdit extension, which is given in plain text. Even though bots should be able to circumvent this, most spambots don't seem to do so. At least all wikis I know using ConfirmEdit are spam-free. I would oppose disabling anonymous editing, as this would kind-of break the original idea of a wiki. Also public computers and people having problems logging in (e.g. from a cellphone). Also oppose to leave everything as it is. Spam sucks, and even though Spammers wouldn't even profit from doing this (the wiki software parses external links as e.g. <a href="http://freeviagra.lol" rel="nofollow">Free Viagra</a>, preventing search engine bots to crawl these links), it still wastes valuable server resources and I hate it. Koepfel talk 21:01, 7 July 2009 (UTC)

I'd like to see the CAPTCHAs too; my vote: 3>2>1>4>5 --Wenslayer 01:14, 8 July 2009 (UTC)

I'd say banning anoymous edits would hurt more of us than the spammers at this point. CAPTCHAs seems like a reasonable approach. I'd stick with 1 until JoannaC is fed up with deleting stuff or the rest of are fed up with flagging stuff. 1>3>2>4>5 Jiml 06:45, 8 July 2009 (UTC) Now they spammed one of my expeditions, so 3>1>2>4>5 :-) Jiml 23:01, 10 July 2009 (UTC)

I really don't like banning anonymous, so I guess for me it's 3>4>1>2>5 -HiroProtagonist 10:59, 25 August 2009 (UTC)

myka comments

captchas could work. They'd be occasionally annoying, but generally not a nusience (for all their evilness). note that 1 and 2 in my list are somewhat interchangable. I really haven't decided as yet

  1. leave as is (as long as joanna remains happy with the situation)
  2. captchas. (ReCaptcha is a decent thing in any case)
  3. ban anonymous editing.
  4. cleaning script.

NWoodruff comments

I've been asked to weigh in on this topic because I do a lot of anonymous editing. I only do anonymous editing because I am too lazy to log in every time.

My thought is that if anonymous editing is causing to much of a problem of spam, then we should ban anonymous editing. It isn't that much more effort to force me to log in every time.

But, you say it will cause the spam bots to create phony user id's. It will make it much easier to run a script to rid the wiki of spam comments if each comment is tagged to a user. Just my though.

If that is not the route to go, then I say white list IP address. I only post from two IP addresses the 65.12 is my home IP address that is static and doesn't change. The 72.243 IP address is my work IP address and doesn't change either.

I wouldn't mind either one to get a handle on spam.