Difference between revisions of "User talk:Aperfectring/Random image page"

Revision as of 17:57, 21 February 2010

I like the idea! But when I think about some of the images, there are cases where voting is just tiresome:

Some of the pictures I have taken (and one of the two pictures I saw at my first try of your page) are blurry...
... or they are really really boring like this one ...
... also there is and similar pictures.

While it is possible (and probable) that they will be voted down each time they show up, it will take very long for them to end up at the lower end of the list. It would speed things up if there was a "This picture really sucks, I want to give it -10 points" button. That's just my 2 cents because I'm lazy and I don't want to see the boring pictures often, I won't mind if you ignore this proposal. Yeah, ignoring is probably best. Why am I still typing? Ah, deleting this post is too much effort. I think I'll find something to occupy myself now. Bye. - Danatar 17:41, 17 February 2010 (UTC)

Yeah, that will happen, but there are also a lot of hidden gems that people wouldn't see otherwise, my tendency is to err on the side of caution, so that we can see all the images!
Same as point #1
That file will be filtered out, because it is not in [[Category:Meetup in LAT LON]].

There has been talk of adding a blacklist, but I think the best way is to just have the probability weight the choice of images based on their current rating. Unfortunately that means there will be some boring images you have to slog through, but as time goes on, it should get better and better. --aperfectring 17:51, 17 February 2010 (UTC)

Schema suggestion

`image` table

Column name	Type
imageID	CHAR(64)
imageName	VARCHAR(65535)
votesUp	INT() UNSIGNED
votesDown	INT() UNSIGNED
isCached	BOOL

`category` table

Column name	Type
catID	INT() UNSIGNED
catName	CHAR(64)

`wikiCatted` and `reportCatted` tables

Column name	Type
catID	INT() UNSIGNED
imageID	CHAR(64)

`wikiCatted` is emptied and repopulated on each sweep of the wiki. `reportCatted` is added to when people use the app. (Perhaps that table should have IP address and timestamp as well?)

Usage

INSERT values into the `category` table ("GPS", "XKCD Marker", "Wildlife") and link them to images using the `wikiCatted` table.

Comments

I do like this better than my idea. However, the wiki likely not swept on a regular basis, as gathering the categories of each of the images would probably be fairly wiki-intensive and time consuming. I will look into that, but the more likely it will be updated once, and randomly refreshed. --aperfectring 21:37, 17 February 2010 (UTC)

Voting / ranking algorithm

This is mostly a summary of talks on the IRC, only little new thoughts, but I think that it should go to the wiki so I wrote it down.

Myself I look very much forward to the voting feature, and would actually love to see that implemented first. --Ekorren 17:57, 21 February 2010 (UTC)

General thoughts

The voting should be about "favourite". We thought about several categories, i.e. "more interesting" vs. "more beautiful" since that doesn't need to be the same (imagine a blurred picture of a ridiculous no trespassing sign vs. a technically perfect picture of a tree), but for most pairs of pictures such a split doesn't really work.

Given the low number of comparisons a single picture will get, and the huge number of purely documentary pictures, a simple count of won and lost votes will not do. There's too much noise in that.

Beauty is in the eye of the beholder. There is no absolute value, and there will be lots of contradictive results.

If picture 1 won over picture 2, and 2 won over picture 3, you can assume that 1 would also win over 3, but can't be sure.

An algorithm that nicely includes transitive ratings will be desirable, but difficult to implement. We should use something easier first and just collect and store the actual votes for later evaluation.

An easy version without database digging

There was an idea to use a rated count, similar to the systems some sports use. The most prominent here is the ELO system (which is quite complicated)

Sports rating systems assume that the absolute strength is a dynamic value evolving over time. A picture rating system doesn't need or even shouldn't assume that. There is no reason why the worth of old points should decline.

Sports rating systems are used to find worthy opponents of similar strength. This should not be done with a picture rating system. The bad ones shouldn't be given bad opponents to have a chance of winning, but maintain their absolute bad score (while still being able to compare their relative badness).

However, the general idea behind ELO wouldn't work too bad for us: Winning over someone with a much lower rating doesn't earn you much, winning over someone with a much higher rating gets you up.

An easy variant of that would be to give between 0 and 20 points for winning, and lose the same for losing, the actual amount derived from the difference between the current ratings of both scaled by some global factor.

Winning against a pic with the same rating would always earn you 10 points, losing -10.

The best rated picture winning against the worst rated wouldn't change their ratings by any point. "We knew that already"

Everything inbetween would scale down into the 0 to 20 range.

Now there's the issue with the global factor. Since we don't really know how the absolute ratings will develop (actually they will develop rather slow), the scaling factor should use both absolute and dynamic limits. The difference between the overall best and the overall worst score might provide a good scale, however, it has two issues:

(1) In the beginning, it's too low because it will take quite a long time until a number of pictures has a significant number of votes, scaling early votes too high compared to later ones.

(2) On a long term, it might reach astronomical heights, which means that most actual scores will be very close together relative to the overall range.

(1) could be adressed by setting an initial minimum value. (2) by using a non-linear scale (arctan comes to mind).

The overall scaling might need adjustment.

@@ Line 56: / Line 56: @@
 ===Comments===
 I do like this better than my idea.  However, the wiki likely not swept on a regular basis, as gathering the categories of each of the images would probably be fairly wiki-intensive and time consuming.  I will look into that, but the more likely it will be updated once, and randomly refreshed.  --[[User:Aperfectring|aperfectring]] 21:37, 17 February 2010 (UTC)
+== Voting / ranking algorithm ==
+This is mostly a summary of talks on the IRC, only little new thoughts, but I think that it should go to the wiki so I wrote it down.
+Myself I look very much forward to the voting feature, and would actually love to see that implemented first. --[[User:Ekorren|Ekorren]] 17:57, 21 February 2010 (UTC)
+=== General thoughts ===
+* The voting should be about "favourite". We thought about several categories, i.e. "more interesting" vs. "more beautiful" since that doesn't need to be the same (imagine a blurred picture of a ridiculous no trespassing sign vs. a technically perfect picture of a tree), but for most pairs of pictures such a split doesn't really work.
+* Given the low number of comparisons a single picture will get, and the huge number of purely documentary pictures, a simple count of won and lost votes will not do. There's too much noise in that.
+* Beauty is in the eye of the beholder. There is no absolute value, and there will be lots of contradictive results.
+* If picture 1 won over picture 2, and 2 won over picture 3, you can assume that 1 would also win over 3, but can't be sure.
+* An algorithm that nicely includes transitive ratings will be desirable, but difficult to implement. We should use something easier first and just collect and store the actual votes for later evaluation.
+=== An easy version without database digging ===
+* There was an idea to use a rated count, similar to the systems some sports use. The most prominent here is the ELO system (which is quite complicated)
+* Sports rating systems assume that the absolute strength is a dynamic value evolving over time. A picture rating system doesn't need or even shouldn't assume that. There is no reason why the worth of old points should decline.
+* Sports rating systems are used to find worthy opponents of similar strength. This should not be done with a picture rating system. The bad ones shouldn't be given bad opponents to have a chance of winning, but maintain their absolute bad score (while still being able to compare their relative badness).
+* However, the general idea behind ELO wouldn't work too bad for us: Winning over someone with a much lower rating doesn't earn you much, winning over someone with a much higher rating gets you up.
+An easy variant of that would be to give between 0 and 20 points for winning, and lose the same for losing, the actual amount derived from the difference between the current ratings of both scaled by some global factor.
+* Winning against a pic with the same rating would always earn you 10 points, losing -10.
+* The best rated picture winning against the worst rated wouldn't change their ratings by any point. "We knew that already"
+* Everything inbetween would scale down into the 0 to 20 range.
+Now there's the issue with the global factor. Since we don't really know how the absolute ratings will develop (actually they will develop rather slow), the scaling factor should use both absolute and dynamic limits. The difference between the overall best and the overall worst score might provide a good scale, however, it has two issues:
+(1) In the beginning, it's too low because it will take quite a long time until a number of pictures has a significant number of votes, scaling early votes too high compared to later ones.
+(2) On a long term, it might reach astronomical heights, which means that most actual scores will be very close together relative to the overall range.
+(1) could be adressed by setting an initial minimum value. (2) by using a non-linear scale (arctan comes to mind).
+The overall scaling might need adjustment.