General Discussions about LUGNET : 4819


Administrative / General / 4819	4818 \| 4820

Subject:	Article scoring
Newsgroups:	lugnet.admin.general
Date:	Sat, 4 Mar 2000 01:17:18 GMT
Highlighted:	! (details)
Viewed:	1278 times

Last night I made up a little prototype test thingy for collaborative ranking (or scoring) of news articles in the system as being positive, neutral, or negative. (Nothing fancy, just about 20 lines of DB tracking code.) Its "underneath" implementation is pretty straightforward, but there's no "on top" user interface yet (but there could be, if the idea sounds good to people). Here are my thoughts: Each article in the system begins life, upon being posted, with a "neutral" score of 0. Members who happen to be reading through the web interface (or who can write themselves an extension to their newsreader to cast their votes) can mark each article (if they want) with a +1, 0, or -1. That "vote" then gets tallied, and the overall score for the article they just scored changes accordingly. Obviously, the voting process must be quick and easy. But even so, most of the time, most articles will be passed by very quickly without many votes, because most articles tend to be more or less neutral, and no matter how simple voting is, it'll still require some amount of effort. Thus the following criteria: 1. Voting must be as quick and easy as possible, and completely voluntary. 2. A voter should not feel pressure to vote on every single article. It's perfectly OK not to vote. 3. Not voting on an article mustn't hurt that article's score. 4. A large number of people casting +1's or -1's should produce a higher magnitude (absolute value) score than a small number of people casting +1's or -1's, because, in practice, people will tend to vote only when they feel strongly one way or the other -- i.e., when they feel it's worth it to vote. Number 4 is especially important, and it means that taking a simple average (sigma over n) is no good. To capture true excitement or true disgust, something other than an average must be used. It looks like what would work wondefully is a "square-rooted average" (sigma over sqrt(n))...for example: +1 +1 +1 +1 +1 +1 +1 +1 +1 +1 -> +10 / sqrt(10) = +3.16 +1 +1 +1 +1 +1 +1 +1 +1 +1 0 -> +9 / sqrt(10) = +2.85 +1 +1 +1 +1 +1 +1 +1 +1 0 0 -> +8 / sqrt(10) = +2.53 +1 +1 +1 +1 +1 -> +5 / sqrt(5) = +2.23 +1 +1 +1 +1 0 -> +4 / sqrt(5) = +1.79 +1 +1 +1 0 0 -> +3 / sqrt(5) = +1.34 +1 +1 -> +2 / sqrt(2) = +1.41 +1 0 -> +1 / sqrt(2) = +0.71 +1 -1 -> 0 / sqrt(2) = 0.00 -1 -1 -1 -> -3 / sqrt(3) = -1.73 -1 -1 -1 -1 -1 -1 -1 -> -7 / sqrt(7) = -2.65 Another way of looking at this is that it's simply a straight sum, but one which has been run through a finalizer function to keep the sums from flying away to positive or infinity or infinity (or, in pracice, silly-big numbers) too quickly. That is, 1000 positive votes shouldn't really result in a score 10 times as high as "only" 100 positive votes. (Mutatis mutandis for negative scores.) So, in other words, the summation part makes it so that one person can't have too large an effect on the whole when there are small numbers of votes (this is a big problem with regular straight averaging), and the sqrt() part makes it so that a bazillion people can't have too large an effect on the whole either (when compared to a tenth-of-a-bazillion people). ln() of some base (say, 2) could just as well be used in place of sqrt(), but in practice I think logs trail off too quickly. The sqrt() part is also very important as a "visibility raiser" enabler. If there are 20 +1's and 10 0's, that comes out to a simple average of 2/3. And if there are 200 +1's and 100 0's, then that also comes out to a simple average of 2/3. But dividing by sqrt(n) instead of by n, it comes out like this: Votes | Sigma / n Sigma / sqrt(n) ---------------------+--------------------------------------------------- +1 x 20 0 x 10 | +20 / 30 = +0.67 +20 / sqrt(30) = +3.65 +1 x 200 0 x 100 | +200 / 300 = +0.67 +200 / sqrt(300) = +11.55 The method on the right yields the result that the bottom one is approximately 3 times more significant than the top one, whereas the method on the left (simple averaging) yields the result that the two are the same. But of course when voting is voluntary and not manditory, 200 people casting a positive vote is far more significant than only 20 people casting a positive vote. * * * OK, that (above) was the internal math side of things. That part would all be (thank god :) completely transparent to users, who'll simply want to be able to click a button and see the current score. And maybe later, also be able to rank things by score. This could help a lot in getting visibility to things -- tied with the Spotlight and the Channels stuff. Even if only a handful of people actively end up choosing to vote. --Todd

Message has 5 Replies:

		Re: Article scoring
(...) ^^^ (...) Er, I meant a literal "and" there, not an "and then." That is, people ought IMHO to be able to see the current score without first voting -- that's an important thing, especially if they're reviewing their vote later, etc. --Todd (24 years ago, 4-Mar-00, to lugnet.admin.general)

		Re: Article scoring
(...) Sounds funky! What would be nice is a Top (X) messages of the hour/day/week/month/etc page. Although, would that bias then be an unfair one? Ie - as traffic gets busier, more and more people might just decide to view the the days most popular (...) (24 years ago, 4-Mar-00, to lugnet.admin.general)

		Re: Article scoring
In lugnet.admin.general, Todd Lehman writes: <interesting (to me) yet long techie details snipped> Great! I like this and agree to the reasoning, 'specially the sqrt part of more people voting influencing more. (...) ;-) (...) I agree, cool! (...) (...) (24 years ago, 4-Mar-00, to lugnet.admin.general)

		Re: Article scoring
(...) The only reason I read Slashdot is because of the article scoring. Not a bad system, that. But I'm enough of a narcissist that I'd be prone to artificially boost the score on my own messages. And I suspect I'm not alone. (You know who you are. (...) (24 years ago, 4-Mar-00, to lugnet.admin.general)

		Re: Article scoring
On Sat, 4 Mar 2000, Todd Lehman (<FqvI8u.LzE@lugnet.com>) wrote at 01:17:18 (...) So what's it for? A surrogate for me-too posts? Speaking as an NNTP person, I juge articles by the number of replies they get, and I wouldn't see any of this at all. (...) (24 years ago, 4-Mar-00, to lugnet.admin.general)

52 Messages in This Thread:

Entire Thread on One Page:: Nested: All | Brief | Compact | Dots
Linear: All | Brief | Compact

Custom Search