Detecting voting rings with HyperLogLog

(written by lawrence krubner, however indented passages are often quotes). You can contact lawrence at: lawrence@krubner.com, or follow me on Twitter.

While this seems like a clever trick, I typically want a lot of metrics regarding voting, so throwing away the metadata doesn’t seem like an option to me. As an aggregate tool whose only purpose is finding voting rings, maybe this useful maybe?

But consider what would happen if we created a HyperLogLog counter for every user on Reddit, and any time that a user receives an upvote, we update the corresponding HyperLogLog counter with the id of the user who did the upvoting. Given this setup, here’s how we detect the voting ring. First, for a given user, retrieve the estimated count of unique upvotes from that user’s HyperLogLog counter and let’s label this quantity as estimated_unique_upvotes. Next, tally up the total number of actual upvotes…

(these things)

across all of that user’s posts. Let’s call this quantity as global_total_upvotes. The probability that a given user is involved in a voting ring will correlate highly with the ratio of these numbers. In other words we can define

voting_honesty = estimated_unique_upvotes / global_total_upvotes

Think about it. If a user is not in a voting ring, then their upvotes are going to be organically generated from anyone one the world-wide-web that thinks their links are interesting. Because the internet is a big place, this user’s estimated_unique_upvotes and global_total_upvotes will tend toward the same number and their voting_honesty “probability” will tend toward 1. On the other hand, a user wrapped up in seedy underworld of Reddit voting rings will have many more global_total_upvotes across all posts than they have estimated_unique_upvotes. For this user, the voting_honesty will tend toward 0.

Post external references

1
http://opensourceconnections.com/blog/2013/10/14/detecting-reddit-voting-rings-using-hyperloglog-counters/

Source

RECENT COMMENTS

January 11, 2022 4:33 am

From Essie on Docker is the dangerous gamble which we will regret

"Once in 1990s, there are popular high performance solution called HPC software, many commercial softwares are ..."

December 17, 2021 7:32 pm

From John Carston on The ethics of being a high level tech consultant (a Fractional CTO)

"It helped when you mentioned that it is important to have a real connection with your consumer. My cousin ment..."

September 2, 2021 7:47 pm

From Mojavedfo on Where PHP regex fails

"55 thousand Greek, 30 thousand Armenian..."

August 7, 2021 9:53 am

From Colin Steele on The ethics of being a high level tech consultant (a Fractional CTO)

"Fantastic essay. Thoughtful, well-constructed, timely and applicable. I think every part-timer in the tech f..."

August 5, 2021 3:02 pm

From Rachiovwn on Where PHP regex fails

"consists of the book itself..."

October 19, 2019 3:08 am

From Bernd Schatz on Object Oriented Programming is an expensive disaster which must end

"I really enjoyed your article. But i can't understand the example with the interface. The example is reall..."

October 4, 2019 8:44 pm

From lawrence on My final post regarding the flaws of Docker / Kubernetes and their eco-system

"Gorgi Kosev, I am working to clean up some of my Packer/Terraform code so I can release it on Github, and then..."

October 4, 2019 5:14 pm

From Gorgi Kosev on My final post regarding the flaws of Docker / Kubernetes and their eco-system

"> Packer, sometimes with some Ansible. The combination of Packer and Terraform typically gives me what I ne..."

October 4, 2019 12:40 pm

From lawrence on My final post regarding the flaws of Docker / Kubernetes and their eco-system

"Gorgi Kosev, about this: "I would love if you could point out which VM based system makes it simpler and..."

October 4, 2019 7:31 am

From Gorgi Kosev on My final post regarding the flaws of Docker / Kubernetes and their eco-system

"I won't list anything concrete that you missed, because that will just give you ammunition to build the next a..."

October 4, 2019 1:39 am

From lawrence on My final post regarding the flaws of Docker / Kubernetes and their eco-system

"Gorgi Kosev, also, I don't think you understand what a "straw man argument" is. This is a definition from Wiki..."

Detecting voting rings with HyperLogLog

Post external references

RECENT COMMENTS

NO COMMENTS

Leave a Reply Cancel reply