First of all, I searched as best I could and read all SO questions that seem relevant, but nothing specifically answered this. This is not a duplicate, afaik.
Obviously if anonymous voting on a website is allowed, there is no fool proof way to prevent someone voting more than once.
However, I am wondering if someone with experience can aide me in coming up with a reasonably reliable way of tracking absolutely unique visitors and recording votes against those credentials.
Currently I am ensuring that only one vote per item/session combo is allowed, however this is easily circumvented by restarting browser, changing browsers/computers, or clearing your session data.
Recording against IP seems the next reasonable solution but I wonder if this will get false positives too often (multiple people on same LAN behind a NAT will have same external IP, etc).
Is there a middle ground to be had here or some other method/combination I am overlooking?
I'd collect as much data about the session as possible without asking any questions directly (browser, OS, installed plugins, all with versions numbers, IP address etc) and hash it.
Record the hash and increment a counter if you want multiple votes to be allowed. Include a timestamp (daily, hourly etc) in the salt to make votes time sensitive, say 5 votes per day.
The Chinese have to share one IPv4 address with hundreds of others; Hp/Compaq/DEC has almost 50 million addresses. IPv6 doesn't help as everyone get addresses by the billion. A person just is not the same as an IP address, and that notion is becoming ever more false.
There are just no proper ways to do this on the Internet. Persons are simply a concept unknown on the Internet, and any idea to introduce the concept is unlikely to succeed. (Too many governments would not want this to happen, for instance.)
Of course, you can relate the amount of votes per IP to the amounf of repeat page visits from that IP, especially in combination with cookie tracking. This works best if you estimate that number before you start the voting period. If the top 5% popular articles are typically read 10 times from a single IP, it's likely 10 people share that IP and they should get 10 votes. Cookies can be used to prevent them from stealing each others vote, but on the whole they can't skew your poll. (Note: this fails in small communities where a large group of voters come from a small number of IPs, in particular this happens around universities).
The simplest answer is to use a cookie. Obviously it's vulnerable to people clearing their cookies, but anonymous voting is inherently approximate anyway.
In practice, unless the topic being voted on is in some way controversial or inflammatory, people aren't going to have a motive behind rigging the vote anyway.
IP is more 'reliable' but will produce an unacceptably high level of collisions due to NATs.
How about a more unique identifier composed of IP + user-agent (maybe a hash)? That effectively means for each IP, each exact OS/browser version pair gets 1 vote, which is a lot closer to 1 vote per person. Most browsers provide detailed version information in the user-agent -- I'm not sure, but my gut feel is that this would prevent the majority of collisions caused by NATs.
The only place that would still produce lots of collisions is a corporate environment with a standardised network, where everyone is using an identical machine.
If you're not looking at authenticating voters, then you're going to be getting some duplicate votes no matter what you use. I'd use a cookie, and have done with it for the anonymous users.
UserVoice allows both anonymous voting and voting when logged in, but then allows the admin to filter out anonymous votes - a nice solution to this problem.
Use a persistent cookie to allow only one vote per item
and record the IP, if there are more than 100 (1,000? 10,000?) requests in less than X mins then "soft block" the IP
The "soft block": dont show a page saying "your IP has been blocked" but show your "thank you for your vote" page and DONT record the vote in your DB. You even can increase the counter for that IP only. You want to prevent them to know that you are blocking their IP.
Anything based on IP addresses isn't an option - the case of NAT has been mentioned, but this seems to only be in the case of home users. There are many larger installations that use NAT - some corporations can have thousands of users pooled behind a single IP address. There are also ISP's that use proxy servers for their users - another case where you can have many thousands of users appear to your application as a single address. Adding unique UA combinations to this won't help, as there isn't enough variation.
A persistent cookie is going to be your best bet - and you'll have to live with the fact that it is easy to game. At least when the cookie is persistent (as opposed to session based) you'll catch the majority of users who run a single browser.
If you really want to rely on the results, you are going to have to add some form of identification in the process (like e-mail validation, which is still gameable).
At the end of the day any internet survey is going to have flaws (like: http://www.time.com/time/arts/article/0,8599,1894028,00.html), and you'll have to live with this.
Two ideas not mentioned yet are:
Obviously the former can be circumvented with disposable email addresses and so on, but gives you an audit trail, and provides a significant hurdle to casual/bot vote-stuffing. A good captcha likewise severely limits vote-stuffing, but with all the usual caveats surrounding their use.
I have the same problem, and here's what I am planning on doing...
Set a persistent cookie. Check the cookie to decide whether a particular vote could be cast. Additionally store some data about the vote request in the form of a combination of IP address + User Agent. And then use this value to limit the no. of votes to, say, 10 per day.
What is the best way of going about creating this hash (IP + UA String)?