January 23, 2012

Statistical analysis for “You Wouldn’t!”

I launched You Wouldn’t in September last year, and the response has been really interesting on two levels- first as a platform for people to have fun with, but  then I have been having fun playing with the statistics associated with the site. You can check them out here.

The first day it was publicly announced it was linked on Reddit, and it ended up getting 1,600 visitors who collectively viewed  twenty-eight thousand pages.   By the end of the night, though, the system that had been used to manage it  had been overcome by a malicious script and the flow of the experience had been destroyed.  Over the next weekend my friend Aidan and I rebuilt a tighter  system using a database. It meant that we could limit the speed that people posted at, but that meant we could also tell who posted them. From there we decided that we could also let people vote on which posts they liked.

While the site lost the initial momentum from the exploit, the improved system captured people’s attention. In the four months it has been online, it has had 3,600 visits, but shown 55,000 pages and the average visitor stays on the site for five minutes. By comparison the average visit on a website is seven seconds – it’s hard to overstate how distracted the average internet user is.

The data that I needed to collect to make the system work also makes for intriguing analysis. With individual ratings you can see which posts are the most popular and which are the most contentious. You can compare the score distribution of all the posts with the subset containing a specific keywords. At present the corpus of data is relatively small so it’s hard to determine clear trends. I’m hoping to expand the system to other places to see how they vary. It’s going to be interesting!

