There is too much information available to us these days. Some neat examples:
- There are at least 80 million videos on Youtube [wikipedia], and if the average length is at least one minute, then it would take about 160 years to watch them all (with no waiting for loading).
- There are at least 1.5 million CCTV cameras deployed in public places in the Greater London area [also wikipedia], meaning that if a person could watch 16 cameras at once to monitor them, you’d still need almost 100,000 people, or around one out of seventy living in London employed to watch over the other 69. (If we broke it into reasonable chunks, you’d want 4 full-time shifts of people, pulling the number up to 400k and the ratio down to about 1:18!
- Project Gutenberg, wonderfully, has over 100,000 books on offer for download. If we assume that the average book is 100,000 words (a novel is generally supposed to be about 230,000 or so; the English translation of Les Miserables is 650,000) then it would take 90 years (of constant reading at a pace of 240wpm) to get through it all.
There are probably hundreds (or thousands, or millions) more examples of where the quantity of information we have at our disposal is so far beyond our perceivable reach that it becomes effectively unusable. This is why we need to find a better way to work with large quantities of information – a way of “Data Mining”, if you will pardon the phrase.
It’s true, datamining is becoming big business, and all of the big IT companies are doing it – but when was the last time that you did it? Unless you work at one of those big IT companies, probably never.
That’s something I want to change. One of the programs I’ve been making recently, the Youtuber, lets individuals do some basic analysis on one of the most frivolous data sources out there – Youtube comments. It is my hope if we make better tools for data analysis for the everyday user, they will learn something new and interesting about the world.
Not to mention what we might find when we turn this analysis back on ourselves! But that’s another story.