<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Zachernuk.com &#187; analytics</title>
	<atom:link href="http://www.zachernuk.com/category/analytics/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.zachernuk.com</link>
	<description>The desk of Brandel Zachernuk</description>
	<lastBuildDate>Tue, 27 Jul 2010 01:39:26 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.8.4</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Prelude to the Youtuber</title>
		<link>http://www.zachernuk.com/2008/10/28/prelude-to-the-youtuber/</link>
		<comments>http://www.zachernuk.com/2008/10/28/prelude-to-the-youtuber/#comments</comments>
		<pubDate>Tue, 28 Oct 2008 11:31:36 +0000</pubDate>
		<dc:creator>Brandel Zachernuk</dc:creator>
				<category><![CDATA[analytics]]></category>
		<category><![CDATA[youtuber]]></category>
		<category><![CDATA[comments]]></category>
		<category><![CDATA[social]]></category>
		<category><![CDATA[stats]]></category>
		<category><![CDATA[web 2.0]]></category>
		<category><![CDATA[youtube]]></category>

		<guid isPermaLink="false">http://www.zachernuk.com/?p=83</guid>
		<description><![CDATA[People talk a lot of trash about youtube comments. People often accuse it of any or all of the three following things:
They&#8217;re stupid,
They&#8217;re repetitive,
There are too many of them.
The first two accusations are reasonable &#8211; there are a lot of stupid comments out there, and they are very repetitive.  They are so hated by some [...]]]></description>
			<content:encoded><![CDATA[<p>People talk a lot of trash about youtube comments. People often accuse it of any or all of the three following things:</p>
<li>They&#8217;re stupid,</li>
<li>They&#8217;re repetitive,</li>
<li>There are too many of them.</li>
<p>The first two accusations are reasonable &#8211; there are a <em>lot</em> of stupid comments out there, and they are very repetitive.  They are so hated by some that someone has developed a Firefox Add-on that will <a title="Comment Snob" href="https://addons.mozilla.org/en-US/firefox/addon/7115" target="_blank">strip comments out of the page before you can even be offended by them</a>.  On the other hand, though, <em>people</em> are very stupid and repetitive.  There&#8217;s often a surprising honesty to the statements that people make on youtube. It is <a href="http://andrewchen.typepad.com/andrew_chens_blog/2007/12/public-and-priv.html" target="_blank">suggested</a> that the monumental scale of the youtube &#8216;community&#8217; means that individuals are effectively anonymous, which liberates users from any self-censorship that would occur from fear of shame or other punishment.  It&#8217;s not all honesty, since people are more likely to be offensive if they know they can get away with it, but it&#8217;s a refreshingly different place to look for views and opinions in.</p>
<p>So while they may be stupid, they&#8217;re still worth looking at.  The issue of quantity is more of a practical issue, though.  For example, I keep tabs on  a video called <a href="http://www.youtube.com/watch?v=ysTmUTQ5wZE" target="_blank">&#8220;The Most Pathetic Baby Panda Ever&#8221;</a>. It has almost 5 million views and 15,500 comments.   Even if you only dedicated 5 seconds to each comment, you have to spend over 21 hours studying them. The Evolution Of Dance has almost 250,000 comments -   almost two weeks&#8217; worth of study.  While they&#8217;re interesting, they&#8217;re not <em>that</em> interesting.</p>
<p>Conveniently, though, this is exactly the kind of thing that the study of data mining is supposed to deal with.  In addition, the fact that the comments appear to be stupid (or simple,) and highly repetitive works in our favour. The first question I would try to ask of these comments is &#8220;what&#8217;s a typical comment?&#8221;  and toward that end, I have created the <a href="http://www.zachernuk.com/Youtuber" target="_blank">Youtuber.</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.zachernuk.com/2008/10/28/prelude-to-the-youtuber/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Your Mine</title>
		<link>http://www.zachernuk.com/2008/10/16/your-mine/</link>
		<comments>http://www.zachernuk.com/2008/10/16/your-mine/#comments</comments>
		<pubDate>Thu, 16 Oct 2008 10:28:37 +0000</pubDate>
		<dc:creator>Brandel Zachernuk</dc:creator>
				<category><![CDATA[analytics]]></category>
		<category><![CDATA[analysis]]></category>
		<category><![CDATA[data]]></category>
		<category><![CDATA[datamining]]></category>
		<category><![CDATA[forest]]></category>
		<category><![CDATA[pizza]]></category>
		<category><![CDATA[youtuber]]></category>

		<guid isPermaLink="false">http://www.zachernuk.com/wp/?p=46</guid>
		<description><![CDATA[In keeping with my last post about datamining, and the overwhelming amount of information available out there, I thought I would tell you a little bit about myself.
I have 630GB of HDD space, with around 450GB currently in use.
In 30GB of that alone, I have 180,000 files
I have written at least 550 emails in the [...]]]></description>
			<content:encoded><![CDATA[<p>In keeping with my last post about datamining, and the overwhelming amount of information available out there, I thought I would tell you a little bit about myself.</p>
<li>I have 630GB of HDD space, with around 450GB currently in use.</li>
<li>In 30GB of that alone, I have 180,000 files</li>
<li>I have written at least 550 emails in the past 3 years, not including a work address that would add another 200 or so, and I&#8217;m not a prolific emailer.</li>
<li>I have made 5500 electronic transactions (EFTPOS is big in New Zealand) since 1999.</li>
<li>My computer has been up for 650 hours without restarting (though with probably about 30 hibernate/restores)</li>
<li>Last week I visited 8,000 web pages.</li>
<li>Over the course of my life, I have spent $1500 at Pizza Hutt.</li>
<p>Most of these factoids are interesting only to myself, sometimes not even that.  Being able to dig into my financial transactions gives me the opportunity to do a number of things, though &#8211; I can construct an &#8220;average&#8221; day.  If I place my expenditures on a city map, I can draw my route over a day (or a week, or a month), see when I buy my morning coffee, and the average radius of my lunchtime wanderings.  I can even use it to retrace my steps, and find the name of that great Japanese restaurant I went to in Auckland last year.</p>
<p>You might still think that these are unimportant, but this ignores two issues.  First, we don&#8217;t know what&#8217;s insignificant until we can see it. There may be important trends in my buying habits &#8211; if I buy a coffee in the morning, I may end up working late because I forget to go home.  Second, it&#8217;s important to somebody &#8211; it might not be front-page news if I go to a restaurant, but it&#8217;s nice to have the name on hand so I can tell a friend whether to give it a miss or not.</p>
<p>Most of our interaction with computers (as digital cameras , cellphones or point-of-sale devices) is being recorded.  The sum total of this recorded data is referred to as our <em>digital footprint</em>.   While many people find the presence &#8211; or even the idea <em>- </em>of this record threatening, I think it gives us an opportunity to answer important questions about who we are, and what we do.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.zachernuk.com/2008/10/16/your-mine/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>TMI</title>
		<link>http://www.zachernuk.com/2008/10/16/tmi/</link>
		<comments>http://www.zachernuk.com/2008/10/16/tmi/#comments</comments>
		<pubDate>Thu, 16 Oct 2008 05:30:19 +0000</pubDate>
		<dc:creator>Brandel Zachernuk</dc:creator>
				<category><![CDATA[analytics]]></category>
		<category><![CDATA[youtuber]]></category>
		<category><![CDATA[analysis]]></category>
		<category><![CDATA[data]]></category>
		<category><![CDATA[datamining]]></category>

		<guid isPermaLink="false">http://www.zachernuk.com/wp/?p=41</guid>
		<description><![CDATA[There is too much information available to us these days. Some neat examples:
 There are at least 80 million videos on Youtube [wikipedia], and if the average length is at least one minute, then it would take about 160 years to watch them all (with no waiting for loading).
There are at least 1.5 million CCTV [...]]]></description>
			<content:encoded><![CDATA[<p>There is too much information available to us these days. Some neat examples:</p>
<ul> There are at least 80 million videos on Youtube [<a href="http://en.wikipedia.org/wiki/Youtube">wikipedia</a>], and if the average length is at least one minute, then it would take about 160 years to watch them all (with no waiting for loading).</ul>
<ul>There are at least 1.5 million CCTV cameras deployed in public places in the Greater London area [also <a href="http://en.wikipedia.org/wiki/CCTV#Crime_prevention_.2F_evidence">wikipedia</a>], meaning that if a person could watch 16 cameras at once to monitor them, you&#8217;d still need almost 100,000 people, or around one out of seventy living in London employed to watch over the other 69. (If we broke it into reasonable chunks, you&#8217;d want 4 full-time shifts of people, pulling the number up to 400k and the ratio down to about 1:18!</ul>
<ul><a href="http://www.gutenberg.org">Project Gutenberg</a>, wonderfully, has over 100,000 books on offer for download.  If we assume that the average book is 100,000 words (a novel is generally supposed to be about 230,000 or so; the English translation of <em>Les Miserables</em> is 650,000)  then it would take 90 years (of constant reading at a pace of 240wpm) to get through it all.</ul>
<p>There are probably hundreds (or thousands, or millions) more examples of where the quantity of information we have at our disposal is so far beyond our perceivable reach that it becomes effectively unusable.  This is why we need to find a better way to work with large quantities of information &#8211; a way of &#8220;Data Mining&#8221;, if you will pardon the phrase.</p>
<p>It&#8217;s true, datamining is becoming big business, and all of the big IT companies are doing it &#8211; but when was the last time that <em>you</em> did it?  Unless you work at one of those big IT companies, probably never.</p>
<p>That&#8217;s something I want to change.  One of the programs I&#8217;ve been making recently, the <a href="http://www.zachernuk.com/Youtuber">Youtuber</a>, lets individuals do some basic analysis on one of the most frivolous data sources out there &#8211; Youtube comments.  It is my hope if we make better tools for data analysis for the everyday user, they will learn something new and interesting about the world.</p>
<p>Not to mention what we might find when we turn this analysis back on <em>ourselves</em>! But that&#8217;s another story.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.zachernuk.com/2008/10/16/tmi/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
