The Future
Punctuation Stripping
I want to write a future version of the youtuber in Actionscript 3. It's faster, which means more processing, and will allow me to use regular expressions to remove punctuation from the keywords (where desirable - It seems that in places where people use exclamation marks, for example, most use one, some use two and a few use three, but sometimes the majority seems to think that two is a more appropriate baseline.)Scraper
The RegEx utilities available in AS3 will also make it easier to write a screen scraper, if necessary, to retrieve a larger set of comments for a video. Some popular videos have as many as 190,000 comments attached, and I would like to see more than the current 1,000 maximum that youtube allows currently.Stronger statistics
My working statistical knowledge comes from a shaky understanding from almost 10 years ago. In future I'd like to either learn more about statistics, or partner with someone who has a better understanding to figure out what we can extract from this data.I hope that these kind of analysis would be feasible to run over the data, for example:
Better graphics
This is really a brief experiment that got out of hand. I've furnished the youtuber with visual design in some places, but not everything has been thought out from a usability perspective, and of those that have, very few have been actually integrated. One thing I'm looking forward to next time is doing this with the text:
It'll be features like this that will make this tool more entertaining to use. I look forward to it.
Wider input support
Currently the youtuber only takes youtube video comments as an input. This is because it struck me as the most obvious source of commentary on a simple topic, limited to a short response. Other places are almost as repetitive, though - I would like to either write extensions for these places, or assist a user in isolating the set of responses so Youtuber can run over them.A shift from automatic to Computer-assisted processing
There are things that computers do well, and things that computers do poorly. Currently, youtuber is trying to do both categories of task. In future I would like to allow users the option of undertaking the more human-suited tasks, or guiding the computer through the best mode of execution. Hopefully this will allow us to do things like determine the multiple classes of comments, find representative comments in each one, and show what such a comment is typically responded to with.coupling with other data sources
A low-priority, but an interesting exercise lies in coupling this, and the dates, with news events, search trends and other important external sources of context. A surprising event will often starkly change the context in which a video is viewed - for example, showing the date of Steve Irwin's death as a turning point for comments on a video about stingrays.Using youtube comment data for other things
I'm not sure how usable it would be, but it would be entertaining to use youtube data as the source of information for an artificial intelligence agent on facts about the world. Such an agent would likely discover that "Cats" are "sooo cute", "Parents" are "horrible" and flashing lights in a video are "rofl", without understanding any broader context. It would be awesome!Attempting to automate trailers for youtube videos
Youtube videos are up to 10 minutes long - some older ones are even longer! This is a very long time for the modern Internet user to commit to without knowing anything about the video. Using some basic vision, and making assumptions about the content of the video from the keywords and the comments, it could be possible (and also extremely entertaining) to generate movie-style trailers about each one.
© 2008 Brandel Zachernuk, Brandy@TMbG.org