Every moment of television news since 2009, now online and fully searchable
19 replies, posted
[quote]Inspired by a pillar of antiquity, the Library of Alexandria, Brewster Kahle has a grand vision for the Internet Archive, the giant aggregator and digitizer of data, which he founded and leads.
“We want to collect all the books, music and video that has ever been produced by humans,” Mr. Kahle said.
As of Tuesday, the archive’s online collection will include every morsel of news produced in the last three years by 20 different channels, encompassing more than 1,000 news series that have generated more than 350,000 separate programs devoted to news.
The latest ambitious effort by the archive, which has already digitized millions of books and tried to collect everything published on every Web page for the last 15 years (that adds up to more than 150 billion Web pages), is intended not only for researchers, Mr. Kahle said, but also for average citizens who make up some of the site’s estimated two million visitors each day. “The focus is to help the American voter to better be able to examine candidates and issues,” Mr. Kahle said. “If you want to know exactly what Mitt Romney said about health care in 2009, you’ll be able to find it.”
Of course, if you want to discredit or satirize a politician based on a clip showing some reversal of a position, that will be made easier as well. Or, as Mr. Kahle put it, “Let a thousand Jon Stewarts bloom.”
Many conventional news outlets will be available, including CNN, Fox News, NBC News, PBS, and every purveyor of eyewitness news on local television stations. And Mr. Stewart’s program, “The Daily Show” is one of those 1,000 series that is part of the new news archive.
“Absolutely,” Mr. Kahle said. “We think of it as news.”
The Internet Archive has been quietly recording the news material from all these outlets, which means, Mr. Kahle said, capturing not only every edition of “60 Minutes” on CBS but also every minute of every day on CNN.
All of this will be available, free, to those willing to dive into the archive starting Tuesday. Mr. Kahle said the method for the search for information would be the closed-captioned words that have accompanied the news programs. The user simply plugs in the words of the search, along with some kind of time frame, and matches of news clips will appear.
Mr. Kahle predicted there would often be hundreds of matches, but he said the system had an interface that would make it easy to browse quickly through 30-second clips in search of the right one. If a researcher wants a copy of the entire program, a DVD will be sent on loan.
The inspiration of the Library of Alexandria, the archive of the knowledge in ancient world in Egypt, was not frivolous. Mr. Kahle said that early effort to assemble the collected works of civilization was in his mind when he conceived the idea to use the almost infinite capacity of the Web to pursue the modern equivalent.
“You could turn all the books in the Library of Congress into a stack of disks that would fit in one shopping cart in Best Buy,” Mr. Kahle said. He estimates that the Internet Archive now contains about 9,000 terabytes of data; by contrast, the digital collection of the Library of Congress is a little more than 300 terabytes, according to an estimate earlier this year.
Mr. Kahle calls himself a technologist and says he moved to the archive project after previously founding and selling off two data-mining companies, one to AOL, the other to Amazon.
The television news project, like his other archive projects, is financed mainly through outside grants, though Mr. Kahle did put up some of his own money to start. He said grants from the National Archives, the Library of Congress and other government agencies and foundations made up the bulk of the financing for the project. He set the annual budget at $12 million, and said about 150 people were working on the project.
The act of copying all this news material is protected under a federal copyright agreement signed in 1976. That was in reaction to a challenge to a news assembly project started by Vanderbilt University in 1968.
The archive has no intention of replacing or competing with the Web outlets owned by the news organizations. Mr. Kahle said new material would not be added until 24 hours after it was first broadcast. “We don’t expect this to replace CNN.com,” he said.
As enormous as the news collection is, it is only the beginning, Mr. Kahle said. The plan is to “go back” year by year, and slowly add news video going back to the start of television. That will require some new and perhaps more challenging methodology because the common use of closed-captioning only started around 2002.
Mr. Kahle said some new technique, perhaps involving word recognition, would be necessary. “We need some interface that is good enough and doesn’t interrupt commerce enough that they get upset with us.”
But the goals for the news service remain as ambitious as all the other services the Internet Archive has embarked upon.
“Yes, we want eventually to be able to make coverage of, say, the 1956 political conventions available,” Mr. Kahle said.
This article has been revised to reflect the following correction:
Correction: September 17, 2012
An earlier version of the headline with this article misstated the status of Internet Archive. It was founded in 1996 and is not a start-up.
A version of this article appeared in print on September 18, 2012, on page B1 of the New York edition with the headline: All the TV News Since 2009, on One Web Site.[/quote]
[url=http://www.nytimes.com/2012/09/18/business/media/internet-archive-amasses-all-tv-news-since-2009.html]Source: The New York Times[/url]
[url=http://archive.org/details/tv]A link to the archive[/url]
I've been messing around with the archive for a few hours. If you search for things like "ABC news special report," "NBC news special report," you can watch every major breaking news story of the past three years as they happened.
I wonder how many hits "murder" will pull up
Edit: About 100,204
Quite cool. Hopefully they'll be able to expand to even older episodes of shows and more programs.
Anyone else getting videos that cut out a third of the way into it? Because that keeps happening to me. Pretty annoying.
[QUOTE=Ericson666;37727530]I wonder how many hits "murder" will pull up
Edit: About 100,204[/QUOTE]
3,426 results for "socialist" AND "Obama." Guess which network mentions it the most?
[editline]19th September 2012[/editline]
[QUOTE=Pat4ever;37727682]Anyone else getting videos that cut out a third of the way into it? Because that keeps happening to me. Pretty annoying.[/QUOTE]
Click "More/Borrow," most shows are cut into 30 second clips for some reason (probably to discourage scraping)
I think this could be politically useful.
-snip-
Apparently it was autofill from the school computer
this is brilliant
[QUOTE=Mr. Smartass;37728319][IMG]http://i.imgur.com/ry0lE.png[/IMG]
Who did it[/QUOTE]
No one. That's your browser's autofill.
[IMG]http://i.imgur.com/LyIhy.png[/IMG]
obscure as fuck
[QUOTE=SPESSMEHREN;37728343]No one. That's your browser's autofill.[/QUOTE]
That's completely bizarre because this is a school computer that I have never used before
[QUOTE=Mr. Smartass;37728382]That's completely bizarre because this is a school computer that I have never used before[/QUOTE]
looks like another facepuncher goes to your school then. :v:
im an idiot that was just a snipped bit at the beginning of a show time
Reminds me of the floating brains from Futurama.
What is a square peg?
[url]http://archive.org/details/WRC_20100228_150000_The_Chris_Matthews_Show#start/59/end/89[/url]
[QUOTE=Mr. Smartass;37728382]That's completely bizarre because this is a school computer that I have never used before[/QUOTE]
are you logged into chrome? It syncs your autofill and stuff.
[QUOTE=BlkDucky;37728535]are you logged into chrome? It syncs your autofill and stuff.[/QUOTE]
Nope, someone else here probably uses Facepunch.
Anyway, back to the article;
This is amazing, the idea of immortalizing all of man's work is truly something amazing.
i typed in shooting
i got this:
[url]http://archive.org/details/WBFF_20090710_093000_Fox_45_Early_Edition#start/1781/end/1811[/url]
"normally when i'm led away by 12 guys dressed as policemen it's like a birthday treat, this was very different"
Sorry, you need to Log In to post a reply to this thread.