When it comes to finding a printed or recorded resource for educational, pleasure or business purposes, nothing beats a trip to the good old neighborhood library—that vast, utilitarian warehouse of lendable published works just waiting to be discovered.
But in the 21st Century, when time-strapped citizens have come to rely on instant access to information on an increasing array of high-tech devices, that quest for knowledge and entertainment doesn’t have to involve getting in the car, searching through the stacks, and waiting in the checkout line. Fortunately, there’s a digital library you can visit, loaded with priceless materials that can save you a trip to your favorite brick-and-mortar media depot and possibly save you money otherwise spent at Amazon, Netflix and iTunes, too.
Founded in 1996 by Brewster Kahle, the San Francisco-based nonprofit Internet Archive began life with a bold goal: to offer the masses gratis access to a wealth of digital materials and collections and serve as the online athenaeum of choice to the world, much as Wikipedia functions as the definitive complimentary electronic encyclopedia.
Shipping containers inside Internet Archive’s Physical Archive building. The containers are used for high density storage of physical media after it has been digitized. The Physical Archive contains more than a million books, plus tens of thousands of reels of film, LP records, VHS tapes, and other types of physical media.
The Internet Archive has something for everyone
Today, visitors from across the globe retrieve countless digitized books, films, TV clips, websites, software, music and audio files, photos, games, maps, court/legal documents and more a la the no-charge Internet Archive.
- Researchers, journalists, students and nostalgia buffs alike tap into the site’s Wayback Machine—a repository of 472 billion retired and indexed web pages—and its TV News Archive.
- Over 400 organizations use Internet Archive’s Archive-It subscription web archiving service to gather, build and preserve digital content collections. Avid readers tap into the Open Library e-book lending program.
- The Live Music Archive, LibriVox Free Audiobook Collection, and Old Time Radio show compilation entertain ears everywhere.
- Computer nerds gravitate to the Software Collection, the planet’s largest vintage and historical software library boasting millions of CD-ROM images, programs and even antique video games like Oregon Trail, played more than 2 million times on Internet Archive’s site.
- Curious consumers indulge in old exploitation films like Reefer Madness, campy public service announcements from the 1950s, and even breathtaking interstellar images.
- And the brand new Political TV Ad Archive attracts innumerable political news junkies.
Among its impressive accomplishments over the past 20 years, consider that the Internet Archive:
- has amassed approximately 25 petabytes of data, which includes 470 billion web captures, 8 million ebooks and texts, 2.5 million audio items (such as 150,000 live concerts), 2.2 million movies and videos, 1 million images, 1 million TV news broadcasts, and 100,000 software items
- digitizes approximately 1,000 physical books per day in 30 centers on 5 continents
- archives roughly 1 billion web captures each week
- enjoys 2 to 3 million visitors per day
“The Internet Archive started as more of a warehouse for digital materials that researchers could draw on, but we have evolved to have a web presence as well,” says Kahle, whose objective is to collect a copy of every book ever published with the help of Internet Archive’s Physical Archive, established in 2011. “We are now in the top 300 websites, according to Alexa Internet.”
A robust repository from the inside out
Alexis Rossi, Internet Archive’s director of Media and Access, says storing 25 petabytes worth of digital goods on their own servers can be downright difficult.
“It’s actually more than 50 petabytes, since all our media is stored at least twice in different physical locations,” says Rossi, who notes that the Internet Archive maintains data centers at its San Francisco headquarters building and Physical Archive buildings in Richmond, California. In addition, there are partial copies of archive data in Amsterdam and at the Library of Alexandria in Egypt. “We have tens of thousands of hard drives, so there is a constant flow of drives failing that need to be replaced quickly. We also do audits of the files to make sure we aren’t suffering from bit rot.”
Racks of hard drives in Internet Archive’s main headquarters datacenter.
But the biggest challenge of them all? “Keeping media accessible to the public. When new browsers, tablets or phones come on the market, file formats can go out of date quickly, so they need to be updated for the latest devices,” Rossi says.
None of this would be possible, of course, without the tireless efforts of Internet Archive’s dedicated team of 150 employees and countless volunteers, all serving a noble cause, Rossi insists.
“We believe it’s crucial to provide free access to information. Our society evolves because of information, and everything we learn or invent or create is built upon the work of others,” adds Rossi. “In a digital age, when everything is expected to be online, we need to make sure the best resources are available. The human race has centuries of valuable information stored in physical libraries and personal collections, but we need to ensure that all of it is online in some form.”
While people can locate media from many different important sources on the Net—YouTube, Spotify, Flickr, etcetera—“these are not libraries dedicated to keeping knowledge safe and accessible for the future,” says Rossi. “We have seen many commercial resources die off over time, including Yahoo Video, Posterous, and MobileMe. Unfortunately, when a company goes out of business, or simply decides that it’s no longer in their interest to provide a service, media disappears. But the knowledge contained in the Internet Archive will not disappear.”
A librarian’s paradise
Jessamyn West, a library consultant and community liaison for the Open Library project, says she’s proud to be among Internet Archive’s legion of contributing librarians and Good Samaritans.
“The Internet Archive is deeply committed to free culture and sharing as much as possible,” says West. “We wrangle big grants, maintain tons of servers, get loads of volunteers and people working on important projects, and are making the content they archive as findable and discoverable as possible by using metadata such as MARC (machine-readable cataloging) records.”
Many private and commercial entities who have quality content “want to lock it up and sell you access to it,” she adds. “We make it available for free, and that’s especially important to the underprivileged and to people in other countries who may not have free access to information. This kind of access has great value, because knowledge is power.”
An Internet Archive Scribe book digitization machine in use. Photo by David Rinehart. Copyright Internet Archive. Used with permission.
John Wiggins, director of Library Services and Quality Improvement at Drexel University, marvels at Internet Archive’s scope and influence. He says the site’s vision to preserve and provide access to selected pages from the ephemeral, continually changing web has created a historic timeline and reference source invaluable to those seeking simple answers as well as those looking to understand the changes in culture over time.
“As an information professional and a researcher at an academic university with a strong civic mission, the high value and cost of access to authoritative resources via vendor gateways—selected and managed for the academic community by the library to support the work of faculty as teachers and researchers/knowledge creators, and students as learners—contrasts sharply with the amount and quality of information available on the Internet to the average citizen,” says Wiggins. “The Internet Archive helps address this gap with free access to local history and other pages describing recent and more distant events, and broad coverage of changing web resources worldwide.”
And that’s important, Wiggins continues, because as our population ages along with the Internet, “the senior citizen of the future or the student in middle school can expect to access the Internet Archive as a memory book of personally significant events in a way that has never before been available.”
Much like both the Public Broadcasting System and National Public Radio rely on community contributions, the Internet Archive is dependent on public kindness to keep its hard drive platters spinning and the electricity turned on.
Interested in contributing? Volunteer your services or make a financial donation. You can also bequeath digital media collections like ebooks, movies or audio that you own but want to share with the public, register for a free account and hit “upload.”