{% extends "base.html" %} {% block overhead %}{% endblock %} {% block path %}{% endblock %} {% block pagetitle %}{% endblock %} {% block postnav %}
Internet Archive is a non-profit library holding millions of books, websites, recordings, and other digitized works.
Since 1996 our mission has been to ensure Universal Access to All Knowledge.
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat.
300 Funston Ave, Inner Richmond
San Franscisco, CA, USA
From the Blog
Twitter (inevitably)
Subscribe to Newsletter:
We built our own scanning hardware and digital workflow. Have dozens of workers in centers world-wide. Work with partners, have XYZ books so far.
News is important. Who said what? We have been recording since 2009 and have a lot of it, Search by captions, see what you can find, hold everybody accountable! Check these examples.
Web is flux-y, always disapearing, big 20th/21st century cultural heritige. We use Heritrix to crawl the web a whole bunch, get big numbers of pages and petabytes.
Some partners submit data or pay, some use Archive-It.
All results available via Wayback Machine.
Archive a page now:
We are pretty in to storing data for a very long time. Own our own real-estate, hardware, etc instead of using cloud storages. Save money, waste heat building, have a couple sites, lots of disks, great good, some details.
Physical artifacts are cool as well, so we have a bunch of that over in the east bay. XYZ tons! Neato!
We can't decide everything to crawl, so partner with universities, libraries, other organizations, who provide funding and lists of what to crawl, then we go and do it, archive forever, usually provide as part of wayback.
Over a petabyte so far, thousands of partners.
Have a great and unique relationship with the radical self-organized Archive Team group, which goes out and saves the web. Horray! Eg, 301works, URLteam,
Sometimes other archives (like Prelinger, George Blood, etc) work with us to do digitization and physical archiving. Great content, big win!
We go way back (get it?) with the Alexa search engine. In the early days they were most of our content, and still provide a bunch. Thanks!
We're in the USA and have done some crawling with Library of Congress and National Records (NARA).
Also work with national libraries and governments from around the world!
Here is where we name-drop the big private foundations that often fund feature development and high-impact projects. Thanks!
Want to use our huge set of data in a research context? Great. Contact us.
It's great when we find online communities which have already organized content. Sometimes a big collection, sometimes user-generated like stack overflow or reddit.
"Lock it open"!
Funded by viewers like you!
Also accept some equipment, and you can volunteer.
You can volunteer, or just start uploading random furry convention photos from your laptop right now!. Or download and make copies, use our APIs, whatever!
Please read our terms, don't get us sued or raided, don't melt our servers or consume thousands of dollars of disk without asking first.