From 21f57d5ba87b32c6278322fab7db197c876520e8 Mon Sep 17 00:00:00 2001 From: Bryan Newbold Date: Tue, 21 Nov 2017 15:57:23 -0800 Subject: things i've learned at archive.org --- archive_org.page | 14 ++++++++++++++ 1 file changed, 14 insertions(+) create mode 100644 archive_org.page (limited to 'archive_org.page') diff --git a/archive_org.page b/archive_org.page new file mode 100644 index 0000000..b2a5323 --- /dev/null +++ b/archive_org.page @@ -0,0 +1,14 @@ + +### Secret archive.org Tricks + +- wayback datetime tricks: `*`, `0`, `2`, year, etc. +- wayback site overview +- `id_` suffix to allow indexing into a zip archive +- s3 interface (fewer access controls, etc) +- .pages.archivelab.org +- IIIF interface +- `ia` command-line tool +- iaminer +- public CDX API +- raw WARC files (for some crawls) +- getting raw/original wayback file (can't remember...) -- cgit v1.2.3