path: root/notes.md
diff options
authorBryan Newbold <bnewbold@archive.org>2019-05-02 14:32:48 -0700
committerBryan Newbold <bnewbold@archive.org>2019-05-02 14:32:48 -0700
commitab48873debef25fefac9a22b7a80d2aa742d0d96 (patch)
treea61deaa5662656288170883cc9908e09b5953e80 /notes.md
init with notes
Diffstat (limited to 'notes.md')
1 files changed, 76 insertions, 0 deletions
diff --git a/notes.md b/notes.md
new file mode 100644
index 0000000..0b94fbc
--- /dev/null
+++ b/notes.md
@@ -0,0 +1,76 @@
+Bryan's notes on setting up an "official" Tor onion service (formerly known as
+"hidden service") gateway for the Internet Archive.
+Goal is to have one (or more) onion service gateways running on IA hardware
+within the IA network to provide access to archive.org content and the wayback
+machine. Rough order of feature importance:
+- browse and download files from archive.org
+- browse wayback machine
+- read the blog
+- request wayback captures via SPN
+- create account and login to archive.org; upload files
+- access the S3 upload endpoint, eg via `ia` tool
+- browse/borrow/use openlibrary.org
+- fatcat.wik (bryan's paper project; can use as a demo/example)
+## Current Situation (May 2019)
+An external un-official volunteer runs an onion service:
+- <http://archivecrfip2lpi.onion/>
+- <http://archivebyd3rzt3ehjpm4c3bjkyxv3hjleiytnvxcn7x32psn2kxcuid.onion>
+- <https://www.hackerfactor.com/blog/index.php?/archives/750-Freedom-of-Information.html>
+- <https://www.hackerfactor.com/blog/index.php?/archives/762-Attacked-Over-Tor.html>
+- <https://www.hackerfactor.com/blog/index.php?/archives/763-The-Continuing-Tor-Attack.html>
+This is done via a custom PHP script:
+- <https://www.hackerfactor.com/src/iaproxy.php.txt>
+archive.is runs an onion service.
+## Onion Service Resources / Docs
+Riseup guide: <https://riseup.net/en/security/network-security/tor/onionservices-best-practices>
+EOTK ("Enterprise Onion ToolKit"): <https://github.com/alecmuffett/eotk>
+Tor Project onion service overview: <https://2019.www.torproject.org/docs/onion-services.html.en>
+Tor Project onion service v3 announce (November 2017): <https://blog.torproject.org/tors-fall-harvest-next-generation-onion-services>
+## IA-specific notes
+Running from the "office" network instead of "cluster" network might be best:
+local routing, but doesn't become a way to bypass our IP-range cluster
+- get a proper SSL EV certificate... wildcard? for the onion address
+- monitoring/alerting
+EOTK notes:
+- upgrading to the new onion services ("v3", "Prop 224", longer keys, better
+ crypto, etc) seems non-trivial: EOTK depends on onionbalance which depends on
+ stem. [EOTK issue](https://github.com/alecmuffett/eotk/issues/23),
+ [onionbalance issue](https://github.com/DonnchaC/onionbalance/issues/69), [onionbalance notes](https://onionbalance.readthedocs.io/en/latest/design.html#next-generation-onion-services-prop-224-compatibility)
+- need to enumerate all "sub-domain stems" (like us.archive.org), but
+ apparently not every host name
+- multi-machine configs are nice, though realistically in 2019 if any
+ datacenter is down then probably all of our services are
+ ("anti-high-availability"), so not much help if the onion service is up.
+ the "hardmap 2" setup, where the second device might not even be live/active,
+ but ready in case there are hardware issues with the first, might be easiest
+### archive.org
+Lots of host names!
+### web.archive.org
+Multiple layers of re-write!