1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
|
Bryan's notes on setting up an "official" Tor onion service (formerly known as
"hidden service") gateway for the Internet Archive.
Goal is to have one (or more) onion service gateways running on IA hardware
within the IA network to provide access to archive.org content and the wayback
machine. Rough order of feature importance:
- browse and download files from archive.org
- browse wayback machine
- read the blog
- request wayback captures via SPN
- create account and login to archive.org; upload files
- access the S3 upload endpoint, eg via `ia` tool
- browse/borrow/use openlibrary.org
- fatcat.wik (bryan's paper project; can use as a demo/example)
## Current Situation (May 2019)
An external un-official volunteer runs an onion service:
- <http://archivecrfip2lpi.onion/>
- <http://archivebyd3rzt3ehjpm4c3bjkyxv3hjleiytnvxcn7x32psn2kxcuid.onion>
- <https://www.hackerfactor.com/blog/index.php?/archives/750-Freedom-of-Information.html>
- <https://www.hackerfactor.com/blog/index.php?/archives/762-Attacked-Over-Tor.html>
- <https://www.hackerfactor.com/blog/index.php?/archives/763-The-Continuing-Tor-Attack.html>
This is done via a custom PHP script:
- <https://www.hackerfactor.com/src/iaproxy.php.txt>
archive.is runs an onion service.
## Onion Service Resources / Docs
Riseup guide: <https://riseup.net/en/security/network-security/tor/onionservices-best-practices>
EOTK ("Enterprise Onion ToolKit"): <https://github.com/alecmuffett/eotk>
Tor Project onion service overview: <https://2019.www.torproject.org/docs/onion-services.html.en>
Tor Project onion service v3 announce (November 2017): <https://blog.torproject.org/tors-fall-harvest-next-generation-onion-services>
## IA-specific notes
Running from the "office" network instead of "cluster" network might be best:
local routing, but doesn't become a way to bypass our IP-range cluster
firewalls.
Tasks:
- get a proper SSL EV certificate... wildcard? for the onion address
- monitoring/alerting
EOTK notes:
- upgrading to the new onion services ("v3", "Prop 224", longer keys, better
crypto, etc) seems non-trivial: EOTK depends on onionbalance which depends on
stem. [EOTK issue](https://github.com/alecmuffett/eotk/issues/23),
[onionbalance issue](https://github.com/DonnchaC/onionbalance/issues/69), [onionbalance notes](https://onionbalance.readthedocs.io/en/latest/design.html#next-generation-onion-services-prop-224-compatibility)
- need to enumerate all "sub-domain stems" (like us.archive.org), but
apparently not every host name
- multi-machine configs are nice, though realistically in 2019 if any
datacenter is down then probably all of our services are
("anti-high-availability"), so not much help if the onion service is up.
the "hardmap 2" setup, where the second device might not even be live/active,
but ready in case there are hardware issues with the first, might be easiest
### archive.org
Lots of host names!
### web.archive.org
Multiple layers of re-write!
|