ideas/ai_risk_mitigations.txt


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145


Artificial Intelligence Risk Mitigation
============================================

My main threat model is fast market agents in the hands of authoritarian or
sociopathic/robber-baron hostile parties who attack economies and societies. We
already have these in the form of high-speed trading corporations and
disinformation campaigns by secret police, organize crime, and nationstates.
But these could be carried out far faster and more effectively with another
generation of machine learning / deep learning / whatver without requiring new
tech or even that many resources. Really more of an augmentation thing than an
AGI thing (though who knows how it could emerge).

A lot of people seem to care/fear the "superintelligence" thing. I think this
is a boogieman or red herring in most cases, but i'll also mention some things
that could hedge against it, as much to see if people actually care about
addressing this risk or are more interested in discussing how cool it would be
or as a distraction for higher-expectation-value risks (like the above, or
bio/chem/nuke weapons, or climate/ecological/resource collapse).

## Intensive Compute

Intensive compute currently requires intensive energy consumption, and weird
compute requires custom silicon. Both resources can be tracked.

Silicon fabs are scarce; a neutral international body could review all output
of high-end fabs looking for AI-specific devices. I think there are only like
5 regions/institutions in the world that do sub-20nm fabrication.

Difficulties:

- wouldn't it look a lot like bitcoin mining? (custom ASICs, huge power
  consumption)
- AFAIK, in current tech effort is only around training, not actual deployment
  of neural net techniques. training can be async and distributed, executtion
  on small generic hardware? but could monitor "efforts"/research
- intelligence agencies do a lot of sketchy monitoring using custom silicon and
  probably don't want to be monitored, even by a "neutral" body. Note that,
  unlike the nuclear weapons industry, intelligence agencies are probably
  commiting actual illegal/unaccountable acts, while weapons work was only
  secret to control spread of knowledge to the "enemy" and had civilian
  oversight

Refs:

Cory Doctorow (?) short story about an international monitoring service looking
for waste heat from rogue/unlicensed "big data" operations, using satellite
infrared cameras.

## Civic Institutional Resiliency

If we consider AGI/superintelligence as a potentially threatening power, but
only in abstract/informational ways to start, it seems obvious that general
civic and infrastructural strength is a good hedge.

Eg, core infrastructure air gapped from the net, defense-in-depth for networked
devices, strong prevalent crypto (for things like government announcements,
journalism, social media), robust voting systems, etc. Basically, look at the
CIA/Putin handbook for disrupting other countries, and make sure we are more
robust against those sort of "dirty" campaigns and manipulation.

There's also technical resiliency: think it's pretty acknowledged that the
current state of software and "security engineering" in particular are a
general shitshow, almost everything has 0-days floating around, etc. Doing a
bell-labs like effort to reset the norms, culture, and standards of the field
(combined with clear guidelines and tools) could make software much more robust
and secure (in my opinion). Bell labs was rare/expensive, but not *that*
rare/expensive in the big picture (eg, compared to defense spending and gonzo
secret projects).

## Slow Down Feedback Loops

A commonly cited fear about superintelligence is that it could operate "really
fast". There are a number of places in society that we could rate-limit and
bring the tempo down to a human pace:

- markets (trading)
- changes to internet infrastructure, like BGP (largely in place already, I
  think)
- almost all forms of beaurocracy or API could have sane rate limits

Our legal/governance systems often have this baked in because those systems are
already skeptical of "mobs" and disinformation. Checks and balances are a form
of containment.

A broader analysis of "power in the world" and having an early warning than any
one entity (company, government, whatever) was gaining a controlling influence
in any resource would be interesting as a general feedback safety thing.

## Culture

I'm pretty confused about OpenAI, because it nominally is trying to de-risk AI,
but it's basically just trying to advance the field (but be in the thick of
it).

It could instead visit labs around the world and issue reports, publish
something like the "N minutes to midnight" (bullitin of atomic scientists),
hold ethics debates and conferences, develop a code of ethics and get
researchers to sign on, start student chapters at universities, lobby and
consult with governments, draft regulations, call out "red flags" and
tripwires, etc. Achieving broad cultural shift is hard, but way more leveraged
than trying to get 100 people in a building to "solve the problem" or whatever.

It's probably just the case the organization's goal is not what it's publicly
state goal is (whether it knows that or not).

Existence proofs of this strategy working are, I think, human cloning (pretty
broad taboo; only a tiny fraction of people that could be are working on this
AFAIK), chemical weapons, and to a large degree nuclear weapons (hard to
recruit).

## Legibility

Require automated systems controlling core infrastructure and markets to be
human-meaningful: no black boxes. No neural nets with direct control over grid
power pricing.

This isn't directly out of fear that these systems would be "superintelligent"
in their own, but that we wouldn't be able to debug and figure out if they had
been manipulated, tampered with, or remotely reverse engineered in an
info-crisis situation. Eg, a superintelligence is more likely to be able to
understand and manipulate "black boxes" than we are.

## Comparison to Nuclear Regulation

Monitoring and regulation of nuclear technologies seems to have been largely
successful in tracking and observing (if not necessarily really containing
proliferation or, most importantly, *reducing risk* of war as opposed to
*preventing the growth of risk*).

Things that maybe worked well then:

- detection of test detonations in a variety of ways; almost impossible to
  hide?
- advanced, specific machinery as a bottleneck
- control/monitoring of physical materials
- huge power and/or radiation required for refinement process
- "delivery systems" monitored/controlled in parallel (and don't have
  difficulty)

Refs:
- Making of the Atomic Bomb, Dark Sun
- Curve of binding energy (out of date but interesting as a snapshot in time)
- Ellsberg's Doomsday Machine
- Bertrand Russell book (TODO)