posts/modelthing-background.md


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194

Title: Communication and Reuse of Mathematical Models
Author: bnewbold
Date: 2020-06-28
Tags: modelthing
Status: draft

This post describes the potential I see for collaborative infrastructure to
agument group research and understanding of mathematical models. This type of
model, consisting of symbolic equations than can be manupulated and computed by
both humans and machines, have historically been surprisingly effective at
describing the natural world. A prototype exploring some of these ideas is
running at [modelthing.org](https://modelthing.org).

After describing why this work is interesting and important to me personally, I
will describe a vision of what augmentation systems might look like, describe
some existing tools, then finally propose some specific tools to build and
research questions to answer.

Outline

* personal backstory
    => technologist essay
    => my previous work
* what would be better?
* existing ecosystem
    => latex, mathml
    => modelica
    => SBML
* proposed system and research questions
    => modelthing.org
* reference list

## Personal Backstory

*Feel free to skip this section*

Much of my university (undergraduate) time studying physics was spent exploring
computational packages and computer algebra systems to automate math.  These
included general purpose computer algebra or numerical computation systems like
Mathematica, MATLAB, Numerical Recipies in C, SciPy, and Sage, as well as
real-time data acquisition or simulation systems like LabView, ROOT, Geant4,
and EPICS. I frequently used an online system called Hyperphysics to refresh my
memory of basic physics and make quick calculations of things like Rayleigh
scattering, and often wished I could contribute to and extend that website to
more areas of math and physics.  In some cases these computational resources
made it possible to skip over learning the underlying methods and math. A
symptom of this was submitting problem set solutions typeset on a computer
(with LaTeX), then failing to solve the same problems with pen and paper in
exams.

<div class="sidebar">
<img src="/static/fig/sicm_cover.jpg" width="150px" alt="SICM book cover"><br>
</div>

A particularly influential experience late in my education was taking a course
on classical mechanics using the Scheme programing language, taught by the
authors of "Structure and Interpretation of Classical Mechanics" (SICM). The
pedagogy of this course really struck a chord with me. Instead of learning how
to operate a complex or even proprietary software black box, students learned
to build up these systems almost from scratch. Writing and debugging equations
and simulations in this framework was usually more about correcting our
confusion or misunderstanding of the physics than computer science. I came to
believe while teaching another human is the *best* way to demonstrate deep
knowledge of a subject, teaching to a *computer* can be a pretty good start.

<div class="sidebar">
This isn't to say that computers as a pedagogical tool can replace
human mentorship and interaction; the SICM course was also one of the most
instructor-intensive and peer-interactive of any I took. And of course this
learning format will not be best for everybody.
</div>

Some years later, I found myself at a junction in my career and looking for a
larger project to dig in to. I think of myself as a narrative-motivated
individual, and was struggling to make a connection between my specific skills
and training with huge, abstract, world-level struggles and challenges
confronting humanity. Bret Victor's "What Can A Technologist Do About Climate
Change?" essay was full of connections between an inhumanly large and
complicated planet-scale challenge and specific human-scale projects. The essay
also makes the claim that systems modeling languages and tools have been
under-invested in over time, and frames the question "What if there were an
`npm` for scientific models?". The essay of course isn't a review or final word
on this one subject, but it is encouraging to see somebody talking about
similar ideas and finding the same state of research.

Summary: computer math systems can be powerful for learning and understanding,
but important that they are open, powerful, and well-designed for open
exploration and unintended uses.

TODO:

* reinventing discovery
    * web-era collaborative projects
* explorable interactive web things
    * really love these, but frustrated that the code/model is hard to get
       out; even more so when creating new models interactively!
    *  eg, nytimes interactive, you can tweak parameters and interpret results
       quickly, but can't tweak the model itself
    *  "kill math"

## Goals and Principles

Core goal: advance the ability of humans to collaborate on large complex symbolic/computational models of natural systems

* scale collaboration to more complex models
* make digestion of knowledge faster/smoother: from primary source to
  secondary/tertiary faster

Some best practices:

* **Free Software workflows**: the entire ecosystem does not need to be free
  and open source software, but it is important that anybody can collaborate
  using only open tools
* **Transferability**: should be possible to move models from project to project,
  even if using idfferent software platforms
* **Versioning, typing, and forking**: lessons from sustainable distributed
  software development (as opposed to large-scale projects within a single
  organization) are that it must be possible to extend or make corrections to
  individual components with as little disruption to other components as
  possible. This means support for versioning, care about design of namespaces
  (when references are by name), and automation to help detect "breaking
  changes" and manage updates.
* **Permissive licenses for content and metadata** to allow broad re-use.
  More restrictive open licenses (eg, GPL, Non-Commercial, Share-Alike) are
  acceptable (and often desirable) for software tools.
* **Scale up and down**

examples of applying core goal:
-> "does veganism make sense"
-> COVID-19 modeling
-> understand equilibrium finances of large companies/institutions, for the people inside those institutions ("business model")

## Existing Ecosystem

Similar tools (in doc):

* modelica
* wolfram world: alpha, mathematica, system modeling
* strong type systems
* mathhml
* SBML

## Proposed System and Open Questions

Proposed system to build:

* simple intermediate format for math models
    => limited in scope and semantics; like regex
* transpilers to/from this format to general programming and computer algebra languages
    => sort of like pandoc
* tooling/systems to combine and build large compound models from components
* public wiki-like catalog to collect and edit models

Will mathematics continue to be "unreasonably effective" in the natural
sciences as we try to understand larger and more complex systems?

Will technical and resource limits constrain symbolic analysis of complex
systems? Eg, will there be scaling problems with algorithms when working with
large models?


## References

Structure and Interpretation of Classical Mechanics ("SICM") (book)
[html](https://mitpress.mit.edu/sites/default/files/titles/content/sicm_edition_2/book.html)
[wiki](https://openlibrary.org/works/OL16797774W/Structure_and_Interpretation_of_Classical_Mechanics)

Functional Differential Geometry (book)
[html](https://mitpress.mit.edu/books/functional-differential-geometry)

Reinventing Discovery (book, Michael Nielsen)
[openlibrary](https://openlibrary.org/works/OL15991453W/Reinventing_discovery)

Hyperphysics (website)
[url](http://hyperphysics.phy-astr.gsu.edu/hbase/geoopt/refr.html#c2)

All Watched Over by Machines of Loving Grace (miniseries, Adam Curtis)
[wiki](https://en.wikipedia.org/wiki/All_Watched_Over_by_Machines_of_Loving_Grace_(TV_series))

What Can A Technologist Do About Climate Change? (essay, Bret Victor, Nov 2015)
[html](http://worrydream.com/ClimateChange/)

More is Different (paper, 1972)

Distilling Free-Form Natural Laws from Experimental Data (paper, 2009)
[pdf](https://www.isi.edu/~gil/diw2012/statements/lipson.pdf)

Symbolic Mathematics Finally Yields to Neural Networks (article, 2020)
[html](https://www.quantamagazine.org/symbolic-mathematics-finally-yields-to-neural-networks-20200520/)

The Unreasonable Effectiveness of Mathematics in the Natural Sciences (article, 1960)
[html](http://www.dartmouth.edu/~matc/MathDrama/reading/Wigner.html)
[wiki](https://en.wikipedia.org/wiki/The_Unreasonable_Effectiveness_of_Mathematics_in_the_Natural_Sciences)