summaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorBryan Newbold <bnewbold@robocracy.org>2018-03-20 20:55:43 -0700
committerBryan Newbold <bnewbold@robocracy.org>2018-03-20 20:55:43 -0700
commitacfb1fadb5ed51ba5fe6c217c9b15def72f9bb02 (patch)
tree551c6d60aa047fc70325a9cd315f2f469becfaa1
parent2a1223c721c32b39670809a5eeb361fdc53d2d27 (diff)
downloadfatcat-acfb1fadb5ed51ba5fe6c217c9b15def72f9bb02.tar.gz
fatcat-acfb1fadb5ed51ba5fe6c217c9b15def72f9bb02.zip
docs
-rw-r--r--README.md3
-rw-r--r--plan.txt24
-rw-r--r--rfc.md8
3 files changed, 30 insertions, 5 deletions
diff --git a/README.md b/README.md
index 886443ab..ea03c0a5 100644
--- a/README.md
+++ b/README.md
@@ -13,6 +13,7 @@ This is just a concept for now; see [rfc](./rfc).
## Python Prototype
-Use `pipenv` (which you can install with `pip`):
+Use `pipenv` (which you can install with `pip`).
pipenv shell
+ python3 fatcat/api.py
diff --git a/plan.txt b/plan.txt
new file mode 100644
index 00000000..b7f05277
--- /dev/null
+++ b/plan.txt
@@ -0,0 +1,24 @@
+
+backend/api:
+- first-rev schema
+- create work, release, etc
+- get by ID
+
+tooling:
+- query tool: by fc id, doi/issn/etc
+
+importers:
+- crossref
+- pubmed
+- dblp
+- "norwegian" journal list
+- scihub hash list
+- author list?
+
+webface:
+- creators and editors for:
+ works
+ releases
+ files
+ people
+ containers
diff --git a/rfc.md b/rfc.md
index 1b63a31a..9f807ec2 100644
--- a/rfc.md
+++ b/rfc.md
@@ -244,7 +244,7 @@ are:
<found at> URLs
<held-at> institution <with> accession
- creator
+ contributor
name
<has> aliases
<has> affiliation <for> date span
@@ -292,9 +292,9 @@ Should `identifier` and `citation` be their own entities, referencing other
entities by UUID instead of by revision? This could save a ton of database
space and chunder.
-Should creator/author contact information be retained? It could be very useful
-for disambiguation, but we don't want to build a huge database for spammers or
-"innovative" start-up marketing.
+Should contributor/author contact information be retained? It could be very
+useful for disambiguation, but we don't want to build a huge database for
+spammers or "innovative" start-up marketing.
Would general purpose SQL databases like Postgres or MySQL scale well enough
told hold several tables with billions of entries? Right from the start there