diff options
-rw-r--r-- | README.md | 3 | ||||
-rw-r--r-- | plan.txt | 24 | ||||
-rw-r--r-- | rfc.md | 8 |
3 files changed, 30 insertions, 5 deletions
@@ -13,6 +13,7 @@ This is just a concept for now; see [rfc](./rfc). ## Python Prototype -Use `pipenv` (which you can install with `pip`): +Use `pipenv` (which you can install with `pip`). pipenv shell + python3 fatcat/api.py diff --git a/plan.txt b/plan.txt new file mode 100644 index 00000000..b7f05277 --- /dev/null +++ b/plan.txt @@ -0,0 +1,24 @@ + +backend/api: +- first-rev schema +- create work, release, etc +- get by ID + +tooling: +- query tool: by fc id, doi/issn/etc + +importers: +- crossref +- pubmed +- dblp +- "norwegian" journal list +- scihub hash list +- author list? + +webface: +- creators and editors for: + works + releases + files + people + containers @@ -244,7 +244,7 @@ are: <found at> URLs <held-at> institution <with> accession - creator + contributor name <has> aliases <has> affiliation <for> date span @@ -292,9 +292,9 @@ Should `identifier` and `citation` be their own entities, referencing other entities by UUID instead of by revision? This could save a ton of database space and chunder. -Should creator/author contact information be retained? It could be very useful -for disambiguation, but we don't want to build a huge database for spammers or -"innovative" start-up marketing. +Should contributor/author contact information be retained? It could be very +useful for disambiguation, but we don't want to build a huge database for +spammers or "innovative" start-up marketing. Would general purpose SQL databases like Postgres or MySQL scale well enough told hold several tables with billions of entries? Right from the start there |