aboutsummaryrefslogtreecommitdiffstats
path: root/notes/plan.mv
diff options
context:
space:
mode:
Diffstat (limited to 'notes/plan.mv')
-rw-r--r--notes/plan.mv53
1 files changed, 53 insertions, 0 deletions
diff --git a/notes/plan.mv b/notes/plan.mv
new file mode 100644
index 0000000..0f97305
--- /dev/null
+++ b/notes/plan.mv
@@ -0,0 +1,53 @@
+
+layout:
+- pipenv, python3.7, flask, elasticsearch-dsl, semantic-ui
+- python code/libs in sub-directory
+- single-file flask with all routes, call helper routines
+
+prototype pipeline:
+- CORD-19 dataset
+- enrich script fetches fatcat metadata, outputs combined .json
+- download + derive manually
+- transform script (based on download) creates ES documents as JSON
+
+pipeline:
+- .json files with basic metadata from each source
+ => CORD-19
+ => fatcat ES queries
+ => manual addition
+- enrich script takes all the above, does fatcat lookups, de-dupes by release ident, dumps json with tags and extra metadata
+
+design:
+- elasticschema schema
+- i18n URL schema
+- single-page? multi-page?
+- tags/indicators for quality
+
+infra:
+- register dns: covid19.qa.fatcat.wiki, covid19.fatcat.wiki
+
+examples:
+- jupyter notebook
+- observable hq
+
+implement:
+- download GROBID as well as PDFs
+
+topics:
+- Favipiravir
+- Chloroquine
+
+tasks/research:
+- tracking down every single paper from WHO etc
+- finding interesting older papers
+
+papers:
+- imperial college paper
+- WHO reports and recommendations
+- "hammer and the dance" blog-post
+- korean, chinese, singaporean reports
+- http://subject.med.wanfangdata.com.cn/Channel/7?mark=34
+
+
+tools?
+- vega-lite