diff options
Diffstat (limited to 'notes/plan.mv')
-rw-r--r-- | notes/plan.mv | 53 |
1 files changed, 53 insertions, 0 deletions
diff --git a/notes/plan.mv b/notes/plan.mv new file mode 100644 index 0000000..0f97305 --- /dev/null +++ b/notes/plan.mv @@ -0,0 +1,53 @@ + +layout: +- pipenv, python3.7, flask, elasticsearch-dsl, semantic-ui +- python code/libs in sub-directory +- single-file flask with all routes, call helper routines + +prototype pipeline: +- CORD-19 dataset +- enrich script fetches fatcat metadata, outputs combined .json +- download + derive manually +- transform script (based on download) creates ES documents as JSON + +pipeline: +- .json files with basic metadata from each source + => CORD-19 + => fatcat ES queries + => manual addition +- enrich script takes all the above, does fatcat lookups, de-dupes by release ident, dumps json with tags and extra metadata + +design: +- elasticschema schema +- i18n URL schema +- single-page? multi-page? +- tags/indicators for quality + +infra: +- register dns: covid19.qa.fatcat.wiki, covid19.fatcat.wiki + +examples: +- jupyter notebook +- observable hq + +implement: +- download GROBID as well as PDFs + +topics: +- Favipiravir +- Chloroquine + +tasks/research: +- tracking down every single paper from WHO etc +- finding interesting older papers + +papers: +- imperial college paper +- WHO reports and recommendations +- "hammer and the dance" blog-post +- korean, chinese, singaporean reports +- http://subject.med.wanfangdata.com.cn/Channel/7?mark=34 + + +tools? +- vega-lite |