layout: - pipenv, python3.7, flask, elasticsearch-dsl, semantic-ui - python code/libs in sub-directory - single-file flask with all routes, call helper routines prototype pipeline: - CORD-19 dataset - enrich script fetches fatcat metadata, outputs combined .json - download + derive manually - transform script (based on download) creates ES documents as JSON pipeline: - .json files with basic metadata from each source => CORD-19 => fatcat ES queries => manual addition - enrich script takes all the above, does fatcat lookups, de-dupes by release ident, dumps json with tags and extra metadata design: - elasticschema schema - i18n URL schema - single-page? multi-page? - tags/indicators for quality infra: - register dns: covid19.qa.fatcat.wiki, covid19.fatcat.wiki examples: - jupyter notebook - observable hq implement: - download GROBID as well as PDFs topics: - Favipiravir - Chloroquine tasks/research: - tracking down every single paper from WHO etc - finding interesting older papers papers: - imperial college paper - WHO reports and recommendations - "hammer and the dance" blog-post - korean, chinese, singaporean reports - http://subject.med.wanfangdata.com.cn/Channel/7?mark=34 tools? - vega-lite