diff options
author | Bryan Newbold <bnewbold@archive.org> | 2020-04-03 15:15:32 -0700 |
---|---|---|
committer | Bryan Newbold <bnewbold@archive.org> | 2020-04-03 15:15:32 -0700 |
commit | 4cbbdf33ee2a9651f79f96e4bf290d8bc721f69d (patch) | |
tree | a81bf8d2d89f6a19a3e20e4b743f3dfc4c4c8ad0 /rfc.md | |
parent | 2bdda2dbf8204d0dd36a4b5b7460ff89bfcc3b5c (diff) | |
download | fatcat-covid19-4cbbdf33ee2a9651f79f96e4bf290d8bc721f69d.tar.gz fatcat-covid19-4cbbdf33ee2a9651f79f96e4bf290d8bc721f69d.zip |
move random files to notes/
Diffstat (limited to 'rfc.md')
-rw-r--r-- | rfc.md | 38 |
1 files changed, 0 insertions, 38 deletions
@@ -1,38 +0,0 @@ - -Research index and searchable discovery tool of papers and datasets related to -COVID-19. - -Features: -- fulltext search over papers -- direct download PDFs -- find content by search queries + lists of identifiers - -## Design - -Web interface build on elasticsearch. Guessing on the order of 100k entities. - -Batch back-end system aggregates papers of interest, fetches metadata from -fatcat, fetches fulltext+GROBID, indexes into elasticsearch. Run periodically -(eg, daily, hourly) - -Some light quality tooling to find bad metadata; do cleanups in fatcat itself. - - -## Thoughts / Brainstorm - -Tagging? Eg, by type of flu, why paper included - -Clearly indicate publication status (pre-prints). - -Auto-translation to multiple languages. Translation/i18n of user interface. - -Dashboards/graphs of stats? - -Faceted search. - - -## Also - -Find historical papers of interest, eg the Spanish Flu, feature in blog posts. - -Manually add interesting/valuable greylit like notable blog posts, WHO reports. |