aboutsummaryrefslogtreecommitdiffstats
path: root/notes/rfc.md
blob: 6a5c516f90f2473338f409b9d44b8fb49d16c349 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38

Research index and searchable discovery tool of papers and datasets related to
COVID-19.

Features:
- fulltext search over papers
- direct download PDFs
- find content by search queries + lists of identifiers

## Design

Web interface build on elasticsearch. Guessing on the order of 100k entities.

Batch back-end system aggregates papers of interest, fetches metadata from
fatcat, fetches fulltext+GROBID, indexes into elasticsearch. Run periodically
(eg, daily, hourly)

Some light quality tooling to find bad metadata; do cleanups in fatcat itself.


## Thoughts / Brainstorm

Tagging? Eg, by type of flu, why paper included

Clearly indicate publication status (pre-prints).

Auto-translation to multiple languages. Translation/i18n of user interface.

Dashboards/graphs of stats?

Faceted search.


## Also

Find historical papers of interest, eg the Spanish Flu, feature in blog posts.

Manually add interesting/valuable greylit like notable blog posts, WHO reports.