diff options
author | Bryan Newbold <bnewbold@archive.org> | 2020-04-09 17:48:16 -0700 |
---|---|---|
committer | Bryan Newbold <bnewbold@archive.org> | 2020-04-09 17:48:16 -0700 |
commit | 5b22495057fa9cb40271764c9e80166882ba3f21 (patch) | |
tree | 36f09901a62818b1a73135babb06136e20238a46 /notes/should_include.md | |
parent | 660ae5878ab235b3631e293e0f216846e2986c02 (diff) | |
download | fatcat-covid19-5b22495057fa9cb40271764c9e80166882ba3f21.tar.gz fatcat-covid19-5b22495057fa9cb40271764c9e80166882ba3f21.zip |
inclusion notes and pipeline update
Diffstat (limited to 'notes/should_include.md')
-rw-r--r-- | notes/should_include.md | 51 |
1 files changed, 51 insertions, 0 deletions
diff --git a/notes/should_include.md b/notes/should_include.md new file mode 100644 index 0000000..6033eb6 --- /dev/null +++ b/notes/should_include.md @@ -0,0 +1,51 @@ + +## Queries + + pandemic influenza + epidemic influenza + pandemic ventilator + SARS + sars-cov-2 + covid-19 + +## Should not include? + +Duplicate releases: + +- zenodo versions +- figshare versions + eg "Coronavirus Research on Figshare" (12 versions) + +Remove anything researchgate? Quality is low. DOI prefix: + +"TOF-SARS" => time of flight physics thing + +These should not end up in the corpus: + + "Description of a new Norwegian star-fish" + by M. Sars + https://fatcat.wiki/release/ngp3qkqf4fccbdlxz2u4h4taoe + +## Specific Articles + +Expect these to end up in the corpus (they are not already): + + "100 Years of Medical Countermeasures and Pandemic Influenza Preparedness" + https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6187768/ + + +## Hacks + + 10.2210/pdb4njl/pdb + no release_type + => dataset + => published + + no release_type + title starts "figure" + => graphic/figure, skip it + + journal: "Emerald Expert Briefings" + container_id:fnllqvywjbec5eumrbavqipfym + => skip it + |