summaryrefslogtreecommitdiffstats
path: root/notes/cleanup_tasks.txt
diff options
context:
space:
mode:
authorMartin Czygan <martin@archive.org>2020-07-10 21:32:41 +0000
committerMartin Czygan <martin@archive.org>2020-07-10 21:32:41 +0000
commit3c266e07771271241aa8cff3e3199a45109362af (patch)
tree73fa6aedf1bbfeffeac9c94593f5f9c4f2dd645b /notes/cleanup_tasks.txt
parentfdf1028c19b0623e30b91e49ffa65ed130dcfdc1 (diff)
parentc9d8550be4bab808c2bad0b0d3642a71075202c0 (diff)
downloadfatcat-3c266e07771271241aa8cff3e3199a45109362af.tar.gz
fatcat-3c266e07771271241aa8cff3e3199a45109362af.zip
datacite: resolve formatting issues in tests
Diffstat (limited to 'notes/cleanup_tasks.txt')
-rw-r--r--notes/cleanup_tasks.txt18
1 files changed, 18 insertions, 0 deletions
diff --git a/notes/cleanup_tasks.txt b/notes/cleanup_tasks.txt
new file mode 100644
index 00000000..bf418e59
--- /dev/null
+++ b/notes/cleanup_tasks.txt
@@ -0,0 +1,18 @@
+
+Cambridge Chemical Database (NCI)
+
+ doi_prefix:10.3406 release_type:article
+
+ 193,346+ entities
+
+ should be 'dataset' not 'article'
+
+ datacite importer
+
+Frontiers
+
+ Frontiers non-PDF abstracts, which have DOIs like `10.3389/conf.*`. Should
+ crawl these, but `release_type` should be... `abstract`? There are at least
+ 18,743 of these. Should be fixed in both crossref-bot, then a retro-active
+ cleanup.
+