From d9c065edd127e5719f2694f7d8f68c8079b4e38e Mon Sep 17 00:00:00 2001 From: Bryan Newbold Date: Tue, 3 Sep 2019 13:49:26 -0700 Subject: update TODO --- TODO.md | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/TODO.md b/TODO.md index cfbc4d5..2d0c7e3 100644 --- a/TODO.md +++ b/TODO.md @@ -10,7 +10,14 @@ x wikidata linkage (prep for wikimania) - don't list dead URLs in fatcat - summary report of some of above - update all fatcat (wikidata QID, urls, fixed ISSN-L, etc) +- when updating fatcat: + if title is "blah, Proceedings of the", set type to proceedings and re-write title + if title like "Workshop on", set type +source improvements: +- entrez: "NLM Unique Id" +- JUFO: finish +- crossref: empty string identifiers? - public scopus list (?) - scrape/munge public clarivate dumps @@ -22,13 +29,15 @@ x wikidata linkage (prep for wikimania) - check that all fields actually getting imported reasonably - homepage crawl/status script +- could poll portal.issn.org like: + https://portal.issn.org/resource/ISSN/1561-7645?format=json + would require a good deal of munging (eg, MARC region -> ISO) - KBART imports (with JSON, so only a single row per slug) - imprint/publisher distinction (publisher is big group) - summary table should be superset of fatcat table - add timestamp columns to enable updates? - fatcat export (filters for changes to make, writes out as JSON) - update_url_status (needs re-write) -- index -> directory - log out index issues (duplicate ISSN-L, etc) to a file - validate against GOLD OA list - decide what to do with JURN... match? fuzzy match? create missing fatcat? -- cgit v1.2.3