fatcat - [no description]

	Commit message (Collapse)	Author	Age	Files	Lines
*	crossref importer: skip affiliations lacking 'name'	Bryan Newbold	2021-12-15	1	-0/+3
\| \| \| \|	Relatedly, we should start handling ROR affiliations in contribs soon.
*	refactor importer metadata tables into separate file; move some helpers around	Bryan Newbold	2021-11-10	1	-92/+2
\| \| \| \| \| \| \|	- MAX_ABSTRACT_LENGTH set in a single place (importer common) - merge datacite license slug table in to common table, removing some TDM-specific licenses (which do not apply in the context of preserving the full work)
*	importers: refactor imports of clean() and other normalization helpers	Bryan Newbold	2021-11-10	1	-28/+28
\|
*	importers: use clean_doi() in many more (all?) importers	Bryan Newbold	2021-11-09	1	-1/+8
\|
*	remove deprecated extid sqlite3 lookup table feature from importers	Bryan Newbold	2021-11-09	1	-54/+0
\| \| \| \| \| \| \| \|	This was used during initial bulk imports, but is no longer used and could create serious metadata problems if used accidentially. In retrospect, it also made metadata provenance less transparent, and may have done more harm than good overall.
*	more involved type wrangling and fixes for importers	Bryan Newbold	2021-11-03	1	-5/+6
\|
*	typing: relatively simple type check fixes	Bryan Newbold	2021-11-03	1	-3/+1
\| \| \| \| \| \| \|	These mostly add new variable names so that existing variables aren't overwritten with a new type; delay coercing '{}' or '[]' to 'None' until the last minute; adding is-not-None checks to conditional clauses; and similar small changes.
*	typing: initial annotations on importers	Bryan Newbold	2021-11-03	1	-13/+13
\| \| \| \| \|	This commit just adds the type annotations, doesn't do fixes to code to make type checking pass.
*	lint: resolve existing mypy type errors	Bryan Newbold	2021-11-02	1	-15/+11
\| \| \| \| \| \| \| \| \|	Adds annotations and re-workes dataflow to satisfy existing mypy issues, without adding any additional type annotations to, eg, function signatures. There will probably be many more type errors when annotations are all added.
*	fmt (black): fatcat_tools/	Bryan Newbold	2021-11-02	1	-167/+246
\|
*	python: isort everything	Bryan Newbold	2021-11-02	1	-3/+2
\|
*	lint: simple, safe inline lint fixes	Bryan Newbold	2021-11-02	1	-2/+1
\| \| \| \|	'==' vs 'is'; 'not a in b' vs 'a not in b'; etc
*	small python tweaks for annotations, imports	Bryan Newbold	2021-11-02	1	-1/+5
\|
*	try some type annotations	Bryan Newbold	2021-11-02	1	-22/+29
\|
*	crossref+datacite: remove confusing early update bail	Bryan Newbold	2020-11-20	1	-2/+0
\| \| \| \| \|	Easy to miss that we skip updates twice, and with this early bailout were not updating counts correctly.
*	simple lint (flake8) fixes over python codebase	Bryan Newbold	2020-07-23	1	-7/+7
\| \| \| \| \| \|	These should not have any behavior changes, though a number of exception catches are now more general, and there may be long-tail exceptions getting thrown in these statements.
*	lint (flake8) tool python files	Bryan Newbold	2020-07-01	1	-7/+1
\|
*	add new license mappings	Bryan Newbold	2020-06-30	1	-0/+13
\|
*	Merge pull request #53 from EdwardBetts/spelling	bnewbold	2020-03-27	1	-2/+2
\|\ \| \| \| \|	Correct spelling mistakes
\| *	Correct spelling mistakes	Edward Betts	2020-03-27	1	-2/+2
\| \|
* \|	crossref: skip stub OUP title	Bryan Newbold	2020-03-19	1	-0/+8
\|/ \| \| \| \| \|	It seems like OUP pre-registers DOIs with this place-holder title, then updates the Crossref metdata when the paper is actually published. We should wait until the real title is available before creating an entity.
*	crossref: accurate blank title counts	Bryan Newbold	2019-11-05	1	-0/+1
\|
*	crossref: component type	Bryan Newbold	2019-11-04	1	-1/+3
\|
*	crossref: count why skip happened	Bryan Newbold	2019-11-04	1	-1/+7
\| \| \| \| \| \|	Might skip based on release type (eg container, not a paper/release), or missing title, or other reasons. Over 7 million DOIs are getting skipped, curious why.
*	crossref: don't skip on short/null subtitle	Bryan Newbold	2019-11-04	1	-1/+1
\| \| \| \|	This was a bug. Should only set subtitle black, not skip the import.
*	refactor all python source for client lib name	Bryan Newbold	2019-09-05	1	-10/+10
\|
*	crossref: allow 'name' fallback (for groups, etc)	Bryan Newbold	2019-06-24	1	-1/+1
\|
*	better crossref container_name handling	Bryan Newbold	2019-05-24	1	-7/+12
\|
*	arxiv license slug shorter; fix test	Bryan Newbold	2019-05-22	1	-1/+1
\|
*	importers: create containers by default	Bryan Newbold	2019-05-21	1	-1/+2
\|
*	arxiv license/slug map	Bryan Newbold	2019-05-21	1	-0/+1
\|
*	python impl	Bryan Newbold	2019-05-14	1	-4/+5
\|
*	python impl	Bryan Newbold	2019-05-14	1	-2/+2
\|
*	importer code updates	Bryan Newbold	2019-05-13	1	-2/+14
\|
*	partial python impl of ext_id and release_stage refactors	Bryan Newbold	2019-05-13	1	-12/+14
\|
*	better/additional crossref license lookups	Bryan Newbold	2019-02-14	1	-20/+58
\|
*	crossref: import subtitle as str, not list[str]	Bryan Newbold	2019-02-14	1	-0/+2
\|
*	add some missing LICENSE_SLUG_MAP	Bryan Newbold	2019-02-05	1	-1/+4
\|
*	crossref import tweaks/fixes	Bryan Newbold	2019-01-29	1	-7/+9
\| \| \| \| \|	- refs: article-title not title; save unstructured; authors not author - save 'language' field (already an ISO code)
*	fix bug in clean() resulting in many consistency check fails	Bryan Newbold	2019-01-29	1	-10/+9
\|
*	fix refs extra ordering bug	Bryan Newbold	2019-01-29	1	-6/+6
\|
*	pass through kwargs (fixes bezerk imports)	Bryan Newbold	2019-01-29	1	-1/+2
\|
*	ensure raw_name is not stub	Bryan Newbold	2019-01-29	1	-1/+4
\|
*	ensure abstracts aren't stubs	Bryan Newbold	2019-01-29	1	-2/+3
\|
*	fix title length checks in crossref	Bryan Newbold	2019-01-28	1	-2/+2
\|
*	filter short/stub original_title	Bryan Newbold	2019-01-28	1	-3/+7
\|
*	enforce title len>1 for release imports	Bryan Newbold	2019-01-28	1	-0/+3
\|
*	tweak crossref import, and update tests	Bryan Newbold	2019-01-24	1	-11/+27
\|
*	allow importing contrib/refs lists	Bryan Newbold	2019-01-24	1	-5/+13
\| \| \| \| \| \|	The motivation here isn't really to support these gigantic lists on principle, but to be able to ingest large corpuses without having to decide whether to filter out or crop such lists.
*	importer bugfixes	Bryan Newbold	2019-01-23	1	-3/+3
\|