fatcat - [no description]

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	datacite: ignore known unknown values in resourceType*	Martin Czygan	2020-01-09	3	-1/+95
\|
*	datacite: abstracts may be strings or list of strings	Martin Czygan	2020-01-09	5	-1/+187
\|
*	datacite: improve license_slug handling	Martin Czygan	2020-01-09	3	-2/+33
\|
*	datacite: add 'Unknown' to blacklist	Martin Czygan	2020-01-09	1	-7/+1
\|
*	datacite: get rid of schemaVersion	Martin Czygan	2020-01-09	17	-32/+14
\|
*	datacite: reformat test cases and use jq . --sort-keys	Martin Czygan	2020-01-08	54	-2299/+2301
\|
*	datacite: factor out contributor handling	Martin Czygan	2020-01-08	5	-2/+107
\| \| \| \| \| \| \|	Use values from: * attributes.creators[] * attributes.contributors[]
*	datacite: adjust tests for release_month	Martin Czygan	2020-01-08	12	-12/+12
\|
*	datacite: mark additional files as stub	Martin Czygan	2020-01-08	3	-1/+73
\|
*	datacite: CCDC are entries, mostly	Martin Czygan	2020-01-08	1	-1/+1
\|
*	datacite: adding datacite-specific extra metadata	Martin Czygan	2020-01-07	30	-1468/+1570
\| \| \| \| \| \| \| \| \| \| \| \| \|	* attributes.metadataVersion * attributes.schemaVersion * attributes.version (source dependent values, follows suggestions in https://schema.datacite.org/meta/kernel-4.3/doc/DataCite-MetadataKernel_v4.3.pdf#page=26, but values vary) Furthermore: * attributes.types.resourceTypeGeneral * attributes.types.resourceType
*	datacite: month field should be top-level	Martin Czygan	2020-01-06	11	-14/+14
\|
*	datacite: include month in extra	Martin Czygan	2020-01-06	11	-11/+13
\| \| \| \| \|	> include release_month as a top-level extra field [...] to auto-populate the schema field from that
*	datacite: indicate mismatched file in test	Martin Czygan	2020-01-06	1	-1/+1
\|
*	datacite: clean abstracts, use unknown value tokens	Martin Czygan	2020-01-06	3	-3/+3
\| \| \| \| \| \| \| \|	Datacite defines placeholders for unknown values: * https://support.datacite.org/docs/schema-values-unknown-information-v43 Clean abstracts.
*	datacite: always include "datacite" key in extra	Martin Czygan	2020-01-04	14	-26/+26
\| \| \| \| \| \|	> always include extra values for the respective DOI registrars (datacite, crossref, jalc), even if they are empty ({}), to be used as a flag so we know which DOI registrar supplied the metadata.
*	datacite: use normal.clean_doi	Martin Czygan	2020-01-03	1	-4/+0
\|
*	datacite: parse_datacite_dates returns month	Martin Czygan	2020-01-03	1	-7/+16
\| \| \| \|	As [...] we will soon add support for release_month field in the release schema.
*	datacite: prepare release_month (stub)	Martin Czygan	2020-01-03	1	-14/+14
\|
*	datacite: remove --lang-detect flag	Martin Czygan	2020-01-03	5	-10/+15
\| \| \| \|	Estimated time for a single call is in the order of 50ms.
*	datacite: add another test case	Martin Czygan	2020-01-02	3	-1/+71
\|
*	datacite: open case for editing after creation	Martin Czygan	2020-01-02	1	-0/+2
\|
*	datacite: add helper script to create new test case	Martin Czygan	2020-01-02	1	-0/+14
\|
*	datacite: address raw_name index form comment	Martin Czygan	2020-01-02	20	-112/+128
\| \| \| \| \| \| \| \| \|	> The convention for display_name and raw_name is to be how the name would normally be printed, not in index form (surname comma given_name). So we might need to un-encode names like "Tricart, Pierre". Use an additional `index_form_to_display_name` function to convert index from to display form, heuristically.
*	datacite: add conversion fixtures	Martin Czygan	2020-01-02	50	-1/+3949
\| \| \| \| \| \| \| \| \| \| \| \| \|	The `test_datacite_conversions` function will compare an input (datacite) document to an expected output (release entity as JSON). This way, it should not be too hard to add more cases by adding: input, output - and by increasing the counter in the range loop within the test. To view input and result side by side with vim, change into the test directory and run: tests/files/datacite $ ./caseview.sh 18
*	datacite: adjust tests	Martin Czygan	2019-12-28	1	-2/+1
\|
*	address first round of MR14 comments	Martin Czygan	2019-12-28	1	-2/+176
\| \| \| \| \| \| \| \| \| \| \| \| \|	* add missing langdetect * use entity_to_dict for json debug output * factor out code for fields in function and add table driven tests * update citeproc types * add author as default role * add raw_affiliation * include relations from datacite * remove url (covered by doi already) Using yapf for python formatting.
*	improve datacite field mapping and import	Martin Czygan	2019-12-28	3	-17/+92
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Current version succeeded to import a random sample of 100000 records (0.5%) from datacite. The --debug (write JSON to stdout) and --insert-log-file (log batch before committing to db) flags are temporary added to help debugging. Add few unit tests. Some edge cases: a) Existing keys without value requires a slightly awkward: ``` titles = attributes.get('titles', []) or [] ``` b) There can be 0, 1, or more (first one wins) titles. c) Date handling is probably not ideal. Datacite has a potentiall fine grained list of dates. The test case (tests/files/datacite_sample.jsonl) refers to https://ssl.fao.org/glis/doi/10.18730/8DYM9, which has date (main descriptor) 1986. The datacite record contains: 2017 (publicationYear, probably the year of record creation with reference system), 1978-06-03 (collected, e.g. experimental sample), 1986 ("Accepted"). The online version of the resource knows even one more date (2019-06-05 10:14:43 by WIEWS update).
*	datacite: importer skeleton	Martin Czygan	2019-12-28	1	-0/+25
\| \| \| \| \| \|	* contributors, title, date, publisher, container, license Field and value analysis via https://github.com/miku/indigo.
*	datacite: fix harvest test	Martin Czygan	2019-12-27	1	-1/+1
\| \| \| \| \| \|	Produced messages should match: jq '.data\|length' tests/files/datacite_api.json
*	datacite: add simple test and fixture for datacite api interaction	Martin Czygan	2019-12-27	2	-0/+46
\|
*	add regression test for medlinedate -> year parsing	Bryan Newbold	2019-12-23	2	-0/+102
\|
*	regression test for deleted entity history view	Bryan Newbold	2019-12-09	1	-0/+25
\|
*	add basic test for crossref harvest API call	Bryan Newbold	2019-12-06	2	-0/+46
\|
*	add regression test for upper-case SHA-1 form submit	Bryan Newbold	2019-12-02	1	-0/+10
\|
*	ingest file result importer	Bryan Newbold	2019-11-15	2	-0/+59
\|
*	test for ingest transform	Bryan Newbold	2019-11-15	1	-0/+57
\|
*	add ingest request transform (and test)	Bryan Newbold	2019-11-15	1	-1/+1
\|
*	Merge branch 'martin-search-results-pagination' into 'master'	Martin Czygan	2019-11-15	1	-2/+3
\|\ \| \| \| \| \| \| \| \|	Add basic pagination to search results See merge request webgroup/fatcat!4
\| *	address test issue	Martin Czygan	2019-11-15	1	-2/+3
\| \|
\| *	adjust search test case for new wording	Martin Czygan	2019-11-14	1	-2/+2
\| \| \| \| \| \| \| \|	> "Showing top " -> "Showing first "
* \|	fix crossref component test	Bryan Newbold	2019-11-04	1	-1/+1
\|/
*	commit file cleaner tests	Bryan Newbold	2019-10-08	1	-0/+58
\|
*	redirect direct entity underscore links	Bryan Newbold	2019-10-03	1	-0/+2
\|
*	python webface impl token generation	Bryan Newbold	2019-09-18	1	-0/+8
\|
*	skip test_crossref_importer_huge() by default	Bryan Newbold	2019-09-13	1	-0/+1
\|
*	refactor all python source for client lib name	Bryan Newbold	2019-09-05	26	-74/+74
\|
*	add kbart counts to container stats	Bryan Newbold	2019-07-31	1	-0/+1
\|
*	complete generic entity rev views	Bryan Newbold	2019-06-28	1	-8/+46
\| \| \| \| \| \|	Was getting 500s in production from crawlers. Also expand test coverage.
*	release elasticsearch results: stage not status	Bryan Newbold	2019-06-13	1	-1/+1
\|