fatcat - [no description]

	Commit message (Collapse)	Author	Age	Files	Lines
*	regression test for deleted entity history view	Bryan Newbold	2019-12-09	1	-0/+25
\|
*	add missing underline in deleted entity web view	Bryan Newbold	2019-12-09	1	-1/+1
\|
*	Merge branch 'bnewbold-crossref-harvest-test' into 'master'	Martin Czygan	2019-12-09	5	-22/+82
\|\ \| \| \| \| \| \| \| \|	Basic mocked test for crossref harvester See merge request webgroup/fatcat!7
\| *	add basic test for crossref harvest API call	Bryan Newbold	2019-12-06	2	-0/+46
\| \|
\| *	refactor kafka producer in crossref harvester	Bryan Newbold	2019-12-06	1	-21/+26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	producer creation/configuration should be happening in __init__() time, not 'daily' call. This specific refactor motivated by mocking out the producer in unit tests.
\| *	add pytest-mock helper library to dev deps	Bryan Newbold	2019-12-06	2	-1/+10
\| \|
* \|	Merge branch 'martin-increase-docker-kafka-message-size' into 'master'	bnewbold	2019-12-06	1	-0/+1
\|\ \ \| \|/ \|/\| \| \| \| \|	increase max.message.bytes in container See merge request webgroup/fatcat!5
\| *	increase max.message.bytes in container	Martin Czygan	2019-12-05	1	-0/+1
\|/ \| \| \| \|	While working on datacite, some message were larger than the default of 1000012 bytes.
*	improve previous commit (JATS abstract hack)	Bryan Newbold	2019-12-03	1	-4/+6
\|
*	hack: remove enclosing JATS XML tags around abstracts	Bryan Newbold	2019-12-03	1	-1/+7
\| \| \| \| \| \|	The more complete fix is to actually render the JATS to HTML and display that. This is just to fix a nit with the most common case of XML tags in abstracts.
*	tweaks to file ingest importer	Bryan Newbold	2019-12-03	2	-3/+10
\| \| \| \| \|	- allow overriding source filter whitelist (common case for CLI use) - fix editgroup description env variable pass-through
*	crossref is_update isn't what I thought	Bryan Newbold	2019-12-03	1	-6/+2
\| \| \| \| \| \| \| \|	I thought this would filter for metadata updates to an existing DOI, but actually "updates" are a type of DOI (eg, a retraction). TODO: handle 'updates' field. Should both do a lookup and set work_ident appropriately, and store in crossref-specific metadata.
*	bump required rust to 1.36	Bryan Newbold	2019-12-03	2	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \|	This isn't a fatcat rust requirement, but instead a diesel requirement, via rust-smallvec, which in v1.0 uses the alloc crate: https://github.com/servo/rust-smallvec/issues/73 I think the reason this came up now is that diesel-cli is an application and doesn't have a Cargo.lock file, and the build was updated. Using some binary mechanism to install these dependencies would be more robust, but feels like a yak shave right now.
*	update gitlab-ci to rust 1.34	Bryan Newbold	2019-12-03	1	-1/+1
\| \| \| \| \|	Apparently the rust:1.34-stretch image is gone from docker hub, and this was causing CI errors.
*	make file edit form hash values case insensitive	Bryan Newbold	2019-12-02	1	-0/+3
\| \| \| \| \| \| \|	Test in previous commit. This fixes a user-reported 500 error when creating a file with SHA1/SHA256/MD5 hashes in upper-case.
*	add regression test for upper-case SHA-1 form submit	Bryan Newbold	2019-12-02	1	-0/+10
\|
*	re-order ingest want() for better stats	Bryan Newbold	2019-11-15	1	-7/+10
\|
*	project -> ingest_request_source	Bryan Newbold	2019-11-15	3	-9/+9
\|
*	have ingest-file-results importer operate as crawl-bot	Bryan Newbold	2019-11-15	1	-1/+1
\| \| \| \|	As opposed to sandcrawler-bot
*	fix release.pmcid typo	Bryan Newbold	2019-11-15	1	-2/+2
\|
*	better ingest-file-results import name	Bryan Newbold	2019-11-15	1	-1/+1
\|
*	ingest importer fixes	Bryan Newbold	2019-11-15	1	-3/+4
\|
*	more ingest importer comments and counts	Bryan Newbold	2019-11-15	2	-2/+29
\|
*	crude support for 'sandcrawler' kafka topics	Bryan Newbold	2019-11-15	1	-2/+3
\|
*	ingest file result importer	Bryan Newbold	2019-11-15	5	-2/+228
\|
*	test for ingest transform	Bryan Newbold	2019-11-15	1	-0/+57
\|
*	add ingest request feature to entity_updates worker	Bryan Newbold	2019-11-15	2	-4/+22
\| \| \| \| \| \| \| \| \| \| \| \| \|	Initially was going to create a new worker to consume from the release update channel, but couldn't get the edit context ("is this a new release, or update to an existing") from that context. Currently there is a flag in source code to control whether we only do OA releases or all releases. Starting with OA only to start slow, but should probably default to all, and make this a config flag. Should probably also have a config flag to control this entire feature. Tested locally in dev.
*	add ingest request transform (and test)	Bryan Newbold	2019-11-15	3	-1/+68
\|
*	update next schema tweaks proposal doc	Bryan Newbold	2019-11-15	1	-0/+1
\|
*	Merge branch 'martin-search-results-pagination' into 'master'	Martin Czygan	2019-11-15	6	-20/+82
\|\ \| \| \| \| \| \| \| \|	Add basic pagination to search results See merge request webgroup/fatcat!4
\| *	address test issue	Martin Czygan	2019-11-15	1	-2/+3
\| \|
\| *	adjust search test case for new wording	Martin Czygan	2019-11-14	1	-2/+2
\| \| \| \| \| \| \| \|	> "Showing top " -> "Showing first "
\| *	gray out inactive navigation links	Martin Czygan	2019-11-14	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	As per [this issue](https://github.com/Semantic-Org/Semantic-UI/issues/1885#issuecomment-77619519), text colors are not supported in semantic ui. To not move text too much, gray out inactive links.
\| *	move pagination into macros	Martin Czygan	2019-11-14	3	-43/+51
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Two new macros: * top_results(found) * bottom_results(found) wip: move pagination into macro
\| *	Add basic pagination to search results	Martin Czygan	2019-11-08	4	-14/+67
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The "deep paging problem" imposes some limit, which currently is a hardcoded default value, `deep_page_limit=2000` in `do_search`. Elasticsearch can be configured, too: > Note that from + size can not be more than the index.max_result_window index setting, which defaults to 10,000. -- https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-body.html#request-body-search-from-size
* \|	web: catch MacaroonInitException	Bryan Newbold	2019-11-12	1	-0/+4
\| \| \| \| \| \| \| \| \| \|	Caught one of these in sentry. Probably due to a crawler? Or typing gibberish in the token form.
* \|	design notes for a larger database	Bryan Newbold	2019-11-12	1	-0/+81
\| \|
* \|	old proposals for 'next' schema update	Bryan Newbold	2019-11-12	1	-0/+38
\| \|
* \|	crossref patch bulk import	Bryan Newbold	2019-11-12	2	-0/+63
\| \|
* \|	Merge branch 'martin-python-readme-es-note' into 'master'	bnewbold	2019-11-08	1	-0/+5
\|\ \ \| \| \| \| \| \| \| \| \| \| \| \|	mention elasticsearch empty index setup See merge request webgroup/fatcat!3
\| * \|	mention elasticsearch empty index setup	Martin Czygan	2019-11-08	1	-0/+5
\| \|/ \| \| \| \| \| \| \| \| \| \|	When setting up with the defaults, all works fine, except that the web search will try to access a local elasticsearch. Mention in README, how to create empty indices.
* \|	crossref: accurate blank title counts	Bryan Newbold	2019-11-05	1	-0/+1
\| \|
* \|	fix crossref component test	Bryan Newbold	2019-11-04	1	-1/+1
\| \|
* \|	TODO idea: 'first seen'	Bryan Newbold	2019-11-04	1	-0/+1
\| \|
* \|	crossref: component type	Bryan Newbold	2019-11-04	1	-1/+3
\| \|
* \|	add 'component' as a release_type	Bryan Newbold	2019-11-04	2	-0/+3
\| \|
* \|	crossref: count why skip happened	Bryan Newbold	2019-11-04	1	-1/+7
\| \| \| \| \| \| \| \| \| \| \| \|	Might skip based on release type (eg container, not a paper/release), or missing title, or other reasons. Over 7 million DOIs are getting skipped, curious why.
* \|	crossref: don't skip on short/null subtitle	Bryan Newbold	2019-11-04	1	-1/+1
\|/ \| \| \|	This was a bug. Should only set subtitle black, not skip the import.
*	note file fixup pushed in prod	Bryan Newbold	2019-10-09	2	-1/+64
\|
*	move corpus changes to 'notes/bulk_edits'	Bryan Newbold	2019-10-08	3	-0/+285
\|