fatcat - [no description]

	Commit message (Collapse)	Author	Age	Files	Lines
*	have ingest-file-results importer operate as crawl-bot	Bryan Newbold	2019-11-15	1	-1/+1
\| \| \| \|	As opposed to sandcrawler-bot
*	fix release.pmcid typo	Bryan Newbold	2019-11-15	1	-2/+2
\|
*	better ingest-file-results import name	Bryan Newbold	2019-11-15	1	-1/+1
\|
*	ingest importer fixes	Bryan Newbold	2019-11-15	1	-3/+4
\|
*	more ingest importer comments and counts	Bryan Newbold	2019-11-15	2	-2/+29
\|
*	crude support for 'sandcrawler' kafka topics	Bryan Newbold	2019-11-15	1	-2/+3
\|
*	ingest file result importer	Bryan Newbold	2019-11-15	5	-2/+228
\|
*	test for ingest transform	Bryan Newbold	2019-11-15	1	-0/+57
\|
*	add ingest request feature to entity_updates worker	Bryan Newbold	2019-11-15	2	-4/+22
\| \| \| \| \| \| \| \| \| \| \| \| \|	Initially was going to create a new worker to consume from the release update channel, but couldn't get the edit context ("is this a new release, or update to an existing") from that context. Currently there is a flag in source code to control whether we only do OA releases or all releases. Starting with OA only to start slow, but should probably default to all, and make this a config flag. Should probably also have a config flag to control this entire feature. Tested locally in dev.
*	add ingest request transform (and test)	Bryan Newbold	2019-11-15	3	-1/+68
\|
*	Merge branch 'martin-search-results-pagination' into 'master'	Martin Czygan	2019-11-15	6	-20/+82
\|\ \| \| \| \| \| \| \| \|	Add basic pagination to search results See merge request webgroup/fatcat!4
\| *	address test issue	Martin Czygan	2019-11-15	1	-2/+3
\| \|
\| *	adjust search test case for new wording	Martin Czygan	2019-11-14	1	-2/+2
\| \| \| \| \| \| \| \|	> "Showing top " -> "Showing first "
\| *	gray out inactive navigation links	Martin Czygan	2019-11-14	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	As per [this issue](https://github.com/Semantic-Org/Semantic-UI/issues/1885#issuecomment-77619519), text colors are not supported in semantic ui. To not move text too much, gray out inactive links.
\| *	move pagination into macros	Martin Czygan	2019-11-14	3	-43/+51
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Two new macros: * top_results(found) * bottom_results(found) wip: move pagination into macro
\| *	Add basic pagination to search results	Martin Czygan	2019-11-08	4	-14/+67
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The "deep paging problem" imposes some limit, which currently is a hardcoded default value, `deep_page_limit=2000` in `do_search`. Elasticsearch can be configured, too: > Note that from + size can not be more than the index.max_result_window index setting, which defaults to 10,000. -- https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-body.html#request-body-search-from-size
* \|	web: catch MacaroonInitException	Bryan Newbold	2019-11-12	1	-0/+4
\| \| \| \| \| \| \| \| \| \|	Caught one of these in sentry. Probably due to a crawler? Or typing gibberish in the token form.
* \|	Merge branch 'martin-python-readme-es-note' into 'master'	bnewbold	2019-11-08	1	-0/+5
\|\ \ \| \| \| \| \| \| \| \| \| \| \| \|	mention elasticsearch empty index setup See merge request webgroup/fatcat!3
\| * \|	mention elasticsearch empty index setup	Martin Czygan	2019-11-08	1	-0/+5
\| \|/ \| \| \| \| \| \| \| \| \| \|	When setting up with the defaults, all works fine, except that the web search will try to access a local elasticsearch. Mention in README, how to create empty indices.
* \|	crossref: accurate blank title counts	Bryan Newbold	2019-11-05	1	-0/+1
\| \|
* \|	fix crossref component test	Bryan Newbold	2019-11-04	1	-1/+1
\| \|
* \|	crossref: component type	Bryan Newbold	2019-11-04	1	-1/+3
\| \|
* \|	crossref: count why skip happened	Bryan Newbold	2019-11-04	1	-1/+7
\| \| \| \| \| \| \| \| \| \| \| \|	Might skip based on release type (eg container, not a paper/release), or missing title, or other reasons. Over 7 million DOIs are getting skipped, curious why.
* \|	crossref: don't skip on short/null subtitle	Bryan Newbold	2019-11-04	1	-1/+1
\|/ \| \| \|	This was a bug. Should only set subtitle black, not skip the import.
*	commit file cleaner tests	Bryan Newbold	2019-10-08	1	-0/+58
\|
*	file cleanup tweaks to actually run	Bryan Newbold	2019-10-08	2	-5/+4
\|
*	refactor duplicated b32_hex function in importers	Bryan Newbold	2019-10-08	3	-21/+11
\|
*	dict wrapper for entity_from_json()	Bryan Newbold	2019-10-08	2	-3/+7
\|
*	new cleanup python tool/framework	Bryan Newbold	2019-10-08	5	-0/+300
\|
*	redirect direct entity underscore links	Bryan Newbold	2019-10-03	2	-0/+30
\|
*	webface: extra <br> in container lookup links	Bryan Newbold	2019-09-21	1	-1/+1
\|
*	remove duplicate style ref in container edit view	Bryan Newbold	2019-09-20	1	-5/+0
\|
*	review/fix all confluent-kafka produce code	Bryan Newbold	2019-09-20	6	-27/+75
\|
*	small fixes to confluent-kafka importers/workers	Bryan Newbold	2019-09-20	8	-26/+69
\| \| \| \| \| \| \| \|	- decrease default changelog pipeline to 5.0sec - fix missing KafkaException harvester imports - more confluent-kafka tweaks - updates to kafka consumer configs - bump elastic updates consumergroup (again)
*	update Pipfile.lock after confluent-kafka rebase	Bryan Newbold	2019-09-20	1	-1/+33
\|
*	convert pipeline workers from pykafka to confluent-kafka	Bryan Newbold	2019-09-20	3	-125/+230
\|
*	small kafka tweaks for robustness	Bryan Newbold	2019-09-20	2	-0/+5
\|
*	convert importers to confluent-kafka library	Bryan Newbold	2019-09-20	2	-21/+74
\|
*	bump max message size to ~20 MBytes	Bryan Newbold	2019-09-20	2	-0/+2
\|
*	fixes to confluent-kafka harvesters	Bryan Newbold	2019-09-20	3	-20/+21
\|
*	first draft harvesters using confluent-kafka	Bryan Newbold	2019-09-20	3	-48/+104
\|
*	make default kafka env 'dev', not 'qa'	Bryan Newbold	2019-09-20	2	-4/+4
\|
*	add confluent-kafka library (to replace pykafka)	Bryan Newbold	2019-09-20	1	-0/+1
\|
*	handle more external identifiers in python	Bryan Newbold	2019-09-18	2	-14/+101
\| \| \| \| \|	This makes it possible to, eg, past an arxiv identifier or SHA-1 hash in the general search box and do a quick lookup.
*	webface: fix duration_seconds parsing	Bryan Newbold	2019-09-18	1	-1/+1
\|
*	add guide editing links to edit forms and signup message	Bryan Newbold	2019-09-18	5	-5/+26
\|
*	python webface impl token generation	Bryan Newbold	2019-09-18	4	-1/+85
\|
*	slightly less annoying 'flash' message header	Bryan Newbold	2019-09-18	1	-1/+1
\|
*	remove '@' from archive.org ident	Bryan Newbold	2019-09-17	1	-1/+1
\|
*	IA auth: use itemname not screenname for username	Bryan Newbold	2019-09-17	1	-1/+1
\| \| \| \| \| \| \|	Have run in to several issues with IA screenname being invalid fatcat usernames (eg, containing whitespace). This probably won't catch all such issues, but hopefully most of them.