fatcat-scholar - Unnamed repository; edit this file 'description' to name the repository.

	Commit message (Collapse)	Author	Age	Files	Lines
*	make fmt	Bryan Newbold	2021-05-17	1	-1/+4
\|
*	iterate on PDF redirect links	Bryan Newbold	2021-05-17	1	-1/+1
\|
*	web: don't clobber user input query when parsing	Bryan Newbold	2021-04-30	1	-3/+4
\| \| \| \| \| \| \|	This is intended to be a UX improvement, to avoid adding double quotes around the query a user has pasted in. This does make the "parsing" behavior less transparent.
*	iterate on access redirects and landing page implementation	Bryan Newbold	2021-04-27	1	-4/+7
\| \| \| \|	Small code refactors and minimal test coverage
*	web: initial implementation of work landing page and citation_pdf_url access ↵	Bryan Newbold	2021-04-23	1	-1/+37
\| \| \| \| \| \| \| \| \| \| \| \|	redirect The initial intent is to have something that can be used by indexing services to pull the citation_pdf_url meta tag and bounce to a direct IA PDF access URL. For now the landing page stubs are just formatted as SERP results. Presumbably these will get re-styled at some point and include citation graph links, etc.
*	search: more aggressively skip fuzzy match exceptions	Bryan Newbold	2021-04-12	1	-5/+5
\|
*	health check: use /<index>/_count endpoint; verify shards	Bryan Newbold	2021-04-06	1	-7/+12
\| \| \| \| \|	In actual production verification, the /_mapping endpoint didn't seem to work.
*	change health check from .exists(index) to .mapping(index)	Bryan Newbold	2021-04-06	1	-4/+13
\| \| \| \| \| \| \| \| \| \| \| \|	In cases where the cluser leader node is unavilable, the health check was returning false even when the local node had full shard replicas and could return requests. A refinement of this change would be to use the /<index>/_count API endpoint to ensure that the "failed" and "skipped" shard numbers are 0 (aka, "successful == total"). However, not sure where that endpoint is exposed in the elasticsearch-py API. the CatClient method doesn't seem right.
*	make fmt	Bryan Newbold	2021-03-29	1	-0/+1
\|
*	web and API health check endpoint	Bryan Newbold	2021-03-29	1	-0/+14
\| \| \| \| \| \|	Because scholar is primarily a search service, the endpoint does a pass-through health check to the elasticsearch backend (aka, es-public-proxy).
*	Revert undesirable changes	Christian Clauss	2021-02-23	1	-1/+1
\|
*	Modernize Python syntax with pyupgrade --py38-plus */.py	Christian Clauss	2021-02-23	1	-2/+2
\|
*	refactor ES configuration setting names	Bryan Newbold	2021-01-25	1	-2/+2
\|
*	add permalink icon/link	Bryan Newbold	2021-01-21	1	-0/+2
\|
*	add citation query feature (disabled by default)	Bryan Newbold	2021-01-19	1	-14/+69
\| \| \| \| \| \|	This is operationally complex (queries hit 3x backend services!), so not enabled by default. Will need more testing; possibly circuit-breaking. Though haproxy should provide some of that automatically at this point.
*	lint: fix small bugs and type annotations	Bryan Newbold	2021-01-18	1	-1/+1
\|
*	search: parse and embed a copy of ScholarDoc object in results	Bryan Newbold	2021-01-14	1	-1/+6
\| \| \| \|	Maybe should refactor this to simply replace the object? Hrm.
*	search: show fewer, shorter highlights. sort by score.	Bryan Newbold	2021-01-14	1	-1/+2
\|
*	work around mypy complaint about exception union type	Bryan Newbold	2020-12-22	1	-1/+2
\|
*	remove minor unused imports	Bryan Newbold	2020-10-22	1	-1/+0
\|
*	improve search logging and exception chaining	Bryan Newbold	2020-10-21	1	-5/+6
\|
*	refactor do_fulltext_search into smaller methods	Bryan Newbold	2020-10-16	1	-52/+70
\|
*	Upgrade Dynaconf to 3+	Bruno Rocha	2020-10-05	1	-1/+1
\| \| \| \| \| \|	In dynaconf 3+ it is no more recommended to use `from dynaconf import settings` now the recommendation is to create your own instance of the settings object based on Dynaconf class.
*	search: handle direct DOI and PMCID queries	Bryan Newbold	2020-09-17	1	-9/+16
\| \| \| \| \| \|	If query is a single token which looks like a valid PMCID or DOI, with no surrounding quotes, then expand scope and filter to that single external identifier.
*	use container_name, not container_ident, in boost	Bryan Newbold	2020-08-12	1	-1/+1
\| \| \| \| \|	This should result in SIM page fulltext matches not getting pushed down as much, as well as things like biorxiv (*rxiv) results.
*	fmt/lint tweaks	Bryan Newbold	2020-08-12	1	-5/+2
\|
*	search: include 'article' in papers filter	Bryan Newbold	2020-08-12	1	-1/+1
\|
*	search: use simplified query for highlighting	Bryan Newbold	2020-08-12	1	-1/+8
\| \| \| \| \| \| \| \|	This fixes broken phrase query highlighting. I found this issues but it may have been unrelated: https://github.com/elastic/elasticsearch/issues/40227
*	re-use ES sync API client	Bryan Newbold	2020-08-06	1	-3/+4
\|
*	report ES API query time as server-timing header	Bryan Newbold	2020-08-06	1	-0/+4
\|
*	add debug mode flag (to control json tag/link)	Bryan Newbold	2020-08-06	1	-0/+1
\|
*	make fmt	Bryan Newbold	2020-08-06	1	-14/+14
\|
*	microfilm access filter; broader access matching	Bryan Newbold	2020-08-06	1	-3/+6
\|
*	fix acknowledgement highlighting (typo)	Bryan Newbold	2020-08-06	1	-1/+1
\|
*	reduce title boost; use only base query for highlighting	Bryan Newbold	2020-08-06	1	-1/+2
\|
*	special case '*' queries	Bryan Newbold	2020-08-06	1	-6/+16
\| \| \| \| \|	More/better query parsing in the client could detect if this was a "filter only" query and do the same kind of optimization.
*	remove 'title' from poor metadata scoring	Bryan Newbold	2020-08-06	1	-1/+0
\|
*	better time ranges (don't search future)	Bryan Newbold	2020-08-06	1	-4/+7
\|
*	add title back to match query	Bryan Newbold	2020-08-06	1	-0/+1
\|
*	query fewer fields; highlight all fulltext fields regardless of match	Bryan Newbold	2020-08-06	1	-3/+1
\|
*	search tweaks to be forwards-compatible with ES 7.x	Bryan Newbold	2020-08-06	1	-2/+10
\| \| \| \| \| \|	When we fully commit to ES 7.x we should upgrade the client library correspondingly, and then can remove these work-arounds. But for now we have one instance of ES 6.x and one ES 7.x.
*	extend ES client timeout to 25 seconds	Bryan Newbold	2020-08-06	1	-1/+1
\|
*	Revert "remove duplicate fulltext search from query"	Bryan Newbold	2020-07-30	1	-0/+1
\| \| \| \| \| \|	This reverts commit 0d3fd83493c7307a2b9593c7add90b8b6f4b4152. Seems like we do need to query on this field for highlighting to work.
*	include container_ident in metadata completeness boost	Bryan Newbold	2020-07-28	1	-0/+1
\|
*	search: smaller default result set	Bryan Newbold	2020-07-27	1	-1/+1
\|
*	remove duplicate fulltext search from query	Bryan Newbold	2020-07-27	1	-1/+0
\| \| \| \| \| \|	may also remove the 'title' and 'abstracts' searches, though they currently help with boosting, and will want to measure actual preformance difference before that change
*	search: tweak 'past week' date range to not include future	Bryan Newbold	2020-07-27	1	-2/+4
\|
*	include fulltext acknowledgements in highlighting	Bryan Newbold	2020-07-21	1	-0/+1
\|
*	fix search filter bug (papers is default)	Bryan Newbold	2020-06-29	1	-2/+2
\|
*	make fmt	Bryan Newbold	2020-06-29	1	-3/+3
\|