Commit message (Collapse) | Author | Age | Files | Lines | |
---|---|---|---|---|---|
* | iterate on access redirects and landing page implementation | Bryan Newbold | 2021-04-27 | 1 | -4/+7 |
| | | | | Small code refactors and minimal test coverage | ||||
* | web: initial implementation of work landing page and citation_pdf_url access ↵ | Bryan Newbold | 2021-04-23 | 1 | -1/+37 |
| | | | | | | | | | | | | redirect The initial intent is to have something that can be used by indexing services to pull the citation_pdf_url meta tag and bounce to a direct IA PDF access URL. For now the landing page stubs are just formatted as SERP results. Presumbably these will get re-styled at some point and include citation graph links, etc. | ||||
* | search: more aggressively skip fuzzy match exceptions | Bryan Newbold | 2021-04-12 | 1 | -5/+5 |
| | |||||
* | health check: use /<index>/_count endpoint; verify shards | Bryan Newbold | 2021-04-06 | 1 | -7/+12 |
| | | | | | In actual production verification, the /_mapping endpoint didn't seem to work. | ||||
* | change health check from .exists(index) to .mapping(index) | Bryan Newbold | 2021-04-06 | 1 | -4/+13 |
| | | | | | | | | | | | | In cases where the cluser leader node is unavilable, the health check was returning false even when the local node had full shard replicas and could return requests. A refinement of this change would be to use the /<index>/_count API endpoint to ensure that the "failed" and "skipped" shard numbers are 0 (aka, "successful == total"). However, not sure where that endpoint is exposed in the elasticsearch-py API. the CatClient method doesn't seem right. | ||||
* | make fmt | Bryan Newbold | 2021-03-29 | 1 | -0/+1 |
| | |||||
* | web and API health check endpoint | Bryan Newbold | 2021-03-29 | 1 | -0/+14 |
| | | | | | | Because scholar is primarily a search service, the endpoint does a pass-through health check to the elasticsearch backend (aka, es-public-proxy). | ||||
* | Revert undesirable changes | Christian Clauss | 2021-02-23 | 1 | -1/+1 |
| | |||||
* | Modernize Python syntax with pyupgrade --py38-plus **/*.py | Christian Clauss | 2021-02-23 | 1 | -2/+2 |
| | |||||
* | refactor ES configuration setting names | Bryan Newbold | 2021-01-25 | 1 | -2/+2 |
| | |||||
* | add permalink icon/link | Bryan Newbold | 2021-01-21 | 1 | -0/+2 |
| | |||||
* | add citation query feature (disabled by default) | Bryan Newbold | 2021-01-19 | 1 | -14/+69 |
| | | | | | | This is operationally complex (queries hit 3x backend services!), so not enabled by default. Will need more testing; possibly circuit-breaking. Though haproxy should provide some of that automatically at this point. | ||||
* | lint: fix small bugs and type annotations | Bryan Newbold | 2021-01-18 | 1 | -1/+1 |
| | |||||
* | search: parse and embed a copy of ScholarDoc object in results | Bryan Newbold | 2021-01-14 | 1 | -1/+6 |
| | | | | Maybe should refactor this to simply replace the object? Hrm. | ||||
* | search: show fewer, shorter highlights. sort by score. | Bryan Newbold | 2021-01-14 | 1 | -1/+2 |
| | |||||
* | work around mypy complaint about exception union type | Bryan Newbold | 2020-12-22 | 1 | -1/+2 |
| | |||||
* | remove minor unused imports | Bryan Newbold | 2020-10-22 | 1 | -1/+0 |
| | |||||
* | improve search logging and exception chaining | Bryan Newbold | 2020-10-21 | 1 | -5/+6 |
| | |||||
* | refactor do_fulltext_search into smaller methods | Bryan Newbold | 2020-10-16 | 1 | -52/+70 |
| | |||||
* | Upgrade Dynaconf to 3+ | Bruno Rocha | 2020-10-05 | 1 | -1/+1 |
| | | | | | | In dynaconf 3+ it is no more recommended to use `from dynaconf import settings` now the recommendation is to create your own instance of the settings object based on Dynaconf class. | ||||
* | search: handle direct DOI and PMCID queries | Bryan Newbold | 2020-09-17 | 1 | -9/+16 |
| | | | | | | If query is a single token which looks like a valid PMCID or DOI, with no surrounding quotes, then expand scope and filter to that single external identifier. | ||||
* | use container_name, not container_ident, in boost | Bryan Newbold | 2020-08-12 | 1 | -1/+1 |
| | | | | | This should result in SIM page fulltext matches not getting pushed down as much, as well as things like biorxiv (*rxiv) results. | ||||
* | fmt/lint tweaks | Bryan Newbold | 2020-08-12 | 1 | -5/+2 |
| | |||||
* | search: include 'article' in papers filter | Bryan Newbold | 2020-08-12 | 1 | -1/+1 |
| | |||||
* | search: use simplified query for highlighting | Bryan Newbold | 2020-08-12 | 1 | -1/+8 |
| | | | | | | | | This fixes broken phrase query highlighting. I found this issues but it may have been unrelated: https://github.com/elastic/elasticsearch/issues/40227 | ||||
* | re-use ES sync API client | Bryan Newbold | 2020-08-06 | 1 | -3/+4 |
| | |||||
* | report ES API query time as server-timing header | Bryan Newbold | 2020-08-06 | 1 | -0/+4 |
| | |||||
* | add debug mode flag (to control json tag/link) | Bryan Newbold | 2020-08-06 | 1 | -0/+1 |
| | |||||
* | make fmt | Bryan Newbold | 2020-08-06 | 1 | -14/+14 |
| | |||||
* | microfilm access filter; broader access matching | Bryan Newbold | 2020-08-06 | 1 | -3/+6 |
| | |||||
* | fix acknowledgement highlighting (typo) | Bryan Newbold | 2020-08-06 | 1 | -1/+1 |
| | |||||
* | reduce title boost; use only base query for highlighting | Bryan Newbold | 2020-08-06 | 1 | -1/+2 |
| | |||||
* | special case '*' queries | Bryan Newbold | 2020-08-06 | 1 | -6/+16 |
| | | | | | More/better query parsing in the client could detect if this was a "filter only" query and do the same kind of optimization. | ||||
* | remove 'title' from poor metadata scoring | Bryan Newbold | 2020-08-06 | 1 | -1/+0 |
| | |||||
* | better time ranges (don't search future) | Bryan Newbold | 2020-08-06 | 1 | -4/+7 |
| | |||||
* | add title back to match query | Bryan Newbold | 2020-08-06 | 1 | -0/+1 |
| | |||||
* | query fewer fields; highlight all fulltext fields regardless of match | Bryan Newbold | 2020-08-06 | 1 | -3/+1 |
| | |||||
* | search tweaks to be forwards-compatible with ES 7.x | Bryan Newbold | 2020-08-06 | 1 | -2/+10 |
| | | | | | | When we fully commit to ES 7.x we should upgrade the client library correspondingly, and then can remove these work-arounds. But for now we have one instance of ES 6.x and one ES 7.x. | ||||
* | extend ES client timeout to 25 seconds | Bryan Newbold | 2020-08-06 | 1 | -1/+1 |
| | |||||
* | Revert "remove duplicate fulltext search from query" | Bryan Newbold | 2020-07-30 | 1 | -0/+1 |
| | | | | | | This reverts commit 0d3fd83493c7307a2b9593c7add90b8b6f4b4152. Seems like we do need to query on this field for highlighting to work. | ||||
* | include container_ident in metadata completeness boost | Bryan Newbold | 2020-07-28 | 1 | -0/+1 |
| | |||||
* | search: smaller default result set | Bryan Newbold | 2020-07-27 | 1 | -1/+1 |
| | |||||
* | remove duplicate fulltext search from query | Bryan Newbold | 2020-07-27 | 1 | -1/+0 |
| | | | | | | may also remove the 'title' and 'abstracts' searches, though they currently help with boosting, and will want to measure actual preformance difference before that change | ||||
* | search: tweak 'past week' date range to not include future | Bryan Newbold | 2020-07-27 | 1 | -2/+4 |
| | |||||
* | include fulltext acknowledgements in highlighting | Bryan Newbold | 2020-07-21 | 1 | -0/+1 |
| | |||||
* | fix search filter bug (papers is default) | Bryan Newbold | 2020-06-29 | 1 | -2/+2 |
| | |||||
* | make fmt | Bryan Newbold | 2020-06-29 | 1 | -3/+3 |
| | |||||
* | note about highlight encoding in ES 7.x | Bryan Newbold | 2020-06-29 | 1 | -0/+2 |
| | |||||
* | un-collapse only to same issue, not uncollapse-all-hits | Bryan Newbold | 2020-06-29 | 1 | -9/+15 |
| | | | | | This is user expecation, and was a lingering TODO with initial implementation. | ||||
* | fix search order default label | Bryan Newbold | 2020-06-29 | 1 | -1/+1 |
| | | | | Thanks for the catch Alexis R! |