Commit message (Collapse) | Author | Age | Files | Lines | |
---|---|---|---|---|---|
* | search: update 'Metadata' availablity to 'All Records' | Bryan Newbold | 2022-04-06 | 1 | -1/+1 |
| | |||||
* | bugfix: elasticsearch per-request timeout for _health (arg name) | Bryan Newbold | 2022-02-14 | 1 | -1/+1 |
| | |||||
* | increase ES default timeout to 50sec, and _health specifically to 90sec | Bryan Newbold | 2022-02-14 | 1 | -2/+4 |
| | | | | | | This is because we are getting lots of alert chunder on the health check. It might be better to revisit which endpoint is being checked... 'count' is usually fast, but might be slow during bulk indexing. | ||||
* | fix before_1927 query filter typo | Bryan Newbold | 2022-01-18 | 1 | -1/+1 |
| | |||||
* | elasticsearch: bump query timeout to 40 seconds (from 25) | Bryan Newbold | 2022-01-10 | 1 | -1/+1 |
| | |||||
* | move public domain wall to 1926 ('before 1927') | Bryan Newbold | 2022-01-05 | 1 | -3/+4 |
| | |||||
* | lint: small cleanups, mostly E711 and E713 | Bryan Newbold | 2021-10-27 | 1 | -1/+1 |
| | |||||
* | lint: remove all 'import *' uses | Bryan Newbold | 2021-10-27 | 1 | -1/+1 |
| | |||||
* | make fmt (black 21.9b0) | Bryan Newbold | 2021-10-27 | 1 | -5/+27 |
| | |||||
* | re-style imports (isort) on all core python files | Bryan Newbold | 2021-10-27 | 1 | -9/+10 |
| | |||||
* | ES: add 'preference' query param; default to '_local' in prod | Bryan Newbold | 2021-08-03 | 1 | -0/+3 |
| | |||||
* | update access redirect URL endpoints | Bryan Newbold | 2021-06-11 | 1 | -24/+1 |
| | |||||
* | make fmt | Bryan Newbold | 2021-05-17 | 1 | -1/+4 |
| | |||||
* | iterate on PDF redirect links | Bryan Newbold | 2021-05-17 | 1 | -1/+1 |
| | |||||
* | web: don't clobber user input query when parsing | Bryan Newbold | 2021-04-30 | 1 | -3/+4 |
| | | | | | | | This is intended to be a UX improvement, to avoid adding double quotes around the query a user has pasted in. This does make the "parsing" behavior less transparent. | ||||
* | iterate on access redirects and landing page implementation | Bryan Newbold | 2021-04-27 | 1 | -4/+7 |
| | | | | Small code refactors and minimal test coverage | ||||
* | web: initial implementation of work landing page and citation_pdf_url access ↵ | Bryan Newbold | 2021-04-23 | 1 | -1/+37 |
| | | | | | | | | | | | | redirect The initial intent is to have something that can be used by indexing services to pull the citation_pdf_url meta tag and bounce to a direct IA PDF access URL. For now the landing page stubs are just formatted as SERP results. Presumbably these will get re-styled at some point and include citation graph links, etc. | ||||
* | search: more aggressively skip fuzzy match exceptions | Bryan Newbold | 2021-04-12 | 1 | -5/+5 |
| | |||||
* | health check: use /<index>/_count endpoint; verify shards | Bryan Newbold | 2021-04-06 | 1 | -7/+12 |
| | | | | | In actual production verification, the /_mapping endpoint didn't seem to work. | ||||
* | change health check from .exists(index) to .mapping(index) | Bryan Newbold | 2021-04-06 | 1 | -4/+13 |
| | | | | | | | | | | | | In cases where the cluser leader node is unavilable, the health check was returning false even when the local node had full shard replicas and could return requests. A refinement of this change would be to use the /<index>/_count API endpoint to ensure that the "failed" and "skipped" shard numbers are 0 (aka, "successful == total"). However, not sure where that endpoint is exposed in the elasticsearch-py API. the CatClient method doesn't seem right. | ||||
* | make fmt | Bryan Newbold | 2021-03-29 | 1 | -0/+1 |
| | |||||
* | web and API health check endpoint | Bryan Newbold | 2021-03-29 | 1 | -0/+14 |
| | | | | | | Because scholar is primarily a search service, the endpoint does a pass-through health check to the elasticsearch backend (aka, es-public-proxy). | ||||
* | Revert undesirable changes | Christian Clauss | 2021-02-23 | 1 | -1/+1 |
| | |||||
* | Modernize Python syntax with pyupgrade --py38-plus **/*.py | Christian Clauss | 2021-02-23 | 1 | -2/+2 |
| | |||||
* | refactor ES configuration setting names | Bryan Newbold | 2021-01-25 | 1 | -2/+2 |
| | |||||
* | add permalink icon/link | Bryan Newbold | 2021-01-21 | 1 | -0/+2 |
| | |||||
* | add citation query feature (disabled by default) | Bryan Newbold | 2021-01-19 | 1 | -14/+69 |
| | | | | | | This is operationally complex (queries hit 3x backend services!), so not enabled by default. Will need more testing; possibly circuit-breaking. Though haproxy should provide some of that automatically at this point. | ||||
* | lint: fix small bugs and type annotations | Bryan Newbold | 2021-01-18 | 1 | -1/+1 |
| | |||||
* | search: parse and embed a copy of ScholarDoc object in results | Bryan Newbold | 2021-01-14 | 1 | -1/+6 |
| | | | | Maybe should refactor this to simply replace the object? Hrm. | ||||
* | search: show fewer, shorter highlights. sort by score. | Bryan Newbold | 2021-01-14 | 1 | -1/+2 |
| | |||||
* | work around mypy complaint about exception union type | Bryan Newbold | 2020-12-22 | 1 | -1/+2 |
| | |||||
* | remove minor unused imports | Bryan Newbold | 2020-10-22 | 1 | -1/+0 |
| | |||||
* | improve search logging and exception chaining | Bryan Newbold | 2020-10-21 | 1 | -5/+6 |
| | |||||
* | refactor do_fulltext_search into smaller methods | Bryan Newbold | 2020-10-16 | 1 | -52/+70 |
| | |||||
* | Upgrade Dynaconf to 3+ | Bruno Rocha | 2020-10-05 | 1 | -1/+1 |
| | | | | | | In dynaconf 3+ it is no more recommended to use `from dynaconf import settings` now the recommendation is to create your own instance of the settings object based on Dynaconf class. | ||||
* | search: handle direct DOI and PMCID queries | Bryan Newbold | 2020-09-17 | 1 | -9/+16 |
| | | | | | | If query is a single token which looks like a valid PMCID or DOI, with no surrounding quotes, then expand scope and filter to that single external identifier. | ||||
* | use container_name, not container_ident, in boost | Bryan Newbold | 2020-08-12 | 1 | -1/+1 |
| | | | | | This should result in SIM page fulltext matches not getting pushed down as much, as well as things like biorxiv (*rxiv) results. | ||||
* | fmt/lint tweaks | Bryan Newbold | 2020-08-12 | 1 | -5/+2 |
| | |||||
* | search: include 'article' in papers filter | Bryan Newbold | 2020-08-12 | 1 | -1/+1 |
| | |||||
* | search: use simplified query for highlighting | Bryan Newbold | 2020-08-12 | 1 | -1/+8 |
| | | | | | | | | This fixes broken phrase query highlighting. I found this issues but it may have been unrelated: https://github.com/elastic/elasticsearch/issues/40227 | ||||
* | re-use ES sync API client | Bryan Newbold | 2020-08-06 | 1 | -3/+4 |
| | |||||
* | report ES API query time as server-timing header | Bryan Newbold | 2020-08-06 | 1 | -0/+4 |
| | |||||
* | add debug mode flag (to control json tag/link) | Bryan Newbold | 2020-08-06 | 1 | -0/+1 |
| | |||||
* | make fmt | Bryan Newbold | 2020-08-06 | 1 | -14/+14 |
| | |||||
* | microfilm access filter; broader access matching | Bryan Newbold | 2020-08-06 | 1 | -3/+6 |
| | |||||
* | fix acknowledgement highlighting (typo) | Bryan Newbold | 2020-08-06 | 1 | -1/+1 |
| | |||||
* | reduce title boost; use only base query for highlighting | Bryan Newbold | 2020-08-06 | 1 | -1/+2 |
| | |||||
* | special case '*' queries | Bryan Newbold | 2020-08-06 | 1 | -6/+16 |
| | | | | | More/better query parsing in the client could detect if this was a "filter only" query and do the same kind of optimization. | ||||
* | remove 'title' from poor metadata scoring | Bryan Newbold | 2020-08-06 | 1 | -1/+0 |
| | |||||
* | better time ranges (don't search future) | Bryan Newbold | 2020-08-06 | 1 | -4/+7 |
| |