aboutsummaryrefslogtreecommitdiffstats
path: root/fatcat_scholar/search.py
Commit message (Collapse)AuthorAgeFilesLines
* add citation query feature (disabled by default)Bryan Newbold2021-01-191-14/+69
| | | | | | This is operationally complex (queries hit 3x backend services!), so not enabled by default. Will need more testing; possibly circuit-breaking. Though haproxy should provide some of that automatically at this point.
* lint: fix small bugs and type annotationsBryan Newbold2021-01-181-1/+1
|
* search: parse and embed a copy of ScholarDoc object in resultsBryan Newbold2021-01-141-1/+6
| | | | Maybe should refactor this to simply replace the object? Hrm.
* search: show fewer, shorter highlights. sort by score.Bryan Newbold2021-01-141-1/+2
|
* work around mypy complaint about exception union typeBryan Newbold2020-12-221-1/+2
|
* remove minor unused importsBryan Newbold2020-10-221-1/+0
|
* improve search logging and exception chainingBryan Newbold2020-10-211-5/+6
|
* refactor do_fulltext_search into smaller methodsBryan Newbold2020-10-161-52/+70
|
* Upgrade Dynaconf to 3+Bruno Rocha2020-10-051-1/+1
| | | | | | In dynaconf 3+ it is no more recommended to use `from dynaconf import settings` now the recommendation is to create your own instance of the settings object based on Dynaconf class.
* search: handle direct DOI and PMCID queriesBryan Newbold2020-09-171-9/+16
| | | | | | If query is a single token which looks like a valid PMCID or DOI, with no surrounding quotes, then expand scope and filter to that single external identifier.
* use container_name, not container_ident, in boostBryan Newbold2020-08-121-1/+1
| | | | | This should result in SIM page fulltext matches not getting pushed down as much, as well as things like biorxiv (*rxiv) results.
* fmt/lint tweaksBryan Newbold2020-08-121-5/+2
|
* search: include 'article' in papers filterBryan Newbold2020-08-121-1/+1
|
* search: use simplified query for highlightingBryan Newbold2020-08-121-1/+8
| | | | | | | | This fixes broken phrase query highlighting. I found this issues but it may have been unrelated: https://github.com/elastic/elasticsearch/issues/40227
* re-use ES sync API clientBryan Newbold2020-08-061-3/+4
|
* report ES API query time as server-timing headerBryan Newbold2020-08-061-0/+4
|
* add debug mode flag (to control json tag/link)Bryan Newbold2020-08-061-0/+1
|
* make fmtBryan Newbold2020-08-061-14/+14
|
* microfilm access filter; broader access matchingBryan Newbold2020-08-061-3/+6
|
* fix acknowledgement highlighting (typo)Bryan Newbold2020-08-061-1/+1
|
* reduce title boost; use only base query for highlightingBryan Newbold2020-08-061-1/+2
|
* special case '*' queriesBryan Newbold2020-08-061-6/+16
| | | | | More/better query parsing in the client could detect if this was a "filter only" query and do the same kind of optimization.
* remove 'title' from poor metadata scoringBryan Newbold2020-08-061-1/+0
|
* better time ranges (don't search future)Bryan Newbold2020-08-061-4/+7
|
* add title back to match queryBryan Newbold2020-08-061-0/+1
|
* query fewer fields; highlight all fulltext fields regardless of matchBryan Newbold2020-08-061-3/+1
|
* search tweaks to be forwards-compatible with ES 7.xBryan Newbold2020-08-061-2/+10
| | | | | | When we fully commit to ES 7.x we should upgrade the client library correspondingly, and then can remove these work-arounds. But for now we have one instance of ES 6.x and one ES 7.x.
* extend ES client timeout to 25 secondsBryan Newbold2020-08-061-1/+1
|
* Revert "remove duplicate fulltext search from query"Bryan Newbold2020-07-301-0/+1
| | | | | | This reverts commit 0d3fd83493c7307a2b9593c7add90b8b6f4b4152. Seems like we do need to query on this field for highlighting to work.
* include container_ident in metadata completeness boostBryan Newbold2020-07-281-0/+1
|
* search: smaller default result setBryan Newbold2020-07-271-1/+1
|
* remove duplicate fulltext search from queryBryan Newbold2020-07-271-1/+0
| | | | | | may also remove the 'title' and 'abstracts' searches, though they currently help with boosting, and will want to measure actual preformance difference before that change
* search: tweak 'past week' date range to not include futureBryan Newbold2020-07-271-2/+4
|
* include fulltext acknowledgements in highlightingBryan Newbold2020-07-211-0/+1
|
* fix search filter bug (papers is default)Bryan Newbold2020-06-291-2/+2
|
* make fmtBryan Newbold2020-06-291-3/+3
|
* note about highlight encoding in ES 7.xBryan Newbold2020-06-291-0/+2
|
* un-collapse only to same issue, not uncollapse-all-hitsBryan Newbold2020-06-291-9/+15
| | | | | This is user expecation, and was a lingering TODO with initial implementation.
* fix search order default labelBryan Newbold2020-06-291-1/+1
| | | | Thanks for the catch Alexis R!
* fix OA filterBryan Newbold2020-06-041-1/+1
|
* collapse pages by SIM issueBryan Newbold2020-06-041-3/+25
|
* fmtBryan Newbold2020-06-041-0/+2
|
* start some annotaition fixes for pytypeBryan Newbold2020-06-031-3/+6
|
* flake8-annotation lintingBryan Newbold2020-06-031-1/+1
| | | | Added some new annotations; need to finish more.
* flake8 fixes (partial)Bryan Newbold2020-06-031-8/+6
|
* reformat python code with blackBryan Newbold2020-06-031-30/+40
|
* compute and use tagsBryan Newbold2020-06-031-2/+1
|
* change availability filter phrasing; default to fulltextBryan Newbold2020-06-031-6/+6
|
* make mypy happyBryan Newbold2020-05-211-1/+1
|
* implement crude availability filterBryan Newbold2020-05-211-0/+11
|