summaryrefslogtreecommitdiffstats
path: root/fatcat_scholar
Commit message (Collapse)AuthorAgeFilesLines
* fix display of papers missing fulltextBryan Newbold2020-08-061-1/+1
| | | | | | I think the bug happened now that we do not serialize the pydantic structures with empty values. A better solution might be to deserialize search hits into pydantic objects before rendering.
* Revert "remove duplicate fulltext search from query"Bryan Newbold2020-07-301-0/+1
| | | | | | This reverts commit 0d3fd83493c7307a2b9593c7add90b8b6f4b4152. Seems like we do need to query on this field for highlighting to work.
* transform: catch more cases of null extraBryan Newbold2020-07-301-10/+10
| | | | Also correctly pull issne/issnp from container.extra, not release.extra.
* include container_ident in metadata completeness boostBryan Newbold2020-07-281-0/+1
|
* search: smaller default result setBryan Newbold2020-07-271-1/+1
|
* pipeline: skip grobid/pdftext lookups when no URL; prefer GROBID to pdftextBryan Newbold2020-07-271-1/+3
|
* remove duplicate fulltext search from queryBryan Newbold2020-07-271-1/+0
| | | | | | may also remove the 'title' and 'abstracts' searches, though they currently help with boosting, and will want to measure actual preformance difference before that change
* json: exclude None in output, and sort keysBryan Newbold2020-07-273-4/+4
| | | | | | | | | | These are both size/performance enhancements. Not including 'None' values will reduce document sizes on-disk and over network, particularly for intermediate objects. Sorting by key should improve compression ratios across multiple documents, both on-disk (gzip) and in elasticsearch itself: https://www.elastic.co/guide/en/elasticsearch/reference/current/tune-for-disk-usage.html#_put_fields_in_the_same_order_in_documents
* search: tweak 'past week' date range to not include futureBryan Newbold2020-07-271-2/+4
|
* abstracts: more prefixes to ignoreBryan Newbold2020-07-271-0/+3
|
* more careful watermark removalBryan Newbold2020-07-222-0/+0
|
* hide overflow link domain text (for mobile SERPs)Bryan Newbold2020-07-211-1/+1
|
* gaudy placeholder vaporwave logoBryan Newbold2020-07-214-12/+11
|
* differentiate SERP card size from other card divsBryan Newbold2020-07-212-2/+2
|
* include fulltext acknowledgements in highlightingBryan Newbold2020-07-211-0/+1
|
* ensure SIM release date parses before assigningBryan Newbold2020-07-211-1/+6
|
* strip <em> tags explicitlyBryan Newbold2020-07-211-0/+1
|
* display Szczepanski as an OA quality labelBryan Newbold2020-07-211-1/+1
|
* load issue rows: handle empty metadataBryan Newbold2020-07-211-0/+2
|
* skip partial/stub issue itemsBryan Newbold2020-07-011-0/+2
|
* tweak CSS of last commit so it worksBryan Newbold2020-06-291-1/+1
|
* at full screen width, show full thumbnailsBryan Newbold2020-06-291-0/+3
|
* fix search filter bug (papers is default)Bryan Newbold2020-06-291-2/+2
|
* handle large/bad 'first_page' metadataBryan Newbold2020-06-291-0/+3
| | | | This was causing elasticsearch indexing errors
* more conservative container_original_nameBryan Newbold2020-06-291-0/+2
|
* fix lint errors (and some small bugs)Bryan Newbold2020-06-295-27/+28
|
* seaweedfs for S3 API; pull config from dynaconfBryan Newbold2020-06-291-11/+2
|
* make fmtBryan Newbold2020-06-294-13/+22
|
* fixes to schema parsing from prodBryan Newbold2020-06-291-9/+13
|
* include GROBID-extracted abstracts in search documentsBryan Newbold2020-06-292-10/+23
|
* Search Inside -> SearchBryan Newbold2020-06-291-1/+1
|
* fix SIM highlight HTML escapesBryan Newbold2020-06-291-3/+7
| | | | Thanks to Merlijn for finding the broken examples in QA.
* recommend search filter changes on no hits pageBryan Newbold2020-06-291-0/+18
|
* note about highlight encoding in ES 7.xBryan Newbold2020-06-291-0/+2
|
* OA logo SVG file (small) (unused)Bryan Newbold2020-06-291-0/+19
| | | | via wikimedia commons. Public Domain.
* small improvements to SIM metadata mapsBryan Newbold2020-06-291-6/+11
|
* update stage and withdrawn display; tweak other result stylesBryan Newbold2020-06-291-10/+12
|
* remove confusing unlock logo from OA tagBryan Newbold2020-06-291-1/+1
|
* alt/hover text for json tag linkBryan Newbold2020-06-291-1/+1
|
* remove 'metadata' tag/link now that title goes to fatcat landingBryan Newbold2020-06-291-5/+0
|
* un-collapse only to same issue, not uncollapse-all-hitsBryan Newbold2020-06-292-10/+16
| | | | | This is user expecation, and was a lingering TODO with initial implementation.
* search titles link to fatcat.wiki landing pagesBryan Newbold2020-06-291-13/+1
| | | | Not entirely settled on this decision, but trying it for now.
* fixes to pagination displayBryan Newbold2020-06-291-1/+5
|
* fix search order default labelBryan Newbold2020-06-291-1/+1
| | | | Thanks for the catch Alexis R!
* fixes for pdf_meta dictBryan Newbold2020-06-291-1/+2
|
* remove old COVID19 thumbnail hackBryan Newbold2020-06-292-2/+2
|
* fetch pdftotext and pdf_meta from blobs, postgrestBryan Newbold2020-06-294-43/+72
| | | | | This replaces the temporary COVID-19 content hack with production content (text, thumbnail URLs) stored in postgrest and seaweedfs.
* update translation filesBryan Newbold2020-06-044-58/+76
|
* commit production work-around (temporarily)Bryan Newbold2020-06-041-1/+2
|
* first iteration of an API query helperBryan Newbold2020-06-041-0/+117
| | | | If fatcat-cli was ready, could just use that instead. Oh well!