aboutsummaryrefslogtreecommitdiffstats
path: root/fatcat_scholar/sandcrawler.py
Commit message (Collapse)AuthorAgeFilesLines
* lint fixes, and run fmtBryan Newbold2021-06-021-3/+1
|
* add 'crossref' hydration to work pipelineBryan Newbold2021-06-021-0/+11
| | | | | | | | The immediate motivation is to include recent crossref refs in citation graph transforms. May also be valuable for researchers to have authoritative/publisher metadata in the bundle dumps.
* Modernize Python syntax with pyupgrade --py38-plus **/*.pyChristian Clauss2021-02-231-1/+1
|
* add basic html fulltext support to fetch pipelineBryan Newbold2020-11-181-0/+11
|
* make fmtBryan Newbold2020-06-291-1/+3
|
* fetch pdftotext and pdf_meta from blobs, postgrestBryan Newbold2020-06-291-0/+9
| | | | | This replaces the temporary COVID-19 content hack with production content (text, thumbnail URLs) stored in postgrest and seaweedfs.
* fmtBryan Newbold2020-06-041-1/+8
|
* more type annotations and fixesBryan Newbold2020-06-041-2/+2
|
* flake8 fixes (partial)Bryan Newbold2020-06-031-1/+0
|
* reformat python code with blackBryan Newbold2020-06-031-21/+14
|
* WIP on release-to-sim fetchingBryan Newbold2020-05-191-0/+75