aboutsummaryrefslogtreecommitdiffstats
path: root/fatcat_covid19/transform.py
Commit message (Collapse)AuthorAgeFilesLines
* handle ext_ids without _id in release schemaBryan Newbold2020-04-091-4/+7
|
* attempt somewhat more robust abstract cleaningBryan Newbold2020-04-091-7/+4
| | | | | | Note: there is still a security and robustness issue here in that highlights are marked "safe". Should come up with a better mechanism for escaping/safing.
* transform: remove more tags from abstractsBryan Newbold2020-04-091-1/+1
|
* transform hacks for new fatcat documentsBryan Newbold2020-04-091-1/+16
|
* small search tweaks and fixesBryan Newbold2020-04-081-1/+1
|
* special-case arxiv/medrxiv/biorxiv container namesBryan Newbold2020-04-081-0/+11
|
* transform: try to cleanup abstractsBryan Newbold2020-04-081-3/+31
|
* include ia_pdf_url when availableBryan Newbold2020-04-031-0/+4
|
* fixes from prodBryan Newbold2020-04-031-2/+3
|
* refactor elastic transform into CLI toolBryan Newbold2020-04-031-0/+204