aboutsummaryrefslogtreecommitdiffstats
path: root/tests
Commit message (Collapse)AuthorAgeFilesLines
* reformat python code with blackBryan Newbold2020-06-033-13/+19
|
* improve text scrubbingBryan Newbold2020-06-031-0/+15
| | | | | | | | | | Was going to use textpipe, but dependency was too large and failed to install with halfway modern GCC (due to CLD2 issue): https://github.com/GregBowyer/cld2-cffi/issues/12 So instead basically pulled out the clean_text function, which is quite short.
* first pass transform from pipelines to ES schemaBryan Newbold2020-05-201-1/+1
|
* initial progress on work pipelineBryan Newbold2020-05-161-2/+2
|
* crude djvu XML parsingBryan Newbold2020-05-162-0/+5158
|
* basic biblio converterBryan Newbold2020-05-161-1/+10
|
* start implementing ES transform helpersBryan Newbold2020-05-142-0/+20