Commit message (Collapse) | Author | Age | Files | Lines | |
---|---|---|---|---|---|
* | make fmt | Bryan Newbold | 2021-07-26 | 1 | -5/+13 |
| | |||||
* | fix failing test after clean_doi() | Bryan Newbold | 2021-07-26 | 1 | -1/+1 |
| | |||||
* | refs transform: many fixes | Bryan Newbold | 2021-07-25 | 2 | -1/+274 |
| | | | | | | | | | - include year correctly (many cases) - test coverage for Crossref transform - pass-through 'edition' as 'version' - series-title parsed in to title or container as appropriate - missing release stage - fix 0-index vs. 1-index ref index field | ||||
* | refs transform: 1-index refs.index, not 0-index | Bryan Newbold | 2021-07-25 | 1 | -1/+1 |
| | | | | | | | | This was not matching expectations/schema of downstream refs pipeline (cgraph), and wasn't matching documented schema. Note care required when checking if the index is set, to distinguish between '0' and 'None' values. | ||||
* | refs: include (source) release_stage in output | Bryan Newbold | 2021-06-30 | 1 | -9/+18 |
| | |||||
* | commit missing elastic get example JSON files | Bryan Newbold | 2021-06-11 | 2 | -0/+174 |
| | |||||
* | update citation_pdf_url HTML meta tag to new access URL style | Bryan Newbold | 2021-06-11 | 1 | -0/+1 |
| | |||||
* | update access redirect URL endpoints | Bryan Newbold | 2021-06-11 | 1 | -19/+20 |
| | |||||
* | lint fixes, and run fmt | Bryan Newbold | 2021-06-02 | 1 | -4/+1 |
| | |||||
* | add 'crossref' hydration to work pipeline | Bryan Newbold | 2021-06-02 | 1 | -0/+16 |
| | | | | | | | | The immediate motivation is to include recent crossref refs in citation graph transforms. May also be valuable for researchers to have authoritative/publisher metadata in the bundle dumps. | ||||
* | web: fixes to access redirect endpoints | Bryan Newbold | 2021-05-19 | 1 | -0/+11 |
| | |||||
* | iterate on PDF redirect links | Bryan Newbold | 2021-05-17 | 1 | -3/+41 |
| | |||||
* | iterate on access redirects and landing page implementation | Bryan Newbold | 2021-04-27 | 2 | -0/+123 |
| | | | | Small code refactors and minimal test coverage | ||||
* | Revert undesirable changes | Christian Clauss | 2021-02-23 | 6 | -11/+11 |
| | |||||
* | Modernize Python syntax with pyupgrade --py38-plus **/*.py | Christian Clauss | 2021-02-23 | 6 | -11/+11 |
| | |||||
* | api: handle null 'q' parameter on search endpoint | Bryan Newbold | 2021-02-11 | 1 | -1/+5 |
| | |||||
* | refactor ES configuration setting names | Bryan Newbold | 2021-01-25 | 1 | -1/+1 |
| | |||||
* | api: fix /search test, and mypy error on implementation | Bryan Newbold | 2021-01-15 | 1 | -1/+11 |
| | |||||
* | add mocks to work pipeline test | Bryan Newbold | 2021-01-14 | 1 | -1/+63 |
| | |||||
* | add regression test for uvloop+httptools uvicorn problem | Bryan Newbold | 2021-01-05 | 1 | -0/+11 |
| | |||||
* | improve Accept-Language header parsing | Bryan Newbold | 2020-12-02 | 1 | -0/+4 |
| | |||||
* | fmt | Bryan Newbold | 2020-10-28 | 1 | -1/+0 |
| | |||||
* | fixes to issue_db tests | Bryan Newbold | 2020-10-23 | 1 | -6/+3 |
| | |||||
* | basic web search test | Bryan Newbold | 2020-10-23 | 2 | -1/+1701 |
| | |||||
* | basic test for issue-db pipeline | Bryan Newbold | 2020-10-23 | 3 | -0/+30 |
| | |||||
* | start test coverage for web interface | Bryan Newbold | 2020-10-22 | 2 | -0/+68 |
| | |||||
* | improve test coverage | Bryan Newbold | 2020-10-22 | 5 | -0/+72 |
| | |||||
* | minimum viable tests for GROBID XML parsing and refs transform | Bryan Newbold | 2020-09-14 | 3 | -0/+535 |
| | |||||
* | another clean_str() test case | Bryan Newbold | 2020-08-12 | 1 | -0/+4 |
| | |||||
* | transform: more string cleaning | Bryan Newbold | 2020-08-12 | 1 | -1/+19 |
| | |||||
* | scrub_text: single-token strings skipped | Bryan Newbold | 2020-08-06 | 1 | -1/+1 |
| | |||||
* | start some annotaition fixes for pytype | Bryan Newbold | 2020-06-03 | 1 | -1/+1 |
| | |||||
* | flake8-annotation linting | Bryan Newbold | 2020-06-03 | 3 | -4/+4 |
| | | | | Added some new annotations; need to finish more. | ||||
* | flake8 fixes (partial) | Bryan Newbold | 2020-06-03 | 2 | -3/+0 |
| | |||||
* | reformat python code with black | Bryan Newbold | 2020-06-03 | 3 | -13/+19 |
| | |||||
* | improve text scrubbing | Bryan Newbold | 2020-06-03 | 1 | -0/+15 |
| | | | | | | | | | | Was going to use textpipe, but dependency was too large and failed to install with halfway modern GCC (due to CLD2 issue): https://github.com/GregBowyer/cld2-cffi/issues/12 So instead basically pulled out the clean_text function, which is quite short. | ||||
* | first pass transform from pipelines to ES schema | Bryan Newbold | 2020-05-20 | 1 | -1/+1 |
| | |||||
* | initial progress on work pipeline | Bryan Newbold | 2020-05-16 | 1 | -2/+2 |
| | |||||
* | crude djvu XML parsing | Bryan Newbold | 2020-05-16 | 2 | -0/+5158 |
| | |||||
* | basic biblio converter | Bryan Newbold | 2020-05-16 | 1 | -1/+10 |
| | |||||
* | start implementing ES transform helpers | Bryan Newbold | 2020-05-14 | 2 | -0/+20 |