| Commit message (Expand) | Author | Age | Files | Lines |
* | normalizer: test for un-versioned arxiv_id | Bryan Newbold | 2020-12-24 | 1 | -0/+4 |
* | wikidata QID normalize helper | Bryan Newbold | 2020-12-17 | 1 | -2/+24 |
* | HACK: squash intermitent failure of detect_text_lang() test | Bryan Newbold | 2020-12-11 | 1 | -1/+2 |
* | langdetect: more text for 'zh' test case | Bryan Newbold | 2020-11-20 | 1 | -1/+1 |
* | clean DOI: ban all non-ASCII characters | Bryan Newbold | 2020-11-19 | 1 | -1/+4 |
* | normal: handle langdetect of 'zh-cn' (not len=2) | Bryan Newbold | 2020-11-19 | 1 | -0/+3 |
* | handle more non-ASCII DOI cases | Bryan Newbold | 2020-11-19 | 1 | -1/+3 |
* | more python normalizers, and move from importer common | Bryan Newbold | 2020-11-19 | 1 | -0/+322 |
* | normalizer: filter out a specific non-ASCII character in DOI | Bryan Newbold | 2020-11-04 | 1 | -1/+3 |
* | lint (flake8) tool python files | Bryan Newbold | 2020-07-01 | 1 | -1/+0 |
* | disallow a specific unicode character from DOIs | Bryan Newbold | 2020-06-26 | 1 | -0/+6 |
* | consistently use raw string prefix for regex | Bryan Newbold | 2020-04-17 | 1 | -5/+5 |
* | normal: DOI corner-case from pubmed import | Bryan Newbold | 2020-01-19 | 1 | -0/+9 |
* | do not normalize "en dash" in DOI | Martin Czygan | 2020-01-17 | 1 | -2/+5 |
* | doi parsing fixes | Bryan Newbold | 2019-12-23 | 1 | -0/+7 |
* | normalizers: clean_pmid(), and handle nulls in all other cleaners | Bryan Newbold | 2019-12-23 | 1 | -0/+31 |
* | handle more external identifiers in python | Bryan Newbold | 2019-09-18 | 1 | -14/+97 |
* | start work on 'generic' search box | Bryan Newbold | 2019-06-13 | 1 | -0/+95 |