index
:
fatcat
bnewbold-doaj-article-harvest
bnewbold-elastic-extras
bnewbold-openapi-client-generator-v601
bnewbold-pythonclient-types
bnewbold-redoc
bnewbold-rust-gen-v5
bnewbold-sitemap
bnewbold-ubuntu-jammy
cockroach
confluent-kafka
master
preview
x-attic-auth-other-macaroon-lib
x-attic-camp
x-attic-changelog-export
x-attic-chocula
x-attic-cockroach
x-attic-golang
x-attic-more-importers
x-attic-preview
x-attic-python-rust-hacks
[no description]
about
summary
refs
log
tree
commit
diff
stats
log msg
author
committer
range
path:
root
/
python
/
fatcat_tools
/
normal.py
Commit message (
Expand
)
Author
Age
Files
Lines
*
ftfy 'fix_entities' argument has been renamed
Bryan Newbold
2021-11-02
1
-4
/
+4
*
try some type annotations
Bryan Newbold
2021-11-02
1
-9
/
+10
*
python: normalization/validation support for handle identifiers (hdl)
Bryan Newbold
2021-10-13
1
-0
/
+33
*
clean_doi() should lower-case returned DOI
Bryan Newbold
2021-06-07
1
-1
/
+4
*
normalizer: test for un-versioned arxiv_id
Bryan Newbold
2020-12-24
1
-0
/
+4
*
wikidata QID normalize helper
Bryan Newbold
2020-12-17
1
-2
/
+24
*
HACK: squash intermitent failure of detect_text_lang() test
Bryan Newbold
2020-12-11
1
-1
/
+2
*
langdetect: more text for 'zh' test case
Bryan Newbold
2020-11-20
1
-1
/
+1
*
clean DOI: ban all non-ASCII characters
Bryan Newbold
2020-11-19
1
-1
/
+4
*
normal: handle langdetect of 'zh-cn' (not len=2)
Bryan Newbold
2020-11-19
1
-0
/
+3
*
handle more non-ASCII DOI cases
Bryan Newbold
2020-11-19
1
-1
/
+3
*
more python normalizers, and move from importer common
Bryan Newbold
2020-11-19
1
-0
/
+322
*
normalizer: filter out a specific non-ASCII character in DOI
Bryan Newbold
2020-11-04
1
-1
/
+3
*
lint (flake8) tool python files
Bryan Newbold
2020-07-01
1
-1
/
+0
*
disallow a specific unicode character from DOIs
Bryan Newbold
2020-06-26
1
-0
/
+6
*
consistently use raw string prefix for regex
Bryan Newbold
2020-04-17
1
-5
/
+5
*
normal: DOI corner-case from pubmed import
Bryan Newbold
2020-01-19
1
-0
/
+9
*
do not normalize "en dash" in DOI
Martin Czygan
2020-01-17
1
-2
/
+5
*
doi parsing fixes
Bryan Newbold
2019-12-23
1
-0
/
+7
*
normalizers: clean_pmid(), and handle nulls in all other cleaners
Bryan Newbold
2019-12-23
1
-0
/
+31
*
handle more external identifiers in python
Bryan Newbold
2019-09-18
1
-14
/
+97
*
start work on 'generic' search box
Bryan Newbold
2019-06-13
1
-0
/
+95