aboutsummaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* improve export_fatcatBryan Newbold2019-08-281-5/+22
|
* python script to fix fatcat ISSN-LsBryan Newbold2019-08-271-0/+75
|
* hand-coded corrections to invalid fatcat ISSN-LsBryan Newbold2019-08-271-88/+88
|
* current invalid fatcat ISSN-LsBryan Newbold2019-08-271-0/+118
| | | | | AKA, list of fatcat containers with an ISSN-L that isn't a valid ISSN (based on checksum)
* only fatcat_export 'valid' (syntax) ISSN-LsBryan Newbold2019-08-271-1/+1
|
* include Szczepanski in everything command (oops)Bryan Newbold2019-08-271-0/+1
|
* updated crossref title file; ISSN-L file linkBryan Newbold2019-08-273-3/+3
|
* update IA_CRAWL_FILEBryan Newbold2019-07-311-1/+1
|
* commit TODO listBryan Newbold2019-07-311-0/+37
|
* update fetch.sh with url_status filesBryan Newbold2019-07-311-0/+3
|
* webarchive_urls separate from regular URLsBryan Newbold2019-07-311-1/+21
|
* don't return 'error' for bad CDX lookupsBryan Newbold2019-07-311-1/+3
|
* add 'export_fatcat'Bryan Newbold2019-07-311-1/+51
|
* README updateBryan Newbold2019-07-311-21/+35
|
* more check_issn_urls corner-casesBryan Newbold2019-07-311-1/+5
|
* handle 'ttp://' URL prefix corner caseBryan Newbold2019-07-311-0/+2
|
* broader top-level gitignoreBryan Newbold2019-07-311-0/+25
|
* remove python 3.5 constraintBryan Newbold2019-07-312-6/+4
|
* pipenv: datasetteBryan Newbold2019-07-312-1/+145
|
* add wikidata SPARQL queryBryan Newbold2019-07-311-0/+35
|
* sqlite-notebook template for basic chocula statsBryan Newbold2019-07-312-0/+186
|
* iterate on homepage url import/statsBryan Newbold2019-07-312-21/+43
|
* more issn URL checker fixesBryan Newbold2019-07-312-11/+27
|
* major improvements to ISSN URL checkerBryan Newbold2019-07-301-20/+121
|
* import vanilla ISSN url checker scriptBryan Newbold2019-07-301-0/+52
|
* chocula: sherpa_color in summary; cleanupsBryan Newbold2019-07-303-6/+12
|
* chocula: openapcBryan Newbold2019-07-301-1/+31
|
* chocula: json exportBryan Newbold2019-07-301-0/+17
|
* chocula: fix wikidata_qid inclusionBryan Newbold2019-07-301-2/+3
|
* chocula: fix wikidata_qid inclusionBryan Newbold2019-07-302-1/+3
|
* chocula: better ISSN-L handlingBryan Newbold2019-07-304-24/+41
|
* chocula: updated fetches, new ISSN-L and DOAJ filesBryan Newbold2019-07-302-7/+10
|
* chocula: wikidata indexingBryan Newbold2019-07-301-4/+48
|
* chocula: crude publisher type bucketing; field cleanupBryan Newbold2019-07-302-40/+194
|
* shorter/simpler table namesBryan Newbold2019-07-262-9/+17
|
* chocula: more host/domain fixesBryan Newbold2019-07-261-3/+8
|
* GOLD OA parsingBryan Newbold2019-07-261-40/+54
|
* chocula: fix domain parsingBryan Newbold2019-07-261-10/+47
|
* pipenv: pytest for journal_metadataBryan Newbold2019-07-262-4/+83
|
* chocula READMEBryan Newbold2019-07-141-0/+7
|
* chocula: fetch SZ jsonBryan Newbold2019-07-141-0/+2
|
* more chocula progressBryan Newbold2019-07-142-61/+183
|
* EZB and szczepanski indexersBryan Newbold2019-07-111-45/+146
|
* chocula early workBryan Newbold2019-07-104-0/+1009
| | | | (non-functional)
* fix parse_merge_metadata.py merge_spans()Bryan Newbold2019-05-301-4/+8
|
* better KBART mergingBryan Newbold2019-05-301-4/+5
|
* initial code to handle multiple KBART spans betterBryan Newbold2019-05-301-2/+64
|
* update ISSN-L fileBryan Newbold2019-02-202-2/+6
|
* update to newer ISSN-L mappingBryan Newbold2019-01-292-2/+2
|
* improved journal metadata mungerBryan Newbold2019-01-252-100/+325
|