Commit message (Expand) | Author | Age | Files | Lines | |
---|---|---|---|---|---|
* | update IA_CRAWL_FILE | Bryan Newbold | 2019-07-31 | 1 | -1/+1 |
* | commit TODO list | Bryan Newbold | 2019-07-31 | 1 | -0/+37 |
* | update fetch.sh with url_status files | Bryan Newbold | 2019-07-31 | 1 | -0/+3 |
* | webarchive_urls separate from regular URLs | Bryan Newbold | 2019-07-31 | 1 | -1/+21 |
* | don't return 'error' for bad CDX lookups | Bryan Newbold | 2019-07-31 | 1 | -1/+3 |
* | add 'export_fatcat' | Bryan Newbold | 2019-07-31 | 1 | -1/+51 |
* | README update | Bryan Newbold | 2019-07-31 | 1 | -21/+35 |
* | more check_issn_urls corner-cases | Bryan Newbold | 2019-07-31 | 1 | -1/+5 |
* | handle 'ttp://' URL prefix corner case | Bryan Newbold | 2019-07-31 | 1 | -0/+2 |
* | broader top-level gitignore | Bryan Newbold | 2019-07-31 | 1 | -0/+25 |
* | remove python 3.5 constraint | Bryan Newbold | 2019-07-31 | 2 | -6/+4 |
* | pipenv: datasette | Bryan Newbold | 2019-07-31 | 2 | -1/+145 |
* | add wikidata SPARQL query | Bryan Newbold | 2019-07-31 | 1 | -0/+35 |
* | sqlite-notebook template for basic chocula stats | Bryan Newbold | 2019-07-31 | 2 | -0/+186 |
* | iterate on homepage url import/stats | Bryan Newbold | 2019-07-31 | 2 | -21/+43 |
* | more issn URL checker fixes | Bryan Newbold | 2019-07-31 | 2 | -11/+27 |
* | major improvements to ISSN URL checker | Bryan Newbold | 2019-07-30 | 1 | -20/+121 |
* | import vanilla ISSN url checker script | Bryan Newbold | 2019-07-30 | 1 | -0/+52 |
* | chocula: sherpa_color in summary; cleanups | Bryan Newbold | 2019-07-30 | 3 | -6/+12 |
* | chocula: openapc | Bryan Newbold | 2019-07-30 | 1 | -1/+31 |
* | chocula: json export | Bryan Newbold | 2019-07-30 | 1 | -0/+17 |
* | chocula: fix wikidata_qid inclusion | Bryan Newbold | 2019-07-30 | 1 | -2/+3 |
* | chocula: fix wikidata_qid inclusion | Bryan Newbold | 2019-07-30 | 2 | -1/+3 |
* | chocula: better ISSN-L handling | Bryan Newbold | 2019-07-30 | 4 | -24/+41 |
* | chocula: updated fetches, new ISSN-L and DOAJ files | Bryan Newbold | 2019-07-30 | 2 | -7/+10 |
* | chocula: wikidata indexing | Bryan Newbold | 2019-07-30 | 1 | -4/+48 |
* | chocula: crude publisher type bucketing; field cleanup | Bryan Newbold | 2019-07-30 | 2 | -40/+194 |
* | shorter/simpler table names | Bryan Newbold | 2019-07-26 | 2 | -9/+17 |
* | chocula: more host/domain fixes | Bryan Newbold | 2019-07-26 | 1 | -3/+8 |
* | GOLD OA parsing | Bryan Newbold | 2019-07-26 | 1 | -40/+54 |
* | chocula: fix domain parsing | Bryan Newbold | 2019-07-26 | 1 | -10/+47 |
* | pipenv: pytest for journal_metadata | Bryan Newbold | 2019-07-26 | 2 | -4/+83 |
* | chocula README | Bryan Newbold | 2019-07-14 | 1 | -0/+7 |
* | chocula: fetch SZ json | Bryan Newbold | 2019-07-14 | 1 | -0/+2 |
* | more chocula progress | Bryan Newbold | 2019-07-14 | 2 | -61/+183 |
* | EZB and szczepanski indexers | Bryan Newbold | 2019-07-11 | 1 | -45/+146 |
* | chocula early work | Bryan Newbold | 2019-07-10 | 4 | -0/+1009 |
* | fix parse_merge_metadata.py merge_spans() | Bryan Newbold | 2019-05-30 | 1 | -4/+8 |
* | better KBART merging | Bryan Newbold | 2019-05-30 | 1 | -4/+5 |
* | initial code to handle multiple KBART spans better | Bryan Newbold | 2019-05-30 | 1 | -2/+64 |
* | update ISSN-L file | Bryan Newbold | 2019-02-20 | 2 | -2/+6 |
* | update to newer ISSN-L mapping | Bryan Newbold | 2019-01-29 | 2 | -2/+2 |
* | improved journal metadata munger | Bryan Newbold | 2019-01-25 | 2 | -100/+325 |
* | first-pass journal metadata munger | Bryan Newbold | 2019-01-24 | 5 | -0/+512 |