Commit message (Collapse) | Author | Age | Files | Lines | ||
---|---|---|---|---|---|---|
... | ||||||
* | update URL crawl status snapshot | Bryan Newbold | 2019-12-26 | 2 | -5/+2 | |
| | ||||||
* | add check to container stat fetch to ensure valid JSON returned | Bryan Newbold | 2019-12-26 | 1 | -1/+1 | |
| | ||||||
* | add stats and URL crawl status files | Bryan Newbold | 2019-12-24 | 2 | -2/+6 | |
| | ||||||
* | count chocula logo (yay) | Bryan Newbold | 2019-12-24 | 1 | -0/+0 | |
| | ||||||
* | example queries to run on sqlite | Bryan Newbold | 2019-12-24 | 2 | -0/+64 | |
| | ||||||
* | update README with better directions | Bryan Newbold | 2019-12-24 | 2 | -16/+48 | |
| | ||||||
* | move old scripts into subdirectory | Bryan Newbold | 2019-12-23 | 3 | -0/+0 | |
| | ||||||
* | update chocula usage of argparse | Bryan Newbold | 2019-12-23 | 1 | -14/+22 | |
| | ||||||
* | update norwegian CSV importer schema | Bryan Newbold | 2019-12-23 | 1 | -2/+4 | |
| | ||||||
* | update chocula input data files | Bryan Newbold | 2019-12-23 | 3 | -38/+35 | |
| | | | | | Including updating fetch script, README links, and chocula.py path references. | |||||
* | use newer fatcat contianer dump | Bryan Newbold | 2019-09-06 | 2 | -1/+3 | |
| | ||||||
* | filter out bad ISSN{e,p} | Bryan Newbold | 2019-09-06 | 1 | -0/+5 | |
| | | | | | Unfortunately a few hundred of these got pushed into fatcat already; will probably fix with a new fixer bot tool. | |||||
* | last name/publisher cleanups | Bryan Newbold | 2019-09-03 | 1 | -2/+6 | |
| | ||||||
* | update TODO | Bryan Newbold | 2019-09-03 | 1 | -1/+10 | |
| | ||||||
* | don't include doaj.org or NCBI homepage URLs | Bryan Newbold | 2019-09-03 | 1 | -0/+4 | |
| | ||||||
* | improve fatcat_export metadata quality | Bryan Newbold | 2019-09-03 | 1 | -3/+12 | |
| | ||||||
* | fix SZCEPANSKI typo | Bryan Newbold | 2019-09-03 | 1 | -2/+2 | |
| | ||||||
* | improve export_fatcat | Bryan Newbold | 2019-08-28 | 1 | -5/+22 | |
| | ||||||
* | python script to fix fatcat ISSN-Ls | Bryan Newbold | 2019-08-27 | 1 | -0/+75 | |
| | ||||||
* | hand-coded corrections to invalid fatcat ISSN-Ls | Bryan Newbold | 2019-08-27 | 1 | -88/+88 | |
| | ||||||
* | current invalid fatcat ISSN-Ls | Bryan Newbold | 2019-08-27 | 1 | -0/+118 | |
| | | | | | AKA, list of fatcat containers with an ISSN-L that isn't a valid ISSN (based on checksum) | |||||
* | only fatcat_export 'valid' (syntax) ISSN-Ls | Bryan Newbold | 2019-08-27 | 1 | -1/+1 | |
| | ||||||
* | include Szczepanski in everything command (oops) | Bryan Newbold | 2019-08-27 | 1 | -0/+1 | |
| | ||||||
* | updated crossref title file; ISSN-L file link | Bryan Newbold | 2019-08-27 | 3 | -3/+3 | |
| | ||||||
* | update IA_CRAWL_FILE | Bryan Newbold | 2019-07-31 | 1 | -1/+1 | |
| | ||||||
* | commit TODO list | Bryan Newbold | 2019-07-31 | 1 | -0/+37 | |
| | ||||||
* | update fetch.sh with url_status files | Bryan Newbold | 2019-07-31 | 1 | -0/+3 | |
| | ||||||
* | webarchive_urls separate from regular URLs | Bryan Newbold | 2019-07-31 | 1 | -1/+21 | |
| | ||||||
* | don't return 'error' for bad CDX lookups | Bryan Newbold | 2019-07-31 | 1 | -1/+3 | |
| | ||||||
* | add 'export_fatcat' | Bryan Newbold | 2019-07-31 | 1 | -1/+51 | |
| | ||||||
* | README update | Bryan Newbold | 2019-07-31 | 1 | -21/+35 | |
| | ||||||
* | more check_issn_urls corner-cases | Bryan Newbold | 2019-07-31 | 1 | -1/+5 | |
| | ||||||
* | handle 'ttp://' URL prefix corner case | Bryan Newbold | 2019-07-31 | 1 | -0/+2 | |
| | ||||||
* | broader top-level gitignore | Bryan Newbold | 2019-07-31 | 1 | -0/+25 | |
| | ||||||
* | remove python 3.5 constraint | Bryan Newbold | 2019-07-31 | 2 | -6/+4 | |
| | ||||||
* | pipenv: datasette | Bryan Newbold | 2019-07-31 | 2 | -1/+145 | |
| | ||||||
* | add wikidata SPARQL query | Bryan Newbold | 2019-07-31 | 1 | -0/+35 | |
| | ||||||
* | sqlite-notebook template for basic chocula stats | Bryan Newbold | 2019-07-31 | 2 | -0/+186 | |
| | ||||||
* | iterate on homepage url import/stats | Bryan Newbold | 2019-07-31 | 2 | -21/+43 | |
| | ||||||
* | more issn URL checker fixes | Bryan Newbold | 2019-07-31 | 2 | -11/+27 | |
| | ||||||
* | major improvements to ISSN URL checker | Bryan Newbold | 2019-07-30 | 1 | -20/+121 | |
| | ||||||
* | import vanilla ISSN url checker script | Bryan Newbold | 2019-07-30 | 1 | -0/+52 | |
| | ||||||
* | chocula: sherpa_color in summary; cleanups | Bryan Newbold | 2019-07-30 | 3 | -6/+12 | |
| | ||||||
* | chocula: openapc | Bryan Newbold | 2019-07-30 | 1 | -1/+31 | |
| | ||||||
* | chocula: json export | Bryan Newbold | 2019-07-30 | 1 | -0/+17 | |
| | ||||||
* | chocula: fix wikidata_qid inclusion | Bryan Newbold | 2019-07-30 | 1 | -2/+3 | |
| | ||||||
* | chocula: fix wikidata_qid inclusion | Bryan Newbold | 2019-07-30 | 2 | -1/+3 | |
| | ||||||
* | chocula: better ISSN-L handling | Bryan Newbold | 2019-07-30 | 4 | -24/+41 | |
| | ||||||
* | chocula: updated fetches, new ISSN-L and DOAJ files | Bryan Newbold | 2019-07-30 | 2 | -7/+10 | |
| | ||||||
* | chocula: wikidata indexing | Bryan Newbold | 2019-07-30 | 1 | -4/+48 | |
| |