Commit message (Collapse) | Author | Age | Files | Lines | |
---|---|---|---|---|---|
* | update chocula usage of argparse | Bryan Newbold | 2019-12-23 | 1 | -14/+22 |
| | |||||
* | update norwegian CSV importer schema | Bryan Newbold | 2019-12-23 | 1 | -2/+4 |
| | |||||
* | update chocula input data files | Bryan Newbold | 2019-12-23 | 1 | -10/+10 |
| | | | | | Including updating fetch script, README links, and chocula.py path references. | ||||
* | use newer fatcat contianer dump | Bryan Newbold | 2019-09-06 | 1 | -1/+1 |
| | |||||
* | filter out bad ISSN{e,p} | Bryan Newbold | 2019-09-06 | 1 | -0/+5 |
| | | | | | Unfortunately a few hundred of these got pushed into fatcat already; will probably fix with a new fixer bot tool. | ||||
* | last name/publisher cleanups | Bryan Newbold | 2019-09-03 | 1 | -2/+6 |
| | |||||
* | don't include doaj.org or NCBI homepage URLs | Bryan Newbold | 2019-09-03 | 1 | -0/+4 |
| | |||||
* | improve fatcat_export metadata quality | Bryan Newbold | 2019-09-03 | 1 | -3/+12 |
| | |||||
* | fix SZCEPANSKI typo | Bryan Newbold | 2019-09-03 | 1 | -2/+2 |
| | |||||
* | improve export_fatcat | Bryan Newbold | 2019-08-28 | 1 | -5/+22 |
| | |||||
* | only fatcat_export 'valid' (syntax) ISSN-Ls | Bryan Newbold | 2019-08-27 | 1 | -1/+1 |
| | |||||
* | include Szczepanski in everything command (oops) | Bryan Newbold | 2019-08-27 | 1 | -0/+1 |
| | |||||
* | updated crossref title file; ISSN-L file link | Bryan Newbold | 2019-08-27 | 1 | -1/+1 |
| | |||||
* | update IA_CRAWL_FILE | Bryan Newbold | 2019-07-31 | 1 | -1/+1 |
| | |||||
* | webarchive_urls separate from regular URLs | Bryan Newbold | 2019-07-31 | 1 | -1/+21 |
| | |||||
* | add 'export_fatcat' | Bryan Newbold | 2019-07-31 | 1 | -1/+51 |
| | |||||
* | handle 'ttp://' URL prefix corner case | Bryan Newbold | 2019-07-31 | 1 | -0/+2 |
| | |||||
* | iterate on homepage url import/stats | Bryan Newbold | 2019-07-31 | 1 | -18/+40 |
| | |||||
* | chocula: sherpa_color in summary; cleanups | Bryan Newbold | 2019-07-30 | 1 | -5/+9 |
| | |||||
* | chocula: openapc | Bryan Newbold | 2019-07-30 | 1 | -1/+31 |
| | |||||
* | chocula: json export | Bryan Newbold | 2019-07-30 | 1 | -0/+17 |
| | |||||
* | chocula: fix wikidata_qid inclusion | Bryan Newbold | 2019-07-30 | 1 | -2/+3 |
| | |||||
* | chocula: fix wikidata_qid inclusion | Bryan Newbold | 2019-07-30 | 1 | -0/+2 |
| | |||||
* | chocula: better ISSN-L handling | Bryan Newbold | 2019-07-30 | 1 | -11/+16 |
| | |||||
* | chocula: updated fetches, new ISSN-L and DOAJ files | Bryan Newbold | 2019-07-30 | 1 | -3/+3 |
| | |||||
* | chocula: wikidata indexing | Bryan Newbold | 2019-07-30 | 1 | -4/+48 |
| | |||||
* | chocula: crude publisher type bucketing; field cleanup | Bryan Newbold | 2019-07-30 | 1 | -20/+164 |
| | |||||
* | shorter/simpler table names | Bryan Newbold | 2019-07-26 | 1 | -7/+15 |
| | |||||
* | chocula: more host/domain fixes | Bryan Newbold | 2019-07-26 | 1 | -3/+8 |
| | |||||
* | GOLD OA parsing | Bryan Newbold | 2019-07-26 | 1 | -40/+54 |
| | |||||
* | chocula: fix domain parsing | Bryan Newbold | 2019-07-26 | 1 | -10/+47 |
| | |||||
* | more chocula progress | Bryan Newbold | 2019-07-14 | 1 | -57/+171 |
| | |||||
* | EZB and szczepanski indexers | Bryan Newbold | 2019-07-11 | 1 | -45/+146 |
| | |||||
* | chocula early work | Bryan Newbold | 2019-07-10 | 1 | -0/+798 |
(non-functional) |