aboutsummaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
...
* make: entrez.txt, not entrez.csvBryan Newbold2020-10-081-2/+2
|
* vanished_inactive: more tolerant handling of unicode BOMBryan Newbold2020-10-081-1/+2
|
* basic ONIX XML-to-JSON converterBryan Newbold2020-10-081-0/+151
|
* fix typo in sourcesBryan Newbold2020-10-081-1/+1
|
* util: parse ISSN format with extra spacesBryan Newbold2020-09-131-0/+2
|
* update vanished journal importer for 2020-09-03 datasetBryan Newbold2020-09-136-82/+92
|
* update notes and exploreBryan Newbold2020-09-032-1/+26
|
* notes on hathitrust importerBryan Newbold2020-09-021-0/+58
|
* update sources (dates)Bryan Newbold2020-09-021-4/+4
|
* do not create hathitrust-only journal rowsBryan Newbold2020-09-021-1/+2
|
* hathitrust KBART-style importerBryan Newbold2020-09-027-2/+152
|
* commit notes on size/scale of OJS ecosystemBryan Newbold2020-08-311-0/+8
|
* include pkp_pln as a kbart directory in summarization/export/etcBryan Newbold2020-08-311-1/+1
|
* notes on PKP PLN additionBryan Newbold2020-08-311-0/+13
|
* fmtBryan Newbold2020-08-313-12/+29
|
* add makefile/sources support for PKP PLNBryan Newbold2020-08-312-2/+11
| | | | Also more accurate JSTOR URL in sources.toml
* add support for PKP PLN (KBART-like)Bryan Newbold2020-08-315-1/+139
|
* fix img typoBryan Newbold2020-08-191-1/+1
|
* bump sources dateBryan Newbold2020-08-031-2/+2
|
* fatcat export improvementsBryan Newbold2020-08-031-9/+28
|
* more blocked URLs and domainsBryan Newbold2020-08-031-0/+29
|
* directories: all extra metadata in top-level dictBryan Newbold2020-08-034-13/+9
| | | | Had been using slug-specific sub-objects, but this was too confusing.
* sim: some flag fields as booleanBryan Newbold2020-08-031-2/+12
|
* doaj bug: wasn't setting extra directory metadataBryan Newbold2020-08-031-9/+8
|
* brief not on how many remaining missing longtail homepagesBryan Newbold2020-07-081-0/+3
|
* sources: automated updates, plus container+homepage stats/statusBryan Newbold2020-07-081-4/+4
|
* update reportsBryan Newbold2020-07-082-6/+1245
|
* remove trailing whitespace from commentBryan Newbold2020-06-251-7/+7
|
* small improvements to check URL scriptBryan Newbold2020-06-251-2/+2
|
* improvements to Makefile stats/status commandsBryan Newbold2020-06-251-2/+2
|
* update TODOBryan Newbold2020-06-231-21/+15
|
* update notes about longtail homepage URLsBryan Newbold2020-06-232-3/+112
|
* updated report HTMLBryan Newbold2020-06-231-0/+1172
|
* add MAG importer; reorder directory class listingBryan Newbold2020-06-235-10/+110
|
* block some meta stringsBryan Newbold2020-06-231-0/+3
|
* skip umi.com in addition to www.umi.comBryan Newbold2020-06-231-0/+1
|
* commit notes and issnl_prefix.py helper scriptBryan Newbold2020-06-234-0/+157
|
* road: proper language parsingBryan Newbold2020-06-231-2/+6
|
* ensure lang is len()==2; prep for original_name columnBryan Newbold2020-06-231-0/+5
|
* make fmtBryan Newbold2020-06-231-34/+39
|
* update sources snapshotBryan Newbold2020-06-231-2/+2
|
* flake8: ignore comment w/o spaceBryan Newbold2020-06-231-1/+1
|
* expand test coverage to kbart, summarizeBryan Newbold2020-06-235-49/+102
|
* tests and fixes for parse_lang(), parse_country()Bryan Newbold2020-06-231-19/+78
| | | | These were basically entirely broken. Oof!
* block/skip more homepage patternsBryan Newbold2020-06-231-0/+9
|
* fix langs inclusion in summarization; remove unused/duplicate fieldsBryan Newbold2020-06-231-2/+2
|
* strip control characters from titles (issn_meta)Bryan Newbold2020-06-231-0/+4
|
* fix issn_meta country detectionBryan Newbold2020-06-231-5/+8
|
* improve lang parsingBryan Newbold2020-06-235-7/+11
|
* issn_meta: mainTitle can be an arrayBryan Newbold2020-06-231-1/+4
|