aboutsummaryrefslogtreecommitdiffstats
Commit message (Expand)AuthorAgeFilesLines
* add support for PKP PLN (KBART-like)Bryan Newbold2020-08-315-1/+139
* fix img typoBryan Newbold2020-08-191-1/+1
* bump sources dateBryan Newbold2020-08-031-2/+2
* fatcat export improvementsBryan Newbold2020-08-031-9/+28
* more blocked URLs and domainsBryan Newbold2020-08-031-0/+29
* directories: all extra metadata in top-level dictBryan Newbold2020-08-034-13/+9
* sim: some flag fields as booleanBryan Newbold2020-08-031-2/+12
* doaj bug: wasn't setting extra directory metadataBryan Newbold2020-08-031-9/+8
* brief not on how many remaining missing longtail homepagesBryan Newbold2020-07-081-0/+3
* sources: automated updates, plus container+homepage stats/statusBryan Newbold2020-07-081-4/+4
* update reportsBryan Newbold2020-07-082-6/+1245
* remove trailing whitespace from commentBryan Newbold2020-06-251-7/+7
* small improvements to check URL scriptBryan Newbold2020-06-251-2/+2
* improvements to Makefile stats/status commandsBryan Newbold2020-06-251-2/+2
* update TODOBryan Newbold2020-06-231-21/+15
* update notes about longtail homepage URLsBryan Newbold2020-06-232-3/+112
* updated report HTMLBryan Newbold2020-06-231-0/+1172
* add MAG importer; reorder directory class listingBryan Newbold2020-06-235-10/+110
* block some meta stringsBryan Newbold2020-06-231-0/+3
* skip umi.com in addition to www.umi.comBryan Newbold2020-06-231-0/+1
* commit notes and issnl_prefix.py helper scriptBryan Newbold2020-06-234-0/+157
* road: proper language parsingBryan Newbold2020-06-231-2/+6
* ensure lang is len()==2; prep for original_name columnBryan Newbold2020-06-231-0/+5
* make fmtBryan Newbold2020-06-231-34/+39
* update sources snapshotBryan Newbold2020-06-231-2/+2
* flake8: ignore comment w/o spaceBryan Newbold2020-06-231-1/+1
* expand test coverage to kbart, summarizeBryan Newbold2020-06-235-49/+102
* tests and fixes for parse_lang(), parse_country()Bryan Newbold2020-06-231-19/+78
* block/skip more homepage patternsBryan Newbold2020-06-231-0/+9
* fix langs inclusion in summarization; remove unused/duplicate fieldsBryan Newbold2020-06-231-2/+2
* strip control characters from titles (issn_meta)Bryan Newbold2020-06-231-0/+4
* fix issn_meta country detectionBryan Newbold2020-06-231-5/+8
* improve lang parsingBryan Newbold2020-06-235-7/+11
* issn_meta: mainTitle can be an arrayBryan Newbold2020-06-231-1/+4
* set is_active flag based on directoriesBryan Newbold2020-06-231-0/+5
* sources, ISSN-L test mappings, __init__ for recent importersBryan Newbold2020-06-233-0/+87
* ZDB homepage (FIZE) scrape importerBryan Newbold2020-06-232-0/+59
* australian ERA journal list importerBryan Newbold2020-06-232-0/+79
* vanished journal metadata importerBryan Newbold2020-06-234-0/+163
* ISSN portal metadata directory importerBryan Newbold2020-06-232-0/+86
* AWOL directory importerBryan Newbold2020-06-232-0/+101
* new sources: issn_meta, zdb_fizeBryan Newbold2020-06-232-0/+83
* filter out more meta/index URL hostsBryan Newbold2020-06-231-1/+15
* Revert "EZB color not a good proxy for OA status"Bryan Newbold2020-06-231-0/+2
* new manual homepage sourceBryan Newbold2020-06-235-0/+93
* be more careful with sherpa/romeo color summarizationBryan Newbold2020-06-221-3/+4
* EZB color not a good proxy for OA statusBryan Newbold2020-06-221-2/+0
* additional small flake8 fixesBryan Newbold2020-06-222-2/+2
* flake8: don't do annotation warnings by defaultBryan Newbold2020-06-221-1/+2
* makefile: coverage targetBryan Newbold2020-06-221-0/+4