aboutsummaryrefslogtreecommitdiffstats
path: root/chocula/database.py
Commit message (Expand)AuthorAgeFilesLines
* database: work around annoying ISSN-L column issueBryan Newbold2022-07-291-1/+1
* more publisher_type pattern matchingBryan Newbold2022-07-211-2/+2
* more homepage domains to ignore (and resort)Bryan Newbold2022-07-211-28/+33
* in fatcat exports, skip 'UNKNOWN_TITLE'Bryan Newbold2021-11-301-0/+5
* handle homepage check with no status (skip, etc)Bryan Newbold2021-11-301-1/+1
* simplify homepage URL handling code a bitBryan Newbold2021-11-301-12/+14
* improve homepage URL filteringBryan Newbold2021-11-301-14/+28
* more HomepageUrl filteringBryan Newbold2021-11-241-0/+3
* make fmtBryan Newbold2021-04-231-1/+5
* database support for scholarsportal and cariniana preservation holdingsBryan Newbold2020-10-081-0/+2
* do not create hathitrust-only journal rowsBryan Newbold2020-09-021-1/+2
* hathitrust KBART-style importerBryan Newbold2020-09-021-1/+8
* include pkp_pln as a kbart directory in summarization/export/etcBryan Newbold2020-08-311-1/+1
* fmtBryan Newbold2020-08-311-8/+21
* fatcat export improvementsBryan Newbold2020-08-031-9/+28
* more blocked URLs and domainsBryan Newbold2020-08-031-0/+29
* directories: all extra metadata in top-level dictBryan Newbold2020-08-031-7/+3
* skip umi.com in addition to www.umi.comBryan Newbold2020-06-231-0/+1
* ensure lang is len()==2; prep for original_name columnBryan Newbold2020-06-231-0/+5
* block/skip more homepage patternsBryan Newbold2020-06-231-0/+9
* fix langs inclusion in summarization; remove unused/duplicate fieldsBryan Newbold2020-06-231-2/+2
* set is_active flag based on directoriesBryan Newbold2020-06-231-0/+5
* filter out more meta/index URL hostsBryan Newbold2020-06-231-1/+15
* Revert "EZB color not a good proxy for OA status"Bryan Newbold2020-06-231-0/+2
* be more careful with sherpa/romeo color summarizationBryan Newbold2020-06-221-3/+4
* EZB color not a good proxy for OA statusBryan Newbold2020-06-221-2/+0
* flake8 cleanupsBryan Newbold2020-06-221-3/+1
* fmt (black)Bryan Newbold2020-06-221-248/+356
* remove un-necessary list() in iterationBryan Newbold2020-06-221-1/+1
* use and pass-through 'platform' extra metadataBryan Newbold2020-06-111-4/+7
* add KBART parsing/importingBryan Newbold2020-06-021-51/+9
* fix tests and type annotationsBryan Newbold2020-06-011-22/+21
* 'everything' at least partially workingBryan Newbold2020-06-011-107/+35
* update code to work with new config structureBryan Newbold2020-05-071-2/+2
* start a MakefileBryan Newbold2020-05-071-499/+254
* rename chocula.databaseBryan Newbold2020-05-061-0/+1015