aboutsummaryrefslogtreecommitdiffstats
path: root/chocula/database.py
Commit message (Collapse)AuthorAgeFilesLines
* make fmtBryan Newbold2021-04-231-1/+5
|
* database support for scholarsportal and cariniana preservation holdingsBryan Newbold2020-10-081-0/+2
|
* do not create hathitrust-only journal rowsBryan Newbold2020-09-021-1/+2
|
* hathitrust KBART-style importerBryan Newbold2020-09-021-1/+8
|
* include pkp_pln as a kbart directory in summarization/export/etcBryan Newbold2020-08-311-1/+1
|
* fmtBryan Newbold2020-08-311-8/+21
|
* fatcat export improvementsBryan Newbold2020-08-031-9/+28
|
* more blocked URLs and domainsBryan Newbold2020-08-031-0/+29
|
* directories: all extra metadata in top-level dictBryan Newbold2020-08-031-7/+3
| | | | Had been using slug-specific sub-objects, but this was too confusing.
* skip umi.com in addition to www.umi.comBryan Newbold2020-06-231-0/+1
|
* ensure lang is len()==2; prep for original_name columnBryan Newbold2020-06-231-0/+5
|
* block/skip more homepage patternsBryan Newbold2020-06-231-0/+9
|
* fix langs inclusion in summarization; remove unused/duplicate fieldsBryan Newbold2020-06-231-2/+2
|
* set is_active flag based on directoriesBryan Newbold2020-06-231-0/+5
|
* filter out more meta/index URL hostsBryan Newbold2020-06-231-1/+15
|
* Revert "EZB color not a good proxy for OA status"Bryan Newbold2020-06-231-0/+2
| | | | | | | | I think this actually is Ok in the context of identifying longtail journals. We don't set the `is_oa` flag in release metdata based on this chocula flag. This reverts commit 9ba5b2e307c7f61f60304ba104bf3cc8424b7163.
* be more careful with sherpa/romeo color summarizationBryan Newbold2020-06-221-3/+4
|
* EZB color not a good proxy for OA statusBryan Newbold2020-06-221-2/+0
|
* flake8 cleanupsBryan Newbold2020-06-221-3/+1
|
* fmt (black)Bryan Newbold2020-06-221-248/+356
|
* remove un-necessary list() in iterationBryan Newbold2020-06-221-1/+1
|
* use and pass-through 'platform' extra metadataBryan Newbold2020-06-111-4/+7
|
* add KBART parsing/importingBryan Newbold2020-06-021-51/+9
|
* fix tests and type annotationsBryan Newbold2020-06-011-22/+21
|
* 'everything' at least partially workingBryan Newbold2020-06-011-107/+35
|
* update code to work with new config structureBryan Newbold2020-05-071-2/+2
|
* start a MakefileBryan Newbold2020-05-071-499/+254
| | | | | | | | | | Move all "index" functions into classes, each in a separate file. Add lots of type annotations. Use dataclass objects to hold database rows. This aspect will need further refactoring to remove "extra" usage, probably by adding database rows to align with DatabaseInfo more closely.
* rename chocula.databaseBryan Newbold2020-05-061-0/+1015