Commit message (Collapse) | Author | Age | Files | Lines | |
---|---|---|---|---|---|
* | add openalex directory source | Bryan Newbold | 2021-11-22 | 2 | -0/+70 |
| | | | | | | Always run as day-specific ("TODAY") commands. Add timeouts so command actually completes reasonably. | ||||
* | make fmt | Bryan Newbold | 2021-04-23 | 8 | -11/+28 |
| | |||||
* | doaj: updates for new file format; removed some fields/metadata | Bryan Newbold | 2021-04-23 | 1 | -55/+43 |
| | |||||
* | SIM: cap maximum year of coverage | Bryan Newbold | 2020-12-07 | 1 | -0/+3 |
| | |||||
* | database support for scholarsportal and cariniana preservation holdings | Bryan Newbold | 2020-10-08 | 3 | -1/+72 |
| | |||||
* | vanished_inactive: more tolerant handling of unicode BOM | Bryan Newbold | 2020-10-08 | 1 | -1/+2 |
| | |||||
* | util: parse ISSN format with extra spaces | Bryan Newbold | 2020-09-13 | 1 | -0/+2 |
| | |||||
* | update vanished journal importer for 2020-09-03 dataset | Bryan Newbold | 2020-09-13 | 2 | -30/+18 |
| | |||||
* | do not create hathitrust-only journal rows | Bryan Newbold | 2020-09-02 | 1 | -1/+2 |
| | |||||
* | hathitrust KBART-style importer | Bryan Newbold | 2020-09-02 | 4 | -2/+106 |
| | |||||
* | include pkp_pln as a kbart directory in summarization/export/etc | Bryan Newbold | 2020-08-31 | 1 | -1/+1 |
| | |||||
* | fmt | Bryan Newbold | 2020-08-31 | 3 | -12/+29 |
| | |||||
* | add support for PKP PLN (KBART-like) | Bryan Newbold | 2020-08-31 | 3 | -1/+57 |
| | |||||
* | fatcat export improvements | Bryan Newbold | 2020-08-03 | 1 | -9/+28 |
| | |||||
* | more blocked URLs and domains | Bryan Newbold | 2020-08-03 | 1 | -0/+29 |
| | |||||
* | directories: all extra metadata in top-level dict | Bryan Newbold | 2020-08-03 | 4 | -13/+9 |
| | | | | Had been using slug-specific sub-objects, but this was too confusing. | ||||
* | sim: some flag fields as boolean | Bryan Newbold | 2020-08-03 | 1 | -2/+12 |
| | |||||
* | doaj bug: wasn't setting extra directory metadata | Bryan Newbold | 2020-08-03 | 1 | -9/+8 |
| | |||||
* | remove trailing whitespace from comment | Bryan Newbold | 2020-06-25 | 1 | -7/+7 |
| | |||||
* | add MAG importer; reorder directory class listing | Bryan Newbold | 2020-06-23 | 2 | -10/+73 |
| | |||||
* | block some meta strings | Bryan Newbold | 2020-06-23 | 1 | -0/+3 |
| | |||||
* | skip umi.com in addition to www.umi.com | Bryan Newbold | 2020-06-23 | 1 | -0/+1 |
| | |||||
* | road: proper language parsing | Bryan Newbold | 2020-06-23 | 1 | -2/+6 |
| | |||||
* | ensure lang is len()==2; prep for original_name column | Bryan Newbold | 2020-06-23 | 1 | -0/+5 |
| | |||||
* | make fmt | Bryan Newbold | 2020-06-23 | 1 | -34/+39 |
| | |||||
* | tests and fixes for parse_lang(), parse_country() | Bryan Newbold | 2020-06-23 | 1 | -19/+78 |
| | | | | These were basically entirely broken. Oof! | ||||
* | block/skip more homepage patterns | Bryan Newbold | 2020-06-23 | 1 | -0/+9 |
| | |||||
* | fix langs inclusion in summarization; remove unused/duplicate fields | Bryan Newbold | 2020-06-23 | 1 | -2/+2 |
| | |||||
* | strip control characters from titles (issn_meta) | Bryan Newbold | 2020-06-23 | 1 | -0/+4 |
| | |||||
* | fix issn_meta country detection | Bryan Newbold | 2020-06-23 | 1 | -5/+8 |
| | |||||
* | improve lang parsing | Bryan Newbold | 2020-06-23 | 5 | -7/+11 |
| | |||||
* | issn_meta: mainTitle can be an array | Bryan Newbold | 2020-06-23 | 1 | -1/+4 |
| | |||||
* | set is_active flag based on directories | Bryan Newbold | 2020-06-23 | 1 | -0/+5 |
| | |||||
* | sources, ISSN-L test mappings, __init__ for recent importers | Bryan Newbold | 2020-06-23 | 1 | -0/+12 |
| | |||||
* | ZDB homepage (FIZE) scrape importer | Bryan Newbold | 2020-06-23 | 1 | -0/+34 |
| | |||||
* | australian ERA journal list importer | Bryan Newbold | 2020-06-23 | 1 | -0/+54 |
| | |||||
* | vanished journal metadata importer | Bryan Newbold | 2020-06-23 | 2 | -0/+113 |
| | |||||
* | ISSN portal metadata directory importer | Bryan Newbold | 2020-06-23 | 1 | -0/+61 |
| | |||||
* | AWOL directory importer | Bryan Newbold | 2020-06-23 | 1 | -0/+76 |
| | |||||
* | filter out more meta/index URL hosts | Bryan Newbold | 2020-06-23 | 1 | -1/+15 |
| | |||||
* | Revert "EZB color not a good proxy for OA status" | Bryan Newbold | 2020-06-23 | 1 | -0/+2 |
| | | | | | | | | I think this actually is Ok in the context of identifying longtail journals. We don't set the `is_oa` flag in release metdata based on this chocula flag. This reverts commit 9ba5b2e307c7f61f60304ba104bf3cc8424b7163. | ||||
* | new manual homepage source | Bryan Newbold | 2020-06-23 | 2 | -0/+49 |
| | |||||
* | be more careful with sherpa/romeo color summarization | Bryan Newbold | 2020-06-22 | 1 | -3/+4 |
| | |||||
* | EZB color not a good proxy for OA status | Bryan Newbold | 2020-06-22 | 1 | -2/+0 |
| | |||||
* | additional small flake8 fixes | Bryan Newbold | 2020-06-22 | 2 | -2/+2 |
| | |||||
* | flake8 cleanups | Bryan Newbold | 2020-06-22 | 7 | -17/+9 |
| | |||||
* | norwegian: fixes from bugs flake8 helped find | Bryan Newbold | 2020-06-22 | 1 | -3/+2 |
| | |||||
* | fmt (black) | Bryan Newbold | 2020-06-22 | 21 | -613/+766 |
| | |||||
* | remove un-necessary list() in iteration | Bryan Newbold | 2020-06-22 | 1 | -1/+1 |
| | |||||
* | additional OJS platform names | Bryan Newbold | 2020-06-11 | 1 | -0/+2 |
| |