Commit message (Collapse) | Author | Age | Files | Lines | |
---|---|---|---|---|---|
* | add support for PKP PLN (KBART-like) | Bryan Newbold | 2020-08-31 | 5 | -1/+139 |
| | |||||
* | fix img typo | Bryan Newbold | 2020-08-19 | 1 | -1/+1 |
| | |||||
* | bump sources date | Bryan Newbold | 2020-08-03 | 1 | -2/+2 |
| | |||||
* | fatcat export improvements | Bryan Newbold | 2020-08-03 | 1 | -9/+28 |
| | |||||
* | more blocked URLs and domains | Bryan Newbold | 2020-08-03 | 1 | -0/+29 |
| | |||||
* | directories: all extra metadata in top-level dict | Bryan Newbold | 2020-08-03 | 4 | -13/+9 |
| | | | | Had been using slug-specific sub-objects, but this was too confusing. | ||||
* | sim: some flag fields as boolean | Bryan Newbold | 2020-08-03 | 1 | -2/+12 |
| | |||||
* | doaj bug: wasn't setting extra directory metadata | Bryan Newbold | 2020-08-03 | 1 | -9/+8 |
| | |||||
* | brief not on how many remaining missing longtail homepages | Bryan Newbold | 2020-07-08 | 1 | -0/+3 |
| | |||||
* | sources: automated updates, plus container+homepage stats/status | Bryan Newbold | 2020-07-08 | 1 | -4/+4 |
| | |||||
* | update reports | Bryan Newbold | 2020-07-08 | 2 | -6/+1245 |
| | |||||
* | remove trailing whitespace from comment | Bryan Newbold | 2020-06-25 | 1 | -7/+7 |
| | |||||
* | small improvements to check URL script | Bryan Newbold | 2020-06-25 | 1 | -2/+2 |
| | |||||
* | improvements to Makefile stats/status commands | Bryan Newbold | 2020-06-25 | 1 | -2/+2 |
| | |||||
* | update TODO | Bryan Newbold | 2020-06-23 | 1 | -21/+15 |
| | |||||
* | update notes about longtail homepage URLs | Bryan Newbold | 2020-06-23 | 2 | -3/+112 |
| | |||||
* | updated report HTML | Bryan Newbold | 2020-06-23 | 1 | -0/+1172 |
| | |||||
* | add MAG importer; reorder directory class listing | Bryan Newbold | 2020-06-23 | 5 | -10/+110 |
| | |||||
* | block some meta strings | Bryan Newbold | 2020-06-23 | 1 | -0/+3 |
| | |||||
* | skip umi.com in addition to www.umi.com | Bryan Newbold | 2020-06-23 | 1 | -0/+1 |
| | |||||
* | commit notes and issnl_prefix.py helper script | Bryan Newbold | 2020-06-23 | 4 | -0/+157 |
| | |||||
* | road: proper language parsing | Bryan Newbold | 2020-06-23 | 1 | -2/+6 |
| | |||||
* | ensure lang is len()==2; prep for original_name column | Bryan Newbold | 2020-06-23 | 1 | -0/+5 |
| | |||||
* | make fmt | Bryan Newbold | 2020-06-23 | 1 | -34/+39 |
| | |||||
* | update sources snapshot | Bryan Newbold | 2020-06-23 | 1 | -2/+2 |
| | |||||
* | flake8: ignore comment w/o space | Bryan Newbold | 2020-06-23 | 1 | -1/+1 |
| | |||||
* | expand test coverage to kbart, summarize | Bryan Newbold | 2020-06-23 | 5 | -49/+102 |
| | |||||
* | tests and fixes for parse_lang(), parse_country() | Bryan Newbold | 2020-06-23 | 1 | -19/+78 |
| | | | | These were basically entirely broken. Oof! | ||||
* | block/skip more homepage patterns | Bryan Newbold | 2020-06-23 | 1 | -0/+9 |
| | |||||
* | fix langs inclusion in summarization; remove unused/duplicate fields | Bryan Newbold | 2020-06-23 | 1 | -2/+2 |
| | |||||
* | strip control characters from titles (issn_meta) | Bryan Newbold | 2020-06-23 | 1 | -0/+4 |
| | |||||
* | fix issn_meta country detection | Bryan Newbold | 2020-06-23 | 1 | -5/+8 |
| | |||||
* | improve lang parsing | Bryan Newbold | 2020-06-23 | 5 | -7/+11 |
| | |||||
* | issn_meta: mainTitle can be an array | Bryan Newbold | 2020-06-23 | 1 | -1/+4 |
| | |||||
* | set is_active flag based on directories | Bryan Newbold | 2020-06-23 | 1 | -0/+5 |
| | |||||
* | sources, ISSN-L test mappings, __init__ for recent importers | Bryan Newbold | 2020-06-23 | 3 | -0/+87 |
| | |||||
* | ZDB homepage (FIZE) scrape importer | Bryan Newbold | 2020-06-23 | 2 | -0/+59 |
| | |||||
* | australian ERA journal list importer | Bryan Newbold | 2020-06-23 | 2 | -0/+79 |
| | |||||
* | vanished journal metadata importer | Bryan Newbold | 2020-06-23 | 4 | -0/+163 |
| | |||||
* | ISSN portal metadata directory importer | Bryan Newbold | 2020-06-23 | 2 | -0/+86 |
| | |||||
* | AWOL directory importer | Bryan Newbold | 2020-06-23 | 2 | -0/+101 |
| | |||||
* | new sources: issn_meta, zdb_fize | Bryan Newbold | 2020-06-23 | 2 | -0/+83 |
| | |||||
* | filter out more meta/index URL hosts | Bryan Newbold | 2020-06-23 | 1 | -1/+15 |
| | |||||
* | Revert "EZB color not a good proxy for OA status" | Bryan Newbold | 2020-06-23 | 1 | -0/+2 |
| | | | | | | | | I think this actually is Ok in the context of identifying longtail journals. We don't set the `is_oa` flag in release metdata based on this chocula flag. This reverts commit 9ba5b2e307c7f61f60304ba104bf3cc8424b7163. | ||||
* | new manual homepage source | Bryan Newbold | 2020-06-23 | 5 | -0/+93 |
| | |||||
* | be more careful with sherpa/romeo color summarization | Bryan Newbold | 2020-06-22 | 1 | -3/+4 |
| | |||||
* | EZB color not a good proxy for OA status | Bryan Newbold | 2020-06-22 | 1 | -2/+0 |
| | |||||
* | additional small flake8 fixes | Bryan Newbold | 2020-06-22 | 2 | -2/+2 |
| | |||||
* | flake8: don't do annotation warnings by default | Bryan Newbold | 2020-06-22 | 1 | -1/+2 |
| | |||||
* | makefile: coverage target | Bryan Newbold | 2020-06-22 | 1 | -0/+4 |
| |