Commit message (Collapse) | Author | Age | Files | Lines | ||
---|---|---|---|---|---|---|
... | ||||||
* | | updates to Makefile | Bryan Newbold | 2020-07-01 | 3 | -6/+33 | |
| | | ||||||
* | | reviewer: fix bugs in common code found by mypy | Bryan Newbold | 2020-07-01 | 1 | -2/+3 | |
| | | ||||||
* | | update TODO with some old examples | Bryan Newbold | 2020-07-01 | 1 | -0/+10 | |
| | | ||||||
* | | commit old example notes | Bryan Newbold | 2020-07-01 | 3 | -0/+65 | |
| | | ||||||
* | | JALC bulk edit notes from 2020-03-23 | Bryan Newbold | 2020-07-01 | 1 | -0/+23 | |
| | | ||||||
* | | commit example of an elasticsearch SQL query | Bryan Newbold | 2020-07-01 | 1 | -0/+8 | |
| | | ||||||
* | | commit old README about bulk downloads | Bryan Newbold | 2020-07-01 | 1 | -0/+40 | |
|/ | ||||||
* | CLI proposal | Bryan Newbold | 2020-06-30 | 1 | -0/+124 | |
| | ||||||
* | add new license mappings | Bryan Newbold | 2020-06-30 | 2 | -0/+27 | |
| | ||||||
* | datacite: improve license mapping | Martin Czygan | 2020-06-30 | 2 | -9/+29 | |
| | | | | via "missed potential license", refs #58 | |||||
* | Merge branch 'martin-datacite-fix-strptime-36559' into 'master' | bnewbold | 2020-06-29 | 2 | -1/+2 | |
|\ | | | | | | | | | datacite: hard cast possible date value to string See merge request webgroup/fatcat!59 | |||||
| * | datacite: hard cast possible date value to string | Martin Czygan | 2020-06-29 | 2 | -1/+2 | |
|/ | ||||||
* | remove accidentally-commited lines from rust Makefile | Bryan Newbold | 2020-06-26 | 1 | -3/+0 | |
| | ||||||
* | disallow a specific unicode character from DOIs | Bryan Newbold | 2020-06-26 | 1 | -0/+6 | |
| | ||||||
* | Merge branch 'martin-fulltext-checkbox-label' into 'master' | bnewbold | 2020-06-17 | 1 | -2/+2 | |
|\ | | | | | | | | | make fulltext-only label clickable See merge request webgroup/fatcat!58 | |||||
| * | make fulltext-only label clickable | Martin Czygan | 2020-06-16 | 1 | -2/+2 | |
|/ | ||||||
* | Merge branch 'bnewbold-better-button-links' into 'master' | Martin Czygan | 2020-06-05 | 5 | -4/+19 | |
|\ | | | | | | | | | better download button links See merge request webgroup/fatcat!57 | |||||
| * | use ES 'best_url' in file download pages | Bryan Newbold | 2020-06-04 | 2 | -2/+4 | |
| | | | | | | | | Similar to recent change for release download pages. | |||||
| * | ES schema: add best_url to file schema | Bryan Newbold | 2020-06-04 | 2 | -0/+13 | |
| | | | | | | | | | | | | | | | | | | This will increase index size (URLs are often long in our corpus, and we have many file entities), but seems worth it. Initially added `ia_url` as a second field, guaranteed to always be an *.archive.org URL, but `best_url` defaults to that anyways so didn't seem worthwhile. | |||||
| * | re-use 'best pdf url' for release green button | Bryan Newbold | 2020-06-04 | 1 | -2/+2 | |
| | | | | | | | | | | | | | | I thought this was the existing behavior, but it looks like we were just taking the first link from the first file. In the future may refactor this out even further. | |||||
* | | fix 'dev' target in python makefile | Bryan Newbold | 2020-06-04 | 1 | -1/+1 | |
|/ | ||||||
* | Merge remote-tracking branch 'origin/martin-harvest-fail-on-400' | Bryan Newbold | 2020-05-29 | 1 | -4/+0 | |
|\ | | | | | | | | | | | Manually resolved conflicts: python/fatcat_tools/harvest/doi_registrars.py | |||||
| * | harvest: fail on HTTP 400 | Martin Czygan | 2020-05-29 | 1 | -4/+0 | |
| | | | | | | | | | | | | | | | | | | In the past harvest of datacite resulted in occasional HTTP 400. Meanwhile, various API bugs have been fixed (most recently: https://github.com/datacite/lupo/pull/537, https://github.com/datacite/datacite/issues/1038). Downside of ignoring this error was that state lives in kafka, which has limited support for deletion of arbitrary messages from a topic. | |||||
* | | Merge branch 'martin-datacite-harvest-log-output' into 'master' | Martin Czygan | 2020-05-29 | 1 | -1/+1 | |
|\ \ | | | | | | | | | | | | | harvest: log the failed url See merge request webgroup/fatcat!55 | |||||
| * | | harvest: log the failed url | Martin Czygan | 2020-05-29 | 1 | -1/+1 | |
| |/ | ||||||
* | | Merge branch 'martin-datacite-harvest-test-docs' into 'master' | Martin Czygan | 2020-05-29 | 1 | -3/+3 | |
|\ \ | |/ |/| | | | | | datacite: fix test docs See merge request webgroup/fatcat!54 | |||||
| * | datacite: fix test docs | Martin Czygan | 2020-05-29 | 1 | -3/+3 | |
|/ | ||||||
* | Merge branch 'bnewbold-ingest-stage' into 'master' | Martin Czygan | 2020-05-28 | 3 | -7/+46 | |
|\ | | | | | | | | | verify release_stage in ingest importer See merge request webgroup/fatcat!52 | |||||
| * | ingest importer: check that stage is consistent with release | Bryan Newbold | 2020-05-26 | 1 | -0/+5 | |
| | | ||||||
| * | regression test for release_stage mismatch with ingest request | Bryan Newbold | 2020-05-26 | 2 | -7/+41 | |
| | | ||||||
* | | Merge branch 'bnewbold-harvest-state-next-span' into 'master' | Martin Czygan | 2020-05-27 | 5 | -7/+7 | |
|\ \ | |/ |/| | | | | | rename HarvestState.next() to HarvestState.next_span() See merge request webgroup/fatcat!53 | |||||
| * | rename HarvestState.next() to HarvestState.next_span() | Bryan Newbold | 2020-05-26 | 5 | -7/+7 | |
|/ | | | | | | | | | "span" short for "timespan" to harvest; there may be a better name to use. Motivation for this is to work around a pylint erorr that .next() was not callable. This might be a bug with pylint, but .next() is also a very generic name. | |||||
* | add work-in-progress Rust makefile | Bryan Newbold | 2020-05-26 | 2 | -2/+29 | |
| | ||||||
* | add a work-in-progress python Makefile | Bryan Newbold | 2020-05-26 | 1 | -0/+24 | |
| | ||||||
* | pylintrc: skip many spurious WTForm no-member errors | Bryan Newbold | 2020-05-26 | 1 | -0/+2 | |
| | ||||||
* | HACK: try to squelch pylint in CI | Bryan Newbold | 2020-05-26 | 1 | -2/+2 | |
| | | | | | | | | | | | | | | | | | Gitlab CI is showing lint errors like: =================================== FAILURES =================================== 6316 _______________________ [pylint] tests/harvest_state.py ________________________ 6317 E: 19,11: hs.next is not callable (not-callable) 6318 E: 33,11: hs.next is not callable (not-callable) 6319 E: 19,11: hs.next is not callable (not-callable) [...] this is confusing as we use pipenv with a lock, so I should see the exact same errors locally. This commit is a hack to try and fix this and unbreak builds until we can debug further. | |||||
* | sql: really don't double-dump requests | Bryan Newbold | 2020-05-26 | 1 | -1/+0 | |
| | | | | | | I guess we were dumping 3 times originally; already had an earlier commit that removed one row from this README (that I copypaste to CLI every time) | |||||
* | 2020-05-26 prod database size and stats | Bryan Newbold | 2020-05-26 | 2 | -0/+48 | |
| | ||||||
* | HACK: skip pylint errors on lines that seem to be fine | Bryan Newbold | 2020-05-22 | 3 | -3/+3 | |
| | | | | | It seems to be an inadvertantly ugraded version of pylint saying that these lines are not-callable. | |||||
* | run flake8 in CI | Bryan Newbold | 2020-05-22 | 1 | -0/+1 | |
| | ||||||
* | pipenv: add flake8 | Bryan Newbold | 2020-05-22 | 2 | -183/+213 | |
| | ||||||
* | Merge remote-tracking branch 'github/master' | Bryan Newbold | 2020-05-22 | 3 | -11/+11 | |
|\ | ||||||
| * | Merge pull request #55 from cclauss/patch-1 | bnewbold | 2020-05-22 | 3 | -11/+11 | |
| |\ | | | | | | | Travis CI: Lint Python code for syntax errors and undefined names | |||||
| | * | LICENSE.md: Properly capitalize brand names | Christian Clauss | 2020-05-14 | 1 | -4/+4 | |
| | | | ||||||
| | * | Delete .travis.yml | Christian Clauss | 2020-05-14 | 1 | -6/+0 | |
| | | | ||||||
| | * | Indentity is not the same this as equality in Python | Christian Clauss | 2020-05-14 | 1 | -2/+2 | |
| | | | ||||||
| | * | Indentity is not the same this as equality in Python | Christian Clauss | 2020-05-14 | 1 | -5/+5 | |
| | | | ||||||
| | * | python: 3.8 | Christian Clauss | 2020-05-13 | 1 | -0/+2 | |
| | | | ||||||
| | * | Travis CI: Lint Python code for syntax errors and undefined names | Christian Clauss | 2020-05-13 | 1 | -0/+4 | |
| |/ | ||||||
* | | importers: clarify handling of ApiException | Bryan Newbold | 2020-05-22 | 3 | -4/+10 | |
| | | | | | | | | | | | | | | | | One of these (in ingest importer pipeline) is an actual bug, the others are just changing the syntax to be more explicit/conservative. The ingest importer bug seems to have resulted in some bad file match imports; scale of impact is unknown. |