Commit message (Collapse) | Author | Age | Files | Lines | |
---|---|---|---|---|---|
* | wip: note on memory | Martin Czygan | 2020-11-23 | 1 | -0/+3 |
| | | | | | large clusters might halt program (as they are currently kept in memory) | ||||
* | tests: find a simple format" | Martin Czygan | 2020-11-21 | 6 | -104/+166 |
| | |||||
* | get rid of static folder | Martin Czygan | 2020-11-21 | 3 | -0/+0 |
| | |||||
* | remove title log for now | Martin Czygan | 2020-11-21 | 1 | -6/+4 |
| | |||||
* | remove tmp file | Martin Czygan | 2020-11-21 | 1 | -1/+0 |
| | |||||
* | update notes | Martin Czygan | 2020-11-21 | 2 | -15/+16 |
| | |||||
* | wip: handle empty lists | Martin Czygan | 2020-11-21 | 1 | -6/+9 |
| | |||||
* | wip: datacite, figshare versions | Martin Czygan | 2020-11-21 | 1 | -6/+35 |
| | |||||
* | wip: another contrib comparison | Martin Czygan | 2020-11-20 | 1 | -14/+92 |
| | |||||
* | cleanup list | Martin Czygan | 2020-11-20 | 1 | -1/+0 |
| | |||||
* | update notes | Martin Czygan | 2020-11-20 | 1 | -15/+26 |
| | |||||
* | update notes | Martin Czygan | 2020-11-20 | 1 | -0/+111 |
| | |||||
* | verify: ignore certain types of release types for now | Martin Czygan | 2020-11-19 | 1 | -2/+4 |
| | |||||
* | update notes | Martin Czygan | 2020-11-19 | 1 | -1/+5 |
| | |||||
* | update stats | Martin Czygan | 2020-11-19 | 2 | -25/+30 |
| | |||||
* | verify: ignore ids like solv-int/9606010v1 for now | Martin Czygan | 2020-11-19 | 1 | -4/+8 |
| | |||||
* | verify: allow a larger gap | Martin Czygan | 2020-11-19 | 1 | -1/+6 |
| | |||||
* | verify: account for article/article-journal | Martin Czygan | 2020-11-19 | 1 | -1/+4 |
| | |||||
* | update verification case list | Martin Czygan | 2020-11-19 | 2 | -9/+20 |
| | |||||
* | update notes | Martin Czygan | 2020-11-19 | 2 | -12/+29 |
| | |||||
* | update notes | Martin Czygan | 2020-11-19 | 1 | -34/+43 |
| | |||||
* | ignore sample files | Martin Czygan | 2020-11-19 | 1 | -0/+3 |
| | |||||
* | update README | Martin Czygan | 2020-11-18 | 1 | -0/+58 |
| | |||||
* | verify: fix a None | Martin Czygan | 2020-11-18 | 1 | -2/+2 |
| | |||||
* | cluster: log progress | Martin Czygan | 2020-11-17 | 1 | -1/+3 |
| | |||||
* | cleanup sql stuff for now | Martin Czygan | 2020-11-17 | 1 | -13/+0 |
| | |||||
* | move blacklist to the end | Martin Czygan | 2020-11-17 | 1 | -227/+666 |
| | |||||
* | cleanup blacklist | Martin Czygan | 2020-11-17 | 1 | -1524/+1531 |
| | |||||
* | update stats | Martin Czygan | 2020-11-17 | 1 | -245/+1561 |
| | |||||
* | fix subtitle check | Martin Czygan | 2020-11-17 | 1 | -2/+11 |
| | |||||
* | extend title blacklist | Martin Czygan | 2020-11-17 | 1 | -34/+1293 |
| | |||||
* | update stats | Martin Czygan | 2020-11-17 | 1 | -9/+9 |
| | |||||
* | update blacklist | Martin Czygan | 2020-11-17 | 1 | -8/+65 |
| | |||||
* | update blacklist | Martin Czygan | 2020-11-17 | 1 | -4/+16 |
| | |||||
* | update stats | Martin Czygan | 2020-11-17 | 1 | -5/+7 |
| | |||||
* | update blacklist | Martin Czygan | 2020-11-17 | 1 | -12/+15 |
| | |||||
* | update notes | Martin Czygan | 2020-11-17 | 1 | -14/+52 |
| | |||||
* | update docs and blacklist | Martin Czygan | 2020-11-17 | 1 | -0/+28 |
| | |||||
* | update blacklists | Martin Czygan | 2020-11-17 | 1 | -2/+22 |
| | |||||
* | be less fine grained with datasets | Martin Czygan | 2020-11-17 | 1 | -1/+11 |
| | |||||
* | handle newline in titles | Martin Czygan | 2020-11-17 | 1 | -14/+10 |
| | |||||
* | update blacklist | Martin Czygan | 2020-11-17 | 1 | -1/+1 |
| | |||||
* | update blacklist | Martin Czygan | 2020-11-16 | 1 | -8/+39 |
| | |||||
* | add more blacklists | Martin Czygan | 2020-11-16 | 1 | -15/+32 |
| | |||||
* | wip: author_slug | Martin Czygan | 2020-11-15 | 1 | -2/+26 |
| | |||||
* | update title blacklist | Martin Czygan | 2020-11-14 | 1 | -0/+1 |
| | |||||
* | wip: verification and tests | Martin Czygan | 2020-11-14 | 3 | -48/+236 |
| | |||||
* | update Pipfile | Martin Czygan | 2020-11-14 | 2 | -50/+69 |
| | |||||
* | fix tests | Martin Czygan | 2020-11-13 | 4 | -55/+4 |
| | |||||
* | wip: verification | Martin Czygan | 2020-11-13 | 3 | -17/+181 |
| | | | | | | | | | | | | | Output currently (1m sample): { "unique": 916075, "too_large": 575, "dummy": 10307, "contrib_miss": 27215, "short_title": 1379, "arxiv_v": 8943 } |