Commit message (Collapse) | Author | Age | Files | Lines | |
---|---|---|---|---|---|
* | html extract: protocols.io, fix americanarchivist | Bryan Newbold | 2020-01-10 | 1 | -1/+7 |
| | |||||
* | more ingest HTML extraction hacks | Bryan Newbold | 2020-01-10 | 1 | -6/+46 |
| | |||||
* | many publisher-specific ingest improvements | Bryan Newbold | 2020-01-10 | 1 | -4/+96 |
| | |||||
* | fill in more html extraction techniques | Bryan Newbold | 2020-01-09 | 1 | -7/+6 |
| | |||||
* | refactor: use print(..., file=sys.stderr) | Bryan Newbold | 2019-12-18 | 1 | -1/+1 |
| | | | | Should use logging soon, but this seems more idiomatic in the meanwhile. | ||||
* | start of hrmars.com ingest support | Bryan Newbold | 2019-11-14 | 1 | -0/+2 |
| | |||||
* | citation_pdf_url with host-relative URLs | Bryan Newbold | 2019-11-13 | 1 | -1/+3 |
| | |||||
* | more progress on file ingest | Bryan Newbold | 2019-11-13 | 1 | -0/+19 |
| | |||||
* | much progress on file ingest path | Bryan Newbold | 2019-10-22 | 1 | -0/+73 |