<feed xmlns='http://www.w3.org/2005/Atom'>
<title>sandcrawler/python/tests, branch bnewbold-persist-grobid-errors</title>
<subtitle>[no description]</subtitle>
<id>https://git.bnewbold.net/sandcrawler/atom?h=bnewbold-persist-grobid-errors</id>
<link rel='self' href='https://git.bnewbold.net/sandcrawler/atom?h=bnewbold-persist-grobid-errors'/>
<link rel='alternate' type='text/html' href='https://git.bnewbold.net/sandcrawler/'/>
<updated>2020-01-17T20:22:34+00:00</updated>
<entry>
<title>ingest: add URL blocklist feature</title>
<updated>2020-01-17T20:22:34+00:00</updated>
<author>
<name>Bryan Newbold</name>
<email>bnewbold@archive.org</email>
</author>
<published>2020-01-17T20:12:05+00:00</published>
<link rel='alternate' type='text/html' href='https://git.bnewbold.net/sandcrawler/commit/?id=58f744e97c8f3f1a3472aa821f4518d7d139e850'/>
<id>urn:sha1:58f744e97c8f3f1a3472aa821f4518d7d139e850</id>
<content type='text'>
And, temporarily, block zenodo and figshare.
</content>
</entry>
<entry>
<title>clarify ingest result schema and semantics</title>
<updated>2020-01-15T21:54:02+00:00</updated>
<author>
<name>Bryan Newbold</name>
<email>bnewbold@archive.org</email>
</author>
<published>2020-01-15T21:54:02+00:00</published>
<link rel='alternate' type='text/html' href='https://git.bnewbold.net/sandcrawler/commit/?id=d06fd45e3c86cb080ad7724f3fc7575750a9cd69'/>
<id>urn:sha1:d06fd45e3c86cb080ad7724f3fc7575750a9cd69</id>
<content type='text'>
</content>
</entry>
<entry>
<title>add postgrest checks to test mocks</title>
<updated>2020-01-15T01:15:22+00:00</updated>
<author>
<name>Bryan Newbold</name>
<email>bnewbold@archive.org</email>
</author>
<published>2020-01-15T01:15:22+00:00</published>
<link rel='alternate' type='text/html' href='https://git.bnewbold.net/sandcrawler/commit/?id=6a56b922ced9013c6d09027d771dc8c4fc80421e'/>
<id>urn:sha1:6a56b922ced9013c6d09027d771dc8c4fc80421e</id>
<content type='text'>
</content>
</entry>
<entry>
<title>tests: don't use localhost as a responses mock host</title>
<updated>2020-01-15T01:15:01+00:00</updated>
<author>
<name>Bryan Newbold</name>
<email>bnewbold@archive.org</email>
</author>
<published>2020-01-15T01:15:01+00:00</published>
<link rel='alternate' type='text/html' href='https://git.bnewbold.net/sandcrawler/commit/?id=af2ffc0cea2d03a86f3ecf6c8dd0bd106a19b851'/>
<id>urn:sha1:af2ffc0cea2d03a86f3ecf6c8dd0bd106a19b851</id>
<content type='text'>
</content>
</entry>
<entry>
<title>SPNv2 doesn't support FTP; add a live test for non-revist FTP</title>
<updated>2020-01-15T00:06:19+00:00</updated>
<author>
<name>Bryan Newbold</name>
<email>bnewbold@archive.org</email>
</author>
<published>2020-01-15T00:06:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.bnewbold.net/sandcrawler/commit/?id=4bb341270907f91b0475a7cdb00a7d280a80c06c'/>
<id>urn:sha1:4bb341270907f91b0475a7cdb00a7d280a80c06c</id>
<content type='text'>
</content>
</entry>
<entry>
<title>more ftp status 226 support</title>
<updated>2020-01-15T00:05:41+00:00</updated>
<author>
<name>Bryan Newbold</name>
<email>bnewbold@archive.org</email>
</author>
<published>2020-01-15T00:05:41+00:00</published>
<link rel='alternate' type='text/html' href='https://git.bnewbold.net/sandcrawler/commit/?id=ba6f16a02cfde0e4acb499c00b456b42472c0b00'/>
<id>urn:sha1:ba6f16a02cfde0e4acb499c00b456b42472c0b00</id>
<content type='text'>
</content>
</entry>
<entry>
<title>add live tests for ftp, revisits</title>
<updated>2020-01-14T23:53:00+00:00</updated>
<author>
<name>Bryan Newbold</name>
<email>bnewbold@archive.org</email>
</author>
<published>2020-01-14T23:53:00+00:00</published>
<link rel='alternate' type='text/html' href='https://git.bnewbold.net/sandcrawler/commit/?id=21599839802b8ef3a84ffe90855f7bceaaa12a0d'/>
<id>urn:sha1:21599839802b8ef3a84ffe90855f7bceaaa12a0d</id>
<content type='text'>
</content>
</entry>
<entry>
<title>more live tests (for regressions)</title>
<updated>2020-01-11T00:04:13+00:00</updated>
<author>
<name>Bryan Newbold</name>
<email>bnewbold@archive.org</email>
</author>
<published>2020-01-11T00:04:13+00:00</published>
<link rel='alternate' type='text/html' href='https://git.bnewbold.net/sandcrawler/commit/?id=89abcd4da267665d363e558ab54ec3272d67c6e4'/>
<id>urn:sha1:89abcd4da267665d363e558ab54ec3272d67c6e4</id>
<content type='text'>
</content>
</entry>
<entry>
<title>refactor ingest to a loop, allowing multiple hops</title>
<updated>2020-01-10T01:31:08+00:00</updated>
<author>
<name>Bryan Newbold</name>
<email>bnewbold@archive.org</email>
</author>
<published>2020-01-10T01:31:08+00:00</published>
<link rel='alternate' type='text/html' href='https://git.bnewbold.net/sandcrawler/commit/?id=24185837a47f305757a5c783b95ca25b709f66e3'/>
<id>urn:sha1:24185837a47f305757a5c783b95ca25b709f66e3</id>
<content type='text'>
</content>
</entry>
<entry>
<title>add (skipped) live tests for wayback services</title>
<updated>2020-01-10T00:51:59+00:00</updated>
<author>
<name>Bryan Newbold</name>
<email>bnewbold@archive.org</email>
</author>
<published>2020-01-10T00:51:59+00:00</published>
<link rel='alternate' type='text/html' href='https://git.bnewbold.net/sandcrawler/commit/?id=00cf33a1c230c8ce5dcda41aba5dcc6a88264d46'/>
<id>urn:sha1:00cf33a1c230c8ce5dcda41aba5dcc6a88264d46</id>
<content type='text'>
</content>
</entry>
</feed>
