<feed xmlns='http://www.w3.org/2005/Atom'>
<title>sandcrawler, branch bnewbold-refactor-loggging</title>
<subtitle>[no description]</subtitle>
<id>https://git.bnewbold.net/sandcrawler/atom?h=bnewbold-refactor-loggging</id>
<link rel='self' href='https://git.bnewbold.net/sandcrawler/atom?h=bnewbold-refactor-loggging'/>
<link rel='alternate' type='text/html' href='https://git.bnewbold.net/sandcrawler/'/>
<updated>2022-07-12T22:03:29Z</updated>
<entry>
<title>WIP: refactor logging calls in ingest pipelines</title>
<updated>2022-07-12T22:03:29Z</updated>
<author>
<name>Bryan Newbold</name>
<email>bnewbold@archive.org</email>
</author>
<published>2022-07-12T22:03:29Z</published>
<link rel='alternate' type='text/html' href='https://git.bnewbold.net/sandcrawler/commit/?id=c15432c0ce52c48efabcd7e3221a5d625ef3e9d0'/>
<id>urn:sha1:c15432c0ce52c48efabcd7e3221a5d625ef3e9d0</id>
<content type='text'>
</content>
</entry>
<entry>
<title>ingest: targeted 2022-04 notes</title>
<updated>2022-07-07T20:19:40Z</updated>
<author>
<name>Bryan Newbold</name>
<email>bnewbold@archive.org</email>
</author>
<published>2022-07-07T20:19:40Z</published>
<link rel='alternate' type='text/html' href='https://git.bnewbold.net/sandcrawler/commit/?id=8f85ab294eae50e31efa9e31bb0bca1bca76cf8b'/>
<id>urn:sha1:8f85ab294eae50e31efa9e31bb0bca1bca76cf8b</id>
<content type='text'>
</content>
</entry>
<entry>
<title>stats: may 2022 ingest-by-domain stats</title>
<updated>2022-07-07T20:19:12Z</updated>
<author>
<name>Bryan Newbold</name>
<email>bnewbold@archive.org</email>
</author>
<published>2022-07-07T20:19:12Z</published>
<link rel='alternate' type='text/html' href='https://git.bnewbold.net/sandcrawler/commit/?id=bf1826f8e8d203f732cbdda008e0c5944cbdae60'/>
<id>urn:sha1:bf1826f8e8d203f732cbdda008e0c5944cbdae60</id>
<content type='text'>
</content>
</entry>
<entry>
<title>ingest: IEEE domain is blocking us</title>
<updated>2022-07-07T20:17:49Z</updated>
<author>
<name>Bryan Newbold</name>
<email>bnewbold@archive.org</email>
</author>
<published>2022-07-07T20:17:49Z</published>
<link rel='alternate' type='text/html' href='https://git.bnewbold.net/sandcrawler/commit/?id=695a80a64f02f4c23bb938ecfffeef146344841f'/>
<id>urn:sha1:695a80a64f02f4c23bb938ecfffeef146344841f</id>
<content type='text'>
</content>
</entry>
<entry>
<title>ingest: catch more ConnectionErrors (SPN, replay fetch, GROBID)</title>
<updated>2022-05-16T22:02:02Z</updated>
<author>
<name>Bryan Newbold</name>
<email>bnewbold@archive.org</email>
</author>
<published>2022-05-16T22:02:02Z</published>
<link rel='alternate' type='text/html' href='https://git.bnewbold.net/sandcrawler/commit/?id=fcc5a1648d2e49e7002ca569ed668d3318a75584'/>
<id>urn:sha1:fcc5a1648d2e49e7002ca569ed668d3318a75584</id>
<content type='text'>
</content>
</entry>
<entry>
<title>ingest: skip arxiv.org DOIs, we already direct-ingest</title>
<updated>2022-05-11T19:19:48Z</updated>
<author>
<name>Bryan Newbold</name>
<email>bnewbold@archive.org</email>
</author>
<published>2022-05-11T19:19:48Z</published>
<link rel='alternate' type='text/html' href='https://git.bnewbold.net/sandcrawler/commit/?id=1534ff4d05c6fca460e82b5707fe3fbdc3504e50'/>
<id>urn:sha1:1534ff4d05c6fca460e82b5707fe3fbdc3504e50</id>
<content type='text'>
</content>
</entry>
<entry>
<title>python make fmt</title>
<updated>2022-05-05T18:21:35Z</updated>
<author>
<name>Bryan Newbold</name>
<email>bnewbold@archive.org</email>
</author>
<published>2022-05-05T18:21:35Z</published>
<link rel='alternate' type='text/html' href='https://git.bnewbold.net/sandcrawler/commit/?id=a0214959c10a5ecb794d78b189a767ac01c0af48'/>
<id>urn:sha1:a0214959c10a5ecb794d78b189a767ac01c0af48</id>
<content type='text'>
</content>
</entry>
<entry>
<title>ingest spn2: fix tests</title>
<updated>2022-05-05T18:21:29Z</updated>
<author>
<name>Bryan Newbold</name>
<email>bnewbold@archive.org</email>
</author>
<published>2022-05-05T18:21:29Z</published>
<link rel='alternate' type='text/html' href='https://git.bnewbold.net/sandcrawler/commit/?id=21ad5cd9942044939c8203dd076ea080b6d55a61'/>
<id>urn:sha1:21ad5cd9942044939c8203dd076ea080b6d55a61</id>
<content type='text'>
</content>
</entry>
<entry>
<title>ingest: more loginwall patterns</title>
<updated>2022-05-05T18:08:52Z</updated>
<author>
<name>Bryan Newbold</name>
<email>bnewbold@archive.org</email>
</author>
<published>2022-05-05T18:08:52Z</published>
<link rel='alternate' type='text/html' href='https://git.bnewbold.net/sandcrawler/commit/?id=1f9ca570bd168154a72adcd2454b992dbc7e8d0a'/>
<id>urn:sha1:1f9ca570bd168154a72adcd2454b992dbc7e8d0a</id>
<content type='text'>
</content>
</entry>
<entry>
<title>ingest_tool: fix arg parsing</title>
<updated>2022-05-04T00:35:52Z</updated>
<author>
<name>Bryan Newbold</name>
<email>bnewbold@archive.org</email>
</author>
<published>2022-05-04T00:35:52Z</published>
<link rel='alternate' type='text/html' href='https://git.bnewbold.net/sandcrawler/commit/?id=1ec661af75f37b3ae5031851f6c452039e08503c'/>
<id>urn:sha1:1ec661af75f37b3ae5031851f6c452039e08503c</id>
<content type='text'>
</content>
</entry>
</feed>
