<feed xmlns='http://www.w3.org/2005/Atom'>
<title>sandcrawler/python, branch trawler</title>
<subtitle>[no description]</subtitle>
<id>https://git.bnewbold.net/sandcrawler/atom?h=trawler</id>
<link rel='self' href='https://git.bnewbold.net/sandcrawler/atom?h=trawler'/>
<link rel='alternate' type='text/html' href='https://git.bnewbold.net/sandcrawler/'/>
<updated>2021-12-08T03:44:53+00:00</updated>
<entry>
<title>grobid: set a maximum file size (256 MByte)</title>
<updated>2021-12-08T03:44:53+00:00</updated>
<author>
<name>Bryan Newbold</name>
<email>bnewbold@archive.org</email>
</author>
<published>2021-12-08T03:44:53+00:00</published>
<link rel='alternate' type='text/html' href='https://git.bnewbold.net/sandcrawler/commit/?id=89b5f51e57d3a0cc043640262e396e28297e7c00'/>
<id>urn:sha1:89b5f51e57d3a0cc043640262e396e28297e7c00</id>
<content type='text'>
</content>
</entry>
<entry>
<title>worker: add kafka_group_suffix option</title>
<updated>2021-12-08T03:10:23+00:00</updated>
<author>
<name>Bryan Newbold</name>
<email>bnewbold@archive.org</email>
</author>
<published>2021-12-08T03:09:54+00:00</published>
<link rel='alternate' type='text/html' href='https://git.bnewbold.net/sandcrawler/commit/?id=833f9bb5181419ca9f5af0f9ba0e2e047ee164d4'/>
<id>urn:sha1:833f9bb5181419ca9f5af0f9ba0e2e047ee164d4</id>
<content type='text'>
</content>
</entry>
<entry>
<title>ingest tool: allow configuration of GROBID endpoint</title>
<updated>2021-12-08T03:10:23+00:00</updated>
<author>
<name>Bryan Newbold</name>
<email>bnewbold@archive.org</email>
</author>
<published>2021-12-04T00:38:28+00:00</published>
<link rel='alternate' type='text/html' href='https://git.bnewbold.net/sandcrawler/commit/?id=5c82ee1b965e1f3901294c752d8b2d24c6bdc974'/>
<id>urn:sha1:5c82ee1b965e1f3901294c752d8b2d24c6bdc974</id>
<content type='text'>
</content>
</entry>
<entry>
<title>Revert "pipenv: update deps"</title>
<updated>2021-12-02T01:37:19+00:00</updated>
<author>
<name>Bryan Newbold</name>
<email>bnewbold@archive.org</email>
</author>
<published>2021-12-02T01:37:10+00:00</published>
<link rel='alternate' type='text/html' href='https://git.bnewbold.net/sandcrawler/commit/?id=6777004f20f742134105c18d6bae06d0ce362d50'/>
<id>urn:sha1:6777004f20f742134105c18d6bae06d0ce362d50</id>
<content type='text'>
This reverts commit 7a5b203dbb37958a452eb1be3bd1bf8ed94cbbce.

There is a problem with `internetarchive` 2.2.0, so reverting for now.
</content>
</entry>
<entry>
<title>pipenv: update deps</title>
<updated>2021-12-02T01:14:20+00:00</updated>
<author>
<name>Bryan Newbold</name>
<email>bnewbold@archive.org</email>
</author>
<published>2021-12-02T01:14:20+00:00</published>
<link rel='alternate' type='text/html' href='https://git.bnewbold.net/sandcrawler/commit/?id=7a5b203dbb37958a452eb1be3bd1bf8ed94cbbce'/>
<id>urn:sha1:7a5b203dbb37958a452eb1be3bd1bf8ed94cbbce</id>
<content type='text'>
</content>
</entry>
<entry>
<title>add CDX sha1hex lookup/fetch helper script</title>
<updated>2021-11-30T23:29:41+00:00</updated>
<author>
<name>Bryan Newbold</name>
<email>bnewbold@archive.org</email>
</author>
<published>2021-11-30T23:29:41+00:00</published>
<link rel='alternate' type='text/html' href='https://git.bnewbold.net/sandcrawler/commit/?id=0328598e3b643edd0a2033ca97c607f596dfb092'/>
<id>urn:sha1:0328598e3b643edd0a2033ca97c607f596dfb092</id>
<content type='text'>
</content>
</entry>
<entry>
<title>codespell typos in python (comments)</title>
<updated>2021-11-25T00:05:24+00:00</updated>
<author>
<name>Bryan Newbold</name>
<email>bnewbold@archive.org</email>
</author>
<published>2021-11-25T00:05:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.bnewbold.net/sandcrawler/commit/?id=dfd13be5a7ac87b8b6c186986624f97da02b8923'/>
<id>urn:sha1:dfd13be5a7ac87b8b6c186986624f97da02b8923</id>
<content type='text'>
</content>
</entry>
<entry>
<title>html_meta: actual typo in code (CSS selector) caught by codespell</title>
<updated>2021-11-25T00:04:43+00:00</updated>
<author>
<name>Bryan Newbold</name>
<email>bnewbold@archive.org</email>
</author>
<published>2021-11-25T00:04:43+00:00</published>
<link rel='alternate' type='text/html' href='https://git.bnewbold.net/sandcrawler/commit/?id=a6cfb01063da8a5172d38d2da190a25e7d070993'/>
<id>urn:sha1:a6cfb01063da8a5172d38d2da190a25e7d070993</id>
<content type='text'>
</content>
</entry>
<entry>
<title>ingest tool: new backfill mode</title>
<updated>2021-11-17T00:16:14+00:00</updated>
<author>
<name>Bryan Newbold</name>
<email>bnewbold@archive.org</email>
</author>
<published>2021-11-17T00:16:14+00:00</published>
<link rel='alternate' type='text/html' href='https://git.bnewbold.net/sandcrawler/commit/?id=b4ca684c83d77a9fc6e7844ea8c45dfcb72aacb4'/>
<id>urn:sha1:b4ca684c83d77a9fc6e7844ea8c45dfcb72aacb4</id>
<content type='text'>
</content>
</entry>
<entry>
<title>make fmt</title>
<updated>2021-11-17T00:10:08+00:00</updated>
<author>
<name>Bryan Newbold</name>
<email>bnewbold@archive.org</email>
</author>
<published>2021-11-17T00:10:08+00:00</published>
<link rel='alternate' type='text/html' href='https://git.bnewbold.net/sandcrawler/commit/?id=9d8da0ec55a3d901bf7ffad6d86fd8cc08f89b6f'/>
<id>urn:sha1:9d8da0ec55a3d901bf7ffad6d86fd8cc08f89b6f</id>
<content type='text'>
</content>
</entry>
</feed>
