diff options
author | Bryan Newbold <bnewbold@archive.org> | 2020-03-02 16:37:08 -0800 |
---|---|---|
committer | Bryan Newbold <bnewbold@archive.org> | 2020-03-02 16:37:08 -0800 |
commit | b45e1ac6638edb9d634269a343d05eff90daa31e (patch) | |
tree | 0c9e6bcedec7c782e2bbd54347a4c614077fd22f /proposals | |
parent | 6d41261ac417c61a61d0c794fa07639f454bcd52 (diff) | |
download | sandcrawler-b45e1ac6638edb9d634269a343d05eff90daa31e.tar.gz sandcrawler-b45e1ac6638edb9d634269a343d05eff90daa31e.zip |
ingest: add force_recrawl flag to skip historical wayback lookup
Diffstat (limited to 'proposals')
-rw-r--r-- | proposals/2019_ingest.md | 1 |
1 files changed, 1 insertions, 0 deletions
diff --git a/proposals/2019_ingest.md b/proposals/2019_ingest.md index 196dbea..c649809 100644 --- a/proposals/2019_ingest.md +++ b/proposals/2019_ingest.md @@ -98,6 +98,7 @@ HTML? Or both? Let's just recrawl. `savepapernow-web` - `release_stage`: optional. indicates the release stage of fulltext expected to be found at this URL - `rel`: optional. indicates the link type + - `force_recrawl`: optional. if true, will always SPNv2 (won't check wayback) - `oa_status`: optional. unpaywall schema - `edit_extra`: additional metadata to be included in any eventual fatcat commits. - `fatcat` |