aboutsummaryrefslogtreecommitdiffstats
path: root/proposals/2019_ingest.md
diff options
context:
space:
mode:
authorBryan Newbold <bnewbold@archive.org>2020-03-02 16:37:08 -0800
committerBryan Newbold <bnewbold@archive.org>2020-03-02 16:37:08 -0800
commitb45e1ac6638edb9d634269a343d05eff90daa31e (patch)
tree0c9e6bcedec7c782e2bbd54347a4c614077fd22f /proposals/2019_ingest.md
parent6d41261ac417c61a61d0c794fa07639f454bcd52 (diff)
downloadsandcrawler-b45e1ac6638edb9d634269a343d05eff90daa31e.tar.gz
sandcrawler-b45e1ac6638edb9d634269a343d05eff90daa31e.zip
ingest: add force_recrawl flag to skip historical wayback lookup
Diffstat (limited to 'proposals/2019_ingest.md')
-rw-r--r--proposals/2019_ingest.md1
1 files changed, 1 insertions, 0 deletions
diff --git a/proposals/2019_ingest.md b/proposals/2019_ingest.md
index 196dbea..c649809 100644
--- a/proposals/2019_ingest.md
+++ b/proposals/2019_ingest.md
@@ -98,6 +98,7 @@ HTML? Or both? Let's just recrawl.
`savepapernow-web`
- `release_stage`: optional. indicates the release stage of fulltext expected to be found at this URL
- `rel`: optional. indicates the link type
+ - `force_recrawl`: optional. if true, will always SPNv2 (won't check wayback)
- `oa_status`: optional. unpaywall schema
- `edit_extra`: additional metadata to be included in any eventual fatcat commits.
- `fatcat`