aboutsummaryrefslogtreecommitdiffstats
path: root/proposals/2019_ingest.md
diff options
context:
space:
mode:
authorBryan Newbold <bnewbold@archive.org>2020-02-18 16:42:36 -0800
committerBryan Newbold <bnewbold@archive.org>2020-02-18 16:42:36 -0800
commit832a9e42bc068c1b1656526b4a2cb7108c9b8334 (patch)
tree6fb1e0a0d7403659667265175a7e9dbbf6a30ac2 /proposals/2019_ingest.md
parentf613f69a40fcc9a445f21cadd35d7c36c8061db8 (diff)
downloadsandcrawler-832a9e42bc068c1b1656526b4a2cb7108c9b8334.tar.gz
sandcrawler-832a9e42bc068c1b1656526b4a2cb7108c9b8334.zip
include rel and oa_status in ingest request 'extra'
Diffstat (limited to 'proposals/2019_ingest.md')
-rw-r--r--proposals/2019_ingest.md4
1 files changed, 4 insertions, 0 deletions
diff --git a/proposals/2019_ingest.md b/proposals/2019_ingest.md
index 0b569b0..7c73ee3 100644
--- a/proposals/2019_ingest.md
+++ b/proposals/2019_ingest.md
@@ -97,6 +97,8 @@ HTML? Or both? Let's just recrawl.
user who submitted request. eg, `fatcat-changelog`, `editor_<ident>`,
`savepapernow-web`
- `release_stage`: optional. indicates the release stage of fulltext expected to be found at this URL
+ - `rel`: optional. indicates the link type
+ - `oa_status`: optional. unpaywall schema
- `fatcat`
- `release_ident`: optional. if provided, indicates that ingest is expected
to be fulltext copy of this release (though may be a sibling release
@@ -186,6 +188,8 @@ Proposing two tables:
-- ext_ids (source/source_id sometimes enough)
-- release_ident (if ext_ids and source/source_id not specific enough; eg SPN)
-- edit_extra
+ -- rel
+ -- oa_status
-- ingest_request_source TEXT NOT NULL CHECK (octet_length(ingest_request_source) >= 1),
PRIMARY KEY (ingest_type, base_url, link_source, link_source_id)