aboutsummaryrefslogtreecommitdiffstats
path: root/guide/src/entity_webcapture.md
diff options
context:
space:
mode:
authorBryan Newbold <bnewbold@robocracy.org>2021-11-22 16:12:01 -0800
committerBryan Newbold <bnewbold@robocracy.org>2021-11-22 16:12:01 -0800
commit5c7f50b2f497692493bfa54ad4741fdc573352ae (patch)
treec20cce1884076fffe210ba28e1a569f93ed22827 /guide/src/entity_webcapture.md
parentf3bd82c0308948a63645538bdd9511a503625499 (diff)
parentdd00cec4164c1a1c31c8d9cffb92deb2e30b2211 (diff)
downloadfatcat-5c7f50b2f497692493bfa54ad4741fdc573352ae.tar.gz
fatcat-5c7f50b2f497692493bfa54ad4741fdc573352ae.zip
Merge branch 'bnewbold-content-scope'
Diffstat (limited to 'guide/src/entity_webcapture.md')
-rw-r--r--guide/src/entity_webcapture.md6
1 files changed, 6 insertions, 0 deletions
diff --git a/guide/src/entity_webcapture.md b/guide/src/entity_webcapture.md
index 8c5615fb..1b3cac55 100644
--- a/guide/src/entity_webcapture.md
+++ b/guide/src/entity_webcapture.md
@@ -29,4 +29,10 @@ Warning: This schema is not yet stable.
- `timestamp` (string, datetime): same format as CDX line timestamp (UTC, etc).
Corresponds to the overall capture timestamp. Can be the earliest of CDX
timestamps if that makes sense
+- `content_scope` (string): for situations where the webcapture does not simply
+ contain the full representation of a work (eg, HTML fulltext, for an
+ `article-journal` release), describes what that scope of coverage is. Eg,
+ `landing-page` it doesn't contain the full content. Landing pages are
+ out-of-scope for fatcat, but if they were accidentally imported, should mark
+ them as such so they aren't re-imported. Uses same vocabulary as File entity.
- `release_ids` (array of string identifiers): references to `release` entities