aboutsummaryrefslogtreecommitdiffstats
path: root/proposals/2020_ir_importer.spn
diff options
context:
space:
mode:
authorBryan Newbold <bnewbold@robocracy.org>2023-01-04 19:55:30 -0800
committerBryan Newbold <bnewbold@robocracy.org>2023-01-04 20:18:25 -0800
commit276ac2aa24166660bc6ffe7601cee44b5d848dae (patch)
tree8a35ce06e7ab9e6755b24abc41dee1115cf62788 /proposals/2020_ir_importer.spn
parentee46c33544941a5104182a2e221e841a32cbbf78 (diff)
downloadfatcat-276ac2aa24166660bc6ffe7601cee44b5d848dae.tar.gz
fatcat-276ac2aa24166660bc6ffe7601cee44b5d848dae.zip
proposals: update status; add some old ones; consistent file names
Diffstat (limited to 'proposals/2020_ir_importer.spn')
-rw-r--r--proposals/2020_ir_importer.spn25
1 files changed, 25 insertions, 0 deletions
diff --git a/proposals/2020_ir_importer.spn b/proposals/2020_ir_importer.spn
new file mode 100644
index 00000000..ad561d7b
--- /dev/null
+++ b/proposals/2020_ir_importer.spn
@@ -0,0 +1,25 @@
+
+status: brainstorm
+
+Institutional Repository Importer
+=================================
+
+Want to import content from IRs. Same general workflow for CORE, SHARE, BASE,
+other aggregators.
+
+Filter input to only works with known/ingested fulltext.
+
+Lookup file by hash. If found, skip for now. In future might do
+mapping/matching.
+
+Lookup by primary id (eg, CORE ident). If existing, can skip if it has file, or
+add file/location directly.
+
+Two indirect lookups: by external ident (DOI, PMID), or fuzzy search match. If
+we get either of these, want to do release/work grouping correctly.
+
+1. if we are certain of IR copy stage, then compare with existing release,
+ and/or lookup entire work for releases with same stage. update release or
+ add new release under same work.
+2. not sure of IR copy stage. guess stage from sherpa/romeo color and proceed
+ to insert/update.