summaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
-rw-r--r--notes/basic_ui.txt19
-rw-r--r--notes/initial_sources.txt28
-rw-r--r--notes/libs.txt7
-rw-r--r--rfc.md18
4 files changed, 58 insertions, 14 deletions
diff --git a/notes/basic_ui.txt b/notes/basic_ui.txt
index 38f194d3..12ad1ecc 100644
--- a/notes/basic_ui.txt
+++ b/notes/basic_ui.txt
@@ -27,9 +27,26 @@ Issue / Volume / Pages / Chapter
Known file / copy / url
Citations (outbound)
+# Queries / Searches / Views
-# Other Workflows
+Views: work, release, creator, container, publisher
+Lookup by identifier
+
+# Other Workflows/Editors
+
+Single-creator-oriented helper to find works and disambiguate authorship
+
+Bulk author disambiguation helper (find other unresolved authors with same
+alias text and select; drag works between columns)
+
+Bulk query-then-edit UI: search results in a table, edit like a spreadsheet, up
+to... dozens? Query and then apply delta (eg, set topic)? Eg, author edits
+basic metadata for all their citations all at once.
+
+Release editor
+
+Merge containers (and all related releases)
Merge entities (works, releases, etc)
Move release between works
Split entities (works, authors, etc), including linked stuff
diff --git a/notes/initial_sources.txt b/notes/initial_sources.txt
index d2e18ff6..a68fb982 100644
--- a/notes/initial_sources.txt
+++ b/notes/initial_sources.txt
@@ -1,11 +1,19 @@
-dblp
-crossref
-arxiv
-opencitations
-CORE
-oaDOI
-medline
-openlibrary
-archive.org paper/url manifest
-semantic scholar
+Probably start with:
+
+ crossref (including citations)
+ arxiv
+ medline
+
+then merge in:
+
+ dblp
+ CORE
+ oaDOI
+ archive.org paper/url manifest
+ semantic scholar
+
+and later:
+
+ opencitations
+ openlibrary
diff --git a/notes/libs.txt b/notes/libs.txt
index 8a789641..ede10bdf 100644
--- a/notes/libs.txt
+++ b/notes/libs.txt
@@ -6,3 +6,10 @@ Python
Golang
sqlx (or xorm, or something to match rows to structs)
go-swagger
+
+Rust
+ diesel
+
+UI
+ tachyons
+ choo?
diff --git a/rfc.md b/rfc.md
index fd9397ad..1b63a31a 100644
--- a/rfc.md
+++ b/rfc.md
@@ -254,12 +254,15 @@ are:
name
open-access policy
peer-review policy
+ <has> aliases, acronyms
+ <about> subject/category
<has> identifier
<published in> container
<published-by> publisher
publisher
name
+ <has> aliases, acronyms
<has> identifier
## Controlled Vocabularies
@@ -271,8 +274,10 @@ have more controlled editing workflow... perhaps versioned in the codebase:
- identifier namespaces (DOI, ISBN, ISSN, ORCID, etc)
- subject categorization
- license and open access status
-- work types
+- work "types" (article vs. book chapter vs. proceeding, etc)
- contributor types (author, translator, illustrator, etc)
+- human languages
+- file mimetypes
## Unresolved Questions
@@ -309,12 +314,19 @@ I see a tension between focus and scope creep. If a central database like
fatcat doesn't support enough fields and metadata, then it will not be possible
to completely import other corpuses, and this becomes "yet another" partial
bibliographic database. On the other hand, accepting arbitrary data leads to
-other problems:
+other problems: sparseness increases (we have more "partial" data), potential
+for redundancy is high, humans will start editing content that might be
+bulk-replaced, etc.
+
+There might be a need to support "stub" references between entities. Eg, when
+adding citations from PDF extraction, the cited works are likely to be
+ambiguous. Could create "stub" works to be merged/resolved later, or could
+leave the citation hanging. Same with authors, containers (journals), etc.
## References and Previous Work
The closest overall analog of fatcat is [MusicBrainz][mb], a collaboratively
-edited music database. [Open Library][] is a very similar existing service,
+edited music database. [Open Library][ol] is a very similar existing service,
which exclusively contains book metadata.
[Wikidata][wd] seems to be the most successful and actively edited/developed