diff options
-rw-r--r-- | notes/basic_ui.txt | 19 | ||||
-rw-r--r-- | notes/initial_sources.txt | 28 | ||||
-rw-r--r-- | notes/libs.txt | 7 | ||||
-rw-r--r-- | rfc.md | 18 |
4 files changed, 58 insertions, 14 deletions
diff --git a/notes/basic_ui.txt b/notes/basic_ui.txt index 38f194d3..12ad1ecc 100644 --- a/notes/basic_ui.txt +++ b/notes/basic_ui.txt @@ -27,9 +27,26 @@ Issue / Volume / Pages / Chapter Known file / copy / url Citations (outbound) +# Queries / Searches / Views -# Other Workflows +Views: work, release, creator, container, publisher +Lookup by identifier + +# Other Workflows/Editors + +Single-creator-oriented helper to find works and disambiguate authorship + +Bulk author disambiguation helper (find other unresolved authors with same +alias text and select; drag works between columns) + +Bulk query-then-edit UI: search results in a table, edit like a spreadsheet, up +to... dozens? Query and then apply delta (eg, set topic)? Eg, author edits +basic metadata for all their citations all at once. + +Release editor + +Merge containers (and all related releases) Merge entities (works, releases, etc) Move release between works Split entities (works, authors, etc), including linked stuff diff --git a/notes/initial_sources.txt b/notes/initial_sources.txt index d2e18ff6..a68fb982 100644 --- a/notes/initial_sources.txt +++ b/notes/initial_sources.txt @@ -1,11 +1,19 @@ -dblp -crossref -arxiv -opencitations -CORE -oaDOI -medline -openlibrary -archive.org paper/url manifest -semantic scholar +Probably start with: + + crossref (including citations) + arxiv + medline + +then merge in: + + dblp + CORE + oaDOI + archive.org paper/url manifest + semantic scholar + +and later: + + opencitations + openlibrary diff --git a/notes/libs.txt b/notes/libs.txt index 8a789641..ede10bdf 100644 --- a/notes/libs.txt +++ b/notes/libs.txt @@ -6,3 +6,10 @@ Python Golang sqlx (or xorm, or something to match rows to structs) go-swagger + +Rust + diesel + +UI + tachyons + choo? @@ -254,12 +254,15 @@ are: name open-access policy peer-review policy + <has> aliases, acronyms + <about> subject/category <has> identifier <published in> container <published-by> publisher publisher name + <has> aliases, acronyms <has> identifier ## Controlled Vocabularies @@ -271,8 +274,10 @@ have more controlled editing workflow... perhaps versioned in the codebase: - identifier namespaces (DOI, ISBN, ISSN, ORCID, etc) - subject categorization - license and open access status -- work types +- work "types" (article vs. book chapter vs. proceeding, etc) - contributor types (author, translator, illustrator, etc) +- human languages +- file mimetypes ## Unresolved Questions @@ -309,12 +314,19 @@ I see a tension between focus and scope creep. If a central database like fatcat doesn't support enough fields and metadata, then it will not be possible to completely import other corpuses, and this becomes "yet another" partial bibliographic database. On the other hand, accepting arbitrary data leads to -other problems: +other problems: sparseness increases (we have more "partial" data), potential +for redundancy is high, humans will start editing content that might be +bulk-replaced, etc. + +There might be a need to support "stub" references between entities. Eg, when +adding citations from PDF extraction, the cited works are likely to be +ambiguous. Could create "stub" works to be merged/resolved later, or could +leave the citation hanging. Same with authors, containers (journals), etc. ## References and Previous Work The closest overall analog of fatcat is [MusicBrainz][mb], a collaboratively -edited music database. [Open Library][] is a very similar existing service, +edited music database. [Open Library][ol] is a very similar existing service, which exclusively contains book metadata. [Wikidata][wd] seems to be the most successful and actively edited/developed |