guide: add 'for publishers' and 'for authors' sections

author: Bryan Newbold <bnewbold@robocracy.org> 2022-11-16 20:26:52 -0800
committer: Bryan Newbold <bnewbold@robocracy.org> 2022-11-16 20:26:52 -0800
commit: 79db40766a9ef467f38a1ac7cc344a938f25a599 (patch)
tree: b363ddbb4fc604316ca77a70a80811954180a3d0
parent: 41c7e9ac6c67da8fbf66ed5b026d23657ed2ffd9 (diff)
download: fatcat-79db40766a9ef467f38a1ac7cc344a938f25a599.tar.gz
fatcat-79db40766a9ef467f38a1ac7cc344a938f25a599.zip
3 files changed, 190 insertions, 0 deletions
diff --git a/guide/src/SUMMARY.md b/guide/src/SUMMARY.md
index 7f41d25b..37c20b1f 100644
--- a/guide/src/SUMMARY.md
+++ b/guide/src/SUMMARY.md
@@ -29,6 +29,10 @@
     - [Code of Conduct](./code_of_conduct.md)
     - [Privacy](./privacy_policy.md)
 
+[For Publishers](./publishers.md)
+
+[For Authors](./authors.md)
+
 [Further Reading](./bibliography.md)
 
 [About This Guide](./guide.md)
diff --git a/guide/src/authors.md b/guide/src/authors.md
new file mode 100644
index 00000000..3072db9a
--- /dev/null
+++ b/guide/src/authors.md
@@ -0,0 +1,86 @@
+For Authors
+===============
+
+This page addresses common questions and concerns from individual authors of
+works indexed in Fatcat, as well as the Internet Archive Scholar service built
+on top of it.
+
+For help in exceptional cases, contact Internet Archive through our usual
+support channels.
+
+
+## Updating Works
+
+A frequent request from authors is to remove outdated versions of works.
+
+The philosophy of the catalog is to go beyond "the version of record" and
+instead collect "the record of versions". This means that drafts, manuscripts,
+working papers, and other alternative versions of works can be fully included
+and differentiated using metadata in the catalog. Even in the case of
+retractions, expressions of concern, or other serious issues with earlier
+versions, it is valuable to keep out-of-date versions in the catalog. Corrected
+or updated versions will generally be preferred and linked to publicly, for
+example on scholar.archive.org. Outright removing content reduces context and
+can result in additional confusion for readers and librarians.
+
+Because of this, it is strongly preferred to add new updated content instead of
+requesting the removal of old out-of-date content. Depending on the situation,
+this could involve creating a new post-publication `release` entity with the
+date of update, with status `updated` or `retracted`; or a new pre-publication
+`release`; or crawling an updated PDF and adding to an existing `release`
+entity.
+
+
+## Correcting Metadata
+
+Sometimes the bibliographic metadata in fatcat is incorrect, incomplete, or out
+of date. This is a particularly sensitive subject when it comes to representing
+information about individuals. While we aspire to automating metadata updates
+and improvements as much as possible, often a human touch is best.
+
+Any person can contribute to the catalog directly by creating an account and
+submitting changes for review. This includes, but is not limited to, authors or
+a person acting on their behalf submitting corrections. The [editing
+quickstart](./quickstart.md) is a good place to start. Please remember that
+corrections are considered part of the public record of the catalog and will be
+preserved even if a contributor later deletes their account. Editor *usernames*
+can be changed at any time.
+
+Fatcat is in some sense a non-authoritive catalog, which means that it is
+usually best if corrections are made in "upstream" sources first (or at the
+same time) as being corrected in fatcat. For example, updating metadata in
+publisher databases, repositories, or ORCiD in addition to in fatcat.
+
+
+### Name Changes
+
+The preferred workflow for author name changes depends on the author's
+sensitivity to having prior names accessible and searchable.
+
+If "also known as" behvior is desirable, contributor names on the release
+record should remain unchanged (matching what the publication at the time
+indicated), and a linked `creator` entity should include the
+currently-preferred name for display.
+
+If "also known as" is not acceptable, and the work has already been updated in
+authoritative publication catalogs, then the contributor name can be updated on
+`release` records as well.
+
+See also the [`creator` style guide](./entity_creator.md).
+
+
+### Author Relation Completeness
+
+`creator` records are not always generated when importing `release` records;
+the current practice is to create and/or link them if there is ORCiD metadata
+linking specific authors to a published work.
+
+This means that author/work is often very incomplete or non-existent. At this
+time we would recommend using other services like dblp.org or openalex.org for
+more complete (but possibly less accurate) author/work metadata.
+
+
+## Resolving Publication Disputes
+
+Authorship and publication ethics disputes should generally be resolved with
+the original publisher first, then updated in fatcat.
diff --git a/guide/src/publishers.md b/guide/src/publishers.md
new file mode 100644
index 00000000..1d567afa
--- /dev/null
+++ b/guide/src/publishers.md
@@ -0,0 +1,100 @@
+For Publishers
+===================
+
+This page addresses common questions and concerns from publishers of research
+works indexed in Fatcat, as well as the Internet Archive Scholar service built
+on top of it. The [for authors](./authors.md) has some information on updates
+and metadata corrections that are also relevant to publishers.
+
+For help in exceptional cases, contact Internet Archive through our usual
+support channels.
+
+
+## Metadata Indexing
+
+Many publishers will find that metadata records are already included in fatcat
+if they register persistent identifiers for their research works. This pipeline
+is based on our automated harvesting of DOI, Pubmed, dblp, DOAJ, and other
+metadata catalogs. This process can take some time (eg, days from
+registration), does not (yet) cover all persistent identifiers, and will only
+cover those works which get identifiers.
+
+For publishers who find that they are not getting indexed in fatcat, our
+primary advice is to register ISSNs for venues (journals, repositories,
+conferences, etc), and to register DOIs for all current and back-catalog works.
+DOIs are the most common and integrated identifier in the scholarly ecosystem,
+and will result in automatic indexing in many other aggregators in addition to
+fatcat/scholar. There may be funding or resources available for smaller
+publishers to cover the cost of DOI registration, and ISSN registration is
+usually no-cost or affordable through national institutions.
+
+We *do not* recommend that journal or conference publishers use general-purpose
+repositories like Zenodo to obtain no-cost DOIs for journal articles. These
+platforms are a great place for pre-publication versions, datasets, software,
+and other artifacts, but not for primary publication-version works (in our
+opinion).
+
+If DOI registration is not possible, one good alternative is to get included in
+the Directory of Open Access Journals and deposit article metadata there. This
+process may take some time, but is a good basic indicator of publication
+quality. DOAJ article metadata is periodically harvested and indexed in fatcat,
+after a de-duplication process.
+
+Fatcat does not yet support OAI-PMH as an identifier and mechanism for
+automated journal ingest, but we likely will in the future. This would
+particularly help publishers using the Open Journal System (OJS). Fatcat also
+does not yet support crawling journal sites and extracting bibliographic
+metadata from HTML tags.
+
+Lastly, publishers could use the fatcat catalog web interface or API to push
+metadata records about their works programmatically. We don't know of any
+publishers actually doing this today.
+
+
+## Improving Automatic Preservation
+
+In alignment with it's mission, Internet Archive makes basic automated attempts
+to capture and preserve all open access research publications on the public
+web, at no cost. This effort comes with no guarantees around completeness,
+timeliness, or support communications.
+
+Preservation coverage can be monitored through the journal-specific dashboards
+or via the coverage search interface.
+
+There are a few technical things publishers can do to increase their
+preservation coverage, in addition to the metadata indexing tips above:
+
+- use the `citation_pdf_url` HTML meta tag, when appropriate, to link directly
+  from article landing pages to PDF URLs
+- use simple HTML to represent landing pages and article content, and do not
+  require Javascript to render page content or links
+- ensure that hosting server `robots.txt` rules are not preventing or overly
+  restricting automated crawling
+- use simple, accessible PDF access links. Do not use time-limited or
+  IP-limited URLs, require specific referrer headers, or use cookies to
+  authenticate access to OA PDFs
+- minimize the number of HTTP redirects and HTML hops between DOI and fulltext
+  content
+- paywalls, loginwalls, geofencing, and anti-bot measures are all obviously
+  antithetical to open crawling and indexing
+
+Publishers are also free to submit "Save Paper Now" requests, or edit the
+catalog itself either manually or in bulk through the API. If an individual
+work persistently fails to ingest, try running a "Save Page Now" request first
+from web.archive.org and verify that the content is available through Wayback
+replay, then submit the "Save Paper Now" request again.
+
+
+## Official Preservation
+
+Internet Archive is developing preservation services for scholarly content on
+the web. Contact us at webservices@archive.org for details.
+
+Existing web archiving services offered to universities, national libraries,
+and other institutions may already be appropriate for some publications. Check
+if your affiliated institutions already have an
+[Archive-IT](https://archive-it.org) account or other existing relationship
+with Internet Archive.
+
+Small publishers using Open Journal System (OJS) should be aware of the PKP
+preservation project.
author	Bryan Newbold <bnewbold@robocracy.org>	2022-11-16 20:26:52 -0800
committer	Bryan Newbold <bnewbold@robocracy.org>	2022-11-16 20:26:52 -0800
commit	79db40766a9ef467f38a1ac7cc344a938f25a599 (patch)
tree	b363ddbb4fc604316ca77a70a80811954180a3d0
parent	41c7e9ac6c67da8fbf66ed5b026d23657ed2ffd9 (diff)
download	fatcat-79db40766a9ef467f38a1ac7cc344a938f25a599.tar.gz fatcat-79db40766a9ef467f38a1ac7cc344a938f25a599.zip