diff options
Diffstat (limited to 'fatcat_covid19/templates/sources.html')
-rw-r--r-- | fatcat_covid19/templates/sources.html | 119 |
1 files changed, 119 insertions, 0 deletions
diff --git a/fatcat_covid19/templates/sources.html b/fatcat_covid19/templates/sources.html new file mode 100644 index 0000000..17b0818 --- /dev/null +++ b/fatcat_covid19/templates/sources.html @@ -0,0 +1,119 @@ +{% extends "base.html" %} + +{% block title %}About{% endblock %} + +{% block body %} + +{# <img class="ui fluid bordered image" src="/static/fatcat.jpg" title="CC0 photo of an oversized feline" alt=""> #} + +<h1></h1> + +<p>Fatcat is versioned, publicly-editable catalog of research publications: +journal articles, conference proceedings, pre-prints, blog posts, and so forth. +The goal is to improve the state of preservation and access to these works by +providing a manifest of full-text content versions and locations. + +<p>This service does not directly contain full-text content itself, but +provides basic access for human and machine readers through links to copies in +web archives, repositories, and the public web. + +<p>Significantly more context and background information can be found in <a +href="https://guide.{{ config.FATCAT_DOMAIN }}/">The Guide</a>. + +<p>Feedback and queries can be directed to +<b><a href="mailto:webservices@archive.org">webservices@archive.org</a></b>. + +<h3>Goals and Features</h3> + +<p>A few things set Fatcat apart from similar indexing and discovery services: + +<ul> + <li>inclusion of archival, <b>file-level metadata (hashes)</b> in addition + to URLs, which allows automated verification ("do I have the right copy"), + reveals content-drift over time, and enables efficient distribution of + content through the ecosystem + <li>native support for "post-PDF" digital media, including <b>archival web + captures and datasets</b>, as well as content stored on the distributed web + <li>data model that captures the <b>work/edition distinction</b>, + grouping pre-print, post-review, published, re-published, and updated + versions of a work together + <li><b>public editing</b> interface, allowing metadata corrections and improvements + from individuals and bots in addition to automated imports from authoritative + sources + <li>focus on providing a stable API and corpus (making integration with + diverse user-facing applications simple), while enabling full replication and + mirroring of the corpus to <b>reduce the risks of centralized control</b> +</ul> + +<p>This service aspires to be a piece of sustainable, long-term, non-profit, +free-software, collaborative, open digital infrastructure. It is primarily +designed to support the <i>archival</i> and <i>dissemination</i> roles of +scholarly communication. It may also support the <i>registration</i> role +(establishing precedence and authorship), but explicitly does not aid with +<i>certification</i> of content, and is not intended to be used for +<i>evaluation</i> of individuals, institutions, or venues. This service is +"universal", not currated, and happily includes retracted and "predatory" +content). + +<h3>Sources of Metadata</h3> + +The source of all bibliographic information is recorded in edit history +metadata, which allows the provenance of all records to be reconstructed. A few +major sources are worth highlighting here: + +<ul> + <li>Release metadata from <b>Crossref</b>, via their public + <a href="https://github.com/CrossRef/rest-api-doc">REST API</a> + <li>Release metadata and linked full-text content from NIH <b>Pubmed</b> and <b><a href="https://arxiv.org">arXiv.org</a></b> + <li>Release metadata and linked public domain full-text content the <b>JSTOR</b> Early Journal Content collection + <li>Creator names and de-duplication from <b>ORCID</b>, via their annual public data releases + <li>Journal title metadata from <b>DOAJ</b>, <b>ISSN ROAD</b>, and <b>SHERPA/RoMEO</b> + <li>Full-text URL lists from <b><a href="https://core.ac.uk">CORE</a></b>, + <b><a href="http://unpaywall.org">Unpaywall</a></b>, + <b><a href="https://www.semanticscholar.org">Semantic Scholar</a></b>, + <b><a href="https://citeseerx.ist.psu.edu">CiteseerX</a></b>, + and <b><a href="https://www.microsoft.com/en-us/research/project/academic">Microsoft Academic Graph</a></b>. + <li><a href="https://guide.{{ config.FATCAT_DOMAIN }}/sources.html">The Guide</a> lists more major sources +</ul> + +Many thanks for the hard work of all these projects, institutions, and +individuals! + + +<h3>Support and Acknowledgments</h3> + +<p>Fatcat is a project of the <b><a href="https://archive.org">Internet Archive</a></b>, +a US-based non-profit digital library, well known for its +<a href="https://web.archive.org">Wayback Machine</a> web archive and +<a href="https://openlibrary.org">Open Library</a> book digitization and +lending service. All Fatcat databases and services run on Internet Archive +servers in California, and a copy of most full-text content is stored in the +Archive's collections and/or web archives. + +<p>Development of Fatcat and related web harvesting, indexing, and preservation +efforts at the Archive have been partially funded (for the 2018-2019 period) by +a generous grant from the <b>Mellon Foundation</b> +(<a href="https://blog.archive.org/2018/03/05/andrew-w-mellon-foundation-awards-grant-to-the-internet-archive-for-long-tail-journal-preservation/">"Long-tail Open Access Journal Preservation"</a>). +Fatcat supports this work by both tracking which open access works in known +archives and providing minimum-viable indexing and access mechanisms for +long-tail works which otherwise would lack them. + +<p>The service would not technically be possible without hundreds of Free +Software components and the efforts of their individual and organizational +maintainers, more than can be listed here (please see the source code for full +lists). A few major components include the PostgreSQL database, Elasticsearch +search engine, Flask python web framework, Rust programming language, Diesel +database library, Swagger/OpenAPI code generators, Kafka distributed log, +Ansible configuration management tool, and Ubuntu GNU/Linux operating system +distribution. + +<p>The front-page photo of a large feline with a cup of coffee is by +<a href="http://www.kampschroer.com/photography.html">Quinn Kampschroer</a>, +under a CC-0 license. The name "Fatcat" can be interpreted as short for "large +catalog", as the service aspires to be a <i>complete</i> catalog of the digital +scholarly record. + +<p>A list of technical contributors, including volunteers, is maintained in the +source code repository (<code>CONTRIBUTORS.md</code>). Thanks everybody! + +{% endblock %} |