aboutsummaryrefslogtreecommitdiffstats
path: root/guide
diff options
context:
space:
mode:
authorBryan Newbold <bnewbold@robocracy.org>2020-10-02 00:12:57 -0700
committerBryan Newbold <bnewbold@robocracy.org>2020-10-02 00:12:57 -0700
commitc5fdad74622350e7445a962f96975348a964018d (patch)
tree596cffd9b2af9421e81da0d3348924188b76b5b2 /guide
parent95bcef522ba3cdb32fc60078caec38855506c814 (diff)
downloadfatcat-c5fdad74622350e7445a962f96975348a964018d.tar.gz
fatcat-c5fdad74622350e7445a962f96975348a964018d.zip
update 'contributing' page in guide
Diffstat (limited to 'guide')
-rw-r--r--guide/src/SUMMARY.md4
-rw-r--r--guide/src/bibliography.md33
-rw-r--r--guide/src/contributing.md137
-rw-r--r--guide/src/sw_contribute.md14
4 files changed, 172 insertions, 16 deletions
diff --git a/guide/src/SUMMARY.md b/guide/src/SUMMARY.md
index c8e867c2..ffc80ac2 100644
--- a/guide/src/SUMMARY.md
+++ b/guide/src/SUMMARY.md
@@ -23,11 +23,11 @@
- [Public API](./http_api.md)
- [Bulk Exports](./bulk_exports.md)
- [Cookbook](./cookbook.md)
-- [Software Contributions](./sw_contribute.md)
+- [Contributing](./contributing.md)
- [Policies](./policies.md)
- [Code of Conduct](./code_of_conduct.md)
- [Privacy](./privacy_policy.md)
-[Bibliography](./bibliography.md)
+[Further Reading](./bibliography.md)
[About This Guide](./guide.md)
diff --git a/guide/src/bibliography.md b/guide/src/bibliography.md
index f94f1b79..d38c4b04 100644
--- a/guide/src/bibliography.md
+++ b/guide/src/bibliography.md
@@ -1,4 +1,37 @@
+# Presentations
+
+2020 Workshop On Open Citations And Open Scholarly Metadata 2020 - Fatcat ([vidoo on archive.org](https://archive.org/details/fatcat_workshop_open_citations_open_scholarly_metadata_2020))
+
+2019-10-25 FORCE2019 - Perpetual Access Machines: Archiving Web-Published Scholarship at Scale ([video on youtube.com](https://www.youtube.com/watch?v=PARqfbYIdXQ))
+
+
+# Blog Posts And Press
+
+2020-09-17 blog.dshr.org - [Don't Say We Didn't Warn You](https://blog.dshr.org/2020/09/dont-say-we-didnt-warn-you.html)
+
+2020-09-15: blog.archive.org - [How the Internet Archive is Ensuring Permanent Access to Open Access Journal Articles](http://blog.archive.org/2020/09/15/how-the-internet-archive-is-ensuring-permanent-access-to-open-access-journal-articles/)
+
+2020-02-18 blog.dshr.org - [The Scholarly Record At The Internet Archive](https://blog.dshr.org/2020/02/the-scholarly-record-at-internet-archive.html)
+
+2019-04-18 blog.dshr.org - [Personal Pods and Fatcat](https://blog.dshr.org/2019/04/personal-pods-and-fatcat.html)
+
+2018-10-03 blog.dshr.org - [Brief Talk At Internet Archive Event](https://blog.dshr.org/2018/10/brief-talk-at-internet-archive-event.html)
+
+2018-03-05 blog.archive.org - [Andrew W. Mellon Foundation Awards Grant to the Internet Archive for Long Tail Journal Preservation](https://blog.archive.org/2018/03/05/andrew-w-mellon-foundation-awards-grant-to-the-internet-archive-for-long-tail-journal-preservation/)
+
+
+# Background
+
+<!-- TODO: move these to bibliography instead? -->
+
+2020-09-08 sciencemag.org: [Dozens of scientific journals have vanished from the internet, and no one preserved them](https://www.sciencemag.org/news/2020/09/dozens-scientific-journals-have-vanished-internet-and-no-one-preserved-them)
+
+2020-09-10 nature.com: [More than 100 scientific journals have disappeared from the Internet](https://www.nature.com/articles/d41586-020-02610-z)
+
+2020-08-27 arxiv.org [Open is not forever: a study of vanished open access journals](https://arxiv.org/abs/2008.11933)
+
+
# Bibliography
<!-- On zbib.org: https://zbib.org/f53f7e0032ff4268a9a5f3e13aff13b9 -->
diff --git a/guide/src/contributing.md b/guide/src/contributing.md
new file mode 100644
index 00000000..2b43f15a
--- /dev/null
+++ b/guide/src/contributing.md
@@ -0,0 +1,137 @@
+
+# Contributing
+
+Our aspiration is for this to be an open, collaborative project, with
+individuals and organization of all sizes able to participate. There is not
+much structure or documentation on how volunteers can get started or be most
+helpful, but perhaps we can work together on that as well!
+
+The best place to organize and coordinate right now is the
+[gitter chatroom](https://gitter.im/internetarchive/fatcat). Gitter is
+described as "for developers", but we use it for everybody, and you don't need
+an invitation.
+
+Want to help out? Below are a few example roles you could play.
+
+
+#### Anybody: Find Bugs, Suggest Improvements
+
+The user sign-up and editing workflow on fatcat.wiki is currently pretty poor.
+How could this experience be improved and better documented? Specific ideas,
+suggestions and diagrams would be very helpful. You don't need to know how to
+program or about web technologies to contribute; hand drawings and example text
+can be sufficient.
+
+
+#### Community Organizer: Partner and Volunteer Organizing
+
+Are you passionate about Open Access and want to help build a community around
+preservation and universal access to knowledge? We could use help structuring
+an editing community, and communicating with partner projects like Wikidata to
+ensure we are not duplicating efforts.
+
+A good example of a project to organize would be improving journal-level
+metadata in wikidata, including journal homepages, and linking to fatcat
+"container" entities.
+
+
+#### Research Librarian: Identify Missing Content
+
+If you have an interest in a specific scholarly field, you could give us
+feedback on how good of a job fatcat is doing preserving at-risk open access
+content. We know we have a lot of work to do, but both specific examples of
+missing publications, as well as broader patterns and missing holes are helpful
+to know about. Some missing content we know we don't have, but there are surely
+entire categories of in-scope content that we do not even know are missing!
+
+
+#### Metadata Librarian: Schema Improvements
+
+Are you an experienced wrangler of BibFrame, MARC, bibtext, RDF, OAI-PMH, and
+Citation Style Language? Our data model and entity schemas are bespoke (sorry!)
+and designed to evolved over time. There might be related efforts and new
+controlled vocabularies we could adopt or align with, or small changes to the
+schema might enable new use cases. It could be as simple as identifying and
+prioritizing new external identifiers (PIDs) to allow. Let us know what we got
+right and what needs improvement!
+
+
+#### Power Editor: Better Interfaces
+
+Are you super experienced with data entry, editing, and corrections? Do you
+have ideas on how our interface could be improved, or what kinds of new
+interfaces and tools could be build to support effective editing? Our open API
+allows third-party interfaces to make edits on individuals' behalf, meaning new
+tools can be build for specific patterns of editing or user contribution.
+
+
+#### Data Scientist: Wrangling and Visualization
+
+We have hundreds of gigabytes of metadata to transform and normalize before
+importing, and already have a rich open dataset with millions of linked
+entities. Our elasticsearch analytics database has an open read-only endpoint
+(<https://search.fatcat.wiki>), which are used to power our [coverage
+interface](https://fatcat.wiki/coverage/search). What other interactive
+visualizations could be built? What tools should we be using to wrangle
+bibliographic metadata better and faster?
+
+
+#### Author: Verify Metadata
+
+Do you publish research documents, and want to ensure it is accessible to the
+broadest audience today and in the future? Like many academic search engines,
+you can add papers and link an author profile to specific publications. Unlike
+others, you can also ensure uploaded pre-prints and other open versions of your
+research are found and linked using the "save paper now" feature, and you can
+any errors made by publishers and bots.
+
+
+#### Translation and Accessibility Advocate
+
+Some of our web interfaces have existing internationalization infrastructure,
+and translations can be
+[contributed directly](https://hosted.weblate.org/projects/internetarchive/).
+
+Other projects need help getting translation infrastructure in place, and all
+of our projects could use review and recommendations for improvement by experts
+in web accessibility. For example, if you use a screen reader, feedback on
+which parts of our services are most difficult to use are very helpful.
+
+
+#### Software Developer: Bot Wrangling
+
+Fatcat is structured such that all changes to the catalog go through an open
+API. This includes human edits through the web interface, but the large
+majority of edits are made by bots. You could write a new bot to help...
+
+- review human edits (from the "reviewable" queue) to "lint" for typos, missing
+ fields, or other problems, and then leave an annotation
+- harvest, transform, and import metadata from addition subject- and
+ region-specific sources
+- find and clean-up patterns of poor or incorrect metadata already in the
+ catalog
+
+
+#### SQL Expert: Database Scaling
+
+We have a large (500+ GByte) PostgreSQL database backing the catalog. This is
+working great so far, but we have concerns about how the catalog will scale
+further, especially if bots start making multiple updates per entity. You could
+review our SQL schema and recommend improvements, or give feedback and advice
+on how to switch to a distributed primary datastore.
+
+
+#### Financial Supporter
+
+Short on time? As a US 501(c)(3) non-profit, the Internet Archive always
+appreciates and makes good use of [donations](https://archive.org/donate/).
+
+
+## Software Contributions
+
+Bugs and patches can be filed on Github at: <https://github.com/internetarchive/fatcat>
+
+When considering making a non-trivial contribution, it can save review time and
+duplicated work to post an issue with your intentions and plan. New code and
+features must include unit tests before being merged, though we can help with
+writing them.
diff --git a/guide/src/sw_contribute.md b/guide/src/sw_contribute.md
deleted file mode 100644
index d408ef4b..00000000
--- a/guide/src/sw_contribute.md
+++ /dev/null
@@ -1,14 +0,0 @@
-# Software Contributions
-
-For now, issues and patches can be filed at <https://github.com/internetarchive/fatcat>.
-
-The back-end (`fatcatd`, in Rust), web interface (`fatcat-web`, in Python),
-bots, and this guide are all versioned in the same git repository.
-
-See the `rust/README.md` and `rust/HACKING.md` documents for some common tasks
-and gotchas when working with the rust backend.
-
-When considering making a non-trivial contribution, it can save review time and
-duplicated work to post an issue with your intentions and plan. New code and
-features must include unit tests before being merged, though we can help with
-writing them.