diff options
author | Bryan Newbold <bnewbold@robocracy.org> | 2020-10-02 00:12:57 -0700 |
---|---|---|
committer | Bryan Newbold <bnewbold@robocracy.org> | 2020-10-02 00:12:57 -0700 |
commit | c5fdad74622350e7445a962f96975348a964018d (patch) | |
tree | 596cffd9b2af9421e81da0d3348924188b76b5b2 /guide/src | |
parent | 95bcef522ba3cdb32fc60078caec38855506c814 (diff) | |
download | fatcat-c5fdad74622350e7445a962f96975348a964018d.tar.gz fatcat-c5fdad74622350e7445a962f96975348a964018d.zip |
update 'contributing' page in guide
Diffstat (limited to 'guide/src')
-rw-r--r-- | guide/src/SUMMARY.md | 4 | ||||
-rw-r--r-- | guide/src/bibliography.md | 33 | ||||
-rw-r--r-- | guide/src/contributing.md | 137 | ||||
-rw-r--r-- | guide/src/sw_contribute.md | 14 |
4 files changed, 172 insertions, 16 deletions
diff --git a/guide/src/SUMMARY.md b/guide/src/SUMMARY.md index c8e867c2..ffc80ac2 100644 --- a/guide/src/SUMMARY.md +++ b/guide/src/SUMMARY.md @@ -23,11 +23,11 @@ - [Public API](./http_api.md) - [Bulk Exports](./bulk_exports.md) - [Cookbook](./cookbook.md) -- [Software Contributions](./sw_contribute.md) +- [Contributing](./contributing.md) - [Policies](./policies.md) - [Code of Conduct](./code_of_conduct.md) - [Privacy](./privacy_policy.md) -[Bibliography](./bibliography.md) +[Further Reading](./bibliography.md) [About This Guide](./guide.md) diff --git a/guide/src/bibliography.md b/guide/src/bibliography.md index f94f1b79..d38c4b04 100644 --- a/guide/src/bibliography.md +++ b/guide/src/bibliography.md @@ -1,4 +1,37 @@ +# Presentations + +2020 Workshop On Open Citations And Open Scholarly Metadata 2020 - Fatcat ([vidoo on archive.org](https://archive.org/details/fatcat_workshop_open_citations_open_scholarly_metadata_2020)) + +2019-10-25 FORCE2019 - Perpetual Access Machines: Archiving Web-Published Scholarship at Scale ([video on youtube.com](https://www.youtube.com/watch?v=PARqfbYIdXQ)) + + +# Blog Posts And Press + +2020-09-17 blog.dshr.org - [Don't Say We Didn't Warn You](https://blog.dshr.org/2020/09/dont-say-we-didnt-warn-you.html) + +2020-09-15: blog.archive.org - [How the Internet Archive is Ensuring Permanent Access to Open Access Journal Articles](http://blog.archive.org/2020/09/15/how-the-internet-archive-is-ensuring-permanent-access-to-open-access-journal-articles/) + +2020-02-18 blog.dshr.org - [The Scholarly Record At The Internet Archive](https://blog.dshr.org/2020/02/the-scholarly-record-at-internet-archive.html) + +2019-04-18 blog.dshr.org - [Personal Pods and Fatcat](https://blog.dshr.org/2019/04/personal-pods-and-fatcat.html) + +2018-10-03 blog.dshr.org - [Brief Talk At Internet Archive Event](https://blog.dshr.org/2018/10/brief-talk-at-internet-archive-event.html) + +2018-03-05 blog.archive.org - [Andrew W. Mellon Foundation Awards Grant to the Internet Archive for Long Tail Journal Preservation](https://blog.archive.org/2018/03/05/andrew-w-mellon-foundation-awards-grant-to-the-internet-archive-for-long-tail-journal-preservation/) + + +# Background + +<!-- TODO: move these to bibliography instead? --> + +2020-09-08 sciencemag.org: [Dozens of scientific journals have vanished from the internet, and no one preserved them](https://www.sciencemag.org/news/2020/09/dozens-scientific-journals-have-vanished-internet-and-no-one-preserved-them) + +2020-09-10 nature.com: [More than 100 scientific journals have disappeared from the Internet](https://www.nature.com/articles/d41586-020-02610-z) + +2020-08-27 arxiv.org [Open is not forever: a study of vanished open access journals](https://arxiv.org/abs/2008.11933) + + # Bibliography <!-- On zbib.org: https://zbib.org/f53f7e0032ff4268a9a5f3e13aff13b9 --> diff --git a/guide/src/contributing.md b/guide/src/contributing.md new file mode 100644 index 00000000..2b43f15a --- /dev/null +++ b/guide/src/contributing.md @@ -0,0 +1,137 @@ + +# Contributing + +Our aspiration is for this to be an open, collaborative project, with +individuals and organization of all sizes able to participate. There is not +much structure or documentation on how volunteers can get started or be most +helpful, but perhaps we can work together on that as well! + +The best place to organize and coordinate right now is the +[gitter chatroom](https://gitter.im/internetarchive/fatcat). Gitter is +described as "for developers", but we use it for everybody, and you don't need +an invitation. + +Want to help out? Below are a few example roles you could play. + + +#### Anybody: Find Bugs, Suggest Improvements + +The user sign-up and editing workflow on fatcat.wiki is currently pretty poor. +How could this experience be improved and better documented? Specific ideas, +suggestions and diagrams would be very helpful. You don't need to know how to +program or about web technologies to contribute; hand drawings and example text +can be sufficient. + + +#### Community Organizer: Partner and Volunteer Organizing + +Are you passionate about Open Access and want to help build a community around +preservation and universal access to knowledge? We could use help structuring +an editing community, and communicating with partner projects like Wikidata to +ensure we are not duplicating efforts. + +A good example of a project to organize would be improving journal-level +metadata in wikidata, including journal homepages, and linking to fatcat +"container" entities. + + +#### Research Librarian: Identify Missing Content + +If you have an interest in a specific scholarly field, you could give us +feedback on how good of a job fatcat is doing preserving at-risk open access +content. We know we have a lot of work to do, but both specific examples of +missing publications, as well as broader patterns and missing holes are helpful +to know about. Some missing content we know we don't have, but there are surely +entire categories of in-scope content that we do not even know are missing! + + +#### Metadata Librarian: Schema Improvements + +Are you an experienced wrangler of BibFrame, MARC, bibtext, RDF, OAI-PMH, and +Citation Style Language? Our data model and entity schemas are bespoke (sorry!) +and designed to evolved over time. There might be related efforts and new +controlled vocabularies we could adopt or align with, or small changes to the +schema might enable new use cases. It could be as simple as identifying and +prioritizing new external identifiers (PIDs) to allow. Let us know what we got +right and what needs improvement! + + +#### Power Editor: Better Interfaces + +Are you super experienced with data entry, editing, and corrections? Do you +have ideas on how our interface could be improved, or what kinds of new +interfaces and tools could be build to support effective editing? Our open API +allows third-party interfaces to make edits on individuals' behalf, meaning new +tools can be build for specific patterns of editing or user contribution. + + +#### Data Scientist: Wrangling and Visualization + +We have hundreds of gigabytes of metadata to transform and normalize before +importing, and already have a rich open dataset with millions of linked +entities. Our elasticsearch analytics database has an open read-only endpoint +(<https://search.fatcat.wiki>), which are used to power our [coverage +interface](https://fatcat.wiki/coverage/search). What other interactive +visualizations could be built? What tools should we be using to wrangle +bibliographic metadata better and faster? + + +#### Author: Verify Metadata + +Do you publish research documents, and want to ensure it is accessible to the +broadest audience today and in the future? Like many academic search engines, +you can add papers and link an author profile to specific publications. Unlike +others, you can also ensure uploaded pre-prints and other open versions of your +research are found and linked using the "save paper now" feature, and you can +any errors made by publishers and bots. + + +#### Translation and Accessibility Advocate + +Some of our web interfaces have existing internationalization infrastructure, +and translations can be +[contributed directly](https://hosted.weblate.org/projects/internetarchive/). + +Other projects need help getting translation infrastructure in place, and all +of our projects could use review and recommendations for improvement by experts +in web accessibility. For example, if you use a screen reader, feedback on +which parts of our services are most difficult to use are very helpful. + + +#### Software Developer: Bot Wrangling + +Fatcat is structured such that all changes to the catalog go through an open +API. This includes human edits through the web interface, but the large +majority of edits are made by bots. You could write a new bot to help... + +- review human edits (from the "reviewable" queue) to "lint" for typos, missing + fields, or other problems, and then leave an annotation +- harvest, transform, and import metadata from addition subject- and + region-specific sources +- find and clean-up patterns of poor or incorrect metadata already in the + catalog + + +#### SQL Expert: Database Scaling + +We have a large (500+ GByte) PostgreSQL database backing the catalog. This is +working great so far, but we have concerns about how the catalog will scale +further, especially if bots start making multiple updates per entity. You could +review our SQL schema and recommend improvements, or give feedback and advice +on how to switch to a distributed primary datastore. + + +#### Financial Supporter + +Short on time? As a US 501(c)(3) non-profit, the Internet Archive always +appreciates and makes good use of [donations](https://archive.org/donate/). + + +## Software Contributions + +Bugs and patches can be filed on Github at: <https://github.com/internetarchive/fatcat> + +When considering making a non-trivial contribution, it can save review time and +duplicated work to post an issue with your intentions and plan. New code and +features must include unit tests before being merged, though we can help with +writing them. diff --git a/guide/src/sw_contribute.md b/guide/src/sw_contribute.md deleted file mode 100644 index d408ef4b..00000000 --- a/guide/src/sw_contribute.md +++ /dev/null @@ -1,14 +0,0 @@ -# Software Contributions - -For now, issues and patches can be filed at <https://github.com/internetarchive/fatcat>. - -The back-end (`fatcatd`, in Rust), web interface (`fatcat-web`, in Python), -bots, and this guide are all versioned in the same git repository. - -See the `rust/README.md` and `rust/HACKING.md` documents for some common tasks -and gotchas when working with the rust backend. - -When considering making a non-trivial contribution, it can save review time and -duplicated work to post an issue with your intentions and plan. New code and -features must include unit tests before being merged, though we can help with -writing them. |