aboutsummaryrefslogtreecommitdiffstats
path: root/docs/TR-20210808100000-IA-WDS-REFCAT/README.md
blob: 9dfa4fdd4e3e921c5ef312a88af6e8359a9db062 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
# Technical Report: Refcat

* 2021-08-08

To be uploaded to [Arxiv](https://arxiv.org/) soon.

> As part of its scholarly data efforts, the Internet Archive
releases a first version of a citation graph dataset, named refcat, derived
from scholarly publications and additional data sources. It is composed of data
gathered by the fatcat cataloging project , related web-scale crawls targeting
primary and secondary scholarly outputs, as well as metadata from the Open
Library project and Wikipedia . This first version of the graph consists of
1,323,423,672 citations. We release this dataset under a CC0 Public Domain
Dedication, accessible through an archive item4 . All code used in the
derivation process is released under an MIT license.