blob: f5c6bb52ef9c2a10276334054312965bdcfb3855 (
plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
|
# Refcat update
* new refs export, about 10% more (2.7B)
* new fatcat export
New wikipedia extraction:
```
martin@ia601101:/magna/data/wikipedia_citations_2020-07-14 $ LC_ALL=C grep ID_list minimal_dataset.json | grep -c DOI
1442189
$ jq -rc '.refs[] | select(.ID_list != null) | {"URL": .URL, "Title": .title, "ID_list": .ID_list}' enwiki-20211201-pages-articles.citations.json | pv -l > minimal.json
$ grep -c DOI minimal.json
1932578
```
Convert format to existing minimal format, for "BrefZipWikiDOI" task.
|