This report is auto-generated from a sqlite database file, which should be available/included.
datetime('now')
2019-08-01 03:55:43
QUERY: SELECT datetime('now');
Note that pretty much all of the fatcat release stats are on a release, not work basis, so there may be over-counting. Also, as of July 2019 there were over 1.5 million OA longtail releases which are not linked to a container (journal).
Top countries by journal count (and fatcat release counts):
country
journal_count
sum(release_count)
91931
34853365
us
6838
20812424
gb
5967
12238711
nl
2343
7763639
de
1841
4176386
id
1562
112525
br
1501
614272
es
1012
275328
pl
807
256632
it
803
304793
QUERY: SELECT country, COUNT(*) AS journal_count, sum(release_count) from journal group by country order by count(*) desc limit 10;
Top languages by journal count (and fatcat release counts):
lang
journal_count
release_count
96856
39729766
en
25584
46389136
es
738
105717
id
587
35909
pt
560
99100
de
504
1050664
fr
420
314582
ja
330
589020
ru
245
150367
it
202
97561
QUERY: SELECT lang, COUNT(*) as journal_count, sum(release_count) as release_count FROM journal GROUP BY lang ORDER BY COUNT(*) DESC LIMIT 10;
Aggregate fatcat fulltext release coverage by OA status:
is_oa
journal_count
SUM(release_count)
SUM(ia_count)
total_ia_frac
0
74985
42533886
5678063
0.13
1
51850
46171246
10357705
0.22
QUERY: SELECT is_oa, COUNT(*) AS journal_count, SUM(release_count), SUM(ia_count), ROUND(1. * SUM(ia_count) / SUM(release_count), 2) as total_ia_frac FROM journal GROUP BY is_oa;
Big publishers by journal count:
publisher
journal_count
SUM(release_count)
47203
6504661
Elsevier
4009
16206074
Informa UK (Taylor & Francis)
3332
3921600
Springer-Verlag
2875
5638303
SAGE Publications
1372
2344281
Peter Lang International Academic Publishers
1360
252
Wiley (Blackwell Publishing)
1173
3640989
Wiley (John Wiley & Sons)
1039
4456867
Walter de Gruyter GmbH
624
435616
Springer (Biomed Central Ltd.)
558
450187
Cambridge University Press
553
1519555
Hindawi Limited
521
194707
Georg Thieme Verlag KG
512
688731
OMICS Publishing Group
502
93785
JSTOR
495
738890
QUERY: SELECT publisher, COUNT(*) AS journal_count, SUM(release_count) from journal GROUP BY publisher ORDER BY COUNT(*) DESC LIMIT 15;
Number of publishers with 3 or fewer journals:
COUNT(*)
18307
QUERY: SELECT COUNT(*) FROM (SELECT publisher, COUNT(*) as journal_count FROM journal GROUP BY publisher) WHERE journal_count <= 3;
Fulltext coverage by publisher type:
publisher_type
ia_total_frac
preserved_total_frac
journal_count
paper_count
big5
0.12
0.89
15362
39334593
society
0.25
0.71
8545
17499721
0.15
0.28
59716
13550129
commercial
0.12
0.84
6608
6041845
unipress
0.24
0.84
6017
5876070
longtail
0.48
0.54
25523
2216976
oa
0.76
0.84
2476
1180835
repository
0.13
0.3
646
925092
other
0.08
0.88
927
861701
archive
0.26
0.98
604
792273
scielo
0.8
0.81
411
425897
QUERY: SELECT publisher_type, ROUND(1.0 * SUM(ia_count) / SUM(release_count), 2) as ia_total_frac, ROUND(1.0 * SUM(preserved_count) / SUM(release_count), 2) as preserved_total_frac, count(*) as journal_count, sum(release_count) as paper_count from journal group by publisher_type order by sum(release_count) desc;
Fulltext coverage by publisher type (NOTE: averaging fractions without weighing by release count, intentionally):
publisher_type
avg_ia_frac
avg_preserved_frac
journal_count
paper_count
big5
0.15
0.81
15362
39334593
society
0.32
0.53
8545
17499721
0.24
0.31
59716
13550129
commercial
0.26
0.76
6608
6041845
unipress
0.42
0.69
6017
5876070
longtail
0.55
0.58
25523
2216976
oa
0.63
0.8
2476
1180835
repository
0.04
0.19
646
925092
other
0.15
0.65
927
861701
archive
0.31
0.98
604
792273
scielo
0.83
0.85
411
425897
QUERY: SELECT publisher_type, ROUND(1.0 * AVG(ia_frac), 2) as avg_ia_frac, ROUND(1.0 * AVG(preserved_frac), 2) as avg_preserved_frac, count(*) as journal_count, sum(release_count) as paper_count from journal group by publisher_type order by sum(release_count) DESC;
Number of journals with no releases (metadata or fulltext) in fatcat:
publisher_type
journals_with_no_releases
21195
longtail
13194
society
1770
commercial
1626
unipress
1521
big5
363
oa
85
archive
45
other
17
repository
10
QUERY: SELECT publisher_type, COUNT(*) AS journals_with_no_releases FROM journal WHERE release_count = 0 GROUP BY publisher_type ORDER BY COUNT(*) DESC;
Coverage by sherpa color:
sherpa_color
ia_fulltext_count
release_count
total_ia_frac
5076210
26342117
0.19
blue
799381
2876891
0.28
green
7873174
41516320
0.19
white
483434
4632715
0.1
yellow
1803569
13337089
0.14
QUERY: SELECT sherpa_color, SUM(ia_count) as ia_fulltext_count, SUM(release_count) as release_count, ROUND(1.0 * SUM(ia_count) / SUM(release_count), 2) as total_ia_frac FROM journal GROUP BY sherpa_color;
Top publishers with very little IA coverage (NOTE: averaging fractions without weight by journal size):
publisher
journal_count
ROUND(avg(ia_frac),3)
13226
0.001
Informa UK (Taylor & Francis)
2081
0.019
Elsevier
2053
0.015
SAGE Publications
764
0.018
Springer-Verlag
761
0.018
Wiley (Blackwell Publishing)
641
0.02
Wiley (John Wiley & Sons)
593
0.017
JSTOR
295
0.005
CAIRN
280
0.012
Medknow Publications
280
0.008
QUERY: SELECT publisher, count(*) as journal_count, ROUND(avg(ia_frac),3) from journal where ia_frac < 0.05 group by publisher order by count(*) desc limit 10;
Journal counts by homepage status:
any_homepage
any_live_homepage
any_gwb_homepage
COUNT(*)
frac
0
0
0
65614
0.52
1
0
0
5434
0.04
1
0
1
4843
0.04
1
1
0
3624
0.03
1
1
1
47320
0.37
QUERY: SELECT any_homepage, any_live_homepage, any_gwb_homepage, COUNT(*), ROUND(1.0 * COUNT(*) / (SELECT COUNT(*) FROM journal), 2) AS frac FROM journal GROUP BY any_homepage, any_live_homepage, any_gwb_homepage;
Number of unique journals that have a homepage pointing to wayback or archive.org:
COUNT(DISTINCT issnl)
154
QUERY: SELECT COUNT(DISTINCT issnl) FROM homepage WHERE domain = 'archive.org';
Top publishers that have journals in wayback:
publisher
COUNT(*)
63
EDP Sciences
11
PERSEE Program
3
CAIRN
2
Fabula
2
Institut du monde et du développement pour la bonne gouvernance publique
2
ANPAD
1
Ad hoc (Rennes)
1
Asociación Revista Venezolana de Ciencia y Tecnología de Alimentos
1
Association Epiga
1
QUERY: SELECT publisher, COUNT(*) FROM journal LEFT JOIN homepage ON journal.issnl = homepage.issnl WHERE homepage.domain = 'archive.org' GROUP BY journal.publisher ORDER BY COUNT(*) DESC LIMIT 10;
Homepage URL counts:
rows
issnls
surts
83909
61221
82678
QUERY: SELECT COUNT(*) as rows, COUNT(DISTINCT issnl) as issnls, COUNT(DISTINCT surt) as surts FROM homepage;
Journals with most unique SURTs:
issnl
COUNT(*)
0717-3458
6
1406-4243
6
2190-5991
6
0011-6793
5
0022-9830
5
0091-6765
5
0102-7638
5
0144-8463
5
0212-6567
5
0350-154X
5
QUERY: SELECT issnl, COUNT(*) from homepage GROUP BY issnl ORDER BY COUNT(*) DESC LIMIT 10;
Blocked domains:
domain
count(*)
sum(blocked)
jstor.org
3235
3234
brill.nl
216
161
wiley.com
2372
152
bentham.org
146
146
tandfonline.com
2919
84
cairn.info
52
49
emeraldgrouppublishing.com
49
49
emeraldinsight.com
390
16
rodopi.nl
19
15
sagepub.com
1863
9
vsppub.com
9
9
iaster.com
7
7
mohr.de
15
7
scienceq.org
7
7
uctjournals.com
7
7
elsevier.com
2746
6
esaunggul.ac.id
6
6
bloomsbury.com
4
4
gov.hu
8
4
inap.es
4
4
QUERY: SELECT domain, count(*), sum(blocked) from homepage group by domain order by sum(blocked) desc limit 20;
Top duplicated URLs and SURTs:
QUERY: SELECT url, COUNT(*) FROM homepage GROUP BY url ORDER BY COUNT(*) DESC LIMIT 10;
Top terminal URLs catch cases where many URLs redirect to a single page:
QUERY: SELECT terminal_url, COUNT(DISTINCT issnl) FROM homepage WHERE terminal_url IS NOT NULL GROUP BY terminal_url ORDER BY COUNT(DISTINCT issnl) DESC LIMIT 20;
surt
COUNT(*)
org,rsc,pubs)/en/ebooks
47
com,benjamins)/
27
ro,ubbcluj,studia)/serii/index_en.html
22
id,ac,unimed,jurnal)/
12
org,ecorfan)/bolivia/research_journals.php
9
it,minervamedica)/index2.t
8
ch,gesundheitsfoerderung)/ueber-uns/downloads.html
6
com,inderscience)/browse/index.php
6
nl,iospress)/
6
com,inderscience)/
5
QUERY: SELECT surt, COUNT(*) FROM homepage GROUP BY surt ORDER BY COUNT(*) DESC LIMIT 10;