aboutsummaryrefslogtreecommitdiffstats
path: root/python/notes/version_4.md
diff options
context:
space:
mode:
Diffstat (limited to 'python/notes/version_4.md')
-rw-r--r--python/notes/version_4.md33
1 files changed, 33 insertions, 0 deletions
diff --git a/python/notes/version_4.md b/python/notes/version_4.md
index b669a58..97811e7 100644
--- a/python/notes/version_4.md
+++ b/python/notes/version_4.md
@@ -850,3 +850,36 @@ igyewr6er5epfozhk7dyfqa5tu igyewr6er5epfozhk7dyfqa5tu exact doi
* 740,248,530 unique edges
+----
+
+# Stats
+
+```
+553414112 exact doi
+75738037 strong jaccardauthors
+66257136 exact pmid
+19646986 strong slugtitleauthormatch
+17202451 strong tokenizedauthors
+3730080 exact arxiv
+2798816 exact titleauthormatch
+482811 strong versioneddoi
+303336 strong pmiddoipair
+279212 exact isbn
+240405 exact workid
+52678 strong customieeearxiv
+43797 strong dataciterelatedid
+29027 strong arxivversion
+27150 exact pmcid
+1652 strong figshareversion
+832 strong titleartifact
+10 strong custombsiundated
+2 strong custombsisubdoc
+```
+
+* total unique edges: 740248530
+* matches by id: 623707690
+* matches though title/author (fuzzy) matching: 116540840
+* scholarly resources:
+* linked open library titles:
+* URLs extracted from corpus:
+* sample ratio IA/URL from corpus: