aboutsummaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorMartin Czygan <martin.czygan@gmail.com>2021-06-25 16:34:13 +0200
committerMartin Czygan <martin.czygan@gmail.com>2021-06-25 16:34:13 +0200
commitf4442f600b3f66704063ac91ce2769fa250751c9 (patch)
tree6c02783f6daa1299f36e1436cb1ea6621b8d7489
parent76fe5414ccc51a2ab7f02f7b55c3251760f5f934 (diff)
downloadrefcat-f4442f600b3f66704063ac91ce2769fa250751c9.tar.gz
refcat-f4442f600b3f66704063ac91ce2769fa250751c9.zip
docs: add stats
-rw-r--r--python/notes/version_4.md33
1 files changed, 33 insertions, 0 deletions
diff --git a/python/notes/version_4.md b/python/notes/version_4.md
index b669a58..97811e7 100644
--- a/python/notes/version_4.md
+++ b/python/notes/version_4.md
@@ -850,3 +850,36 @@ igyewr6er5epfozhk7dyfqa5tu igyewr6er5epfozhk7dyfqa5tu exact doi
* 740,248,530 unique edges
+----
+
+# Stats
+
+```
+553414112 exact doi
+75738037 strong jaccardauthors
+66257136 exact pmid
+19646986 strong slugtitleauthormatch
+17202451 strong tokenizedauthors
+3730080 exact arxiv
+2798816 exact titleauthormatch
+482811 strong versioneddoi
+303336 strong pmiddoipair
+279212 exact isbn
+240405 exact workid
+52678 strong customieeearxiv
+43797 strong dataciterelatedid
+29027 strong arxivversion
+27150 exact pmcid
+1652 strong figshareversion
+832 strong titleartifact
+10 strong custombsiundated
+2 strong custombsisubdoc
+```
+
+* total unique edges: 740248530
+* matches by id: 623707690
+* matches though title/author (fuzzy) matching: 116540840
+* scholarly resources:
+* linked open library titles:
+* URLs extracted from corpus:
+* sample ratio IA/URL from corpus: