aboutsummaryrefslogtreecommitdiffstats
path: root/extra/cdx/README.md
diff options
context:
space:
mode:
authorMartin Czygan <martin.czygan@gmail.com>2021-04-22 23:05:28 +0200
committerMartin Czygan <martin.czygan@gmail.com>2021-04-22 23:05:28 +0200
commit5739027a0ee1982474fdbc7ff00d5ee52a497caa (patch)
tree57dc306b4682dbad8948d25815644cb8f71d74f8 /extra/cdx/README.md
parent1dd4b22d20d466ff896ad28bc57e824a9eb10c15 (diff)
downloadrefcat-5739027a0ee1982474fdbc7ff00d5ee52a497caa.tar.gz
refcat-5739027a0ee1982474fdbc7ff00d5ee52a497caa.zip
add 10k cdx sample
Diffstat (limited to 'extra/cdx/README.md')
-rw-r--r--extra/cdx/README.md6
1 files changed, 6 insertions, 0 deletions
diff --git a/extra/cdx/README.md b/extra/cdx/README.md
new file mode 100644
index 0000000..8c95021
--- /dev/null
+++ b/extra/cdx/README.md
@@ -0,0 +1,6 @@
+# Sample CDX Links
+
+Given a sample from outbound web links from publications, determine number of
+URLs we may have. We currently find about 44368911 URLs in the refs.
+
+Limit to 10000 links.