summaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorBryan Newbold <bnewbold@robocracy.org>2021-11-10 12:33:36 -0800
committerBryan Newbold <bnewbold@robocracy.org>2021-11-10 17:08:07 -0800
commitb6d228b7171252c8f9f70194c09aba0ed0c55567 (patch)
tree2e3e73b531b29858556ed3b51d3034d99288c212
parent0a36276cc201ca7d4b3d2f491648c71255de21e3 (diff)
downloadfatcat-b6d228b7171252c8f9f70194c09aba0ed0c55567.tar.gz
fatcat-b6d228b7171252c8f9f70194c09aba0ed0c55567.zip
update crawlability docs
-rw-r--r--proposals/2021-04-02_crawlability.md10
1 files changed, 9 insertions, 1 deletions
diff --git a/proposals/2021-04-02_crawlability.md b/proposals/2021-04-02_crawlability.md
index 6b9ef66c..ee9f3c5b 100644
--- a/proposals/2021-04-02_crawlability.md
+++ b/proposals/2021-04-02_crawlability.md
@@ -1,9 +1,16 @@
-status: wip
+status: not-implemented
Crawlability Improvements
--------------------------
+NOTE: After some back and forth on this topic, we have decided for now to focus
+on having scholar.archive.org indexed, not fatcat.wiki. This proposal document
+document is being kept as documentation of that decision.
+
+
+## Original Intro
+
We are interested in making the fatcat corpus more crawlable/indexable by
aggregators and academic search enginges. For example, CiteseerX, Google
Scholar, or Microsoft Academic (when themselves get used by other projects).
@@ -13,6 +20,7 @@ Some open questions:
- is the web.archive.org iframe for PDFs ok, or should we redirect to PDFs with `id_` in the datetime?
+
## Redirect URLs and `citation_pdf_url`
We suspect that some crawlers do not like that fatcat.wiki landing pages have