summaryrefslogtreecommitdiffstats
path: root/extra/sitemap/README.md
diff options
context:
space:
mode:
authorBryan Newbold <bnewbold@archive.org>2021-06-22 15:37:46 -0700
committerBryan Newbold <bnewbold@archive.org>2021-06-22 15:37:46 -0700
commit0d2d8ae4a020c6d34c65bbc5de8513f04b5d722a (patch)
treed809661cd3d3aeef972155ba6b356341a58fcde1 /extra/sitemap/README.md
parentd1c79a4abdfbbec8d84501c8bed8b5084e1285f6 (diff)
downloadfatcat-scholar-0d2d8ae4a020c6d34c65bbc5de8513f04b5d722a.tar.gz
fatcat-scholar-0d2d8ae4a020c6d34c65bbc5de8513f04b5d722a.zip
sitemaps: change filters; only primary release fulltext (via jq); scp to replica
Diffstat (limited to 'extra/sitemap/README.md')
-rw-r--r--extra/sitemap/README.md4
1 files changed, 4 insertions, 0 deletions
diff --git a/extra/sitemap/README.md b/extra/sitemap/README.md
index 53f9518..61ade0f 100644
--- a/extra/sitemap/README.md
+++ b/extra/sitemap/README.md
@@ -9,6 +9,10 @@ installed. Run these commands on a production machine.
/srv/fatcat_scholar/src/extra/sitemap/access_urls_query.sh
/srv/fatcat_scholar/src/extra/sitemap/generate_sitemap_indices.py
+Then copy to alternate/replica machine:
+
+ scp *.txt *.xml $SCHOLARREPLICAHOST:/srv/fatcat_scholar/sitemap
+
## Background
Google has a limit of 50k lines / 10 MByte for text sitemap files, and 50K