blob: 8f87d5c6ea40be17b72c1a146bb176b0b5d17c04 (
plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
|
## 2018-06-27 Measurements
fatcat_prod=# select count(*) from release_ident; 20983019
fatcat_prod=# select count(*) from work_ident; 20988140
fatcat_prod=# select count(*) from file_ident; 1482335
fatcat_prod=# select count(*) from creator_ident; 4167419
fatcat_prod=# select count(*) from container_ident; 61793
select count(*) from release_contrib; 59798133
bnewbold@wbgrp-svc500$ sudo du -sh /var/lib/postgresql/
43G
running import-crossref with 20 threads, and manifest importer with one (at 33%
complete). had already imported ~7million works+releases previously.
PostgreSQL 10.4 - wbgrp-svc500.us.archive.org - postgres@localhost:5432/postgre
Size: 41.38G - 323.40K/s | TPS: 885
Mem.: 50.80% - 23.86G/49.14G | IO Max: 79539/s
Swap: 0.80% - 408.89M/50.00G | Read : 67.04K/s - 16/s
Load: 6.69 7.41 7.69 | Write: 1.93M/s - 493/s
----total-cpu-usage---- -dsk/total- -net/total- ---paging-- ---system--
usr sys idl wai hiq siq| read writ| recv send| in out | int csw
32 6 62 0 0 0| 296k 3880k| 334k 3144B| 0 0 | 21k 65k
31 6 62 0 0 0| 0 3072k| 391k 318B| 0 0 | 51k 141k
31 6 63 0 0 0| 16k 1080k| 344k 1988B| 0 0 | 35k 104k
29 6 65 0 0 0| 136k 2608k| 175k 332B| 0 0 |9835 15k
28 5 67 0 0 0| 408k 4368k| 285k 832B| 0 0 | 14k 17k
33 5 62 0 0 0| 56k 3256k| 219k 99B| 0 0 | 22k 49k
31 6 63 0 0 0| 188k 5120k| 158k 318B| 0 0 | 17k 29k
30 6 64 0 0 0| 200k 6984k| 239k 988B| 0 0 | 16k 24k
30 6 64 0 0 0| 168k 5504k| 159k 152B| 0 0 | 14k 20k
28 7 65 0 0 0| 440k 12M| 236k 420B| 0 0 | 15k 18k
29 6 65 0 0 0| 428k 6968k| 352k 310B| 0 0 | 19k 31k
32 6 62 0 0 0| 64k 3480k| 288k 318B| 0 0 | 18k 55k
32 6 62 0 0 0| 32k 2080k| 155k 318B| 0 0 | 20k 52k
bnewbold@wbgrp-svc500$ uptime
22:00:42 up 28 days, 22:31, 6 users, load average: 7.94, 7.56, 7.72
2018-06-27 21:57:36.102 UTC [401] LOG: checkpoints are occurring too frequently (13 seconds apart)
2018-06-27 21:57:36.102 UTC [401] HINT: Consider increasing the configuration parameter "max_wal_size".
relname | too_much_seq | case | rel_size | seq_scan | idx_scan
-----------------+--------------+----------------+-------------+----------+-----------
changelog | 1274670 | Missing Index? | 39411712 | 1274670 | 0
file_edit | 612386 | Missing Index? | 108298240 | 612386 | 0
creator_edit | 612386 | Missing Index? | 285540352 | 612386 | 0
container_edit | 612386 | Missing Index? | 4784128 | 612386 | 0
release_edit | 612386 | Missing Index? | 1454489600 | 612386 | 0
work_edit | 612386 | Missing Index? | 1454415872 | 612386 | 0
release_contrib | 296675 | Missing Index? | 4725645312 | 296675 | 0
release_ref | 296663 | Missing Index? | 8999837696 | 296663 | 0
file_release | -113 | OK | 13918208 | 110 | 223
container_rev | -979326 | OK | 16285696 | 63 | 979389
file_ident | -3671516 | OK | 109002752 | 362 | 3671878
file_rev | -3944155 | OK | 302940160 | 95 | 3944250
creator_rev | -8420205 | OK | 318283776 | 1226 | 8421431
creator_ident | -9525338 | OK | 309141504 | 52330 | 9577668
container_ident | -20581876 | OK | 4833280 | 272457 | 20854333
release_ident | -40548858 | OK | 1440948224 | 4160919 | 44709777
work_rev | -42534913 | OK | 1124671488 | 1161 | 42536074
editgroup | -48864662 | OK | 34136064 | 1 | 48864663
work_ident | -65008911 | OK | 1503313920 | 1239 | 65010150
release_rev | -185735794 | OK | 13649428480 | 128 | 185735922
## Setup
Add to postgres.conf:
shared_preload_libraries = 'auto_explain,pg_stat_statements'
# Increase the max size of the query strings Postgres records
track_activity_query_size = 2048
# Track statements generated by stored procedures as well
pg_stat_statements.track = all
Also:
track_counts (already default)
autovacuum (already default?)
log_min_error = warning
log_min_duration_statement = 5000
Then from shell:
create extension pg_stat_statements;
Regularly want to run:
VACUUM ANALYZE
## Tuning Values
postgres config:
max_connections = 100 (default)
shared_buffers = 128MB -> 10GB (while elastic on same machine; later 16 or more)
effective_cache_size = 4GB -> 24GB (while elastic on same machine)
work_mem = 4MB -> 128MB # relatively few connections/box
fsync = on
commit_delay = ??? (and siblings)
random_page_cost = 1 (for SSD)
default_statistics_target = 100 -> 200
maintenance_work_mem = 64MB -> 8GB
synchronous_commit = off (during dev only! switch to on for production!)
wal_sync_method (keep default)
max_wal_size = 64 -> 128 (based on above HINT message)
# didn't mess with commit_delay/commit_siblings
system:
sysctl -w vm.overcommit_memory=2
TODO: ulimit -n 65536
TODO: ulimit -p 800
LimitNOFILE
/lib/systemd/system/postgresql.service
## Resources
https://www.geekytidbits.com/performance-tuning-postgres/
Could try pgbadger to handle auto_explain type output.
https://www.postgresql.org/docs/10/static/runtime-config-wal.html
IA-specific resources:
https://git.archive.org/ia/mint/blob/master/postgres/postgres_9_2.yml
https://git.archive.org/ia/mint/blob/master/postgres/put_datadir_on_ssd.sh
https://git.archive.org/ia/mint/blob/master/postgres/templates/postgresql.conf.j2
|