通过k8s安装pgsql

介绍通过deploysvcpvpvc部署pgsql。单机版用于学习,生产环境推荐使用集群模式,支持高可用。

单机版

单机版仅供学习、练习用,生产环境尽量不要单机部署。

此次部署使用pv方式做持久化存储,通过cm注入配置

cm.yaml

1
2
3
4
5
6
7
8
9
10
apiVersion: v1
kind: ConfigMap
metadata:
name: postgres-config
labels:
app: postgres
data:
POSTGRES_DB: postgresdb
POSTGRES_USER: postgresadmin
POSTGRES_PASSWORD: admin12345

pv.yaml

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
kind: PersistentVolume
apiVersion: v1
metadata:
name: postgres-pv-volume
labels:
type: local
app: postgres
spec:
storageClassName: manual
capacity:
storage: 5Gi
accessModes:
- ReadWriteMany
hostPath:
path: "/mnt/data"
---
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: postgres-pv-claim
labels:
app: postgres
spec:
storageClassName: manual
accessModes:
- ReadWriteMany
resources:
requests:
storage: 5Gi

deployment.yaml

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
apiVersion: apps/v1
kind: Deployment
metadata:
name: postgres-deployment
spec:
strategy:
type: Recreate
selector:
matchLabels:
app: postgres
replicas: 1
template:
metadata:
labels:
app: postgres
spec:
containers:
- name: postgres
image: postgres:11.7
imagePullPolicy: "IfNotPresent"
ports:
- containerPort: 5432
envFrom:
- configMapRef:
name: postgres-config
volumeMounts:
- mountPath: /var/lib/postgresql/data
name: postgredb
volumes:
- name: postgredb
persistentVolumeClaim:
claimName: postgres-pv-claim

svc.yaml

1
2
3
4
5
6
7
8
9
10
11
12
13
14
apiVersion: v1
kind: Service
metadata:
name: postgres-service
labels:
app: postgres
spec:
type: NodePort
ports:
- port: 5432
targetPort: 5432
protocol: TCP
selector:
app: postgres

启动

1
2
3
4
kubectl apply -f cm.yaml
kubectl apply -f pv.yaml
kubectl apply -f deployment.yaml
kubectl apply -f svc.yaml

通过golang使用GORM连接

官方文档

引用

1
2
"gorm.io/driver/postgres"
"gorm.io/gorm"

连接

1
2
dsn := "host=192.168.41.249 user=postgresadmin password=admin12345 dbname=gorm port=32759 sslmode=disable TimeZone=Asia/Shanghai"
db, err := gorm.Open(postgres.Open(dsn), &gorm.Config{})

高可用

逻辑复制

从PostgreSQL 10(以下简称PG)开始,PG支持逻辑复制能力,可实现仅复制部分表或PG服务器上的部分database。逻辑复制的一大优点是支持跨版本间复制,也不需要主从节点的操作系统和硬件架构相同。例如,我们可以实现一台Linux服务器上的PG 11和Windows服务器上的PG 10之间的复制;通过逻辑复制还可以实现不停服的数据库版本更新。

集群

集群高可用方案比较多,这里选择stolon over kubernetes实现

官方文档:https://github.com/sorintlab/stolon

功能

  • 利用 PostgreSQL 流式复制。
  • 适应任何类型的分区。在尝试保持最大可用性的同时,它更喜欢一致性而不是可用性。
  • kubernetes 集成让您实现 postgreSQL 高可用性。
  • 使用 etcd、consul 或 kubernetes API server 等集群存储作为高可用数据存储和 leader 选举。
  • 异步(默认)和同步复制。
  • 在几分钟内完成集群设置。
  • 轻松简单的集群管理。
  • 可以与您首选的备份/恢复工具集成进行时间点恢复。
  • 备用集群(用于多站点复制和接近零停机时间的迁移)。
  • 自动服务发现和动态重新配置(处理 postgres 和 stolon 进程更改其地址)。
  • 可以使用 pg_rewind 与当前 master 进行快速实例重新同步。

stolon架构

Stolon architecture

Stolon 由 3 个主要部分组成

  • keeper:他负责管理PostgreSQL的实例汇聚到由sentinel(s)提供的clusterview。
  • sentinel:负责发现并且监控keeper,并且计算最理想的clusterview。
  • proxy:客户端的接入点。它强制连接到正确的PostgreSQL的master节点并且强制关闭连接到由非选举产生的master。

Stolonetcd或者consul作为主要的集群状态存储。

对应部署文档:

https://github.com/sorintlab/stolon/blob/master/examples/kubernetes/README.md

这里选择keeper节点2个,proxy节点1个,sentinel节点1个

1
2
3
4
base-stolon-keeper-0                    1/1     Running     1          19h
base-stolon-keeper-1 1/1 Running 0 11m
base-stolon-proxy-797755db58-kglzf 1/1 Running 0 19h
base-stolon-sentinel-5c969874d4-shsmm 1/1 Running 0 19h

由于keeper是通过本地持久化存储数据,测试高可用场景主要在本地数据无法访问时是否保证数据可访问。在k8s上,sentinelproxy都是无状态服务,当节点宕机,k8s上会自动漂移到其他节点,因此,高可用测试主要集中在keeper节点上。在集群规模比较大时,多个proxy节点和多个sentinel能够提高访问性能和集群管理性能。

store-backendkubernetes api server

1
stolonctl --cluster-name stolon-cluster --store-backend=kubernetes --kube-resource-kind=configmap init

进入proxypod,通过命令行连接数据库

1
psql -h 127.0.0.1 -d postgres -U datatom

生成测试数据库,测试数据表,测试数据

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
-- 创建数据库
create database mitaka;
-- 创建表
create table people
(
id integer,
name varchar(32),
age integer,
grade numeric(4, 2),
birthday date,
logintime timestamp
);

-- 插入 100w 数据
insert into people
select generate_series(1,1000000) as id,
md5(random()::text) as name,
(random()*100)::integer as age,
(random()*99)::numeric(4,2) as grade,
now() - ((random()*1000)::integer||' day')::interval as birthday,
clock_timestamp() as logintime;

-- 获取数据
SELECT * FROM people LIMIT 100;
SELECT COUNT(*) FROM people;

查看主节点

pg_stat_replication是一个视图,主要用于监控PG流复制情况。在这个系统视图中每个记录只代表一个slave。因此,可以看到谁处于连接状态,在做什么任务。pg_stat_replication也是检查slave是否处于连接状态的一个好方法。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
mitaka=# -- 打开扩展显示
mitaka=# \x
Expanded display is on.
mitaka=# -- 主库查看wal日志发送状态
select * from pg_stat_replication;
-[ RECORD 1 ]----+------------------------------
pid | 3554 -- 这代表负责流连接的wal_sender进程的进程ID。例如“postgres: walsender repluser 10.0.0.69(44922) streaming 60/85928000”。
usesysid | 16384 -- 每个内部用户都有一个独一无二的编号。该系统的工作原理很像UNIX。 usesysid 是 (PostgreSQL) 用户连接到系统的唯一标识符。
usename | repluser -- (不是用户名, 注意少了 r),它存储与用户相关的 usesysid 的名字。这是客户端放入到连接字符串中的东西。
application_name | stolon_88a40937 -- 这是同步复制的通常设置。它可以通过连接字符串传递到master。
client_addr | 10.0.0.249 -- 它会告诉您流连接从何而来。它拥有客户端的IP地址。
client_hostname | -- 除了客户端的IP,您还可以这样做,通过它的主机名来标识客户端。您可以通过master上的postgresql.conf中的log_hostname启用DNS反向查找。
client_port | 32952 -- 这是客户端用来和WALsender进行通信使用的TPC端口号。 如果不本地UNIX套接字被使用了将显示-1。
backend_start | 2022-11-03 02:59:40.635655+00 -- 它告诉我们slave什么时间创建了流连接。
backend_xmin |
state | streaming -- 此列告诉我们数据的连接状态。如果事情按计划进行,它应该包含流信息。
sent_lsn | 0/2F2EF898 -- 这代表发送到连接的最后的事务日志的位置。已经通过网络发送了多少WAL?
write_lsn | 0/2F2EF898 -- 这是写到standby系统磁盘上最后的事务日志位置。已向操作系统发送了多少WAL?( 尚未 flushing)
flush_lsn | 0/2F2EF898 -- 这是被刷新到standby系统的最后位置。(这里注意写和刷新之间的区别。写并不意味着刷新 。)已经有多少WAL已 flush 到磁盘?
replay_lsn | 0/2F2EF898 -- 这是slave上重放的最后的事务日志位置。已重放了多少WAL,因此对查询可见?
write_lag |
flush_lag |
replay_lag |
sync_priority | 0 -- 这个字段是唯一和同步复制相关的。每次同步复制将会选择一个优先权 —sync_priority—会告诉您选择了那个优先权。
sync_state | async -- 最后您会看到slave在哪个状态。这个状态可以是async, sync, or potential。当有一个带有较高优先权的同步slave时,PostgreSQL会把slave 标记为 potential。

人们经常说 pg_stat_replication 视图是primary 端的,这是不对的。该视图的作用是揭示有关wal sender 进程的信息。换句话说:如果你正在运行级联复制,该视图意味着在 secondary 复制到其他slaves 的时候, secondary 端的 pg_stat_replication 上的也会显示entries ( 条目 )

只能通过proxy连接,而且只能连接到master

master上无法查看到从库wal日志接收状态:

1
2
mitaka=# select * from pg_stat_wal_receiver;
(0 rows)

通过命令行获取pgsql集群状态

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
# stolonctl --cluster-name=base-stolon --store-backend=kubernetes --kube-resource-kind=configmap status
=== Active sentinels ===

ID LEADER
f17b4ca7 true

=== Active proxies ===

ID
56b6fe5f

=== Keepers ===

UID HEALTHY PG LISTENADDRESS PG HEALTHY PG WANTEDGENERATION PG CURRENTGENERATION
keeper0 true 10.0.0.172:5432 true 5 5 # keep0状态true
keeper1 true 10.0.0.249:5432 true 2 2 # keep1状态true

=== Cluster Info ===

Master Keeper: keeper0 # keeper0为master

===== Keepers/DB tree =====

keeper0 (master)
└─keeper1 # keep1为standby

集群信息

1
stolonctl --cluster-name=base-stolon --store-backend=kubernetes --kube-resource-kind=configmap clusterdata read
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
{
"formatVersion":1,
"changeTime":"2022-11-03T04:02:39.93375702Z",
"cluster":{
"uid":"ed2b4c84",
"generation":1,
"changeTime":"2022-11-03T02:50:39.625209782Z",
"spec":{
"additionalWalSenders":null,
"additionalMasterReplicationSlots":null,
"initMode":"new",
"pgParameters":{
"datestyle":"iso, mdy",
"default_text_search_config":"pg_catalog.english",
"dynamic_shared_memory_type":"posix",
"lc_messages":"en_US.utf8",
"lc_monetary":"en_US.utf8",
"lc_numeric":"en_US.utf8",
"lc_time":"en_US.utf8",
"log_timezone":"Etc/UTC",
"max_connections":"2000",
"max_wal_size":"1GB",
"min_wal_size":"80MB",
"shared_buffers":"128MB",
"ssl":"off",
"timezone":"Etc/UTC",
"wal_level":"replica"
},
"pgHBA":null,
"automaticPgRestart":null
},
"status":{
"phase":"normal",
"master":"45d5ba5a"
}
},
"keepers":{
"keeper0":{
"uid":"keeper0",
"generation":1,
"changeTime":"2022-11-03T04:02:39.933914846Z",
"spec":{

},
"status":{
"healthy":true,
"lastHealthyTime":"2022-11-03T04:02:39.929794259Z",
"bootUUID":"830de9a1-88a3-49ca-8f6d-ef30fc1de515",
"postgresBinaryVersion":{
"Maj":11,
"Min":9
}
}
},
"keeper1":{
"uid":"keeper1",
"generation":1,
"changeTime":"2022-11-03T04:02:39.933918792Z",
"spec":{

},
"status":{
"healthy":true,
"lastHealthyTime":"2022-11-03T04:02:39.929794842Z",
"bootUUID":"aebf6d0a-4087-403a-827b-e13516092982",
"postgresBinaryVersion":{
"Maj":11,
"Min":9
}
}
}
},
"dbs":{
"45d5ba5a":{
"uid":"45d5ba5a",
"generation":5,
"changeTime":"2022-11-03T04:02:39.933855426Z",
"spec":{
"keeperUID":"keeper0",
"requestTimeout":"10s",
"maxStandbys":20,
"additionalWalSenders":5,
"additionalReplicationSlots":null,
"initMode":"none",
"pgParameters":{
"datestyle":"iso, mdy",
"default_text_search_config":"pg_catalog.english",
"dynamic_shared_memory_type":"posix",
"lc_messages":"en_US.utf8",
"lc_monetary":"en_US.utf8",
"lc_numeric":"en_US.utf8",
"lc_time":"en_US.utf8",
"log_timezone":"Etc/UTC",
"max_connections":"2000",
"max_wal_size":"1GB",
"min_wal_size":"80MB",
"shared_buffers":"128MB",
"ssl":"off",
"timezone":"Etc/UTC",
"wal_level":"replica"
},
"pgHBA":null,
"role":"master",
"followers":[
"88a40937"
],
"synchronousStandbys":null,
"externalSynchronousStandbys":null
},
"status":{
"healthy":true,
"currentGeneration":5,
"listenAddress":"10.0.0.172",
"port":"5432",
"systemdID":"7161314937790898225",
"timelineID":2,
"xLogPos":791607672,
"timelinesHistory":[
{
"timelineID":1,
"switchPoint":651337000,
"reason":"no recovery target specified"
}
],
"pgParameters":{
"datestyle":"iso, mdy",
"default_text_search_config":"pg_catalog.english",
"dynamic_shared_memory_type":"posix",
"lc_messages":"en_US.utf8",
"lc_monetary":"en_US.utf8",
"lc_numeric":"en_US.utf8",
"lc_time":"en_US.utf8",
"log_timezone":"Etc/UTC",
"max_connections":"2000",
"max_wal_size":"1GB",
"min_wal_size":"80MB",
"shared_buffers":"128MB",
"ssl":"off",
"timezone":"Etc/UTC",
"wal_level":"replica"
},
"synchronousStandbys":null,
"olderWalFile":"000000020000000000000027"
}
},
"88a40937":{
"uid":"88a40937",
"generation":2,
"changeTime":"2022-11-03T04:02:39.933910733Z",
"spec":{
"keeperUID":"keeper1",
"requestTimeout":"10s",
"maxStandbys":20,
"additionalWalSenders":5,
"additionalReplicationSlots":null,
"initMode":"none",
"pgParameters":{
"datestyle":"iso, mdy",
"default_text_search_config":"pg_catalog.english",
"dynamic_shared_memory_type":"posix",
"lc_messages":"en_US.utf8",
"lc_monetary":"en_US.utf8",
"lc_numeric":"en_US.utf8",
"lc_time":"en_US.utf8",
"log_timezone":"Etc/UTC",
"max_connections":"2000",
"max_wal_size":"1GB",
"min_wal_size":"80MB",
"shared_buffers":"128MB",
"ssl":"off",
"timezone":"Etc/UTC",
"wal_level":"replica"
},
"pgHBA":null,
"role":"standby",
"followConfig":{
"type":"internal",
"dbuid":"45d5ba5a"
},
"followers":[

],
"synchronousStandbys":null,
"externalSynchronousStandbys":null
},
"status":{
"healthy":true,
"currentGeneration":2,
"listenAddress":"10.0.0.249",
"port":"5432",
"systemdID":"7161314937790898225",
"timelineID":2,
"xLogPos":791607672,
"timelinesHistory":[
{
"timelineID":1,
"switchPoint":651337000,
"reason":"no recovery target specified"
}
],
"pgParameters":{
"datestyle":"iso, mdy",
"default_text_search_config":"pg_catalog.english",
"dynamic_shared_memory_type":"posix",
"lc_messages":"en_US.utf8",
"lc_monetary":"en_US.utf8",
"lc_numeric":"en_US.utf8",
"lc_time":"en_US.utf8",
"log_timezone":"Etc/UTC",
"max_connections":"2000",
"max_wal_size":"1GB",
"min_wal_size":"80MB",
"shared_buffers":"128MB",
"ssl":"off",
"timezone":"Etc/UTC",
"wal_level":"replica"
},
"synchronousStandbys":null,
"olderWalFile":"000000020000000000000027"
}
}
},
"proxy":{
"generation":5,
"changeTime":"2022-11-03T02:50:44.661274173Z",
"spec":{
"masterDbUid":"45d5ba5a",
"enabledProxies":[
"56b6fe5f"
]
},
"status":{

}
}
}

重启keep0

1
kubectl delete pod base-stolon-keeper-0

或者

1
2
3
4
# 获取 docker id
kubectl describe pod base-stolon-keeper-0 | grep "Container ID:"
# 关闭该docker
docker stop ef79a3eed

查看集群状态

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
=== Active sentinels ===

ID LEADER
f17b4ca7 true

=== Active proxies ===

ID
56b6fe5f

=== Keepers ===

UID HEALTHY PG LISTENADDRESS PG HEALTHY PG WANTEDGENERATION PG CURRENTGENERATION
keeper0 true 10.0.0.10:5432 true 2 2
keeper1 true 10.0.0.249:5432 true 4 4

=== Cluster Info ===

Master Keeper: keeper1 # master变更

===== Keepers/DB tree =====

keeper1 (master)
└─keeper0

再写入 100w 数据

1
2
3
4
5
6
7
insert into people
select generate_series(1,1000000) as id,
md5(random()::text) as name,
(random()*100)::integer as age,
(random()*99)::numeric(4,2) as grade,
now() - ((random()*1000)::integer||' day')::interval as birthday,
clock_timestamp() as logintime;

keep1缩容,模拟keep1关机

1
kubectl scale --replicas=1 statefulset.apps/base-stolon-keeper
1
2
3
4
5
6
NAME                                    READY   STATUS      RESTARTS   AGE
base-stolon-create-cluster-f55nw 0/1 Completed 0 23h
base-stolon-keeper-0 1/1 Running 2 10m
base-stolon-proxy-797755db58-kglzf 1/1 Running 0 23h
base-stolon-sentinel-5c969874d4-shsmm 1/1 Running 0 23h
base-stolon-update-cluster-spec-wdl94 0/1 Completed 0 3h50m

查看集群状态

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
=== Active sentinels ===

ID LEADER
f17b4ca7 true

=== Active proxies ===

ID
56b6fe5f

=== Keepers ===

UID HEALTHY PG LISTENADDRESS PG HEALTHY PG WANTEDGENERATION PG CURRENTGENERATION
keeper0 true 10.0.0.10:5432 true 3 3 # keep0 在线
keeper1 false (no db assigned) false 0 0 # keep1 离线

=== Cluster Info ===

Master Keeper: keeper0 # keep0 为master

===== Keepers/DB tree =====

keeper0 (master)

再次查看数据,第一次执行会报错,说明proxy在这个过程中发生切换,再次执行可以获取结果

1
select * from pg_stat_replication;

再次写入数据,然后启动keep1,模拟是否发送脑裂

1
kubectl scale --replicas=2 statefulset.apps/base-stolon-keeper
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
=== Active sentinels ===

ID LEADER
f17b4ca7 true

=== Active proxies ===

ID
56b6fe5f

=== Keepers ===

UID HEALTHY PG LISTENADDRESS PG HEALTHY PG WANTEDGENERATION PG CURRENTGENERATION
keeper0 true 10.0.0.10:5432 true 4 4
keeper1 true 10.0.0.12:5432 true 1 0 # keep1 恢复正常

=== Cluster Info ===

Master Keeper: keeper0 # master 为keep0

===== Keepers/DB tree =====

keeper0 (master)
└─keeper1

数据正常。

推荐阅读:

《PostgreSQL从入门到不后悔》

PostgreSQL高可用stolon