通过kubeadm搭建k8s集群

准备工作

系统 内核 docker ip 主机名 配置
centos 7.4 4.14.179-1.el7.x86_64 20.10.16 192.168.111.128 master 2核4G
centos 7.4 4.14.179-1.el7.x86_64 20.10.16 192.168.111.129 node01 2核4G
centos 7.4 4.14.179-1.el7.x86_64 20.10.16 192.168.111.130 node02 2核4G

vmware虚拟机环境,网络全部是NAT模式,其中master有另外一张网卡,桥接模式,IP:192.168.41.249

以下操作覆盖所有节点

关闭防火墙

1
2
systemctl stop firewalld
systemctl disable firewalld

修改k8s配置文件

1
2
3
4
5
cat <<EOF >  /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
EOF
sysctl --system

开启转发

1
echo 1 > /proc/sys/net/ipv4/ip_forward

关闭swap

1
2
3
4
swapoff -a

// 注释/etc/fstab中的这行
/dev/mapper/centos-swap

关闭selinux

1
2
sed -i 's/enforcing/disabled/' /etc/selinux/config
setenforce 0

打开时间同步服务

1
2
systemctl enable chronyd
systemctl start chronyd

如果可以同步

安装docker

所有节点都需要安装

1
2
3
4
5
6
yum install -y yum-utils
yum install docker-ce docker-ce-cli containerd.io docker-compose-plugin
mkdir -p /etc/containerd
containerd config default | sudo tee /etc/containerd/config.toml
systemctl restart containerd
systemctl start docker

修改docker配置文件

1
2
3
4
5
6
7
8
9
10
11
12
13
cat <<EOF | sudo tee /etc/docker/daemon.json
{
"exec-opts": ["native.cgroupdriver=systemd"],
"log-driver": "json-file",
"log-opts": {
"max-size": "100m"
},
"storage-driver": "overlay2"
}
EOF
systemctl enable docker
systemctl daemon-reload
systemctl restart docker

安装kubeadm kubelet kubectl服务

所有节点都需要配置和安装

配置yum源

1
2
3
4
5
6
7
8
9
cat /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=http://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=0
repo_gpgcheck=0
gpgkey=http://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg
http://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg

安装(使用1.23.7版,不用最新版,最新版可能有坑,例如镜像仓库无法替换,镜像下载不下来)

1
yum install kubeadm-1.23.7-0  kubelet-1.23.7-0 kubectl-1.23.7-0 -y

部署k8s

主节点上初始化master

1
2
3
4
5
kubeadm init --kubernetes-version=1.23.7 \
--apiserver-advertise-address=192.168.111.128 \
--image-repository registry.aliyuncs.com/google_containers \
--service-cidr=10.1.0.0/16 \
--pod-network-cidr=10.244.0.0/16 --v=7

kubernetes-version:指定k8s版本,与安装版本一致

image-repository:指定镜像仓

apiserver-advertise-addressapiserver监听地址(IP要在本机)

service-cidrsvc的地址段

pod-network-cidrpod的地址段

正常情况下,部署成功。不正常情况,在踩坑中有汇总。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
I0706 16:46:45.150346   12488 loader.go:372] Config loaded from file:  /etc/kubernetes/admin.conf

Your Kubernetes control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

Alternatively, if you are the root user, you can run:

export KUBECONFIG=/etc/kubernetes/admin.conf

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/

Then you can join any number of worker nodes by running the following on each as root:

kubeadm join 192.168.111.128:6443 --token nks44r.gkygece3delpvvfv \
--discovery-token-ca-cert-hash sha256:c75ee26ea8d28e9703affae319d0479a5a3ece6c82448cbae527821c96271a66

其中,最下面的一串命令会在后续其他节点加入集群时会用到。如果没有保存下来也没关系,可以通过命令重新生成。

1
2
# kubeadm token create --print-join-command
kubeadm join 192.168.111.128:6443 --token i95sh4.acf1p80elxaku7y0 --discovery-token-ca-cert-hash sha256:c75ee26ea8d28e9703affae319d0479a5a3ece6c82448cbae527821c96271a66

配置kubectl工具

1
2
3
4
5
cp -a /etc/kubernetes/admin.conf .kube/config

# kubectl get nodes
NAME STATUS ROLES AGE VERSION
master NotReady control-plane,master 4m14s v1.23.7

安装flannel

1
2
3
mkdir k8s
cd k8s/
wget https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml

修改yml,将网络改成初始化时pod网络

1
2
3
4
5
6
7
net-conf.json: |
{
"Network": "10.244.0.0/16",
"Backend": {
"Type": "vxlan"
}
}

安装flannel

1
kubectl apply -f kube-flannel.yml
  • 可能出现镜像拉不下来导致pod启动不起来,可手动docker pull拉一下镜像。

安装完flannel之后,node状态恢复成Ready

1
2
3
# kubectl get nodes
NAME STATUS ROLES AGE VERSION
master Ready control-plane,master 39m v1.23.7

命令补全

1
2
3
4
5
yum install -y bash-completion

source <(kubectl completion bash)
echo "source <(kubectl completion bash)" >> ~/.bashrc
source ~/.bashrc

将其他节点加入到集群中

执行上面的kubeadm join

1
kubeadm join 192.168.111.128:6443 --token i95sh4.acf1p80elxaku7y0 --discovery-token-ca-cert-hash sha256:c75ee26ea8d28e9703affae319d0479a5a3ece6c82448cbae527821c96271a66

集群状态

1
2
3
4
5
# kubectl get nodes
NAME STATUS ROLES AGE VERSION
master Ready control-plane,master 10m v1.23.7
node01 Ready <none> 10m v1.23.7
node02 Ready <none> 10m v1.23.7

每个节点的kubelet都设置自启动

1
systemctl enable kubelet

检查功能

1
2
kubectl create deployment my-dep --image=nginx --port=80 --replicas=10
kubectl expose deployment my-dep --port=8080 --target-port=80
1
2
3
curl 10.244.1.3:80
kubectl get svc
curl 10.1.163.249:8080

关机,快照。

踩坑

  • kubeadm init时,kubelet启动不起

    1
    journalctl -u kubelet

    很有可能是由于swap没有关闭

  • kubeadm init时,容器启动不起来

    1
    2
    docker ps -a 
    docker logs xxx(apiserver)

    可能由于换了镜像仓,但是镜像也不能拉下来导致。

    也可能由于IP地址没有绑定,导致etcd起不来。

  • 报错x09

    1
    E0706 07:40:27.771562       1 authentication.go:63] "Unable to authenticate the request" err="[x509: certificate signed by unknown authority, verifying certificate SN=388146623273092387, SKID=, AKID=76:0F:D3:22:AE:F3:23:23:9D:81:DD:C5:90:82:E8:1E:89:D7:63:98 failed: x509: certificate signed by unknown authority (possibly because of \"crypto/rsa: verification error\" while trying to verify candidate authority certificate \"kubernetes\")]"