- 官方文档地址: K3s 中文文档
什么是 k3s
K3s 是一个轻量级的 Kubernetes 发行版,它针对边缘计算、物联网等场景进行了高度优化。K3s 有以下增强功能:
- 打包为单个二进制文件。
- 使用基于 sqlite3 的轻量级存储后端作为默认存储机制。同时支持使用 etcd3、MySQL 和 PostgreSQL 作为存储机制。
- 封装在简单的启动程序中,通过该启动程序处理很多复杂的 TLS 和选项。
- 默认情况下是安全的,对轻量级环境有合理的默认值。
- 添加了简单但功能强大的batteries-included功能,例如:本地存储提供程序,服务负载均衡器,Helm controller 和 Traefik Ingress controller。
- 所有 Kubernetes control-plane 组件的操作都封装在单个二进制文件和进程中,使 K3s 具有自动化和管理包括证书分发在内的复杂集群操作的能力。
- 最大程度减轻了外部依赖性,K3s 仅需要 kernel 和 cgroup 挂载。 K3s 软件包需要的依赖项包括:
- containerd
- Flannel
- CoreDNS
- CNI
- 主机实用程序(iptables、socat 等)
- Ingress controller(Traefik)
- 嵌入式服务负载均衡器(service load balancer)
- 嵌入式网络策略控制器(network policy controller)
部署环境说明
本次安装使用 5 台 Linux 服务器,系统版本为 CentOS 7.9,分别为 3台 Server,2台 Agent,其中 Node 配置相同。
主机 | IP地址 | 系统版本 | 角色 |
---|---|---|---|
k3s-server01 | 10.1.40.61 | CentOS 7.9 | server, etcd |
k3s-server02 | 10.1.40.62 | CentOS 7.9 | server, etcd |
k3s-server03 | 10.1.40.63 | CentOS 7.9 | server, etcd |
k3s-server01 | 10.1.40.64 | CentOS 7.9 | agent |
k3s-server02 | 10.1.40.65 | CentOS 7.9 | agent |
- VIP 地址: 10.1.40.60
系统初始化配置
各节点通信采用主机名的方式,这种方式与 IP 地址相比较更具有扩展性。所有节点配置 hosts,修改 /etc/hosts,如下:
1
2
3
4
5
6
7
8
9
10cat > /etc/hosts <<EOF
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
10.1.40.61 k3s-server01
10.1.40.62 k3s-server02
10.1.40.63 k3s-server03
10.1.40.64 k3s-agent01
10.1.40.65 k3s-agent02
EOF所有节点关闭防火墙,selinux,dnsmasq,swap。(如果开启防火墙需要开放对应的端口,配置较为复杂)。如果在云上部署,可以通过安全组进行阿暖配置。
1
2
3
4
5
6
7
8
9
10
11
12
13# 关闭并禁用 firewalld, dnsmasq, NetworkManager
systemctl disable --now firewalld
systemctl disable --now dnsmasq
systemctl disable --now NetworkManager # 公有云的话不需要关闭
# 临时关闭 selinux
setenforce 0
# 永久关闭 selinux
sed -i 's/SELINUX=enforcing/SELINUX=disabled/g' /etc/selinux/config
# 关闭 swap 分区以及注释 swap 挂载项
swapoff -a && sysctl -w vm.swappiness=0
sed -ri 's/.*swap.*/#&/' /etc/fstab所有节点同步时间设置,所有节点同步时间是必须的,并且需要加到开机自启动和计划任务中,如果节点时间不同步,会造成 Etcd 存储 Kubernetes 信息的键值数据库同步数据不正常,也会造成证书出现问题。时间同步配置如下:
1
2
3
4
5
6
7
8
9
10
11
12
13# 安装时间同步软件 ntpdate
yum install -y ntpdate
# 设置时区
ln -sf /usr/share/zoneinfo/Asia/Shanghai /etc/localtime
echo 'Asia/Shanghai' > /etc/timezone
# 手动同步时间
ntpdate time2.aliyun.com
# 添加定时任务以及配置开机启动执行
echo '*/5 * * * * /usr/sbin/ntpdate time2.aliyun.com >/dev/null' >> /var/spool/cron/root
echo '/usr/sbin/ntpdate time2.aliyun.com' >> /etc/rc.local所有节点配置 limit
1
2
3
4
5
6
7
8
9
10# 临时设置
ulimit -SHn 65535
# 永久设置
sed -i '/^# End/i\* soft nofile 655350' /etc/security/limits.conf
sed -i '/^# End/i\* hard nofile 131072' /etc/security/limits.conf
sed -i '/^# End/i\* soft nproc 655350' /etc/security/limits.conf
sed -i '/^# End/i\* hard nproc 655350' /etc/security/limits.conf
sed -i '/^# End/i\* soft memlock unlimited' /etc/security/limits.conf
sed -i '/^# End/i\* hard memlock unlimited' /etc/security/limits.conf在 k3s-server01 节点配置免密登录其他节点,安装过程中生成的配置文件和证书均在 k3s-server01 上操作,集群管理也在该节点上操作
1
2
3
4ssh-keygen -t rsa
for i in k3s-server01 k3s-server02 k3s-server03 k3s-agent01 k3s-agent02;do
ssh-copy-id -i .ssh/id_rsa.pub root@$i;
done配置所有节点使用国内 yum 源
1
2
3
4
5
6
7
8
9# 确保服务器已安装 wget 或者 curl
cd /etc/yum.repos.d && mkdir bak && mv *.repo bak && cd ~/
wget -O /etc/yum.repos.d/CentOS-Base.repo http://mirrors.aliyun.com/repo/Centos-7.repo
wget -O /etc/yum.repos.d/epel.repo http://mirrors.aliyun.com/repo/epel-7.repo
sed -i -e '/mirrors.cloud.aliyuncs.com/d' -e '/mirrors.aliyuncs.com/d' /etc/yum.repos.d/CentOS-Base.repo
yum clean all && yum makecache
# 安装必备工具
yum install -y jq psmisc wget vim net-tools telnet yum-utils device-mapper-persistent-data lvm2 git所有节点升级系统并重启(可选)(此处没有升级内核)
1
yum update -y --exclude=kernel* && reboot
升级内核
CentOS7 需要升级内核至4.18+,本地升级的版本为5.19
方法一,离线升级内核方法
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20# 下载内核
cd /root
wget http://193.49.22.109/elrepo/kernel/el7/x86_64/RPMS/kernel-ml-5.19.12-1.el7.elrepo.x86_64.rpm
wget http://193.49.22.109/elrepo/kernel/el7/x86_64/RPMS/kernel-ml-devel-5.19.12-1.el7.elrepo.x86_64.rpm
# 从 k3s-server01 节点传到其他节点:
for i in k3s-server02 k3s-server03 k3s-agent01 k3s-agent02;do
scp kernel-ml-* root@$i:~/;
done
# 所有节点安装内核
cd /root && yum localinstall -y kernel-ml*
# 所有节点更改内核启动顺序
grub2-set-default 0 && grub2-mkconfig -o /etc/grub2.cfg
grubby --args="user_namespace.enable=1" --update-kernel="$(grubby --default-kernel)"
# 重启并检查内核是不是 5.19
uname -a
安装 Docker-CE
所有 worker 节点安装 Docker
1
2
3
4
5
6
7
8
9
10
11
12# 安装依赖
yum install -y yum-utils device-mapper-persistent-data lvm2
# 配置 Docker yum 仓库
yum-config-manager --add-repo https://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo
yum makecache fast
# 查看可用 Docker 版本
yum list docker-ce --showduplicates |sort -r
# 安装指定版本 Docker,按需修改
yum install -y docker-ce-20.10.*所有 worker 节点配置 Docker 镜像加速
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26mkdir /etc/docker
cat > /etc/docker/daemon.json << EOF
{
"exec-opts": ["native.cgroupdriver=systemd"],
"registry-mirrors": [
"https://b9pmyelo.mirror.aliyuncs.com",
"https://registry.docker-cn.com",
"http://hub-mirror.c.163.com",
"https://docker.mirrors.ustc.edu.cn"
],
"insecure-registries": ["http://10.1.40.69"],
"data-root": "/data/docker",
"storage-driver": "overlay2",
"storage-opts": [
"overlay2.override_kernel_check=true"
],
"max-concurrent-downloads": 10,
"max-concurrent-uploads": 5,
"log-opts": {
"max-size": "300m",
"max-file": "2"
},
"log-driver": "json-file",
"live-restore": true
}
EOF启动 docker 并配置服务开机启动
1
systemctl enable --now docker.service
安装 HAProxy 以及 Keepalived
这里高可用采用的是 HAProxy + Keepalived,HAProxy 和 Keepalived 以守护进程的方式在所有 Server 节点部署。
可以通过 yum 安装 HAProxy 和 Keepalived。
1
yum install -y haproxy keepalived
配置 HAProxy
所有 Server 节点配置 HAProxy(详细配置参考 HAProxy 文档,所有 Server 节点的 HAProxy 配置相同)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68mv /etc/haproxy/haproxy.cfg /etc/haproxy/haproxy_ori.cfg
cat > /etc/haproxy/haproxy.cfg <<EOF
global
log 127.0.0.1 local0 err
chroot /var/lib/haproxy
pidfile /var/run/haproxy.pid
maxconn 4000
ulimit-n 16384
user haproxy
group haproxy
stats timeout 30s
daemon
# turn on stats unix socket
stats socket /var/lib/haproxy/stats
defaults
mode http
log global
option httplog
option dontlognull
timeout http-request 15s
timeout queue 1m
timeout connect 5000
timeout client 50000
timeout server 50000
timeout http-keep-alive 15s
timeout check 15s
maxconn 3000
frontend monitor-in
bind *:33305
mode http
option httplog
monitor-uri /monitor
listen stats
bind *:8006
mode http
stats enable
stats hide-version
stats uri /stats
stats refresh 30s
stats realm Haproxy\ Statistics
stats auth admin:admin
frontend k3s-server
bind 0.0.0.0:8443
bind 127.0.0.1:8443
mode tcp
option tcplog
tcp-request inspect-delay 5s
default_backend k3s-server
backend static
balance roundrobin
server static 127.0.0.1:4331 check
backend k3s-server
mode tcp
option tcplog
option tcp-check
balance roundrobin
default-server inter 10s downinter 5s rise 2 fall 2 slowstart 60s maxconn 250 maxqueue 256 weight 100
server k3s-server01 10.1.40.61:6443 check
server k3s-server02 10.1.40.62:6443 check
server k3s-server03 10.1.40.63:6443 check
EOF
配置 Keepalived
所有 Server 节点配置 Keepalived。注意修改以下配置:
- interface(服务网卡)
- priority(优先级,不同即可)
- mcast_src_ip(节点IP地址)
详细配置参考 Keepalived 文档。
创建 Keepalived 配置文件
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42mv /etc/keepalived/keepalived.conf /etc/keepalived/keepalived_ori.conf
export INTERFACE=$(ip route show |grep default |cut -d ' ' -f5)
export IPADDR=$(ifconfig |grep -A1 $INTERFACE |grep inet |awk '{print $2}')
cat > /etc/keepalived/keepalived.conf <<EOF
! Configuration File for keepalived
global_defs {
router_id LVS_DEVEL
}
vrrp_script chk_apiserver {
script "/etc/keepalived/check_apiserver.sh"
interval 2
weight -5
fall 3
rise 2
}
vrrp_instance VI_1 {
state MASTER
interface ${INTERFACE}
mcast_src_ip ${IPADDR}
virtual_router_id 60
priority 100
advert_int 2
authentication {
auth_type PASS
auth_pass K8SHA_KA_AUTH
}
virtual_ipaddress {
10.1.40.60
}
track_script {
chk_apiserver
}
}
EOF
# 修改其他 server 节点配置文件的 priority 值
# k3s-server02 节点改为 priority 99
# k3s-server03 节点改为 priority 98如果一个局域网内有多个 keepalived 集群,需要修改
virtual_router_id
的值为唯一的,可以设置为 VIP 地址最后的一个值,比如这里的 60配置 Keepalived 健康检查文件
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27cat > /etc/keepalived/check_apiserver.sh <<EOF
#/bin/bash
err=0
for k in \$(seq 1 3)
do
check_code=\$(pgrep haproxy)
if [[ \$check_code == "" ]]; then
err=\$(expr \$err + 1)
sleep 1
continue
else
err=0
break
fi
done
if [[ \$err != "0" ]]; then
echo "systemctl stop keepalived"
/usr/bin/systemctl stop keepalived
exit 1
else
exit 0
fi
EOF
chmod +x /etc/keepalived/check_apiserver.sh启动 HAProxy 和 Keepalived
1
2systemctl enable --now haproxy.service
systemctl enable --now keepalived.service
部署 Etcd 集群
下载 etcd 二进制包
1
wget https://github.com/etcd-io/etcd/releases/download/v3.3.15/etcd-v3.3.15-linux-amd64.tar.gz
解压二进制包
1
tar zxvf etcd-v3.3.15-linux-amd64.tar.gz
安装二进制文件
1
2mkdir -p /usr/local/etcd/{bin,cfg,ssl}
cp etcd-v3.3.15-linux-amd64/{etcd,etcdctl} /usr/local/etcd/bin/下载证书生成工具
1
2
3
4
5
6
7
8
9
10# 下载工具
wget https://github.com/cloudflare/cfssl/releases/download/v1.5.0/cfssljson_1.5.0_linux_amd64
wget https://github.com/cloudflare/cfssl/releases/download/v1.5.0/cfssl_1.5.0_linux_amd64
# 添加执行权限
chmod +x cfssljson_1.5.0_linux_amd64 cfssl_1.5.0_linux_amd64
# 将命令移动到可执行路劲并重命名
mv cfssl_1.5.0_linux_amd64 /usr/local/bin/cfssl
mv cfssljson_1.5.0_linux_amd64 /usr/local/bin/cfssljson创建 etcd 相关证书
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80mkdir pki && cd pki
# 创建 CA 配置文件
cat > ca-config.json<<EOF
{
"signing": {
"default": {
"expiry": "876000h"
},
"profiles": {
"etcd": {
"usages": [
"signing",
"key encipherment",
"server auth",
"client auth"
],
"expiry": "876000h"
}
}
}
}
EOF
# 创建 CSR 文件
cat > etcd-ca-csr.json <<EOF
{
"CN": "etcd",
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "CN",
"ST": "Beijing",
"L": "Beijing",
"O": "etcd",
"OU": "Etcd Security"
}
],
"ca": {
"expiry": "876000h"
}
}
EOF
cat > etcd-csr.json <<EOF
{
"CN": "etcd",
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "CN",
"ST": "Beijing",
"L": "Beijing",
"O": "etcd",
"OU": "Etcd Security"
}
]
}
EOF
# 生成 etcd ca 证书和 key
cfssl gencert -initca etcd-ca-csr.json | cfssljson -bare /usr/local/etcd/ssl/etcd-ca
# 生成 etcd 证书文件
cfssl gencert \
-ca=/usr/local/etcd/ssl/etcd-ca.pem \
-ca-key=/usr/local/etcd/ssl/etcd-ca-key.pem \
-config=ca-config.json \
-hostname=127.0.0.1,\
10.1.40.61,\
10.1.40.62,\
10.1.40.63 \
-profile=kubernetes \
etcd-csr.json | cfssljson -bare /usr/local/etcd/ssl/etcd注意,-hostname 需要把所有的 etcd 节点地址都填进去,也可以多填几个预留地址,方便后期 etcd 集群扩容。
创建 ETCD 配置文件
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48cat > /usr/local/etcd/cfg/etcd.config.yml << EOF
name: 'k8s-sit-master01'
data-dir: /var/lib/etcd
wal-dir: /var/lib/etcd/wal
snapshot-count: 5000
heartbeat-interval: 100
election-timeout: 1000
quota-backend-bytes: 0
listen-peer-urls: 'https://10.1.40.91:2380'
listen-client-urls: 'https://10.1.40.91:2379,http://127.0.0.1:2379'
max-snapshots: 3
max-wals: 5
cors:
initial-advertise-peer-urls: 'https://10.1.40.91:2380'
advertise-client-urls: 'https://10.1.40.91:2379'
discovery:
discovery-fallback: 'proxy'
discovery-proxy:
discovery-srv:
initial-cluster: 'k8s-sit-master01=https://10.1.40.91:2380,k8s-sit-master02=https://10.1.40.92:2380,k8s-sit-master03=https://10.1.40.93:2380'
initial-cluster-token: 'etcd-k8s-cluster'
initial-cluster-state: 'new'
strict-reconfig-check: false
enable-v2: true
enable-pprof: true
proxy: 'off'
proxy-failure-wait: 5000
proxy-refresh-interval: 30000
proxy-dial-timeout: 1000
proxy-write-timeout: 5000
proxy-read-timeout: 0
client-transport-security:
cert-file: '/usr/local/etcd/ssl/etcd.pem'
key-file: '/usr/local/etcd/ssl/etcd-key.pem'
client-cert-auth: true
trusted-ca-file: '/usr/local/etcd/ssl/etcd-ca.pem'
auto-tls: true
peer-transport-security:
cert-file: '/usr/local/etcd/ssl/etcd.pem'
key-file: '/usr/local/etcd/ssl/etcd-key.pem'
peer-client-cert-auth: true
trusted-ca-file: '/usr/local/etcd/ssl/etcd-ca.pem'
auto-tls: true
debug: false
log-package-levels:
log-outputs: [default]
force-new-cluster: false
EOF创建 ETCD 服务管理文件(所有节点配置一样)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18cat > /usr/lib/systemd/system/etcd.service << EOF
[Unit]
Description=Etcd Server
After=network.target
After=network-online.target
Wants=network-online.target
[Service]
Type=notify
ExecStart=/usr/local/etcd/bin/etcd --config-file=/usr/local/etcd/cfg/etcd.config.yml
Restart=on-failure
RestartSec=10
LimitNOFILE=65536
[Install]
WantedBy=multi-user.target
Alias=etcd3.service
EOF拷贝 etcd 相关文件到其他 ETCD 节点
1
2
3
4for i in k3s-server02 k3s-server03;do
scp -r /usr/local/etcd root@$i:/usr/local/etcd;
scp /usr/lib/systemd/system/etcd.service root@$i:/usr/lib/systemd/system/etcd.service;
done修改其他节点的配置文件,需要修改的地方如下:
- name: 节点名称,集群中必须唯一
- listen-peer-urls: 修改为当前节点的IP地址
- listen-client-urls: 修改为当前节点的IP地址
- initial-advertise-peer-urls: 修改为当前节点的IP地址
- advertise-client-urls: 修改为当前节点的IP地址
启动 Etcd 并配置开机启动
1
systemctl enable --now etcd
检查 etcd 集群状态
1
2
3
4
5
6
7
8export ETCDCTL_API=3
/usr/local/etcd/bin/etcdctl \
--cacert=/usr/local/etcd/ssl/etcd-ca.pem \
--cert=/usr/local/etcd/ssl/etcd.pem \
--key=/usr/local/etcd/ssl/etcd-key.pem \
--endpoints="https://10.1.40.61:2379,\
https://10.1.40.62:2379,\
https://10.1.40.63:2379" endpoint status --write-out=table注意:如果启动节点启动失败,需要清除启动信息,然后再次启动, 参考文章 etcd集群节点挂掉后恢复步骤
- 具体操作如下:
1
rm -rf /var/lib/etcd/*