一、Rancher 是什么
Rancher 是一个 Kubernetes 管理工具,用于在任何地方和任何提供商上部署和运行集群。
Rancher 可以从托管提供商调配 Kubernetes,调配计算节点,然后将 Kubernetes 安装到这些节点上,或者导入在任何地方运行的现有 Kubernetes 集群。
Rancher 在 Kubernetes 基础上增加了重要价值,首先是为所有集群集中验证和基于角色的访问控制(RBAC),使全球管理员能够从一个位置控制集群访问。
然后,它能够对集群及其资源进行详细监控和警报,向外部提供商发送日志,并通过应用目录直接与 Helm 集成。如果您有外部 CI/CD 系统,您可以将其插入 Rancher,但如果您没有,Rancher 甚至包括 Fleet,以帮助您自动部署和升级工作负载。
Rancher是一个完整的 Kubernetes 容器管理平台,为您提供在任何地方成功运行 Kubernetes 的工具。
二、为什么选择 Rancher
国内有 kuboard 这类的 kubernetes 管理界面,但是并没有解决使用 kubernetes 引入的复杂性,例如能够与 k8s 集群管理平台集成的 CI/CD、多集群管理、账号权限管理。
互联网上一堆检索,对比下来只有 Red Hat OpenShift Kubernetes Engine (OpenShit),VMware Tanzu,SUSE Rancher 三家,能免费使用的就只有 Rancher,所以没得选了。
三、部署 Rancher 服务器
Rancher 支持多种方法部署:
- AWS (uses Terraform)
- AWS Marketplace (uses Amazon EKS)
- Azure (uses Terraform)
- DigitalOcean (uses Terraform)
- GCP (uses Terraform)
- Hetzner Cloud (uses Terraform)
- Vagrant
- Equinix Metal
- Outscale (uses Terraform)
这里我们选择手动部署方式
3.1. 安装要求
准备两台 Ubuntu LTS 版本的服务器,这里使用 k3s 集群运行 Rancher 管理节点,k3s 的资源需求如下:
部署规模 | 管理的集群数量 | 管理的节点数量 | k3s vCPUs 要求 | k3s 内存要求 | k3s 数据库要求 |
---|---|---|---|---|---|
Small | 最大 150 | 最大 1500 | 2 | 8 GB | 2 cores, 4 GB + 1000 IOPS |
Medium | 最大 300 | 最大 3000 | 4 | 16 GB | 2 cores, 4 GB + 1000 IOPS |
Large | 最大 500 | 最大 5000 | 8 | 32 GB | 2 cores, 4 GB + 1000 IOPS |
X-Large | 最大 1000 | 最大 10,000 | 16 | 64 GB | 2 cores, 4 GB + 1000 IOPS |
XX-Large | 最大 2000 | 最大 20,000 | 32 | 128 GB | 2 cores, 4 GB + 1000 IOPS |
3.2. 安装准备,禁用防火墙
systemctl disable ufw
systemctl stop ufw
ufw reset
iptables -P INPUT ACCEPT
iptables -P OUTPUT ACCEPT
iptables -P FORWARD ACCEPT
iptables -F
3.3. 安装 k3s
在两个服务器上安装 Rancher 管理节点集群 k3s,第一个节点命令执行成功,k3s 服务启动成功后,再在第二个节点执行,第二个节点会自动加入并创建集群。版本号可以选择最新稳定版的前一个版本,试过最新稳定版,无法通过中国区镜像下载。
另外,虽然 k3s 已经发布 1.27 版本了,但是 Rancher 支持的最高 k3s 版本还是 1.26 系列,所以仍然选择 1.26 系列版本。
3.3.1. 使用国内镜像网站的脚本安装 K3s
curl -sfL https://rancher-mirror.rancher.cn/k3s/k3s-install.sh | INSTALL_K3S_MIRROR=cn INSTALL_K3S_VERSION="v1.26.5+k3s1" K3S_DATASTORE_ENDPOINT='postgres://k3s:[email protected]:5432/k3s?sslmode=disable' sh -s - server --token=k3stoken --tls-san 192.168.6.247 --tls-san 192.168.6.248 --tls-san rancher.myexample.com
命令响应如下:
[sudo] password for ubuntu:
[INFO] Using v1.26.5+k3s1 as release
[INFO] Downloading hash rancher-mirror.rancher.cn/k3s/v1.26.5-k3s1/sha256sum-amd64.txt
[INFO] Downloading binary rancher-mirror.rancher.cn/k3s/v1.26.5-k3s1/k3s
[INFO] Verifying binary download
[INFO] Installing k3s to /usr/local/bin/k3s
[INFO] Skipping installation of SELinux RPM
[INFO] Creating /usr/local/bin/kubectl symlink to k3s
[INFO] Creating /usr/local/bin/crictl symlink to k3s
[INFO] Creating /usr/local/bin/ctr symlink to k3s
[INFO] Creating killall script /usr/local/bin/k3s-killall.sh
[INFO] Creating uninstall script /usr/local/bin/k3s-uninstall.sh
[INFO] env: Creating environment file /etc/systemd/system/k3s.service.env
[INFO] systemd: Creating service file /etc/systemd/system/k3s.service
[INFO] systemd: Enabling k3s unit
Created symlink /etc/systemd/system/multi-user.target.wants/k3s.service → /etc/systemd/system/k3s.service.
[INFO] systemd: Starting k3s```
3.3.2. 离线安装 K3s
由于国内镜像网站存在同步延迟的问题,目前不支持最新的小版本号,这里也给出离线安装 K3s 的方法。
3.3.2.1. 本地创建目录,放置离线安装包
sudo mkdir -p /var/lib/rancher/k3s/agent/images/
sudo cp ./k3s-airgap-images-amd64.tar.gz /var/lib/rancher/k3s/agent/images/
3.3.2.2. 配置 k3s 可执行文件
从 https://github.com/k3s-io/k3s/releases 页面下载对应版本的 k3s 二进制文件,放置到/usr/local/bin
目录,并添加可执行权限
chmod +x /usr/local/bin/k3s
3.3.2.3. 配置 install.sh 安装脚本
从get.k3s.io下载安装脚本,命名为install.sh
,并添加可执行权限
chmod +x install.sh
3.3.2.4. 使用脚本执行安装
INSTALL_K3S_SKIP_DOWNLOAD=true INSTALL_K3S_EXEC='server --token=k3stoken --tls-san 192.168.6.247 --tls-san 192.168.6.248 --tls-san rancher.myexample.com' \
K3S_DATASTORE_ENDPOINT='postgres://k3s:[email protected]:5432/k3s?sslmode=disable' \
./install.sh
3.4. 配置其中一个节点能够访问 k3s 集群
sudo -s
mkdir -p /home/ubuntu/.kube
sudo cp /etc/rancher/k3s/k3s.yaml /home/ubuntu/.kube/config
chmod 600 /home/ubuntu/.kube/config
chown ubuntu:ubuntu /home/ubuntu/.kube/config
export KUBECONFIG=~/.kube/config
echo "export KUBECONFIG=~/.kube/config" >> /home/ubuntu/.bashrc
3.5. 安装 helm
参考 https://github.com/helm/helm/releases
3.6. 添加 Rancher helm 资源库
helm repo add rancher-latest https://releases.rancher.com/server-charts/latest
3.7. 创建 namespace
kubectl create namespace cattle-system
3.8. 安装 cert-manager CustomResourceDefinitions
如果碰到服务器上访问 github 的问题,可以把文件下载下来,上传到服务器,将 https url 换成文件的相对路径就行,版本号需要和下面 helm 安装的 cert-manager 想同
kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.12.2/cert-manager.crds.yaml
成功执行的命令响应
customresourcedefinition.apiextensions.k8s.io/certificaterequests.cert-manager.io created
customresourcedefinition.apiextensions.k8s.io/certificates.cert-manager.io created
customresourcedefinition.apiextensions.k8s.io/challenges.acme.cert-manager.io created
customresourcedefinition.apiextensions.k8s.io/clusterissuers.cert-manager.io created
customresourcedefinition.apiextensions.k8s.io/issuers.cert-manager.io created
customresourcedefinition.apiextensions.k8s.io/orders.acme.cert-manager.io created
3.9. 添加 cert-manager helm 资源库
helm repo add jetstack https://charts.jetstack.io
helm repo update
3.10. 安装 cert-manager
版本号可以根据官方文档或开源仓库选择最新稳定版
helm install cert-manager jetstack/cert-manager \
--namespace cert-manager \
--create-namespace \
--version v1.12.2
成功安装的命令响应
NAME: cert-manager
LAST DEPLOYED: Tue Jul 11 06:32:41 2023
NAMESPACE: cert-manager
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
cert-manager v1.11.4 has been deployed successfully!
In order to begin issuing certificates, you will need to set up a ClusterIssuer
or Issuer resource (for example, by creating a 'letsencrypt-staging' issuer).
More information on the different types of issuers and how to configure them
can be found in our documentation:
https://cert-manager.io/docs/configuration/
For information on how to configure cert-manager to automatically provision
Certificates for Ingress resources, take a look at the `ingress-shim`
documentation:
https://cert-manager.io/docs/usage/ingress/
3.11. 安装 rancher
helm install rancher rancher-latest/rancher \
--namespace cattle-system \
--set hostname=rancher.myexample.com \
--set replicas=1 \
--set bootstrapPassword=admin
成功安装的命令响应
NAME: rancher
LAST DEPLOYED: Mon Apr 10 03:01:09 2023
NAMESPACE: cattle-system
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
Rancher Server has been installed.
NOTE: Rancher may take several minutes to fully initialize. Please standby while Certificates are being issued, Containers are started and the Ingress rule comes up.
Check out our docs at https://rancher.com/docs/
If you provided your own bootstrap password during installation, browse to https://rancher.51bsi.com to get started.
If this is the first time you installed Rancher, get started by running this command and clicking the URL it generates:
echo https://rancher.myexample.com/dashboard/?setup=$(kubectl get secret --namespace cattle-system bootstrap-secret -o go-template='{{.data.bootstrapPassword|base64decode}}')
To get just the bootstrap password on its own, run:
kubectl get secret --namespace cattle-system bootstrap-secret -o go-template='{{.data.bootstrapPassword|base64decode}}{{ "\n" }}'
Happy Containering!
安装成功之后,大约需要 2 至 3 分钟,再访问rancher 服务 https://rancher.myexample.com/dashboard/?setup=admin
四、常见的服务检查命令
4.1. 检查 cert-manager 服务运行情况
ubuntu@24:~$ kubectl get po -n cert-manager
NAME READY STATUS RESTARTS AGE
cert-manager-cainjector-56bbdd5c47-2gtq5 1/1 Running 0 96s
cert-manager-64f9f45d6f-8qxn6 1/1 Running 0 96s
cert-manager-webhook-d4f4545d7-cxnhf 1/1 Running 0 96s
4.2. 查看服务的日志
ubuntu@24:~$ kubectl -n cert-manager logs cert-manager-webhook-d4f4545d7-cxnhf
I0711 07:13:13.364269 1 feature_gate.go:249] feature gates: &{map[]}
W0711 07:13:13.364365 1 client_config.go:618] Neither --kubeconfig nor --master was specified. Using the inClusterConfig. This might not work.
I0711 07:13:13.370407 1 webhook.go:129] cert-manager "msg"="using dynamic certificate generating using CA stored in Secret resource" "secret_name"="cert-manager-webhook-ca" "secret_namespace"="cert-manager"
I0711 07:13:13.370665 1 server.go:133] cert-manager/webhook "msg"="listening for insecure healthz connections" "address"=":6080"
I0711 07:13:13.370731 1 server.go:197] cert-manager/webhook "msg"="listening for secure connections" "address"=":10250"
I0711 07:13:14.376507 1 dynamic_source.go:266] cert-manager/webhook "msg"="Updated cert-manager webhook TLS certificate" "DNSNames"=["cert-manager-webhook","cert-manager-webhook.cert-manager","cert-manager-webhook.cert-manager.svc"]
4.3. 查看 cattle-system 命名空间的服务情况
ubuntu@24:~$ kubectl get pods --namespace cattle-system
NAME READY STATUS RESTARTS AGE
helm-operation-ggq22 2/2 Running 0 53s
helm-operation-mxqjg 0/2 Completed 0 2m52s
helm-operation-sftw8 0/2 Completed 0 74s
helm-operation-wbd4s 0/2 Completed 0 110s
rancher-7c5dbf46fc-8fb5v 1/1 Running 0 4m51s
rancher-7c5dbf46fc-l92kc 1/1 Running 0 4m50s
rancher-7c5dbf46fc-wmx8h 1/1 Running 0 4m50s
rancher-webhook-577b778f8f-9wzr5 0/1 ContainerCreating 0 9s
查看所有命名空间的服务情况
ubuntu@247:~$ kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system local-path-provisioner-76d776f6f9-lql4s 1/1 Running 0 110m
kube-system svclb-traefik-23008fd2-l67lp 2/2 Running 0 109m
kube-system helm-install-traefik-crd-lfjs8 0/1 Completed 0 110m
kube-system helm-install-traefik-brj48 0/1 Completed 1 110m
kube-system coredns-59b4f5bbd5-ddvjp 1/1 Running 0 110m
kube-system traefik-57c84cf78d-fhxlh 1/1 Running 0 109m
kube-system metrics-server-68cf49699b-zmrqr 1/1 Running 0 110m
cert-manager cert-manager-cainjector-7f47598f9b-rvlwj 1/1 Running 0 23m
cert-manager cert-manager-55b858df44-52ls9 1/1 Running 0 23m
cert-manager cert-manager-webhook-7d694cd764-n5vhc 1/1 Running 0 23m
cattle-system rancher-7769775dfb-gcghz 1/1 Running 0 22m
cattle-fleet-system gitjob-85b85d5df8-n74sp 1/1 Running 0 19m
cattle-fleet-system fleet-controller-775cd6657c-zxfq2 1/1 Running 0 19m
cattle-system helm-operation-mrzg8 0/2 Completed 0 19m
cattle-system helm-operation-7nqmg 0/2 Completed 0 18m
cattle-system rancher-webhook-788c48b988-82j77 1/1 Running 0 18m
cattle-system helm-operation-h82mk 0/2 Completed 0 18m
cattle-system helm-operation-sfxnn 0/2 Completed 0 17m
kube-system svclb-traefik-23008fd2-59psn 2/2 Running 0 16m
cattle-fleet-local-system fleet-agent-7f8d499f-4m4fc 1/1 Running 0 9m49s
4.4. 查看 Rancher 部署情况
ubuntu@24:~$ kubectl -n cattle-system get deploy rancher
NAME READY UP-TO-DATE AVAILABLE AGE
rancher 1/1 1 1 5m42s
五、卸载 k3s
5.1. 执行卸载
sudo -s
/usr/local/bin/k3s-uninstall.sh
/usr/local/bin/k3s-agent-uninstall.sh
5.2. 删除对应的文件和目录
rm -rf /etc/ceph /etc/cni /etc/kubernetes /etc/rancher /opt/cni /opt/rke /run/secrets/kubernetes.io /run/calico /run/flannel /var/lib/calico /var/lib/etcd /var/lib/cni /var/lib/kubelet /var/lib/rancher /var/log/containers /var/log/kube-audit /var/log/pods /var/run/calico /var/lib/longhorn