kubeadm 是一个工具,通过提供 kubeadm init 和 kubeadm join 来作为创建 Kubernetes 集群的最佳实践“快速路径”。 kubeadm 执行必要的操作来启动并运行一个最小可行集群。按照设计,它只关心引导,而不关心配置机器。使用 kubeadm 作为所有部署的基础,可以更容易地创建一致的集群。

服务器信息

  • CPU: 2核 Intel(R) Xeon(R) Platinum 8255C CPU @ 2.50GHz
  • 内存: 4GB
  • 操作系统: Ubuntu 18.04.1 LTS
  • IP地址: 10.206.0.4

确保集群中每台机器 2 GB 以上的内存,内存不足时应用会受限制
主节点至少需要2核CPU

安装Docker

  • 安装依赖工具

    sudo apt-get update
    sudo apt-get -y install apt-transport-https ca-certificates curl software-properties-common
    
  • 安装GPG证书

    curl -fsSL https://mirrors.aliyun.com/docker-ce/linux/ubuntu/gpg | sudo apt-key add -
    
  • 写入软件源信息

    sudo add-apt-repository "deb [arch=amd64] https://mirrors.aliyun.com/docker-ce/linux/ubuntu $(lsb_release -cs) stable"
    
  • 安装Docker

    sudo apt-get update
    sudo apt-get -y install docker-ce
    
  • 启动Docker

    sudo systemctl start docker
    
  • 设置开机自启

    sudo systemctl enable docker
    

安装 kubeadm kubelet 和 kubectl

kubeadm是一个快速创建Kubernetes集群的工具,能大大简化Kubernetes集群的部署。有关kubeadm的更多信息,请参阅这里

  • kubeadm : 用于初始化集群。
  • kubelet : 用于在集群中的每个节点上用来启动 pod 和 container 等。
  • kubectl : 用来与集群通信的命令行工具。
  • 安装GPG证书

    curl https://mirrors.aliyun.com/kubernetes/apt/doc/apt-key.gpg | sudo apt-key add -
    
  • 写入软件源信息

    sudo echo "deb https://mirrors.aliyun.com/kubernetes/apt/ kubernetes-xenial main" > /etc/apt/sources.list.d/kubernetes.list
    
  • 安装kubelet kubeadm kubectl

    sudo apt-get update
    sudo apt-get install -y kubelet kubeadm kubectl
    
  • 启动kubelet

    sudo systemctl start kubelet
    
  • 设置kubelet开机自启

    sudo systemctl enbale kubelet
    

初始化Kubernetes集群

  • 禁用swap

    sudo swapoff -a
    
  • 拉取gcr镜像

    由于Kubernetes所使用的Docker镜像在gcr.io上,国内无法直接访问gcr.io,可通过执行sudo kubeadm config images pull来测试与gcr.io的连通性。不出意外你会得到如下所示的错误:

    failed to pull image "k8s.gcr.io/kube-apiserver:v1.17.1": output: Error response from daemon: Get https://k8s.gcr.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
    , error: exit status 1
    To see the stack trace of this error execute with --v=5 or higher
    

    大意就是连接超时,无法访问。我们可以通过在国内源下载到这些镜像,然后再打上gcr的tag。这里我写了个脚本,从阿里云镜像仓库下载对应的镜像,然后打上gcr的tag,只需要执行:

    curl -sSL https://get.k8s.devzhou.cn/gcr | sudo sh
    

    查看本地镜像:

    sudo docker images
    

    输出如下:

    REPOSITORY                           TAG                 IMAGE ID            CREATED             SIZE
    k8s.gcr.io/kube-controller-manager   v1.17.1             5dd8f24429b4        3 days ago          161MB
    k8s.gcr.io/kube-apiserver            v1.17.1             628f0e52ae53        3 days ago          171MB
    k8s.gcr.io/kube-scheduler            v1.17.1             8d2e2e5a92ac        3 days ago          94.4MB
    k8s.gcr.io/kube-proxy                v1.17.1             87a399dffea6        3 days ago          116MB
    k8s.gcr.io/coredns                   1.6.5               70f311871ae1        2 months ago        41.6MB
    k8s.gcr.io/etcd                      3.4.3-0             303ce5db0e90        2 months ago        288MB
    k8s.gcr.io/pause                     3.1                 da86e6ba6ca1        2 years ago         742kB
    
  • 初始化集群

    sudo kubeadm init --pod-network-cidr=10.244.0.0/16 --apiserver-advertise-address=10.206.0.4
    

    由于后面要安装flannel网络插件,所以需要在集群初始化时指定--pod-network-cidr=10.244.0.0/16
    --apiserver-advertise-address 参数是 API server 用来告知集群中其它成员的地址,这也是在 init 流程的时候用来构建 kubeadm join 命令行的地址。如果不设置(或者设置为 0.0.0.0)那么将使用默认接口的ip地址。这里我设置为服务器的ip地址。

    集群初始化成功会输出如下的信息:

    Your Kubernetes control-plane has initialized successfully!
    
    To start using your cluster, you need to run the following as a regular user:
    
      mkdir -p $HOME/.kube
      sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
      sudo chown $(id -u):$(id -g) $HOME/.kube/config
    
    You should now deploy a pod network to the cluster.
    Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
      https://kubernetes.io/docs/concepts/cluster-administration/addons/
    
    Then you can join any number of worker nodes by running the following on each as root:
    
    kubeadm join 10.206.0.4:6443 --token 7xh44c.qktgi7gufu0tk4x7 \
        --discovery-token-ca-cert-hash sha256:e718166f485fb8f6b56a3fcdc886b9e67fb4d384f9b425a599f458c6a8e5b8bd
    
  • 查看集群信息

    sudo kubectl get nodes
    

    如果出现The connection to the server localhost:8080 was refused - did you specify the right host or port?则需要进行如下的设置:

    mkdir -p $HOME/.kube
    sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
    sudo chown $(id -u):$(id -g) $HOME/.kube/config
    

    再次执行sudo kubectl get nodes,输出如下:

    NAME            STATUS     ROLES    AGE     VERSION
    vm-0-4-ubuntu   NotReady   master   6m34s   v1.17.1
    

    可以看到集群的状态是NotReady,执行下面的命令查看Pod状态:

    sudo kubectl get pods -n kube-system -o wide
    

    输出如下,可以看到coredns一直处在Pending状态,这一行为是预期之中的,因为系统就是这么设计的。必须完成Pod的网络配置,然后才能完全部署CoreDNS。在网络被配置好之前,DNS 组件会一直处于 Pending 状态。

    NAME                                    READY   STATUS    RESTARTS   AGE     IP           NODE            NOMINATED NODE   READINESS GATES
    coredns-6955765f44-bsqmw                0/1     Pending   0          9m49s   <none>       <none>          <none>           <none>
    coredns-6955765f44-sdgws                0/1     Pending   0          9m49s   <none>       <none>          <none>           <none>
    etcd-vm-0-4-ubuntu                      1/1     Running   0          9m44s   10.206.0.4   vm-0-4-ubuntu   <none>           <none>
    kube-apiserver-vm-0-4-ubuntu            1/1     Running   0          9m44s   10.206.0.4   vm-0-4-ubuntu   <none>           <none>
    kube-controller-manager-vm-0-4-ubuntu   1/1     Running   0          9m44s   10.206.0.4   vm-0-4-ubuntu   <none>           <none>
    kube-proxy-lfrq6                        1/1     Running   0          9m49s   10.206.0.4   vm-0-4-ubuntu   <none>           <none>
    kube-scheduler-vm-0-4-ubuntu            1/1     Running   0          9m44s   10.206.0.4   vm-0-4-ubuntu   <none>           <none>
    
  • 安装Pod网络插件

    以flannel为例:

    sudo kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
    

    关于flannel的更多信息请参阅这里

    输出如下:

    podsecuritypolicy.policy/psp.flannel.unprivileged created
    clusterrole.rbac.authorization.k8s.io/flannel created
    clusterrolebinding.rbac.authorization.k8s.io/flannel created
    serviceaccount/flannel created
    configmap/kube-flannel-cfg created
    daemonset.apps/kube-flannel-ds-amd64 created
    daemonset.apps/kube-flannel-ds-arm64 created
    daemonset.apps/kube-flannel-ds-arm created
    daemonset.apps/kube-flannel-ds-ppc64le created
    daemonset.apps/kube-flannel-ds-s390x created
    

    再次执行sudo kubectl get pods -n kube-system -o wide确保所有Pod都处于Running状态:

    AMESPACE     NAME                                    READY   STATUS    RESTARTS   AGE   IP           NODE            NOMINATED NODE   READINESS GATES
    kube-system   coredns-6955765f44-bsqmw                1/1     Running   0          21m   10.244.0.2   vm-0-4-ubuntu   <none>           <none>
    kube-system   coredns-6955765f44-sdgws                1/1     Running   0          21m   10.244.0.3   vm-0-4-ubuntu   <none>           <none>
    kube-system   etcd-vm-0-4-ubuntu                      1/1     Running   0          21m   10.206.0.4   vm-0-4-ubuntu   <none>           <none>
    kube-system   kube-apiserver-vm-0-4-ubuntu            1/1     Running   0          21m   10.206.0.4   vm-0-4-ubuntu   <none>           <none>
    kube-system   kube-controller-manager-vm-0-4-ubuntu   1/1     Running   0          21m   10.206.0.4   vm-0-4-ubuntu   <none>           <none>
    kube-system   kube-flannel-ds-amd64-vm742             1/1     Running   0          81s   10.206.0.4   vm-0-4-ubuntu   <none>           <none>
    kube-system   kube-proxy-lfrq6                        1/1     Running   0          21m   10.206.0.4   vm-0-4-ubuntu   <none>           <none>
    kube-system   kube-scheduler-vm-0-4-ubuntu            1/1     Running   0          21m   10.206.0.4   vm-0-4-ubuntu   <none>           <none>
    

    再次执行sudo kubectl get nodes,查看集群信息

    NAME            STATUS   ROLES    AGE   VERSION
    vm-0-4-ubuntu   Ready    master   24m   v1.17.1
    

    可以看到集群状态变为Ready,并且角色是master,至此单节点集群搭建成功。

Q.E.D.

知识共享署名-非商业性使用-相同方式共享 4.0 国际许可协议

眉眼带笑,岁月风平