Friday, July 24, 2020

Kubernetes–Troubleshooting ImagePullBackOff errors

After deploying a new pod, I couldn’t access it. Time to take a look what was going on:

C:\Users\BaWu\Desktop>kubectl get pods --all-namespaces

NAMESPACE              NAME                                                              READY   STATUS             RESTARTS   AGE

kube-system            addon-http-application-routing-default-http-backend-7fc6fc27bj2   1/1     Running            0          93m

kube-system            addon-http-application-routing-external-dns-6c6465cf6f-hqn2w      1/1     Running            0          93m

kube-system            addon-http-application-routing-nginx-ingress-controller-668m4rb   1/1     Running            0          93m

kube-system            azure-cni-networkmonitor-cn57j                                    1/1     Running            0          10d

kube-system            azure-ip-masq-agent-4sjmw                                         1/1     Running            0          10d

kube-system            coredns-544d979687-5c7rt                                          1/1     Running            0          10d

kube-system            coredns-544d979687-rbbh9                                          1/1     Running            0          10d

kube-system            coredns-autoscaler-78959b4578-jdr24                               1/1     Running            0          10d

kube-system            dashboard-metrics-scraper-5f44bbb8b5-dfw47                        1/1     Running            0          10d

kube-system            kube-proxy-8d5sr                                                  1/1     Running            0          10d

kube-system            kubernetes-dashboard-785654f667-2gcbn                             1/1     Running            0          10d

kube-system            metrics-server-85c57978c6-bzsc2                                   1/1     Running            0          10d

kube-system            omsagent-rs-5f579fcfd-9pqpf                                       0/1     ImagePullBackOff   0          2d10h

kube-system            omsagent-rs-6b6cdf78fc-26mpb                                      1/1     Running            1531       10d

kube-system            omsagent-wfgtp                                                    0/1     ImagePullBackOff   0          2d10h

kube-system            tunnelfront-f7bd7ccb-t7g95                                        2/2     Running            1          6d20h

kubernetes-dashboard   dashboard-metrics-scraper-c79c65bb7-w9thj                         0/1     ImagePullBackOff   0          27m

kubernetes-dashboard   kubernetes-dashboard-56484d4c5-c4cwv                              0/1     ImagePullBackOff   0          27m

The deployment turned out to be in the imagepullbackoff  state. There can be various reasons on why this is the case. Let’s figure out what could cause this by calling describe. This gave us a lot of extra information:

C:\Users\BaWu\Desktop>kubectl describe pod kubernetes-dashboard-56484d4c5-c4cwv --namespace=kubernetes-dashboard

Name:         kubernetes-dashboard-56484d4c5-c4cwv

Namespace:    kubernetes-dashboard

Priority:     0

Node:         aks-agentpool-27676582-vmss000000/10.9.1.5

Start Time:   Fri, 24 Jul 2020 12:53:52 +0200

Labels:       k8s-app=kubernetes-dashboard

              pod-template-hash=56484d4c5

Annotations:  <none>

Status:       Pending

IP:           10.9.1.21

IPs:

  IP:           10.9.1.21

Controlled By:  ReplicaSet/kubernetes-dashboard-56484d4c5

Containers:

  kubernetes-dashboard:

    Container ID:

    Image:         kubernetesui/dashboard:v2.0.0

    Image ID:

    Port:          8443/TCP

    Host Port:     0/TCP

    Args:

      --auto-generate-certificates

      --namespace=kubernetes-dashboard

    State:          Waiting

      Reason:       ImagePullBackOff

    Ready:          False

    Restart Count:  0

    Liveness:       http-get https://:8443/ delay=30s timeout=30s period=10s #success=1 #failure=3

    Environment:    <none>

    Mounts:

      /certs from kubernetes-dashboard-certs (rw)

      /tmp from tmp-volume (rw)

      /var/run/secrets/kubernetes.io/serviceaccount from kubernetes-dashboard-token-bxq7s (ro)

Conditions:

  Type              Status

  Initialized       True

  Ready             False

  ContainersReady   False

  PodScheduled      True

Volumes:

  kubernetes-dashboard-certs:

    Type:        Secret (a volume populated by a Secret)

    SecretName:  kubernetes-dashboard-certs

    Optional:    false

  tmp-volume:

    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)

    Medium:

    SizeLimit:  <unset>

  kubernetes-dashboard-token-bxq7s:

    Type:        Secret (a volume populated by a Secret)

    SecretName:  kubernetes-dashboard-token-bxq7s

    Optional:    false

QoS Class:       BestEffort

Node-Selectors:  kubernetes.io/os=linux

Tolerations:     node-role.kubernetes.io/master:NoSchedule

                 node.kubernetes.io/not-ready:NoExecute for 300s

                 node.kubernetes.io/unreachable:NoExecute for 300s

Events:

  Type     Reason     Age                  From                                        Message

  ----     ------     ----                 ----                                        -------

  Normal   Scheduled  30m                  default-scheduler                           Successfully assigned kubernetes-dashboard/kubernetes-dashboard-56484d4c5-c4cwv to aks-agentpool-27676582-vmss000000

  Normal   Pulling    28m (x4 over 30m)    kubelet, aks-agentpool-27676582-vmss000000  Pulling image "kubernetesui/dashboard:v2.0.0"

  Warning  Failed     28m (x4 over 30m)    kubelet, aks-agentpool-27676582-vmss000000  Failed to pull image "kubernetesui/dashboard:v2.0.0": rpc error: code = Unknown desc = Error response from daemon: Get https://registry-1.docker.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)

  Warning  Failed     28m (x4 over 30m)    kubelet, aks-agentpool-27676582-vmss000000  Error: ErrImagePull

  Normal   BackOff    27m (x6 over 30m)    kubelet, aks-agentpool-27676582-vmss000000  Back-off pulling image "kubernetesui/dashboard:v2.0.0"

  Warning  Failed     26s (x117 over 30m)  kubelet, aks-agentpool-27676582-vmss000000  Error: ImagePullBackOff

This indicates that the Kubernetes cluster cannot talk to https://registry-1.docker.io/v2/. This makes sense as I only configured a trust with ACR.