Skip to main content

ElasticSearch on AKS–Debug a failed rollout

Installing ElasticSearch in a Kubernetes cluster is easy thanks to the available operator. Unfortunately that doesn’t guarantee a successfull rollout.

Let’s see how we can find out why a deployment failed.

An ElasticSearch cluster is rolled out as a statefulset. So let’s first have a look a the available statefulsets in our AKS cluster:

PS /home/bart> kubectl get statefulset

No resources found in default namespace.

NAME                READY   AGE

search-es-default   0/1     13h

OK, we see that the statefulset is not ready. Let’s continue our investigation and check the related pods:

PS /home/bart> kubectl get pods -l elasticsearch.k8s.elastic.co/statefulset-name=search-es-default

NAME                  READY   STATUS    RESTARTS   AGE

search-es-default-0   0/1     Pending   0          13h

OK, our pod is in a pending state and not ready. Let’s drill down in more detail about this pending pod:

PS /home/bart> kubectl describe pod search-es-default-0

Name:           search-es-default-0

Namespace:      default

Priority:       0

Node:           <none>

Labels:         common.k8s.elastic.co/type=elasticsearch

                controller-revision-hash=search-es-default-67676d5954

                elasticsearch.k8s.elastic.co/cluster-name=search

                elasticsearch.k8s.elastic.co/config-hash=1368074956

                elasticsearch.k8s.elastic.co/http-scheme=https

                elasticsearch.k8s.elastic.co/node-data=true

                elasticsearch.k8s.elastic.co/node-ingest=true

                elasticsearch.k8s.elastic.co/node-master=true

                elasticsearch.k8s.elastic.co/node-ml=true

                elasticsearch.k8s.elastic.co/node-remote_cluster_client=true

                elasticsearch.k8s.elastic.co/node-transform=true

                elasticsearch.k8s.elastic.co/node-voting_only=false

                elasticsearch.k8s.elastic.co/statefulset-name=search-es-default

                elasticsearch.k8s.elastic.co/version=7.11.2

                statefulset.kubernetes.io/pod-name=search-es-default-0

Annotations:    co.elastic.logs/module: elasticsearch

Status:         Pending

IP:

IPs:            <none>

Controlled By:  StatefulSet/search-es-default

Init Containers:

  elastic-internal-init-filesystem:

    Image:      docker.elastic.co/elasticsearch/elasticsearch:7.11.2

    Port:       <none>

    Host Port:  <none>

    Command:

      bash

      -c

      /mnt/elastic-internal/scripts/prepare-fs.sh

    Limits:

      cpu:     100m

      memory:  50Mi

    Requests:

      cpu:     100m

      memory:  50Mi

    Environment:

      POD_IP:                  (v1:status.podIP)

      POD_NAME:               search-es-default-0 (v1:metadata.name)

      NODE_NAME:               (v1:spec.nodeName)

      NAMESPACE:              default (v1:metadata.namespace)

      HEADLESS_SERVICE_NAME:  search-es-default

    Mounts:

      /mnt/elastic-internal/downward-api from downward-api (ro)

      /mnt/elastic-internal/elasticsearch-bin-local from elastic-internal-elasticsearch-bin-local (rw)

      /mnt/elastic-internal/elasticsearch-config from elastic-internal-elasticsearch-config (ro)

      /mnt/elastic-internal/elasticsearch-config-local from elastic-internal-elasticsearch-config-local (rw)

      /mnt/elastic-internal/elasticsearch-plugins-local from elastic-internal-elasticsearch-plugins-local (rw)

      /mnt/elastic-internal/probe-user from elastic-internal-probe-user (ro)

      /mnt/elastic-internal/scripts from elastic-internal-scripts (ro)

      /mnt/elastic-internal/transport-certificates from elastic-internal-transport-certificates (ro)

      /mnt/elastic-internal/unicast-hosts from elastic-internal-unicast-hosts (ro)

      /mnt/elastic-internal/xpack-file-realm from elastic-internal-xpack-file-realm (ro)

      /usr/share/elasticsearch/config/http-certs from elastic-internal-http-certificates (ro)

      /usr/share/elasticsearch/config/transport-remote-certs/ from elastic-internal-remote-certificate-authorities (ro)

      /usr/share/elasticsearch/data from elasticsearch-data (rw)

      /usr/share/elasticsearch/logs from elasticsearch-logs (rw)

Containers:

  elasticsearch:

    Image:       docker.elastic.co/elasticsearch/elasticsearch:7.11.2

    Ports:       9200/TCP, 9300/TCP

    Host Ports:  0/TCP, 0/TCP

    Limits:

      memory:  2Gi

    Requests:

      memory:   2Gi

    Readiness:  exec [bash -c /mnt/elastic-internal/scripts/readiness-probe-script.sh] delay=10s timeout=5s period=5s #success=1 #failure=3

    Environment:

      POD_IP:                     (v1:status.podIP)

      POD_NAME:                  search-es-default-0 (v1:metadata.name)

      NODE_NAME:                  (v1:spec.nodeName)

      NAMESPACE:                 default (v1:metadata.namespace)

      PROBE_PASSWORD_PATH:       /mnt/elastic-internal/probe-user/elastic-internal-probe

      PROBE_USERNAME:            elastic-internal-probe

      READINESS_PROBE_PROTOCOL:  https

      HEADLESS_SERVICE_NAME:     search-es-default

      NSS_SDB_USE_CACHE:         no

    Mounts:

      /mnt/elastic-internal/downward-api from downward-api (ro)

      /mnt/elastic-internal/elasticsearch-config from elastic-internal-elasticsearch-config (ro)

      /mnt/elastic-internal/probe-user from elastic-internal-probe-user (ro)

      /mnt/elastic-internal/scripts from elastic-internal-scripts (ro)

      /mnt/elastic-internal/unicast-hosts from elastic-internal-unicast-hosts (ro)

      /mnt/elastic-internal/xpack-file-realm from elastic-internal-xpack-file-realm (ro)

      /usr/share/elasticsearch/bin from elastic-internal-elasticsearch-bin-local (rw)

      /usr/share/elasticsearch/config from elastic-internal-elasticsearch-config-local (rw)

      /usr/share/elasticsearch/config/http-certs from elastic-internal-http-certificates (ro)

      /usr/share/elasticsearch/config/transport-certs from elastic-internal-transport-certificates (ro)

      /usr/share/elasticsearch/config/transport-remote-certs/ from elastic-internal-remote-certificate-authorities (ro)

      /usr/share/elasticsearch/data from elasticsearch-data (rw)

      /usr/share/elasticsearch/logs from elasticsearch-logs (rw)

      /usr/share/elasticsearch/plugins from elastic-internal-elasticsearch-plugins-local (rw)

Conditions:

  Type           Status

  PodScheduled   False

Volumes:

  elasticsearch-data:

    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)

    ClaimName:  elasticsearch-data-search-es-default-0

    ReadOnly:   false

  downward-api:

    Type:  DownwardAPI (a volume populated by information about the pod)

    Items:

      metadata.labels -> labels

  elastic-internal-elasticsearch-bin-local:

    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)

    Medium:

    SizeLimit:  <unset>

  elastic-internal-elasticsearch-config:

    Type:        Secret (a volume populated by a Secret)

    SecretName:  search-es-default-es-config

    Optional:    false

  elastic-internal-elasticsearch-config-local:

    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)

    Medium:

    SizeLimit:  <unset>

  elastic-internal-elasticsearch-plugins-local:

    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)

    Medium:

    SizeLimit:  <unset>

  elastic-internal-http-certificates:

    Type:        Secret (a volume populated by a Secret)

    SecretName:  search-es-http-certs-internal

    Optional:    false

  elastic-internal-probe-user:

    Type:        Secret (a volume populated by a Secret)

    SecretName:  search-es-internal-users

    Optional:    false

  elastic-internal-remote-certificate-authorities:

    Type:        Secret (a volume populated by a Secret)

    SecretName:  search-es-remote-ca

    Optional:    false

  elastic-internal-scripts:

    Type:      ConfigMap (a volume populated by a ConfigMap)

    Name:      search-es-scripts

    Optional:  false

  elastic-internal-transport-certificates:

    Type:        Secret (a volume populated by a Secret)

    SecretName:  search-es-default-es-transport-certs

    Optional:    false

  elastic-internal-unicast-hosts:

    Type:      ConfigMap (a volume populated by a ConfigMap)

    Name:      search-es-unicast-hosts

    Optional:  false

  elastic-internal-xpack-file-realm:

    Type:        Secret (a volume populated by a Secret)

    SecretName:  search-es-xpack-file-realm

    Optional:    false

  elasticsearch-logs:

    Type:        EmptyDir (a temporary directory that shares a pod's lifetime)

    Medium:

    SizeLimit:   <unset>

QoS Class:       Burstable

Node-Selectors:  <none>

Tolerations:     node.kubernetes.io/memory-pressure:NoSchedule op=Exists

                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s

                 node.kubernetes.io/unreachable:NoExecute op=Exists for 300s

Events:

  Type     Reason            Age   From               Message

  ----     ------            ----  ----               -------

  Warning  FailedScheduling  13h   default-scheduler  0/3 nodes are available: 1 Insufficient memory, 2 node(s) had volume node affinity conflict.

Wow! That is a lot of information. Luckily the important info can be found at the end. There are no nodes with sufficient memory available in the node pool. Therefore the Kubernetes scheduler cannot schedule a deployment of the pods.  Higher in the information above you can see that at least 2GB of memory is expected.

Time to extend your node pool…

Popular posts from this blog

Azure DevOps/ GitHub emoji

I’m really bad at remembering emoji’s. So here is cheat sheet with all emoji’s that can be used in tools that support the github emoji markdown markup: All credits go to rcaviers who created this list.

Kubernetes–Limit your environmental impact

Reducing the carbon footprint and CO2 emission of our (cloud) workloads, is a responsibility of all of us. If you are running a Kubernetes cluster, have a look at Kube-Green . kube-green is a simple Kubernetes operator that automatically shuts down (some of) your pods when you don't need them. A single pod produces about 11 Kg CO2eq per year( here the calculation). Reason enough to give it a try! Installing kube-green in your cluster The easiest way to install the operator in your cluster is through kubectl. We first need to install a cert-manager: kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.14.5/cert-manager.yaml Remark: Wait a minute before you continue as it can take some time before the cert-manager is up & running inside your cluster. Now we can install the kube-green operator: kubectl apply -f https://github.com/kube-green/kube-green/releases/latest/download/kube-green.yaml Now in the namespace where we want t...

Podman– Command execution failed with exit code 125

After updating WSL on one of the developer machines, Podman failed to work. When we took a look through Podman Desktop, we noticed that Podman had stopped running and returned the following error message: Error: Command execution failed with exit code 125 Here are the steps we tried to fix the issue: We started by running podman info to get some extra details on what could be wrong: >podman info OS: windows/amd64 provider: wsl version: 5.3.1 Cannot connect to Podman. Please verify your connection to the Linux system using `podman system connection list`, or try `podman machine init` and `podman machine start` to manage a new Linux VM Error: unable to connect to Podman socket: failed to connect: dial tcp 127.0.0.1:2655: connectex: No connection could be made because the target machine actively refused it. That makes sense as the podman VM was not running. Let’s check the VM: >podman machine list NAME         ...