Troubleshooting CRI-O container runtime issues

About CRI-O container runtime engine

CRI-O is a Kubernetes-native container engine implementation that integrates closely with the operating system to deliver an efficient and optimized Kubernetes experience. The CRI-O container engine runs as a systemd service on each OKD cluster node.

When container runtime issues occur, verify the status of the crio systemd service on each node. Gather CRI-O journald unit logs from nodes that have container runtime issues.

Verifying CRI-O runtime engine status

You can verify CRI-O container runtime engine status on each cluster node.

Prerequisites

  • You have access to the cluster as a user with the cluster-admin role.

  • You have installed the OpenShift CLI (oc).

Procedure

  1. Review CRI-O status by querying the crio systemd service on a node, within a debug pod.

    1. Start a debug pod for a node:

      1. $ oc debug node/my-node
    2. Set /host as the root directory within the debug shell. The debug pod mounts the host’s root file system in /host within the pod. By changing the root directory to /host, you can run binaries contained in the host’s executable paths:

      1. # chroot /host

      OKD 4.12 cluster nodes running Fedora CoreOS (FCOS) are immutable and rely on Operators to apply cluster changes. Accessing cluster nodes using SSH is not recommended and nodes will be tainted as accessed. However, if the OKD API is not available, or the kubelet is not properly functioning on the target node, oc operations will be impacted. In such situations, it is possible to access nodes using ssh core@<node>.<cluster_name>.<base_domain> instead.

    3. Check whether the crio systemd service is active on the node:

      1. # systemctl is-active crio
    4. Output a more detailed crio.service status summary:

      1. # systemctl status crio.service

Gathering CRI-O journald unit logs

If you experience CRI-O issues, you can obtain CRI-O journald unit logs from a node.

Prerequisites

  • You have access to the cluster as a user with the cluster-admin role.

  • Your API service is still functional.

  • You have installed the OpenShift CLI (oc).

  • You have the fully qualified domain names of the control plane or control plane machines.

Procedure

  1. Gather CRI-O journald unit logs. The following example collects logs from all control plane nodes (within the cluster:

    1. $ oc adm node-logs --role=master -u crio
  2. Gather CRI-O journald unit logs from a specific node:

    1. $ oc adm node-logs <node_name> -u crio
  3. If the API is not functional, review the logs using SSH instead. Replace <node>.<cluster_name>.<base_domain> with appropriate values:

    1. $ ssh core@<node>.<cluster_name>.<base_domain> journalctl -b -f -u crio.service

    OKD 4.12 cluster nodes running Fedora CoreOS (FCOS) are immutable and rely on Operators to apply cluster changes. Accessing cluster nodes using SSH is not recommended and nodes will be tainted as accessed. Before attempting to collect diagnostic data over SSH, review whether the data collected by running oc adm must gather and other oc commands is sufficient instead. However, if the OKD API is not available, or the kubelet is not properly functioning on the target node, oc operations will be impacted. In such situations, it is possible to access nodes using ssh core@<node>.<cluster_name>.<base_domain>.

Cleaning CRI-O storage

You can manually clear the CRI-O ephemeral storage if you experience the following issues:

  • A node cannot run on any pods and this error appears:

    1. Failed to create pod sandbox: rpc error: code = Unknown desc = failed to mount container XXX: error recreating the missing symlinks: error reading name of symlink for XXX: open /var/lib/containers/storage/overlay/XXX/link: no such file or directory
  • You cannot create a new container on a working node and the “can’t stat lower layer” error appears:

    1. can't stat lower layer ... because it does not exist. Going through storage to recreate the missing symlinks.
  • Your node is in the NotReady state after a cluster upgrade or if you attempt to reboot it.

  • The container runtime implementation (crio) is not working properly.

  • You are unable to start a debug shell on the node using oc debug node/<nodename> because the container runtime instance (crio) is not working.

Follow this process to completely wipe the CRI-O storage and resolve the errors.

Prerequisites:

  • You have access to the cluster as a user with the cluster-admin role.

  • You have installed the OpenShift CLI (oc).

Procedure

  1. Use cordon on the node. This is to avoid any workload getting scheduled if the node gets into the Ready status. You will know that scheduling is disabled when SchedulingDisabled is in your Status section:

    1. $ oc adm cordon <nodename>
  2. Drain the node as the cluster-admin user:

    1. $ oc adm drain <nodename> --ignore-daemonsets --delete-emptydir-data

    The terminationGracePeriodSeconds attribute of a pod or pod template controls the graceful termination period. This attribute defaults at 30 seconds, but can be customized per application as necessary. If set to more than 90 seconds, the pod might be marked as SIGKILLed and fail to terminate successfully.

  3. When the node returns, connect back to the node via SSH or Console. Then connect to the root user:

    1. $ ssh core@node1.example.com
    2. $ sudo -i
  4. Manually stop the kubelet:

    1. # systemctl stop kubelet
  5. Stop the containers and pods:

    1. Use the following command to stop the pods that are not in the HostNetwork. They must be removed first because their removal relies on the networking plugin pods, which are in the HostNetwork.

      1. .. for pod in $(crictl pods -q); do if [[ "$(crictl inspectp $pod | jq -r .status.linux.namespaces.options.network)" != "NODE" ]]; then crictl rmp -f $pod; fi; done
    2. Stop all other pods:

      1. # crictl rmp -fa
  6. Manually stop the crio services:

    1. # systemctl stop crio
  7. After you run those commands, you can completely wipe the ephemeral storage:

    1. # crio wipe -f
  8. Start the crio and kubelet service:

    1. # systemctl start crio
    2. # systemctl start kubelet
  9. You will know if the clean up worked if the crio and kubelet services are started, and the node is in the Ready status:

    1. $ oc get nodes

    Example output

    1. NAME STATUS ROLES AGE VERSION
    2. ci-ln-tkbxyft-f76d1-nvwhr-master-1 Ready, SchedulingDisabled master 133m v1.25.0
  10. Mark the node schedulable. You will know that the scheduling is enabled when SchedulingDisabled is no longer in status:

    1. $ oc adm uncordon <nodename>

    Example output

    1. NAME STATUS ROLES AGE VERSION
    2. ci-ln-tkbxyft-f76d1-nvwhr-master-1 Ready master 133m v1.25.0