Verifying the Health of OpenShift Nodes
The following commands display information about the status and health of nodes in an OpenShift cluster:
Display a column with the status of each node. If a node is not Ready, then it cannot communicate with the OpenShift control plane and is effectively dead to the cluster.
PS C:\Users\suyash.sambhare> oc get nodes
NAME STATUS ROLES AGE VERSION
ocpmn01.ocpcl.suyi.local Ready control-plane,master,worker 108d v1.25.8+37a9a08
ocpmn02.ocpcl.suyi.local Ready control-plane,master,worker 108d v1.25.8+37a9a08
ocpmn03.ocpcl.suyi.local Ready control-plane,master,worker 108d v1.25.8+37a9a08
ocpwn01.ocpcl.suyi.local Ready app,worker 107d v1.25.8+37a9a08
ocpwn02.ocpcl.suyi.local Ready app,worker 107d v1.25.8+37a9a08
ocpwn03.ocpcl.suyi.local Ready infra,worker 106d v1.25.8+37a9a08
ocpwn04.ocpcl.suyi.local Ready infra,worker 106d v1.25.8+37a9a08
Display the current CPU and memory usage of each node. These are actual usage numbers, not the resource requests that the OpenShift scheduler considers as the available and used capacity of the node.
PS C:\Users\suyash.sambhare> oc adm top nodes
NAME CPU(cores) CPU% MEMORY(bytes) MEMORY%
ocpmn01.ocpcl.suyi.local 3646m 23% 30679Mi 65%
ocpmn02.ocpcl.suyi.local 4984m 32% 32720Mi 69%
ocpmn03.ocpcl.suyi.local 6117m 39% 36812Mi 78%
ocpwn01.ocpcl.suyi.local 983m 13% 13249Mi 42%
ocpwn02.ocpcl.suyi.local 1786m 23% 16819Mi 54%
ocpwn03.ocpcl.suyi.local 1073m 3% 21016Mi 44%
ocpwn04.ocpcl.suyi.local 1969m 6% 20950Mi 44%
PS C:\Users\suyash.sambhare>
Display the resources available and used from the scheduler's point of view, and other information. Look for the headings "Capacity", "Allocatable", and "Allocated resources" in the output. The heading "Conditions" indicates whether the node is under memory pressure, disk pressure, or some other condition that would prevent the node from starting new containers.
PS C:\Users\suyash.sambhare> oc describe node ocpmn01.ocpcl.suyi.local
Name: ocpmn01.ocpcl.suyi.local
Roles: control-plane,master,worker
Labels: beta.kubernetes.io/arch=amd64
beta.kubernetes.io/os=linux
kubernetes.io/arch=amd64
kubernetes.io/hostname=ocpmn01.ocpcl.suyi.local
kubernetes.io/os=linux
node-role.kubernetes.io/control-plane=
node-role.kubernetes.io/master=
node-role.kubernetes.io/worker=
node.openshift.io/os_id=rhcos
Annotations: csi.volume.kubernetes.io/nodeid: {"csi.vmware.com":"ocpmn01.ocpcl.suyi.local"}
k8s.ovn.org/host-addresses: ["196.0.11.21"]
k8s.ovn.org/l3-gateway-config:
{"default":{"mode":"shared","interface-id":"br-ex_ocpmn01.ocpcl.suyi.local","mac-address":"00:50:56:86:47:57","ip-addresses":["196.0.11....
k8s.ovn.org/node-chassis-id: 5363b9d6-2c92-4f3b-ba91-7d2a4a2ca173
k8s.ovn.org/node-gateway-router-lrp-ifaddr: {"ipv4":"100.64.0.2/16"}
k8s.ovn.org/node-mgmt-port-mac-address: 3e:1d:b5:6b:b8:73
k8s.ovn.org/node-primary-ifaddr: {"ipv4":"196.0.11.21/24"}
k8s.ovn.org/node-subnets: {"default":"10.127.0.0/23"}
machineconfiguration.openshift.io/controlPlaneTopology: HighlyAvailable
machineconfiguration.openshift.io/currentConfig: rendered-master-45d666cdd4fb509173ccaa091b8ca304
machineconfiguration.openshift.io/desiredConfig: rendered-master-45d666cdd4fb509173ccaa091b8ca304
machineconfiguration.openshift.io/desiredDrain: uncordon-rendered-master-45d666cdd4fb509173ccaa091b8ca304
machineconfiguration.openshift.io/lastAppliedDrain: uncordon-rendered-master-45d666cdd4fb509173ccaa091b8ca304
machineconfiguration.openshift.io/reason:
machineconfiguration.openshift.io/ssh: accessed
machineconfiguration.openshift.io/state: Done
volumes.kubernetes.io/controller-managed-attach-detach: true
CreationTimestamp: Tue, 25 Jul 2023 11:53:54 +0530
Taints: <none>
Unschedulable: false
Lease:
HolderIdentity: ocpmn01.ocpcl.suyi.local
AcquireTime: <unset>
RenewTime: Fri, 10 Nov 2023 12:38:37 +0530
Conditions:
Type Status LastHeartbeatTime LastTransitionTime Reason
Message
---- ------ ----------------- ------------------ ------
-------
MemoryPressure False Fri, 10 Nov 2023 12:36:02 +0530 Tue, 25 Jul 2023 19:28:04 +0530 KubeletHasSufficientMemory kubelet has sufficient memory available
DiskPressure False Fri, 10 Nov 2023 12:36:02 +0530 Tue, 25 Jul 2023 19:28:04 +0530 KubeletHasNoDiskPressure kubelet has no disk pressure
PIDPressure False Fri, 10 Nov 2023 12:36:02 +0530 Tue, 25 Jul 2023 19:28:04 +0530 KubeletHasSufficientPID kubelet has sufficient PID available
Ready True Fri, 10 Nov 2023 12:36:02 +0530 Tue, 25 Jul 2023 19:28:14 +0530 KubeletReady
kubelet is posting ready status
Addresses:
ExternalIP: 196.0.11.21
InternalIP: 196.0.11.21
Hostname: ocpmn01.ocpcl.suyi.local
Capacity:
cpu: 16
ephemeral-storage: 261608428Ki
hugepages-1Gi: 0
hugepages-2Mi: 0
memory: 49430728Ki
pods: 250
Allocatable:
cpu: 15500m
ephemeral-storage: 240024585022
hugepages-1Gi: 0
hugepages-2Mi: 0
memory: 48279752Ki
pods: 250
System Info:
Machine ID: afd9cfb931fb4eb1817cfb87581de93e
System UUID: b9dd0642-a9e0-a2ee-6712-a4af714eee87
Boot ID: 17047a4f-1ccc-47aa-9ad1-2ec1429bdc2c
Kernel Version: 4.18.0-372.53.1.el8_6.x86_64
OS Image: Red Hat Enterprise Linux CoreOS 412.86.202305080640-0 (Ootpa)
Operating System: linux
Architecture: amd64
Container Runtime Version: cri-o://1.25.3-2.rhaos4.12.git592efcd.el8
Kubelet Version: v1.25.8+37a9a08
Kube-Proxy Version: v1.25.8+37a9a08
ProviderID: vsphere://4206ddb9-e0a9-eea2-6712-a4af714eee87
Non-terminated Pods: (45 in total)
Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits Age
--------- ---- ------------ ---------- --------------- ------------- ---
stackrox admission-control-748db4d84c-gvbfw 50m (0%) 500m (3%) 100Mi (0%) 500Mi (1%) 17d
stackrox central-db-688b744fb-pdg2z 4 (25%) 8 (51%) 8Gi (17%) 16Gi (34%) 17d
stackrox collector-7x8v9 70m (0%) 2750m (17%) 340Mi (0%) 3572Mi (7%) 17d
Allocated resources:
(Total limits may be over 100 percent, i.e., overcommitted.)
Resource Requests Limits
-------- -------- ------
cpu 12989m (83%) 22950m (148%)
memory 42540Mi (90%) 47592Mi (100%)
ephemeral-storage 0 (0%) 0 (0%)
hugepages-1Gi 0 (0%) 0 (0%)
hugepages-2Mi 0 (0%) 0 (0%)
Events: <none>
Reviewing the Cluster Version Resource
The OpenShift installer creates an auth directory containing the kubeconfig and kubeadmin-password files.
Run the oc login command to connect to the cluster with the kubeadmin user.
The password of the kubeadmin user is in the kubeadmin-password file.
[user@host ~]$ oc login --token=sha256~hBL_ZaY9adNxmd9-NuHtu6H0-qyLOct_arrnqdsOW7o --server=https://api.ocpcl.suyi.local:6443
WARNING: Using insecure TLS client config. Setting this option is not supported!
Logged into "https://api.ocpcl.suyi.local:6443" as "kube:admin" using the token provided.
You have access to 12 projects, the list has been suppressed. You can list all projects with 'oc projects'
Using project "default".
Cluster Version
ClusterVersion is a custom resource that holds high-level information about the cluster, such as the update channels, the status of the cluster operators, and the cluster version (for example, 4.10.3). Use this resource to declare the version of the cluster you want to run. Defining a new version for the cluster instructs the cluster-version operator to upgrade the cluster to that version.
You can retrieve the cluster version to verify that it is running the desired version, and also to ensure that the cluster uses the right subscription channel.
Run oc get clusterversion to retrieve the cluster version. The output lists the version, including minor releases, the cluster uptime for a given version, and the overall status of the cluster.
PS C:\Users\suyash.sambhare> oc get clusterversion
NAME VERSION AVAILABLE PROGRESSING SINCE STATUS
version 4.12.17 True True 8d Unable to apply 4.12.40: the cluster operator storage is not available
Run oc describe clusterversion to obtain more detailed information about the cluster status.
PS C:\Users\suyash.sambhare> oc describe clusterversion
Name: version
Namespace:
Labels: <none>
Annotations: <none>
API Version: config.openshift.io/v1
Kind: ClusterVersion
Metadata:
Creation Timestamp: 2023-07-25T06:06:10Z
Generation: 6
Managed Fields:
API Version: config.openshift.io/v1
Fields Type: FieldsV1
fieldsV1:
f:spec:
.:
f:channel:
f:clusterID:
Manager: cluster-bootstrap
Operation: Update
Time: 2023-07-25T06:06:10Z
API Version: config.openshift.io/v1
Fields Type: FieldsV1
fieldsV1:
f:spec:
f:desiredUpdate:
.:
f:image:
f:version:
Manager: Mozilla
Operation: Update
Time: 2023-11-01T08:57:33Z
API Version: config.openshift.io/v1
Fields Type: FieldsV1
fieldsV1:
f:status:
.:
f:availableUpdates:
f:capabilities:
.:
f:enabledCapabilities:
f:knownCapabilities:
f:conditions:
f:desired:
.:
f:image:
f:url:
f:version:
f:history:
f:observedGeneration:
f:versionHash:
Manager: cluster-version-operator
Operation: Update
Subresource: status
Time: 2023-11-10T06:58:06Z
Resource Version: 241224594
UID: 56a916ef-1f62-4b12-b04e-574675d1089c
Spec:
Channel: stable-4.12
Cluster ID: 18bc54
Desired Update:
Image: quay.io/openshift-release-dev/ocp-release@sha256:b0b1aac82f9083d20e7e4269b05dd3679299d277d122fa9d29b772f38d2cacff
Version: 4.12.40
Status:
Available Updates: <nil>
Capabilities:
Enabled Capabilities:
CSISnapshot
Console
Insights
Storage
baremetal
marketplace
openshift-samples
Known Capabilities:
CSISnapshot
Console
Insights
Storage
baremetal
marketplace
openshift-samples
Conditions:
Last Transition Time: 2023-07-25T06:06:14Z
Message: Kubernetes 1.26 and therefore OpenShift 4.13 remove several APIs that require admin consideration. Please see the knowledge article https://access.redhat.com/articles/6958394 for details and instructions.
Reason: AdminAckRequired
Status: False
Type: Upgradeable
Last Transition Time: 2023-07-25T06:06:14Z
Message: Capabilities match configured spec
Reason: AsExpected
Status: False
Type: ImplicitlyEnabledCapabilities
Last Transition Time: 2023-07-25T06:06:14Z
Message: Payload loaded version="4.12.40" image="quay.io/openshift-release-dev/ocp-release@sha256:b0b1aac82f9083d20e7e4269b05dd3679299d277d122fa9d29b772f38d2cacff" architecture="amd64"
Reason: PayloadLoaded
Status: True
Type: ReleaseAccepted
Last Transition Time: 2023-07-25T10:21:40Z
Message: Done applying 4.12.17
Status: True
Type: Available
Last Transition Time: 2023-11-01T09:33:14Z
Message: Cluster operator storage is not available
Reason: ClusterOperatorNotAvailable
Status: True
Type: Failing
Last Transition Time: 2023-11-01T08:58:02Z
Message: Unable to apply 4.12.40: the cluster operator storage is not available
Reason: ClusterOperatorNotAvailable
Status: True
Type: Progressing
Last Transition Time: 2023-11-10T06:58:06Z
Message: Unable to retrieve available updates: Get "https://api.openshift.com/api/upgrades_info/v1/graph?arch=amd64&channel=stable-4.12&id=818bc54&version=4.12.40": dial tcp 34.239.99.247:443: connect: connection timed out
Reason: RemoteFailed
Status: False
Type: RetrievedUpdates
Desired:
Image: quay.io/openshift-release-dev/ocp-release@sha256:b0b1aac82f9083d20e7e4269b05dd3679299d277d122fa9d29b772f38d2cacff
URL: https://access.redhat.com/errata/RHSA-2023:5896
Version: 4.12.40
History:
Completion Time: <nil>
Image: quay.io/openshift-release-dev/ocp-release@sha256:b0b1aac82f9083d20e7e4269b05dd3679299d277d122fa9d29b772f38d2cacff
Started Time: 2023-11-01T08:58:02Z
State: Partial
Verified: true
Version: 4.12.40
Completion Time: 2023-07-25T10:21:40Z
Image: quay.io/openshift-release-dev/ocp-release@sha256:7ca5f8aa44bbc537c5a985a523d87365eab3f6e72abc50b7be4caae741e093f4
Started Time: 2023-07-25T06:06:14Z
State: Completed
Verified: false
Version: 4.12.17
Observed Generation: 6
Version Hash: hGErDPikQok=
Events: <none>
PS C:\Users\suyash.sambhare>
Top comments (0)