Troubleshoot Registry Namespace Pods in ImagePullBackOff State

Available Languages

Download Options

PDF (46.6 KB)
View with Adobe Reader on a variety of devices
ePub (85.2 KB)
View in various apps on iPhone, iPad, Android, Sony Reader, or Windows Phone
Mobi (Kindle) (74.5 KB)
View on Kindle device or Kindle app on multiple devices

Updated:August 25, 2022

Document ID:218090

Bias-Free Language

The documentation set for this product strives to use bias-free language. For the purposes of this documentation set, bias-free is defined as language that does not imply discrimination based on age, disability, gender, racial identity, ethnic identity, sexual orientation, socioeconomic status, and intersectionality. Exceptions may be present in the documentation due to language that is hardcoded in the user interfaces of the product software, language used based on RFP documentation, or language that is used by a referenced third-party product. Learn more about how Cisco is using Inclusive Language.

Introduction

This document describes the problem and solution of the registry pods in the ImagePullBackOff state.

Problem

The registry pods in Cluster Manager (CM) of the Ultra Cloud Core Subscriber Microservices Infrastructure (SMI) are in ImagePullBackOff state.

cloud-user@lab-deployer-cm-primary:~$ kubectl get pods -A -o wide | grep -v "Running"
NAMESPACE        NAME                                                        READY   STATUS             RESTARTS   AGE    IP               NODE                      NOMINATED NODE   READINESS GATES
registry         charts-cee-2020-02-2-1-1-0                                  0/1     ImagePullBackOff   0          100d   10.10.10.178   lab-deployer-cm-primary   <none>           <none>
registry         charts-cluster-deployer-2020-02-2-35-0                      0/1     ImagePullBackOff   0          100d   10.10.10.180   lab-deployer-cm-primary   <none>           <none>
registry         registry-cee-2020-02-2-1-1-0                                0/1     ImagePullBackOff   0          100d   10.10.10.198   lab-deployer-cm-primary   <none>           <none>
registry         registry-cluster-deployer-2020-02-2-35-0                    0/1     ImagePullBackOff   0          100d   10.10.10.152   lab-deployer-cm-primary   <none>           <none>
registry         software-unpacker-0                                         0/1     ImagePullBackOff   0          100d   10.10.10.160   lab-deployer-cm-primary   <none>           <none>

The Common Execution Environment (CEE) Deployer shows zero percent of the system ready because the system synchronization pending is true.

[deployer/cee] cee# show system 
system uuid 012345678-9abc-0123-4567-000011112222
system status deployed true
system status percent-ready 0.0
system ops-center repository https://charts.10.192.1.1.nip.io/cee-2020.02.2.35
system ops-center-debug status false
system synch running true
system synch pending true.

Use Secure Shell Protocol (SSH) to connect to the CEE, the error 404 Not Found is reported.

[deployer/cee] cee# 
Message from confd-api-manager at 2022-05-05 01:01:01...
Helm update is ERROR. Trigger for update is CHANGE. Message is:
WebApplicationException: HTTP 404 Not Found
com.google.common.util.concurrent.UncheckedExecutionException: javax.ws.rs.WebApplicationException: HTTP 404 Not Found
at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2052)
at com.google.common.cache.LocalCache.get(LocalCache.java:3943)
at com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:3967)
at com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4952)
at com.broadhop.confd.config.proxy.dao.HelmRepositoryDAO.getChartVersion(HelmRepositoryDAO.java:638)
at com.broadhop.confd.config.proxy.dao.HelmRepositoryDAO.installRelease(HelmRepositoryDAO.java:359)
at com.broadhop.confd.config.proxy.dao.HelmRepositoryDAO.sendConfiguration(HelmRepositoryDAO.java:254)
at com.broadhop.confd.config.proxy.service.ConfigurationSynchManager.run(ConfigurationSynchManager.java:233)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: javax.ws.rs.WebApplicationException: HTTP 404 Not Found
at com.broadhop.confd.config.proxy.dao.HelmRepositoryDAO.retrieveHelmIndex(HelmRepositoryDAO.java:620)
at com.broadhop.confd.config.proxy.dao.HelmRepositoryDAO$2.load(HelmRepositoryDAO.java:114)
at com.broadhop.confd.config.proxy.dao.HelmRepositoryDAO$2.load(HelmRepositoryDAO.java:112)
at com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3524)
at com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2273)
at com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2156)
at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2046)

Analysis

Check the helm repository configuration in CEE Deployer.

[deployer/cee] cee# show running-config helm 
helm default-repository base-repos
helm repository base-repos
url https://charts.10.192.1.1.nip.io/cee-2020.02.2.35
exit

Query the index.yaml of the url from the primary Cluster Manager to make sure that the 404 response is sent.

cloud-user@deployer-cm-primary:~$ curl -k https://charts.10.192.1.1.nip.io/cee-2020.02.2.35/index.yaml
default backend - 404

Query image list with the kubectl describe pod command. There is no image based on the description error.

cloud-user@lab-deployer-cm-primary:~$ kubectl describe pod ops-center-cee-labcluster-ops-center-df69975c7-gzszg -n cee-labcluster | grep Image
Image: docker.10.192.1.1.nip.io/cee-2020.02.2.35/smi-apps/cee-ops-center/2020.02.2/confd_init:0.7.0-00001111
Image ID: docker-pullable://docker.10.192.1.1.nip.io/cee-2020.02.2.33/smi-apps/cee-ops-center/2020.02.2/confd_init@sha256:0123456789012345678901234567890123456789012345678901234567890123
Image: docker.10.192.1.1.nip.io/cee-2020.02.2.35/smi-libraries/ops-center/2020.02.2/crd_registry:0.7.1-00002222
Image ID: docker-pullable://docker.10.192.1.1.nip.io/cee-2020.02.2.27/smi-libraries/ops-center/2020.02.2/crd_registry@sha256:0123456789012345678901234567890123456789012345678901234567890123
Image: docker.10.192.1.1.nip.io/cee-2020.02.2.35/smi-libraries/ops-center/2020.02.2/local_storage_init:0.7.1-00003333
Image ID: docker-pullable://docker.10.192.1.1.nip.io/cee-2020.02.2.27/smi-libraries/ops-center/2020.02.2/local_storage_init@sha256:0123456789012345678901234567890123456789012345678901234567890123
Image: docker.10.192.1.1.nip.io/cee-2020.02.2.35/smi-libraries/ops-center/2020.02.2/confd:0.7.1-00004444
Image ID: docker-pullable://docker.10.192.1.1.nip.io/cee-2020.02.2.27/smi-libraries/ops-center/2020.02.2/confd@sha256:0123456789012345678901234567890123456789012345678901234567890123
Image: docker.10.192.1.1.nip.io/cee-2020.02.2.35/smi-libraries/ops-center/2020.02.2/confd_api_bridge:0.7.1-00005555
Image ID: docker-pullable://docker.10.192.1.1.nip.io/cee-2020.02.2.33/smi-libraries/ops-center/2020.02.2/confd_api_bridge@sha256:0123456789012345678901234567890123456789012345678901234567890123
Image: docker.10.192.1.1.nip.io/cee-2020.02.2.35/smi-apps/cee-ops-center/2020.02.2/product_confd_callback:0.7.0-00006666
Image ID: docker-pullable://docker.10.192.1.1.nip.io/cee-2020.02.2.27/smi-apps/cee-ops-center/2020.02.2/product_confd_callback@sha256:0123456789012345678901234567890123456789012345678901234567890123
Image: docker.10.192.1.1.nip.io/cee-2020.02.2.35/smi-libraries/ops-center/2020.02.2/ssh_ui:0.7.1-00007777
Image ID: docker-pullable://docker.10.192.1.1.nip.io/cee-2020.02.2.35/smi-libraries/ops-center/2020.02.2/ssh_ui@sha256:0123456789012345678901234567890123456789012345678901234567890123
Image: docker.10.192.1.1.nip.io/cee-2020.02.2.35/smi-libraries/ops-center/2020.02.2/confd_notifications:0.7.1-00008888
Image ID: docker-pullable://docker.10.192.1.1.nip.io/cee-2020.02.2.27/smi-libraries/ops-center/2020.02.2/confd_notifications@sha256:0123456789012345678901234567890123456789012345678901234567890123

Execute the kubectl describe pod command for the name state registry.

Execute the kubectl get pods -A -o wide | grep -v "Running" command to check the state of the pods across all namespaces in the Kubernetes cluster.

cloud-user@lab-deployer-cm-primary:~$ kubectl describe pod charts-cee-2020-02-2-1-1-0 -n registry
Volumes:
charts-volume:
Type: HostPath (bare host directory volume)
Path: /data/software/packages/cee-2020.02.2.1.1/data/charts
HostPathType: DirectoryOrCreate
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal BackOff 9m3s (x104861 over 16d) kubelet Back-off pulling image 
       "dockerhub.cisco.com/smi-fuse-docker-internal/smi-apps/distributed-registry/2020.02.2/apache:0.1.0-abcd123"
Warning Failed 3m59s (x104884 over 16d) kubelet Error: ImagePullBackOff

cloud-user@lab-deployer-cm-primary:$ kubectl describe pod charts-cluster-deployer-2020-02-2-35-0 -n registry
Name: charts-cluster-deployer-2020-02-2-35-0
Namespace: registry
Priority: 1000000000
Priority Class Name: infra-critical
Node: lab-deployer-cm-primary/10.192.1.1
Start Time: Thu, 01 Jan 2022 13:05:03 +0000
Labels: chart-app=charts-cluster-deployer-2020-02-2-35
component=charts
controller-revision-hash=charts-cluster-deployer-2020-02-2-35-589fdf57b8
registry=cluster-deployer-2020.02.2.35
statefulset.kubernetes.io/pod-name=charts-cluster-deployer-2020-02-2-35-0
Annotations: cni.projectcalico.org/podIP: 10.10.10.180/32
cni.projectcalico.org/podIPs: 10.10.10.180/32
sidecar.istio.io/inject: false
Status: Pending
IP: 10.10.10.180
IPs:
IP: 10.10.10.180
Controlled By: StatefulSet/charts-cluster-deployer-2020-02-2-35
Containers:
charts:
Container ID: 
Image: dockerhub.cisco.com/smi-fuse-docker-internal/smi-apps/distributed-registry/2020.02.2/apache:0.1.0-abcd123
Image ID: 
Port: 8080/TCP
Host Port: 0/TCP
State: Waiting
Reason: ImagePullBackOff
Ready: False
Restart Count: 0
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-qcmhx (ro)
/var/www/html/cluster-deployer-2020.02.2.35 from charts-volume (rw)
Conditions:
Type Status
Initialized True 
Ready False 
ContainersReady False 
PodScheduled True 
Volumes:
charts-volume:
Type: HostPath (bare host directory volume)
Path: /data/software/packages/cluster-deployer-2020.02.2.35/data/charts
HostPathType: DirectoryOrCreate
default-token-qcmhx:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-qcmhx
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 30s
node.kubernetes.io/unreachable:NoExecute op=Exists for 30s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal BackOff 118s (x104949 over 16d) kubelet Back-off pulling image 
       "dockerhub.cisco.com/smi-fuse-docker-internal/smi-apps/distributed-registry/2020.02.2/apache:0.1.0-abcd123"

cloud-user@lab-deployer-cm-primary:/data/software/packages/cluster-deployer-2020.02.2.35/data/charts$
cloud-user@lab-deployer-cm-primary:$ kubectl get pods -A -o wide | grep -v "Running"
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
registry charts-cee-2020-02-2-1-1-0 0/1 ImagePullBackOff 0 100d 10.10.10.178 lab-deployer-cm-primary <none> <none>
registry charts-cluster-deployer-2020-02-2-35-0 0/1 ErrImagePull 0 100d 10.10.10.180 lab-deployer-cm-primary <none> <none>
registry registry-cee-2020-02-2-1-1-0 0/1 ErrImagePull 0 100d 10.10.10.198 lab-deployer-cm-primary <none> <none>
registry registry-cluster-deployer-2020-02-2-35-0 0/1 ImagePullBackOff 0 100d 10.10.10.152 lab-deployer-cm-primary <none> <none>
registry software-unpacker-0 0/1 ImagePullBackOff 0 100d 10.10.10.160 lab-deployer-cm-primary <none> <none>

Confirm the files in the cluster deployer.

cloud-user@lab-deployer-cm-primary:/data/software/packages$ cd cluster-deployer-2020.02.2.35/
cloud-user@lab-deployer-cm-primary:/data/software/packages/cluster-deployer-2020.02.2.35$ ll
total 12
drwxrwxr-x 3 303 303 4096 Jan 1 2021 ./
drwxrwxrwt 5 root root 4096 Mar 1 11:39 ../
drwxrwxr-x 5 303 303 4096 Jan 1 2021 data/
cloud-user@lab-deployer-cm-primary:/data/software/packages/cluster-deployer-2020.02.2.35$ cd data/
cloud-user@lab-deployer-cm-primary:/data/software/packages/cluster-deployer-2020.02.2.35/data$ ll
total 20
drwxrwxr-x 5 303 303 4096 Jan 1 2021 ./
drwxrwxr-x 3 303 303 4096 Jan 1 2021 ../
drwxr-xr-x 2 303 303 4096 Mar 1 12:55 charts/
drwxr-xr-x 4 303 303 4096 Aug 10 2021 deployer-inception/
drwxr-xr-x 3 303 303 4096 Aug 10 2021 docker/
cloud-user@lab-deployer-cm-primary:/data/software/packages/cluster-deployer-2020.02.2.35/data$ cd charts/
cloud-user@lab-deployer-cm-primary:/data/software/packages/cluster-deployer-2020.02.2.35/data/charts$ ll
total 116
drwxr-xr-x 2 303 303 4096 Mar 1 12:55 ./
drwxrwxr-x 5 303 303 4096 Jan 1 2021 ../
-rw-r--r-- 1 303 303 486 Aug 10 2021 index.yaml
-rw-r--r-- 1 303 303 102968 Mar 1 12:55 smi-cluster-deployer-1.1.0-2020-02-2-1144-210826141421-15f3d5b.tgz
cloud-user@lab-deployer-cm-primary:/tmp$ 
cloud-user@lab-deployer-cm-primary:/tmp$ ls /tmp/k8s-* -al
-rw-r--r-- 1 root root 2672 Sep 7 2021 /tmp/k8s-offline.tgz.txt

Solution

The issue is deemed to be caused by the cluster sync-up failure. The solution is to run a cluster sync up from the Inception Server to the CM High Availability (HA).

Use SSH to connect to the Inspection Server.

Use SSH to connect to the ops center port 2022.

cloud-user@all-in-one-vm:~$ ssh admin@localhost -p 2022

Verify the cluster is in the Inception Server.

[all-in-one-base-vm] SMI Cluster Deployer# show clusters

Verify and confirm that the configuration of the cluster is correct. In this example the cluster name is lab-deployer.
```
[all-in-one-base-vm] SMI Cluster Deployer# show running-config clusters lab-deployer
```

Run the cluster sync.

[all-in-one-base-vm] SMI Cluster Deployer# clusters lab-deployer actions sync run debug

Monitor the sync logs.

[all-in-one-base-vm] SMI Cluster Deployer# monitor sync-logs lab-deployer

Successful cluster sync logs example below :  
Wednesday 01 December 2021  01:01:01 +0000 (0:00:00.080)       0:33:08.600 ****
===============================================================================
2021-12-01 01:01:01.230 DEBUG cluster_sync.ca-deployer: Cluster sync successful
2021-12-01 01:01:01.230 DEBUG cluster_sync.ca-deployer: Ansible sync done
2021-12-01 01:01:01.231 INFO cluster_sync.ca-deployer: _sync finished.  Opening lock

Use SSH to connect to the Cluster Manager and make sure the pods are in the "running" state.
```
cloud-user@lab-deployer-cm-primary:~$ kubectl get pods -A -o wide | grep -v "Running"
```

Revision History

Revision	Publish Date	Comments
1.0	25-Aug-2022	Initial Release

Contributed by Cisco Engineers

Nebojsa Kosanovic
Cisco TAC Technical Leader
Dennis Lanov
Cisco TAC Technical Leader

Was this Document Helpful?

Feedback

Contact Cisco

Open a Support Case
(Requires a Cisco Service Contract)

Troubleshoot Registry Namespace Pods in ImagePullBackOff State

Available Languages

Download Options

Bias-Free Language

Contents

Introduction

Problem

Analysis

Solution

Revision History

Contributed by Cisco Engineers

Was this Document Helpful?

Contact Cisco

This Document Applies to These Products