Troubleshooting


Finding Kubernetes Resources

For private clouds, the download link for the Kubeconfig file is available on the last page of the installer UI as displayed in the following screenshot. 

While you may see this file for successful installations in the above screen, you will not be able to access this file if your installation was not successful. This file is required to issue any command listed in the https://kubernetes.io/docs/reference/kubectl/cheatsheet/ section of the Kubernetes documentation. 

By default, the kubectl command looks for the Kubeconfig file in the $HOME/.kube folder.  

  • Successful installation: Copy the downloaded Kubeconfig file to your $HOME/.kube folder and then issue any of the kubectl commands listed in the Kubernetes cheatsheet link above.
  • Stalled Installation
    • Private clouds and most public clouds: SSH into one of the primary server nodes and copy the Kubeconfig file from /etc/kubernetes/admin.conf to the /root/.kube folder.  
    • GCP: Login to GCP, access the Kubernetes Engine, locate your cluster, click Connect to Connect to the cluster, and click the copy icon as displayed in the following screenshot. You should have already installed gcloud in order to view this icon.


Error during the Suite Installation Process (Lack of Resources)

At any time, if you your installation stalls due to a lack of resources, perform this procedure to analyze the error logs.

To fetch the logs for this pod run : 

  1. Locate the actual name of the container by running the following command:

    kubectl get pods -all-namespaces | grep common-framework-suite-prod-mgmt-xxxx
  2. Click the Download Logs Download link to download the installation logs for the failed service in case of an installation failure. 

  3. View the Logs for the container:  common-framework-suite-prod-mgmt ...

  4. Run the following command to view the error:

    kubectl logs -f common-framework-suite-prod-mgmt-xxxx -n cisco

The Progress bar for a Kubernetes Cluster is Stuck at Launching cluster nodes on the cloud or Configuring the primary cluster

The issue displayed in the following screenshot could be an issue with the cloud environment. Refer to your cloud documentation for possible issues.

Other examples:

  • If the target cloud is vSphere, check if the cloud account being used has permissions to launch a VM and if the VM is configured with a valid IPv4 address. 
  • If the cluster nodes are configured to use static IP, verify if the IP pool used is valid and if all the launched nodes have a unique IP from the pool.


Installation Failed: Failed to copy <script-name.sh> to remote host or any error related to SSH connection failure

This issue can occur when the installer node cannot SSH/SCP into launched cluster nodes. Verify if all the launched nodes have a valid IPv4 address and if the installer network can communicate with the Kubernetes cluster network (if they are on different networks). Also verify that the cluster nodes are able to connect to vSphere.


A Pod has Unbound PersistentVolumeClaims

The problem displayed in the following screenshot is usually caused when the cloud user does not have permissions to the configured storage. For example, a vSphere user may not have permissions to the selected datastore.

The Kubernetes Cluster Is Installed Successfully, but the Progress Bar for Suite Administration is Stuck at Waiting for product to be ready

This issue indicates that the CloudCenter Suite installation has some issue. Use the downloaded SSH key to SSH into one of the primary server nodes. To check the status of the pods, run kubectl get pods --all-namespaces for each pod. If the status does not display Running, run the following commands to debug further:

kubectl describe pod <pod-name> -n cisco

or

kubectl logs -f <pod-name> -n cisco

Use the downloaded SSH key to SSH into each cluster node and check if the system clock is synchronized on all nodes. Even if the NTP servers were initially synchronized verify if they are still active by using the following command.

ntpdate <ntp_server>


After Using Suite Admin for a While, Users Cannot Login to Suite Admin if Any Cluster Node is in a NotReady State

This issue may be the result of any of the following situations:

  • Are all the cluster nodes up and running with a valid IP address?

  • If the nodes are running, then use the downloaded SSH key to SSH into one of the primary server nodes.

  • Run the following command on the primary server to verify if all the nodes are in the Ready state.

     kubectl get nodes 

Installation Failed: Failed to copy <script-name.sh>  to remote host or any error related to SSH connection failure

If any of the nodes are Not Ready state, then run the following command on the node:

kubectl describe node <node-name>

If none of the above methods work, retry the installation or contact your CloudCenter Suite admin.


Error in Creating a Cluster

In case of failure (due to a quota availability issue) during the installation process, an error message similar to the one displayed in the following screenshot appears.

Download Logs

Click the Download Logs Download link to download the installation logs for the failed service in case of an installation failure. See Monitor Modules > Download Logs for additional information.

  • No labels
Terms & Conditions Privacy Statement Cookies Trademarks