Cloud Experts Documentation

ROSA Break Glass Troubleshooting

This content is authored by Red Hat experts, but has not yet been tested on every supported configuration.

Background

WARNING: this procedure should only be initiated by a member of the Black Belt team or someone incredibly familiar with ROSA as a whole. THIS IS NOT COMMON!!!

This guide shows how to access ROSA instances in the situation that a break glass scenario is required in the account where ROSA is deployed. This procedure should only be performed under unusual circumstances like a failed provision in order to collect logs. This may be necessary if the control plane fails and SRE is unable to connect or do much to assist with troubleshooting.

Locate the Nodes

In order to locate the nodes, you can login to your AWS account and navigate to the EC2 service. In the search bar, you can search for the name of your cluster:

EC2 Console

Connect to the Node in Serial Console

From the above list, you will need to connect to the node which you wish to reset the password for. This should be performed for each node that needs to be connected to (e.g. each control plane node). Once you’ve selected your node, hit the Connect button and navigate to EC2 serial console.

EC2 Serial Console

You should see a terminal (you may have to hit enter to get the prompt to show up):

Terminal

Reboot the Node

Back in the EC2 web console, select Reboot Instance and then quickly navigate back to the serial console from the above step:

Reboot node

You will see the node go through a reboot sequence:

Reboot sequence

Boot into single user mode

WARNING the boot menu pops up VERY quickly so you will need to babysit the serial console a bit otherwise you will have to repeat the above steps to get back to this point.

Once the menu pops up, you will need to quickly hit the e key to edit the boot options. On the line starting with linux, you will need to add single as an option:

Boot Menu

Hit CTRL+X to continue the boot process. Once prompted to boot or continue, hit the Enter key:

Enter to continue

Change your password

Change your password like you would on a normal Linux system to a password that you know:

passwd root
Changing password for user root.
New password: 
Retype new password: 
passwd: all authentication tokens updated successfully.

Reboot and login

Finally, reboot the host and you can login normally as the root user with the password you set in the previous step:

ip-10-10-0-216 login: root
Password: 
Red Hat Enterprise Linux CoreOS 412.86.202309190812-0
  Part of OpenShift 4.12, RHCOS is a Kubernetes native operating system
  managed by the Machine Config Operator (`clusteroperator/machine-config`).

WARNING: Direct SSH access to machines is not recommended; instead,
make configuration changes via `machineconfig` objects:
  https://docs.openshift.com/container-platform/4.12/architecture/architecture-rhcos.html

---
[root@ip-10-10-0-216 ~]# 

NOTE: if SSH login is required, you may have to adjust the settings in /etc/ssh/sshd…/40-rhcos-defaults.conf and sshd_config to allow password authentication, otherwise you should be able to push logs by whichever means you prefer.

Logs

Some helpful log locations:

  • /var/log/pods - location of the container logs running on the logged in node

Interested in contributing to these docs?

Collaboration drives progress. Help improve our documentation The Red Hat Way.

Red Hat logo LinkedIn YouTube Facebook Twitter

Products

Tools

Try, buy & sell

Communicate

About Red Hat

We’re the world’s leading provider of enterprise open source solutions—including Linux, cloud, container, and Kubernetes. We deliver hardened solutions that make it easier for enterprises to work across platforms and environments, from the core datacenter to the network edge.

Subscribe to our newsletter, Red Hat Shares

Sign up now
© 2023 Red Hat, Inc.