Managing bare metal hosts

When you install OKD on a bare metal cluster, you can provision and manage bare metal nodes using machine and machineset custom resources (CRs) for bare metal hosts that exist in the cluster.

About bare metal hosts and nodes

To provision a Fedora CoreOS (FCOS) bare metal host as a node in your cluster, first create a MachineSet custom resource (CR) object that corresponds to the bare metal host hardware. Bare metal host compute machine sets describe infrastructure components specific to your configuration. You apply specific Kubernetes labels to these compute machine sets and then update the infrastructure components to run on only those machines.

Machine CR’s are created automatically when you scale up the relevant MachineSet containing a metal3.io/autoscale-to-hosts annotation. OKD uses Machine CR’s to provision the bare metal node that corresponds to the host as specified in the MachineSet CR.

Maintaining bare metal hosts

You can maintain the details of the bare metal hosts in your cluster from the OKD web console. Navigate to ComputeBare Metal Hosts, and select a task from the Actions drop down menu. Here you can manage items such as BMC details, boot MAC address for the host, enable power management, and so on. You can also review the details of the network interfaces and drives for the host.

You can move a bare metal host into maintenance mode. When you move a host into maintenance mode, the scheduler moves all managed workloads off the corresponding bare metal node. No new workloads are scheduled while in maintenance mode.

You can deprovision a bare metal host in the web console. Deprovisioning a host does the following actions:

  1. Annotates the bare metal host CR with cluster.k8s.io/delete-machine: true

  2. Scales down the related compute machine set

Powering off the host without first moving the daemon set and unmanaged static pods to another node can cause service disruption and loss of data.

Additional resources

Adding a bare metal host to the cluster using the web console

You can add bare metal hosts to the cluster in the web console.

Prerequisites

  • Install an FCOS cluster on bare metal.

  • Log in as a user with cluster-admin privileges.

Procedure

  1. In the web console, navigate to ComputeBare Metal Hosts.

  2. Select Add HostNew with Dialog.

  3. Specify a unique name for the new bare metal host.

  4. Set the Boot MAC address.

  5. Set the Baseboard Management Console (BMC) Address.

  6. Optional: Enable power management for the host. This allows OKD to control the power state of the host.

  7. Enter the user credentials for the host’s baseboard management controller (BMC).

  8. Select to power on the host after creation, and select Create.

  9. Scale up the number of replicas to match the number of available bare metal hosts. Navigate to ComputeMachineSets, and increase the number of machine replicas in the cluster by selecting Edit Machine count from the Actions drop-down menu.

You can also manage the number of bare metal nodes using the oc scale command and the appropriate bare metal compute machine set.

Adding a bare metal host to the cluster using YAML in the web console

You can add bare metal hosts to the cluster in the web console using a YAML file that describes the bare metal host.

Prerequisites

  • Install a FCOS compute machine on bare metal infrastructure for use in the cluster.

  • Log in as a user with cluster-admin privileges.

  • Create a Secret CR for the bare metal host.

Procedure

  1. In the web console, navigate to ComputeBare Metal Hosts.

  2. Select Add HostNew from YAML.

  3. Copy and paste the below YAML, modifying the relevant fields with the details of your host:

    1. apiVersion: metal3.io/v1alpha1
    2. kind: BareMetalHost
    3. metadata:
    4. name: <bare_metal_host_name>
    5. spec:
    6. online: true
    7. bmc:
    8. address: <bmc_address>
    9. credentialsName: <secret_credentials_name> (1)
    10. disableCertificateVerification: True
    11. bootMACAddress: <host_boot_mac_address>
    12. hardwareProfile: unknown
    1credentialsName must reference a valid Secret CR. The baremetal-operator cannot manage the bare metal host without a valid Secret referenced in the credentialsName. For more information about secrets and how to create them, see Understanding secrets.
  4. Select Create to save the YAML and create the new bare metal host.

  5. Scale up the number of replicas to match the number of available bare metal hosts. Navigate to ComputeMachineSets, and increase the number of machines in the cluster by selecting Edit Machine count from the Actions drop-down menu.

    You can also manage the number of bare metal nodes using the oc scale command and the appropriate bare metal compute machine set.

Automatically scaling machines to the number of available bare metal hosts

To automatically create the number of Machine objects that matches the number of available BareMetalHost objects, add a metal3.io/autoscale-to-hosts annotation to the MachineSet object.

Prerequisites

  • Install FCOS bare metal compute machines for use in the cluster, and create corresponding BareMetalHost objects.

  • Install the OKD CLI (oc).

  • Log in as a user with cluster-admin privileges.

Procedure

  1. Annotate the compute machine set that you want to configure for automatic scaling by adding the metal3.io/autoscale-to-hosts annotation. Replace <machineset> with the name of the compute machine set.

    1. $ oc annotate machineset <machineset> -n openshift-machine-api 'metal3.io/autoscale-to-hosts=<any_value>'

    Wait for the new scaled machines to start.

When you use a BareMetalHost object to create a machine in the cluster and labels or selectors are subsequently changed on the BareMetalHost, the BareMetalHost object continues be counted against the MachineSet that the Machine object was created from.

Removing bare metal hosts from the provisioner node

In certain circumstances, you might want to temporarily remove bare metal hosts from the provisioner node. For example, during provisioning when a bare metal host reboot is triggered by using the OKD administration console or as a result of a Machine Config Pool update, OKD logs into the integrated Dell Remote Access Controller (iDrac) and issues a delete of the job queue.

To prevent the management of the number of Machine objects that matches the number of available BareMetalHost objects, add a baremetalhost.metal3.io/detached annotation to the MachineSet object.

This annotation has an effect for only BareMetalHost objects that are in either Provisioned, ExternallyProvisioned or Ready/Available state.

Prerequisites

  • Install FCOS bare metal compute machines for use in the cluster and create corresponding BareMetalHost objects.

  • Install the OKD CLI (oc).

  • Log in as a user with cluster-admin privileges.

Procedure

  1. Annotate the compute machine set that you want to remove from the provisioner node by adding the baremetalhost.metal3.io/detached annotation.

    1. $ oc annotate machineset <machineset> -n openshift-machine-api 'baremetalhost.metal3.io/detached'

    Wait for the new machines to start.

    When you use a BareMetalHost object to create a machine in the cluster and labels or selectors are subsequently changed on the BareMetalHost, the BareMetalHost object continues be counted against the MachineSet that the Machine object was created from.

  2. In the provisioning use case, remove the annotation after the reboot is complete by using the following command:

    1. $ oc annotate machineset <machineset> -n openshift-machine-api 'baremetalhost.metal3.io/detached-'

Additional resources