The state digram is constrainted into a frame of the page rendering and the prior configuration set it to be a maximum of 660 pixels, however we should allow the image to be aligned to page size which can result in a larger image, but still constrained slightly so spinx includes a link to the image. Change-Id: I19350fc010bd5aac798b2d57ea3d2eb98239a457
12 KiB
Bare Metal State Machine
State Machine Diagram
The diagram below shows the provisioning states that an Ironic node goes through during the lifetime of a node. The diagram also depicts the events that transition the node to different states.
Stable states are highlighted with a thicker border. All transitions from stable states are initiated by API requests. There are a few other API-initiated-transitions that are possible from non-stable states. The events for these API-initiated transitions are indicated with '(via API)'. Internally, the conductor initiates the other transitions (depicted in gray).
Please click the image above to view the diagram at it's full size, as the presence in the documentation results in it being scaled down.
Note
There are aliases for some transitions:
deployis an alias foractive.undeployis an alias fordeleted
Enrollment and Preparation
- enroll (stable state)
-
This is the state that all nodes start off in when created using API version 1.11 or newer. When a node is in the
enrollstate, the only thing ironic knows about it is that it exists, and ironic cannot take any further action by itself. Once a node has its driver/interfaces and their required information set innode.driver_info, the node can be transitioned to theverifyingstate by setting the node's provision state using themanageverb.See
/install/enrollmentfor information on enrolling nodes. - verifying
-
ironic will validate that it can manage the node using the information given in
node.driver_infoand with either the driver/hardware type and interfaces it has been assigned. This involves going out and confirming that the credentials work to access whatever node control mechanism they talk to. - manageable (stable state)
-
Once ironic has verified that it can manage the node using the driver/interfaces and credentials passed in at node create time, the node will be transitioned to the
manageablestate. Frommanageable, nodes can transition to:manageable(throughcleaning) by setting the node's provision state using thecleanverb.manageable(throughinspecting) by setting the node's provision state using theinspectverb.available(throughcleaningif automatic cleaning is enabled) by setting the node's provision state using theprovideverb.active(throughadopting) by setting the node's provision state using theadoptverb.
manageableis the state that a node should be moved into when any updates need to be made to it such as changes to fields in driver_info and updates to networking information on ironic ports assigned to the node.manageableis also the only stable state that can be transitioned to, from these failure states:adopt failedclean failedinspect failed
- inspecting
-
inspectingwill utilize node introspection to update hardware-derived node properties to reflect the current state of the hardware. Typically, the node will transition tomanageableif inspection is synchronous, orinspect waitif asynchronous. The node will transition toinspect failedif error occurred.See
/admin/inspectionfor information about inspection. - inspect wait
-
This is the provision state used when an asynchronous inspection is in progress. A successfully inspected node shall transition to
manageablestate. - inspect failed
-
This is the state a node will move into when inspection of the node fails. From here the node can transitioned to:
inspectingby setting the node's provision state using theinspectverb.manageableby setting the node's provision state using themanageverb
- cleaning
-
Nodes in the
cleaningstate are being scrubbed and reprogrammed into a known configuration.When a node is in the
cleaningstate it means that the conductor is executing the clean step (for out-of-band clean steps) or preparing the environment (building PXE configuration files, configuring the DHCP, etc) to boot the ramdisk for running in-band clean steps. - clean wait
-
Just like the
cleaningstate, the nodes in theclean waitstate are being scrubbed and reprogrammed. The difference is that in theclean waitstate the conductor is waiting for the ramdisk to boot or the clean step which is running in-band to finish.The cleaning process of a node in the
clean waitstate can be interrupted by setting the node's provision state using theabortverb if the task that is running allows it.
Deploy and Undeploy
- available (stable state)
-
After nodes have been successfully preconfigured and cleaned, they are moved into the
availablestate and are ready to be provisioned. Fromavailable, nodes can transition to:active(throughdeploying) by setting the node's provision state using theactiveordeployverbs.manageableby setting the node's provision state using themanageverb
- deploying
-
Nodes in
deployingare being prepared to run a workload on them. This consists of running a series of tasks, such as:- Setting appropriate BIOS configurations
- Partitioning drives and laying down file systems.
- Creating any additional resources (node-specific network config, a config drive partition, etc.) that may be required by additional subsystems.
See
/user/deployand/admin/node-deploymentfor information about deploying nodes. - wait call-back
-
Just like the
deployingstate, the nodes inwait call-backare being deployed. The difference is that inwait call-backthe conductor is waiting for the ramdisk to boot or execute parts of the deployment which need to run in-band on the node (for example, installing the bootloader, or writing the image to the disk).The deployment of a node in
wait call-backcan be interrupted by setting the node's provision state using thedeletedorundeployverbs. - deploy failed
-
This is the state a node will move into when a deployment fails, for example a timeout waiting for the ramdisk to PXE boot. From here the node can be transitioned to:
active(throughdeploying) by setting the node's provision state using theactive,deployorrebuildverbs.available(throughdeletingandcleaning) by setting the node's provision state using thedeletedorundeployverbs.
- active (stable state)
-
Nodes in
activehave a workload running on them. ironic may collect out-of-band sensor information (including power state) on a regular basis. Nodes inactivecan transition to:available(throughdeletingandcleaning) by setting the node's provision state using thedeletedorundeployverbs.active(throughdeploying) by setting the node's provision state using therebuildverb.rescue(throughrescuing) by setting the node's provision state using therescueverb.
- deleting
-
Nodes in
deletingstate are being torn down from running an active workload. Indeleting, ironic tears down and removes any configuration and resources it added indeployingorrescuing. - error (stable state)
-
This is the state a node will move into when deleting an active deployment fails. From
error, nodes can transition to:available(throughdeletingandcleaning) by setting the node's provision state using thedeletedorundeployverbs.
- adopting
-
This state allows ironic to take over management of a baremetal node with an existing workload on it. Ordinarily when a baremetal node is enrolled and managed by ironic, it must transition through
cleaninganddeployingto reachactivestate. However, those baremetal nodes that have an existing workload on them, do not need to be deployed or cleaned again, so this transition allows these nodes to move directly frommanageabletoactive.See
/admin/adoptionfor information about this feature.
Rescue
- rescuing
-
Nodes in
rescuingare being prepared to perform rescue operations. This consists of running a series of tasks, such as:- Setting appropriate BIOS configurations.
- Creating any additional resources (node-specific network config, etc.) that may be required by additional subsystems.
See
/admin/rescuefor information about this feature. - rescue wait
-
Just like the
rescuingstate, the nodes inrescue waitare being rescued. The difference is that inrescue waitthe conductor is waiting for the ramdisk to boot or execute parts of the rescue which need to run in-band on the node (for example, setting the password for user namedrescue).The rescue operation of a node in
rescue waitcan be aborted by setting the node's provision state using theabortverb. - rescue failed
-
This is the state a node will move into when a rescue operation fails, for example a timeout waiting for the ramdisk to PXE boot. From here the node can be transitioned to:
rescue(throughrescuing) by setting the node's provision state using therescueverb.active(throughunrescuing) by setting the node's provision state using theunrescueverb.available(throughdeleting) by setting the node's provision state using thedeletedverb.
- rescue (stable state)
-
Nodes in
rescuehave a rescue ramdisk running on them. Ironic may collect out-of-band sensor information (including power state) on a regular basis. Nodes inrescuecan transition to:active(throughunrescuing) by setting the node's provision state using theunrescueverb.available(throughdeleting) by setting the node's provision state using thedeletedverb.
- unrescuing
-
Nodes in
unrescuingare being prepared to transition toactivestate fromrescuestate. This consists of running a series of tasks, such as setting appropriate BIOS configurations such as changing boot device. - unrescue failed
-
This is the state a node will move into when an unrescue operation fails. From here the node can be transitioned to:
rescue(throughrescuing) by setting the node's provision state using therescueverb.active(throughunrescuing) by setting the node's provision state using theunrescueverb.available(throughdeleting) by setting the node's provision state using thedeletedverb.
Servicing
- servicing
-
Nodes in the
servicingstate are nodes that are having service performed on them. This service is similar to cleaning, but is performed on nodes currently inactivestate and returns them toactivestate when complete.When a node is in the
servicingstate it means that the conductor is executing the service step or preparing the environment to execute the step.See
/admin/servicingfor more details on Node servicing. - service wait
-
Just like the
servicingstate, the nodes in theservice waitstate are being serviced with service steps. The difference is that in theservice waitstate the conductor is waiting for the ramdisk to boot or the clean step which is running in-band to finish.The servicing of a node in the
service waitstate can be interrupted by setting the node's provision state using theabortverb if the task that is running allows it. - service failed
-
This is the state a node will move into when a service operation fails, for example a timeout waiting for the ramdisk to PXE boot. From here the node can be transitioned to:
active(throughservicing) by setting the node's provision state using theserviceverb.rescue(throughrescuing) by setting the node's provision state using therescueverb.