Merge "docs: add clarifications to move servers"

This commit is contained in:
Jenkins
2015-11-26 07:47:17 +00:00
committed by Gerrit Code Review

View File

@@ -399,15 +399,20 @@ compute host to another including shelve, resize, migrations and
evacuate. The following use cases demonstrate the intention of the
actions and the consequence for operational procedures.
**Shelving**
User doesn't want to be charged when not using a server
-------------------------------------------------------
Sometimes a user does not require a server to be active for a while,
perhaps over a weekend or at certain times of day. This gives
the cloud operator an opportunity to make better use of resources by
freeing resources and rebalancing workloads across the infrastructure.
perhaps over a weekend or at certain times of day.
Ideally they don't want to be billed for those resources.
Just powering down a server does not free up any resources,
but shelving a server does free up resources to be used by other users.
This makes it feasible for a cloud operator to offer a discount when
an server is shelved.
When the user shelves a server the operator can choose to remove it
from the compute hosts. When it is unshelved it is scheduled to a new
from the compute hosts, i.e. the operator can offload the shelved server.
When the user's server is unshelved, it is scheduled to a new
host according to the operators policies for distributing work loads
across the compute hosts, including taking disabled hosts into account.
This will contribute to increased overall capacity, freeing hosts that
@@ -418,14 +423,30 @@ Shelving a server is not normally a choice that is available to
the cloud operator because it affects the availability of the server
being provided to the user.
**Resize**
User resizes server to get more resources
-----------------------------------------
Sometimes a user may want to change the flavor of a server, e.g. change
the quantity of cpus, disk, memory or any other resource. This is done
by rebuilding the server with a new flavor. As the server is being
rebuilt it is normal to reschedule the server to another host
moved, it is normal to reschedule the server to another host
(although resize to the same host is an option for the operator).
Resize involves shutting down the server, finding a host that has
the correct resources for the new flavor size, moving the current
server (including all storage) to the new host. Once the server
has been given the appropriate resources to match the new flavor,
the server is started again.
After the resize operation, when the user is happy their server is
working correctly after the resize, the user calls Confirm Resize.
This deletes the backup server that was kept on the source host.
Alternatively, the user can call Revert Resize to delete the new
resized server, and restore the back up that was stored on the source
host. If the user does not manually confirm the resize within a
configured time period, the resize is automatically confirmed, to
free up the space the backup is using on the source host.
As with shelving, resize provides the cloud operator with an
opportunity to redistribute work loads across the cloud according
to the operators scheduling policy, providing the same benefits as
@@ -435,7 +456,8 @@ Resizing a server is not normally a choice that is available to
the cloud operator because it changes the nature of the server
being provided to the user.
**Migration (including cold and live migration)**
Cloud operator needs to move a server
-------------------------------------
Sometimes a cloud operator may need to redistribute work loads for
operational purposes. For example, the operator may need to remove
@@ -459,7 +481,7 @@ As a result it can take considerably longer than cold migration.
During the action the server is online and accessible, but only
a limited set of management actions are available to the user.
The following are two common patterns for employing migrations in
The following are common patterns for employing migrations in
a cloud:
- **Host maintenance**
@@ -478,8 +500,8 @@ a cloud:
Often it is necessary to perform an update on all compute hosts
that requires them to be rebooted. In this case it is not
strictly necessary to move inactive instances because they
will be available after the reboot. However, active instances would
strictly necessary to move inactive servers because they
will be available after the reboot. However, active servers would
be impacted by the reboot. Live migration will allow them to
continue operation.
@@ -494,12 +516,22 @@ a cloud:
This process can be repeated until the whole cloud has been updated,
usually using a pool of empty hosts instead of just one.
- **Resource Optimization**
To reduce energy usage, some users will try and move servers so
they fit into the minimum number of hosts, allowing some servers
to be turned off.
Sometimes higher performance might be wanted, so servers are
spread out between the hosts to minimize resource contention.
Migrating a server is not normally a choice that is available to
the cloud user because the user is not normally aware of compute
hosts. Management of the cloud and how servers are provisioned
in it is the sole responsibility of the cloud operator.
**Evacuate**
Recover from a failed compute host
----------------------------------
Sometimes a compute host may fail. This is a rare occurrence, but when
it happens during normal operation the servers running on the host may
@@ -519,6 +551,15 @@ hosts without worry of the old incarnations reappearing and trying to
access shared resources. It is usual to redistribute the servers
from a failed host by rescheduling them.
Please note, this operation can result in data loss for the user's server.
As there is no access to the original server, if there were any disks stored
on local storage, that data will be lost. Evacuate does the same operation
as a rebuild. It downloads any images from glance and creates new
blank ephemeral disks. Any disks that were volumes, or on shared storage,
are reconnected. There should be no data loss for those disks.
This is why fencing the host is important, to ensure volumes and shared
storage are not corrupted by two servers writing simultaneously.
Evacuating a server is solely in the domain of the cloud operator because
it must be performed in coordination with other operational procedures to
be safe. A user is not normally aware of compute hosts but is adversely