Debug School

Cover image for Compute Migration
Suyash Sambhare
Suyash Sambhare

Posted on

Compute Migration

Compute migration between nodes is a crucial aspect of managing virtual machine instances in cloud environments.

  1. Cold Migration (Non-Live Migration):

    • Involves shutting down a running instance before migrating it from the source Compute node to the destination Compute node.
    • Results in some downtime for the instance.
    • The migrated instance maintains access to the same volumes and IP addresses.
  2. Live Migration:

    • Involves moving the instance from the source Compute node to the destination Compute node without shutting it down.
    • Ensures state consistency during migration.
    • No downtime for the instance.
    • Live migration is particularly useful for workload rebalancing, maintenance, and avoiding hot spots.
  3. Use Cases for Migration:

    • Compute Node Maintenance:
      • Temporarily take a Compute node out of service for hardware maintenance, kernel upgrades, or software updates.
    • Failing Compute Node:
      • Migrate instances from a failing Compute node to a healthy one.
    • Failed Compute Nodes:
      • Evacuate instances from a failed Compute node by rebuilding them on another node.
    • Workload Rebalancing:
      • Distribute instances across Compute nodes to optimize resource utilization.

Compute Management

During live migration, managing compute resources (CPU and memory) is essential for a seamless transition. Let's explore how it works:

  1. Memory Transfer:

    • The VM's memory pages are copied from the source node to the destination node.
    • This process occurs in two stages:
      • Pre-Copy: Initially, memory pages are transferred while the VM continues running.
      • Stop-and-Copy: The VM is briefly paused, and remaining memory changes are copied.
  2. CPU State:

    • The VM's CPU state (registers, execution context) is also transferred.
    • The destination node resumes execution from where the source node left off.
  3. Network Connectivity:

    • Network state (MAC address, IP configuration) is maintained during migration.
    • Seamless connectivity ensures uninterrupted communication.
  4. Downtime Minimization:

    • Live migration aims for zero downtime by overlapping memory transfer and VM execution.
    • The final stop-and-copy phase has minimal impact on application availability.

Live Migration

Memory Management

During live migration, memory handling is a critical aspect. Memory copy is handled in the following manner.

  1. Memory Pre-Copy:

    • Initially, the source node starts copying memory pages to the destination node.
    • This process continues iteratively until the memory changes are minimal.
    • The instance remains running on the source node during this phase.
  2. Stop-and-Copy (Post-Copy):

    • At a certain point, the instance is paused briefly.
    • The remaining memory pages are copied from the source to the destination node.
    • The instance resumes execution on the destination node.
  3. Dirty Page Tracking:

    • To minimize downtime, the hypervisor tracks memory pages that change during pre-copy.
    • Only the modified pages are transferred during the stop-and-copy phase.
  4. Performance Considerations:

    • Live migration impacts performance due to memory copying and network traffic.
    • Balancing between migration speed and downtime is crucial.

Storage Management

During live migration, storage management is crucial to ensuring seamless transitions. Storage is handled in the following manner.

  1. Shared Storage:

    • Live migration typically involves VMs using shared storage.
    • These storage resources are often SAN LUNs (Storage Area Network Logical Unit Numbers) managed by the Failover Clustering service.
    • Each VM doesn't have its own dedicated LUN; instead, multiple VMs share the same storage volume.
  2. Storage Transfer:

    • When a live migration occurs, the VM's memory, storage, and network connectivity are transferred from the original host to the destination.
    • The entire virtual machine, including virtual hard disks (VHDs) and configuration information, can be moved.
  3. Zero Downtime:

    • The beauty of live migration lies in its ability to perform these storage transfers without service interruption or downtime.
    • This ensures continuous operation and data integrity.

Advantages of Live Migration

  • Live migration ensures seamless transitions while preserving the instance’s state.
  • Live migration simplifies hardware maintenance, disaster recovery, and workload balancing, making it a powerful tool for managing virtualized environments
  • Live migration optimizes resource utilization, supports maintenance, and enhances workload flexibility in virtualized environments.
  • Live migration offers smoother transitions, better resource utilization, and minimal downtime compared to cold migration
  1. Downtime Minimization:

    • Live Migration: Minimal to zero downtime.
    • Cold Migration: Requires instance shutdown, resulting in downtime.
  2. Seamless Transition:

    • Live Migration: VM state is preserved during migration.
    • Cold Migration: Requires manual reconfiguration after migration.
  3. Resource Optimization:

    • Live Migration: Balances workloads across nodes.
    • Cold Migration: Doesn't optimize resource usage.
  4. Maintenance Flexibility:

    • Live Migration: Ideal for hardware maintenance or updates.
    • Cold Migration: Better suited for planned migrations.
  5. Application Continuity:

    • Live Migration: Ensures uninterrupted service.
    • Cold Migration: Service interruption during migration.

VMware vMotion vs Hyper-V Migration

  1. VMware vMotion:

    • Part of VMware vSphere.
    • Supports seamless real-time migration to balance server loads.
    • Moves VMs with zero downtime.
    • Can move eight VMs concurrently.
    • Demonstrated superior speed and stability in tests¹.
  2. Hyper-V Live Migration:

    • Part of Microsoft Hyper-V.
    • Also supports migrations with zero downtime.
    • Limited to one live migration at a time (source or destination).
    • Larger numbers of VMs take longer to move.
    • VMware vMotion is up to 5.4 times faster in certain scenarios¹.

Secure migration is configured by the cloud platform, and shared SSH keys facilitate communication during the process.

Ref: https://docs.redhat.com/en/documentation/red_hat_openstack_platform/13/html/instances_and_images_guide/migrating-virtual-machines-between-compute-nodes-osp.
Ref: https://www.vmware.com/content/dam/digitalmarketing/vmware/en/pdf/products/vsphere/vmware-vmotion-verus-live-migration.pdf

Top comments (0)