Migrating to Change CSI Add-On

This topic describes how to change the Container Storage Interface (CSI) provisioner add-on in your kURL cluster. It includes information about how to use the kURL installer to automatically migrate data to the new provisioner during upgrade. It also includes prerequisites that you must complete before attempting to change CSI add-ons to reduce the risk of errors during data migration.

Supported CSI Migrations

Important: kURL does not support Longhorn. If you are currently using Longhorn, you must migrate data from Longhorn to either OpenEBS or Rook. kURL v2023.02.17-0 and later supports automatic data migration from Longhorn to OpenEBS or Rook. For more information, see Longhorn Prerequisites below.

This table describes the CSI add-on migration paths that kURL supports:

From To Notes
Longhorn OpenEBS 3.3.0 and later Migrating from Longhorn to OpenEBS 3.3.0 or later is recommended for single-node installations.
Longhorn Rook 1.10.8 and later Migrating from Longhorn to Rook 1.10.8 or later is recommended for clusters with three or more nodes where data replication and availability are requirements. Compared to OpenEBS, Rook requires more resources from your cluster, including a dedicated block device. Single-node installations of Rook are not recommended. Migrating from Longhorn to Rook is not supported for single-node clusters.
Rook OpenEBS 3.3.0 and later Migrating from Rook to OpenEBS 3.3.0 or later is strongly recommended for single-node installations, or for applications that do not require data replication. Compared to Rook, OpenEBS requires significantly fewer hardware resources from your cluster.

For more information about how to choose between Rook or OpenEBS, see Choosing a PV Provisioner.

About Changing the CSI Add-on

You can change the CSI provisioner that your kURL cluster uses by updating the CSI add-on in your kURL specification file. Then, when you upgrade a kURL cluster using the new specification, the kURL installer detects the change that you made to the CSI add-on and begins automatically migrating data from the current provisioner to the new one.

kURL supports data migration when you change your CSI provisioner from Rook to OpenEBS, or when you change from Longhorn to Rook or OpenEBS.

The following describes the automatic data migration process when you change the CSI provisioner add-on, where source is the CSI provisioner currently installed in the cluster and target is the desired CSI provisioner:

  1. kURL installer temporarily shuts down all pods mounting volumes backed by the source provisioner. This is done to ensure that the data being migrated is not in use and can be safely copied to the new storage system.
  2. kURL recreates all PVCs provided by the source provisioner using the target provisioner as backing storage. Data is then copied from the source PVC to the destination PVC.
  3. If you are migrating from Rook or Longhorn to OpenEBS in a cluster that has more than two nodes, then the kURL installer attempts to create local OpenEBS volumes on the same nodes where the original Rook or Longhorn volumes were referenced.
  4. When the data migration is complete, the pods are restarted using the new PVCs.
  5. kURL uninstalls the source provisioner from the cluster.

Prerequisites

This section includes prerequisites that you must complete before changing the CSI provisioner in your kURL cluster. These prerequisites help you identify and address the most common causes for a data migration failure so that you can reduce the risk of issues.

General Prerequisites

Before you attempt to change the CSI provisioner in your cluster, complete the following prerequisites:

  • Take a snapshot or backup of the relevant data. This helps ensure you can recover your data if the data migration fails.
  • Schedule downtime for the migration. During the automated migration process, there is often a period of time where the application is unavailable. The duration of this downtime depends on the amount of data to migrate. Proper planning and scheduling is necessary to minimize the impact of downtime.
  • Verify that the version of Kubernetes you are upgrading to supports both the current CSI provisioner and the new provisioner that you want to use. Running incompatible versions causes an error during data migration.
  • Ensure that your cluster has adequate hardware resources to run both the current and the new CSI provisioner simultaneously. Your cluster must be able to run both provisioners simultaneously. During the data migration process, the cluster uses twice as much storage capacity as usual due to duplicate volumes. So, the Rook dedicated storage device or the OpenEBS volume must have sufficient disk space available to handle this increase.

    After kURL completes the data migration, storage consumption in the cluster returns to normal because the volumes from the previous CSI provisioner are deleted.

    To ensure that your cluster has adequate resources, review the following for requirements specific to Rook or OpenEBS:

  • If you are migrating from Longhorn, complete the additional Longhorn Prerequisites below.
  • The add-ons Kots and Registry require an Object Storage API (similar to AWS S3) to be available in the cluster. This API can be provided either by Rook (directly) or by the OpenEBS+Minio duo (OpenEBS provides the storage and Minio the Object Storage API). If you are using Kots or Registry add-ons in your cluster then the available migration paths are to migrate to Rook or to OpenEBS+Minio.

Longhorn Prerequisites

If you are migrating from Longhorn to a different CSI provisioner, you must complete the following prerequisites in addition to the General Prerequisites above:

  • Upgrade your cluster to kURL v2023.02.17-0 or later. Automatic data migration from Longhorn to Rook or OpenEBS is not available in kURL versions earlier than v2023.02.17-0.
  • Upgrade the version of Longhorn installed in your cluster to 1.2.0 or later or 1.3.0 or later. Longhorn versions 1.2.x and 1.3.x support Kubernetes versions 1.24 and earlier.
  • Confirm that the Longhorn volumes are in a healthy state. Run the following command to check the status of the volumes:

    kubectl get volumes.longhorn.io -A
    

    If any volumes are reported as not healthy in the Robustness column in the ouput of this command, resolve the issue before proceeding.

    To learn more about volume health, you can also inspect each volume individually:

    kubectl get volumes.longhorn.io -n longhorn-system <volume name> -o yaml
    

    In many cases, volume health is caused by issues with volume replication. Specifically, when multiple replicas are configured for a volume but not all have been scheduled.

    Note: During the data migration process in single-node clusters, the system automatically scales down the number of replicas to 1 in all Longhorn volumes to ensure the volumes are in a healthy state before beginning the data transfer. This is done to minimize the risk of a migration failure.

  • Confirm that Longhorn nodes are in a healthy state. The nodes must be healthy to ensure they are not over-provisioned and can handle scheduled workloads. Run the following command to check the status of the Longhorn nodes:

    kubectl get nodes.longhorn.io -A
    

    If any node is not reported as "Ready" and "Schedulable" in the output of this command, resolve the issue before proceeding.

    To learn more, you can also inspect each node individually and view its "Status" property:

    kubectl get nodes.longhorn.io -n longhorn-system <node name> -o yaml
    
  • (OpenEBS Only) Before you migrate from Longhorn to OpenEBS:

    • Ensure the filesystem on the node has adequate space to accommodate twice the amount of data currently stored by Longhorn. This is important because both OpenEBS and Longhorn use the node's filesystem for data storage.
    • Ensure that there is an additional 2G of memory and 2 CPUs available for OpenEBS. For more information, see What are the minimum requirements and supported container orchestrators? in the OpenEBS documentation.
  • (Rook Only) Before you migrate from Longhorn to Rook, ensure that the dedicated block device for Rook attached to each node has enough space to host all data currently stored in Longhorn.

Change the CSI Add-on in a Cluster

This procedure describes how to update the kURL specification file to use a new CSI provisioner add-on. Then, upgrade your kURL cluster to automatically migrate data to the new provisioner.

For more information about the supported migration paths for CSI provisioners, see Supported CSI Migrations above.

Warning: When you change the CSI provisioner in your cluster, the data migration process causes some amount of downtime for the application. It is important to plan accordingly to minimize the impact on users.

To migrate to a new CSI provisioner in a kURL cluster:

  1. Complete the Prerequisites above.
  2. Update the kURL specification to remove the current CSI add-on and add the new CSI add-on that you want to use (either Rook or OpenEBS). For information about the options for the Rook or OpenEBS kURL add-ons, see Rook Add-on or OpenEBS Add-on.

    Example:

    This example shows how to update a kURL specification to change the CSI provisioner add-on from Rook to OpenEBS Local PV.

    Given the following my-current-installer file, which specifies Rook as the CSI provisioner:

    apiVersion: cluster.kurl.sh/v1beta1
    kind: Installer
    metadata:
      name: my-current-installer
    spec:
      kubernetes:
        version: 1.19.12
      docker:
        version: 20.10.5
      flannel:
        version: 0.20.2
      rook:
        version: 1.0.4
    

    You can remove rook and add openebs with isLocalPVEnable: true to migrate data from Rook to OpenEBS Local PV, as shown in the following my-new-installer file:

    apiVersion: cluster.kurl.sh/v1beta1
    kind: Installer
    metadata:
      name: my-new-installer
    spec:
      kubernetes:
        version: 1.19.12
      docker:
        version: 20.10.5
      flannel:
        version: 0.20.2
      openebs:
        version: 3.3.0
        isLocalPVEnabled: true
        localPVStorageClassName: local
    
  3. Upgrade your kURL cluster to use the updated specification by rerunning the kURL installation script. For more information about how to upgrade a kURL cluster, see Upgrading.

    During the cluster upgrade, the kURL installer detects that the CSI add-on has changed. kURL automatically begins the process of migrating data from the current CSI provisioner to the provisioner in the updated specification. For more information about the data migration process, see About Changing the CSI Add-on above.

Automated Local to Distributed Storage Migrations

You can use the minimumNodeCount field to configure kURL to automatically migrate clusters from local (non-HA) storage to distributed (HA) storage when the node count increases to a minimum of three nodes.

When you include the minimumNodeCount field and the cluster meets the minimum node count specified, one of the following must occur for kURL to migrate to distributed storage:

  • The user joins the third(+) node to the cluster using the kURL join script, and accepts a prompt to migrate storage.
  • The user runs the kURL install.sh script on a primary node, and accepts a prompt to migrate storage.
  • The user runs the migrate-multinode-storage command in the kURL tasks.sh script from a primary node.

Implementation

The following example spec uses the minimumNodeCount field to configure kURL to run local storage with OpenEBS until the cluster increases to three nodes. When the cluster increases to three nodes, kURL automatically migrates to distributed storage with Rook:

 rook: 
    version: "1.11.x"
    minimumNodeCount: 3
  openebs: 
    version: "3.7.x"
    isLocalPVEnabled: true
    localPVStorageClassName: "local"

Requirements

The minimumNodeCount field has the following requirements:

  • Distributed storage requires a node count equal to or greater than three. This means that you can set the minimumNodeCount field to 3 or greater.
  • Automated local to distributed storage migrations require the following:

    • Rook 1.11.7 or later
    • OpenEBS 3.6.0 or later
    • Block storage devices for Rook

Limitation

There is downtime while kURL migrates data from local to distributed storage.

Troubleshoot Longhorn Data Migration

This section describes how to troubleshoot known issues in migrating data from Longhorn to Rook or OpenEBS.

Pods stuck in Terminating or Creating state

One of the most common problems that may arise during the migration process is Pods getting stuck in the Terminating or Creating state. This can happen when the Pods are trying to be scaled down or up but are not able to do so due to some underlying issue with Longhorn. In this case, it is recommended to restart the kubelet service on all nodes. This can be done by opening new sessions to the nodes and running the command below to restart the kubelet service.

sudo systemctl restart kubelet

Restore the original number of Volume replicas

To ensure a smooth migration process, when executed on a single node cluster, all Longhorn volumes are scaled down to 1 replica. This is done to make it easier to identify any issues that may arise during the migration, as scaling up the number of replicas can potentially mask the underlying problem. Despite the migration not being successful, the volumes will remain at 1 replica in order to identify the root cause of the failure. If necessary you can restore the original number of replicas by running the following command:

kurl longhorn rollback-migration-replicas