Back Up On-Prem Deployment With Object Storage

Note

Chef Automate 4.10.1 released on 6th September 2023 includes improvements to the deployment and installation experience of Automate HA. Please read the blog to learn more about key improvements. Refer to the pre-requisites page (On-Premises, AWS) and plan your usage with your customer success manager or account manager.

This document shows how to configure, back up, and restore a Chef Automate high availability deployment with object storage.

During deployment of Chef Automate, if you set backup_config = "object_storage" or backup_config = "file_system" in the Automate configuration TOML file, then backup is already configured and you don’t need to configure data backup for Chef Automate. If a backup wasn’t configured during the initial deployment, then follow these instructions to configure it manually.

Chef Automate supports backing up data to the following platforms:

S3 (AWS S3, MinIO, non-AWS S3)
Google Cloud Storage (GCS)

Configure backup for S3

This section shows how to configure data back up on a Chef Automate high availability deployment to object storage on AWS S3, MinIO, or non-AWS S3.

Configure OpenSearch nodes

Add a secret key and access key for your S3 backup provider on every OpenSearch node.

Note

Encrypted S3 buckets are supported only with Amazon S3 managed keys (SSE-S3).

Set the OpenSearch path configuration location.

export OPENSEARCH_PATH_CONF="/hab/svc/automate-ha-opensearch/config"

Add your S3 access and secret keys to the OpenSearch keystore.

hab pkg exec chef/automate-ha-opensearch opensearch-keystore add s3.client.default.access_key
hab pkg exec chef/automate-ha-opensearch opensearch-keystore add s3.client.default.secret_key

Change ownership of the keystore.

chown -RL hab:hab /hab/svc/automate-ha-opensearch/config/opensearch.keystore

Load the secure settings into the OpenSearch keystore.

curl -X POST https://localhost:9200/_nodes/reload_secure_settings?pretty --cacert /hab/svc/automate-ha-opensearch/config/certificates/root-ca.pem --key /hab/svc/automate-ha-opensearch/config/certificates/admin-key.pem --cert /hab/svc/automate-ha-opensearch/config/certificates/admin.pem -k

Repeat these steps on all OpenSearch nodes until they are all updated.

OpenSearch health check

Use the following commands on OpenSearch nodes to verify their health status.

Get the OpenSearch Cluster status from the bastion
```
chef-automate status --os
```
Verify that the Habitat service is running.
```
hab svc status
```

Check the status of OpenSearch indices.

curl -k -X GET "https://localhost:9200/_cat/indices/*?v=true&s=index&pretty" -u admin:admin

View logs of the Chef Habitat services.

journalctl -u hab-sup -f | grep 'automate-ha-opensearch'

Patch the Automate configuration

On the bastion host, update the S3 and OpenSearch configuration.

Before starting, make sure the frontend nodes and OpenSearch nodes have access to the object storage endpoint.

Create a TOML file on the bastion host with the following settings.

[s3]
  [s3.client.default]
    protocol = "https"
    read_timeout = "60s"
    max_retries = "3"
    use_throttle_retries = true
    endpoint = "s3.example.com"

Replace the value of endpoint with the URL of your S3 storage endpoint.

Add the following content to the TOML file to configure OpenSearch.

[global.v1]
  [global.v1.external.opensearch.backup]
    enable = true
    location = "s3"

  [global.v1.external.opensearch.backup.s3]

    # bucket (required): The name of the bucket
    bucket = "<BUCKET_NAME>"

    # base_path (optional): The path within the bucket where backups should be stored
    # If base_path is not set, backups will be stored at the root of the bucket.
    base_path = "opensearch"

    # name of an s3 client configuration you create in your opensearch.yml
    # see https://www.open.co/guide/en/opensearch/plugins/current/repository-s3-client.html
    # for full documentation on how to configure client settings on your
    # OpenSearch nodes
    client = "default"

  [global.v1.external.opensearch.backup.s3.settings]
    ## The meaning of these settings is documented in the S3 Repository Plugin
    ## documentation. See the following links:
    ## https://www.open.co/guide/en/opensearch/plugins/current/repository-s3-repository.html

    ## Backup repo settings
    # compress = false
    # server_side_encryption = false
    # buffer_size = "100mb"
    # canned_acl = "private"
    # storage_class = "standard"
    ## Snapshot settings
    # max_snapshot_bytes_per_sec = "40mb"
    # max_restore_bytes_per_sec = "40mb"
    # chunk_size = "null"
    ## S3 client settings
    # read_timeout = "50s"
    # max_retries = 3
    # use_throttle_retries = true
    # protocol = "https"

  [global.v1.backups]
    location = "s3"

  [global.v1.backups.s3.bucket]
    # name (required): The name of the bucket
    name = "<BUCKET_NAME>"

    # endpoint (required): The endpoint for the region the bucket lives in for Automate Version 3.x.y
    # endpoint (required): For Automate Version 4.x.y, use this https://s3.amazonaws.com
    endpoint = "<OBJECT_STORAGE_URL>"

    # base_path (optional): The path within the bucket where backups should be stored
    # If base_path is not set, backups will be stored at the root of the bucket.
    base_path = "automate"

  [global.v1.backups.s3.credentials]
    access_key = "<ACCESS_KEY>"
    secret_key = "<SECRET_KEY>"

Use the patch subcommand to patch the Automate configuration.

./chef-automate config patch --frontend /PATH/TO/FILE_NAME.TOML

Configure backup on Google Cloud Storage

This sections shows how to configure a Chef Automate high availability deployment to back up data to object storage on Google Cloud Storage (GCS).

Configure OpenSearch nodes

Add a GCS service account file that gives access to the GCS bucket to every OpenSearch node.

export OPENSEARCH_PATH_CONF="/hab/svc/automate-ha-opensearch/config"
export GCS_SERVICE_ACCOUNT_JSON_FILE_PATH="/PATH/TO/GOOGLESERVICEACCOUNT.JSON"

Change ownership of the GCS service account file.

chown -RL hab:hab $GCS_SERVICE_ACCOUNT_JSON_FILE_PATH

Add the GCS service account file to OpenSearch.

hab pkg exec chef/automate-ha-opensearch opensearch-keystore add-file --force gcs.client.default.credentials_file $GCS_SERVICE_ACCOUNT_JSON_FILE_PATH

Change ownership of the keystore.

chown -RL hab:hab /hab/svc/automate-ha-opensearch/config/opensearch.keystore

Load the secure settings into the OpenSearch keystore.

curl -X POST https://localhost:9200/_nodes/reload_secure_settings?pretty --cacert /hab/svc/automate-ha-opensearch/config/certificates/root-ca.pem --key /hab/svc/automate-ha-opensearch/config/certificates/admin-key.pem --cert /hab/svc/automate-ha-opensearch/config/certificates/admin.pem -k

Repeat these steps on all OpenSearch nodes until they are all updated.

After updating all nodes, the above curl command will return an output similar to this:

{
	"_nodes": {
		"total": 3,
		"successful": 3,
		"failed": 0
	},
	"cluster_name": "chef-insights",
	"nodes": {
		"lenRTrZ1QS2uv_vJIwL-kQ": {
			"name": "lenRTrZ"
		},
		"Us5iBo4_RoaeojySjWpr9A": {
			"name": "Us5iBo4"
		},
		"qtz7KseqSlGm2lEm0BiUEg": {
			"name": "qtz7Kse"
		}
	}
}

OpenSearch health check

Use the following commands on OpenSearch nodes to verify their health status.

Get the OpenSearch Cluster status from the bastion
```
chef-automate status --os
```
Verify that the Habitat service is running.
```
hab svc status
```

Check the status of OpenSearch indices.

curl -k -X GET "https://localhost:9200/_cat/indices/*?v=true&s=index&pretty" -u admin:admin

View logs of the Chef Habitat services.

journalctl -u hab-sup -f | grep 'automate-ha-opensearch'

Patch the Automate configuration

On the bastion host, update the OpenSearch configuration.

Before starting, make sure the frontend nodes and OpenSearch nodes have access to the object storage endpoint.

Create a TOML file on the bastion host with the following settings.

[global.v1]
  [global.v1.external.opensearch.backup]
    enable = true
    location = "gcs"

  [global.v1.external.opensearch.backup.gcs]

    # bucket (required): The name of the bucket
    bucket = "bucket-name"

    # base_path (optional): The path within the bucket where backups should be stored
    # If base_path is not set, backups will be stored at the root of the bucket.
    base_path = "opensearch"
    client = "default"

  [global.v1.backups]
    location = "gcs"

  [global.v1.backups.gcs.bucket]
    # name (required): The name of the bucket
    name = "bucket-name"

    # endpoint = ""

    # base_path (optional): The path within the bucket where backups should be stored
    # If base_path is not set, backups will be stored at the root of the bucket.
    base_path = "automate"

  [global.v1.backups.gcs.credentials]
    json = '''{
      "type": "service_account",
      "project_id": "chef-automate-ha",
      "private_key_id": "7b1e77baec247a22a9b3****************f",
      "private_key": "<PRIVATE KEY>",
      "client_email": "myemail@chef.iam.gserviceaccount.com",
      "client_id": "1******************1",
      "auth_uri": "https://accounts.google.com/o/oauth2/auth",
      "token_uri": "https://oauth2.googleapis.com/token",
      "auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs",
      "client_x509_cert_url": "https://www.googleapis.com/robot/v1/metadata/x509/myemail@chef.iam.gserviceaccount.com",
      "universe_domain": "googleapis.com"
    }'''

Patch the Automate configuration to trigger the deployment.

./chef-automate config patch --frontend /PATH/TO/FILE_NAME.TOML

Backup and Restore

Backup

To create a backup, run the backup command from the bastion host.

chef-automate backup create

Restore

Pre-Restore Validation

Run the restore command with the –verify-restore-config flag to validate the configuration settings before initiating the restore process. To perform the pre-check, run the following command from the bastion host:

chef-automate backup restore s3://bucket_name/path_to_backups/BACKUP_ID --verify-restore-config

The verification process ensures that the backup and restore configurations are correct and identifies potential issues so they can be addressed in advance.

Run Restore

Restore a backup from external object storage.

Check the status of the Automate HA cluster from the bastion host.
```
chef-automate status
```

Restore the backup by running the restore command from the bastion host.

For S3:

chef-automate backup restore s3://BUCKET_NAME/PATH_TO_BACKUPS/BACKUP_ID --skip-preflight --s3-access-key "ACCESS_KEY" --s3-secret-key "SECRET_KEY"

For GCS:

chef-automate backup restore gs://BUCKET_NAME/PATH/TO/BACKUPS/BACKUP_ID --gcs-credentials-path "PATH/TO/GOOGLE_SERVICE_ACCOUNT.JSON"

In an airgapped environment:

chef-automate backup restore <OBJECT-STORAGE-BUCKET-PATH>/BACKUPS/BACKUP_ID --skip-preflight --airgap-bundle </PATH/TO/BUNDLE>

Note

If you are restoring the backup from an older version, then you need to provide the --airgap-bundle </path/to/current/bundle>.
Large Compliance Report is not supported in Automate HA

Troubleshooting

Follow the steps below if Chef Automate encounters an error during data restoration.

Check the Chef Automate status.
```
chef-automate status
```
Check the status of your Habitat service on the Automate node.
```
hab svc status
```
If the deployment services are not healthy, reload them.
```
hab svc load chef/deployment-service
```
Check the status of the Automate node, and then attempt to run the restore command from the bastion host.

To change the base_path or path, follow the steps below for performing a backup.

File System

During deployment, the backup_mount is default set to /mnt/automate_backups.
The deployment process will automatically apply the updated path if you update the backup_mount value in the config.toml file before deployment.
If the backup_mount value is changed after deployment (e.g., to /bkp/backps), you must manually patch the configuration on all frontend and backend nodes.
Update the FE nodes using the template below. To update the configuration, use the command chef-automate config patch fe.toml --fe.

   [global.v1.backups]
      [global.v1.backups.filesystem]
         path = "/bkp/backps"
   [global.v1.external.opensearch.backup]
      [global.v1.external.opensearch.backup.fs]
         path = "/bkp/backps"

Update the OpenSearch nodes using the template provided below. Use the chef-automate config patch os.toml --os command to update the Opensearch node configs.

[path]
   repo = "/bkp/backps"

Run the curl request against one of the Automate frontend nodes.

curl localhost:10144/_snapshot?pretty

If the response is an empty JSON object {}, no changes are required to the snapshot settings in the OpenSearch cluster.
If you see a JSON response similar to the example below, check that the backup_mount setting is correctly configured. Use the location value in the response to verify. It should start with /bkp/backps.

{
 "chef-automate-es6-event-feed-service" : {
 "type" : "fs",
 "settings" : {
 "location" : "/mnt/automate_backups/opensearch/automate-elasticsearch-data/chef-automate-es6-event-feed-service"
       }
    },
 "chef-automate-es6-compliance-service" : {
 "type" : "fs",
 "settings" : {
 "location" : "/mnt/automate_backups/opensearch/automate-elasticsearch-data/chef-automate-es6-compliance-service"
       }
    },
 "chef-automate-es6-ingest-service" : {
     "type" : "fs",
 "settings" : {
 "location" : "/mnt/automate_backups/opensearch/automate-elasticsearch-data/chef-automate-es6-ingest-service"
       }
    },
 "chef-automate-es6-automate-cs-oc-erchef" : {
 "type" : "fs",
 "settings" : {
 "location" : "/mnt/automate_backups/opensearch/automate-elasticsearch-data/chef-automate-es6-automate-cs-oc-erchef"
       }
    }
 }

If the prefix in the location value does not match the backup_mount, the existing snapshots must be deleted. Use the script below to delete the snapshots from the one of the Automate frontend nodes.

   snapshot=$(curl -XGET http://localhost:10144/_snapshot?pretty | jq 'keys[]')
   for name in $snapshot;do
       key=$(echo $name | tr -d '"')
      curl -XDELETE localhost:10144/_snapshot/$key?pretty
   done

The above script requires jq to be installed, You can install it from the airgap bundle. To locate the jq package, run the command below on one of the Automate frontend nodes.

ls -ltrh /hab/cache/artifacts/ | grep jq

-rw-r--r--. 1 ec2-user ec2-user  730K Dec  8 08:53 core-jq-static-1.6-20220312062012-x86_64-linux.hart
-rw-r--r--. 1 ec2-user ec2-user  730K Dec  8 08:55 core-jq-static-1.6-20190703002933-x86_64-linux.hart

If multiple versions of jq are available, install the latest one. Use the command below to install the jq package on one of the Automate frontend nodes.

hab pkg install /hab/cache/artifacts/core-jq-static-1.6-20190703002933-x86_64-linux.hart -bf

Object Storage

During deployment, the backup_config should be set to object_storage.
To use object_storage, we use the following template during deployment.

   [object_storage.config]
    google_service_account_file = ""
    location = ""
    bucket_name = ""
    access_key = ""
    secret_key = ""
    endpoint = ""
    region = ""

If you configured it before deployment, then you are all set.
If you want to change the bucket or base_path, use the following template for Frontend nodes.

[global.v1]
  [global.v1.external.opensearch.backup.s3]
      bucket = "<BUCKET_NAME>"
      base_path = "opensearch"
   [global.v1.backups.s3.bucket]
      name = "<BUCKET_NAME>"
      base_path = "automate"

You can assign any value to the base_path variable. The base_path configuration is required only for the Frontend nodes.
Use the command chef-automate config patch frontend.toml --fe to apply the above template and update the configuration.
Use the following curl request to validate the configuration.
```
curl localhost:10144/_snapshot?pretty
```
If the response is an empty JSON object ({}), the configuration is valid.

If the response contains a JSON output similar to the example below, it should have the correct value for the base_path.

{
    "chef-automate-es6-event-feed-service" : {
      "type" : "s3",
      "settings" : {
        "bucket" : "MY-BUCKET",
        "base_path" : "opensearch/automate-elasticsearch-data/chef-automate-es6-event-feed-service",
        "readonly" : "false",
        "compress" : "false"
      }
    },
    "chef-automate-es6-compliance-service" : {
      "type" : "s3",
      "settings" : {
        "bucket" : "MY-BUCKET",
        "base_path" : "opensearch/automate-elasticsearch-data/chef-automate-es6-compliance-service",
        "readonly" : "false",
        "compress" : "false"
      }
    },
    "chef-automate-es6-ingest-service" : {
      "type" : "s3",
      "settings" : {
        "bucket" : "MY-BUCKET",
        "base_path" : "opensearch/automate-elasticsearch-data/chef-automate-es6-ingest-service",
        "readonly" : "false",
        "compress" : "false"
      }
    },
    "chef-automate-es6-automate-cs-oc-erchef" : {
      "type" : "s3",
      "settings" : {
        "bucket" : "MY-BUCKET",
        "base_path" : "opensearch/automate-elasticsearch-data/chef-automate-es6-automate-cs-oc-erchef",
        "readonly" : "false",
        "compress" : "false"
      }
    }
}

If the base_path value does not match, you must delete the existing snapshots. Please take a look at the File System troubleshooting steps for guidance.