On-prem data migration to Sourcegraph Cloud
This process describes the current state of how to do a full data migration of an on-prem instance to a Cloud v2 instance. On-prem-to-Cloud data migrations are currently owned by Implementation Engineering, but the process is documented in Cloud as it pertains to Cloud infrastructure.
The on-prem-to-Cloud data migration process described here will result in the full restore/ overwriting of the Cloud v2 instance to the state of the customer’s on-prem instance. This process is intended to be performed immediately following the provisioning of a new Cloud v2 instance. If a migration is planned for a newly provisioned Cloud v2 instance, TAs are recommended to not hand over access to the Cloud v2 instance to the customer until the migration is complete.
Note: This process is an “all-or-nothing” data migration. There is no way to partially or selectively migrate certain aspects of a customer’s on-prem Sourcegraph instance’s data (e.g., only Batch Changes execution history or certain Code Insights)
Requirements
To qualify for a data migration, the customer must:
- have a Sourcegraph instance on v3.20.0 or later (limitation of multi-version upgrades)
- note: where possible, strongly encourage the customer to upgrade to their on-prem instance to the latest version of Sourcegraph first.
- use databases on Postgres 12 or later
- not have on-disk database encryption enabled
- have the latest release of
src
- have direct database access
- have a site-admin access token for their instance
An operator must:
- have the latest build of
src
installed - have the
gcloud
CLI installed - have the
mi2
CLI installed
Initial setup
Set up the target Cloud instance
First, the operator must create an instance with the configuration for the desired final Cloud instance.
Create a data migration Cloud Storage bucket
- In the
cloud-data-migrations
repository, copy thetemplate/
directory, naming it corresponding to the customer.
- Fill out all
$CUSTOMER
variables and set all unset variables interraform.tfvars
as documented. - Commit your changes, open a pull request in
cloud-data-migrations
, and merge the changes after review.
- In the
infrastructure
repository, create Terraform Cloud workspaces for the migration resources insourcegraph/infrastructure/terraform-cloud/cloud_migration.tf
file by adding something like the following, replacing$CUSTOMER
as appropriate:
#### $CUSTOMER
module "cloud-data-migration-project-$CUSTOMER" {
source = "../modules/tfcworkspace"
organization = data.tfe_organization.sourcegraph.name
vcs_oauth_token_id = tfe_oauth_client.github.oauth_token_id
name = "cloud-data-migration-project-$CUSTOMER"
vcs_repo = local.sourcegraph_cloud_data_migrations_repo_name
working_directory = "$CUSTOMER/project"
trigger_patterns = ["$CUSTOMER/project/*"]
tags = ["cloud-tooling"]
terraform_version = "1.3.6"
team_access = local.allow_cloud_team_write_access
}
module "cloud-data-migration-resources-$CUSTOMER" {
source = "../modules/tfcworkspace"
organization = data.tfe_organization.sourcegraph.name
vcs_oauth_token_id = tfe_oauth_client.github.oauth_token_id
name = "cloud-data-migration-resources-$CUSTOMER"
vcs_repo = local.sourcegraph_cloud_data_migrations_repo_name
working_directory = "$CUSTOMER/resources"
trigger_patterns = ["$CUSTOMER/resources/*"]
tags = ["cloud-tooling"]
terraform_version = "1.3.6"
team_access = merge(local.allow_cloud_team_write_access, local.allow_implementation_engineering_team_write_access)
}
Commit your changes, open a pull request in infrastructure
, and merge the changes after review.
-
Make sure that your Terraform Cloud workspaces are created, then schedule a run for the created
-project
workspace. Once that succeeds, do the same for the created-resources
workspace. -
Once
resources/
has been applied, you should have outputs for a GCP bucket and a GCP service account with write-only access to it. Create a 1password share entry with these outputs:
snapshot_bucket_name
writer_service_account_key
Outputs can also be retrieved from the Terraform state of resources/
:
cd resources/
terraform init
# Bucket name
terraform output -json | jq -e -r .snapshot_bucket_name.value
# Credentials, sent to file
terraform output -json | jq -e -r .writer_service_account_key.value > credential.json
Collect snapshot contents from customer’s on-prem instance
Notify users of instance migration
The customer site admin is responsible for communicating the upcomming cloud migration plans to their users. It is recommended that they add a non-dismissible site notice to their on-prem instance in global settings:
{
"notices": [
{
"dismissible": false,
"location": "top",
"message": "🚨 A Sourcegraph instance migration is underway - changes to configuration might not be persisted, and performance may be affected, until the migration is finalized."
}
]
}
Generate databases backups
Ensure that the customer has the latest build of src
installed before proceeding.
The customer should first be asked to create pg_dump
exports of their Sourcegraph databases. pg_dump
is designed to be usable while the database is in use:
It makes consistent backups even if the database is being used concurrently. pg_dump does not block other users accessing the database (readers or writers).
Note that we ask the customer to configure a notice to let their users know that any actions taken after the point of the dump will not be persisted to their new Cloud instance.
Template commands for running pg_dump
can be generated with src snapshot databases
for various configurations:
$ src snapshot databases --help
'src snapshot databases' generates commands to export Sourcegraph database dumps.
Note that these commands are intended for use as reference - you may need to adjust the commands for your deployment.
USAGE
src [-v] snapshot databases <pg_dump|docker|kubectl> [--targets=<docker|k8s|"targets.yaml">]
TARGETS FILES
Predefined targets are available based on default Sourcegraph configurations ('docker', 'k8s').
Custom targets configuration can be provided in YAML format with '--targets=target.yaml', e.g.
primary:
target: ... # the DSN of the database deployment, e.g. in docker, the name of the database container
dbname: ... # name of database
username: ... # username for database access
password: ... # password for database access - only include password if it is non-sensitive
codeintel:
# same as above
codeinsights:
# same as above
See the pgdump.Targets type for more details.
Each of the generated commands must be run to completion to generate a database dump for each database. The output is as follows:
src-snapshot/primary.sql
src-snapshot/codeintel.sql
src-snapshot/codeinsights.sql
For custom or complex database setups, the operator will decide how best to proceed, in collaboration with IE/CSE/etc - the goal in the end is to generate the above database dumps in a format aligned with the output of src snapshot databases pg_dump
(the plain pg_dump
commands).
Generate instance summary
A snapshot summary is used to run acceptance tests post-migration. The customer should create one with src snapshot summary
- note that a site admin access token is required:
src login # configure credentials for the instance
src snapshot summary
This will generate a JSON file at src-snapshot/summary.json
. See src snapshot summary --help
for more details.
Upload snapshot contents to GCS bucket
If the above steps for creating the src-snapshot
folder contents were followed correctly, the customer can run src snapshot upload
with the appropriate bucket and credentials and src
will find the snapshot contents and upload them to the configured buckets.
src snapshot upload -bucket=$BUCKET -credentials=$CREDENTIALS_FILE
Once the customer has indicated the upload succeeded, validate the contents of the bucket to ensure everything is there:
primary.sql
codeintel.sql
codeinsights.sql
summary.json
Audit logs are generated for bucket access in the project’s logs, under log entries with @type: "type.googleapis.com/google.cloud.audit.AuditLog"
.
Execute data migration
Disable alerting and scale down cloud instance
Once the database backups and instance summary have been uploaded by the customer, the first step before processing the migration is to disable the instance alerting and scale down the Cloud instance.
- Update mi2 to the latest version:
sg cloud install
-
Extract the Cloud instance from the control plane, following the instructions in the “Extract instance from control plane (break glass)” section in the instance-specific dashboard from go/cloud-ops
-
Disable alerting by editing the instance
config.yaml
insourcegraph/cloud
as follows. Make sure to deploy the terraform changes as well.
spec:
debug:
- enableAlerting: true
+ enableAlerting: false
- Commit and submit your changes as a pull request. After merging, apply the monitoring stack changes in Terraform Cloud or via mi2:
mi2 instance tfc deploy -auto-approve -force-ignore-stack-dependencies -target monitoring
- Once monitoring has been disabled, proceed with scaling down the instance:
mi2 instance scale-down
Request GCP infrastructure permission
Visit go/cloud-ops and locate the instance, then request access to Cloud infra via the provided Entitle link.
Reset databases
In a separate terminal window, set up a connection to the Cloud SQL database:
mi2 instance db proxy -session.timeout 0 -download
Then, connect to the database as the admin user:
# Extract the database admin password
cd terraform/stacks/sql && terraform init && cd -
export INSTANCE_ADMIN_PASSWORD="$(cd terraform/stacks/sql && terraform output -json | jq -r '.sql_crossstackoutputgooglesqlusersqlsqladminuserCE5B87EApassword_209B7378.value')"
# Connect to database
psql postgres://sourcegraph-admin:"$INSTANCE_ADMIN_PASSWORD"@localhost:5433/postgres
Drop and recreate all databases:
DROP DATABASE IF EXISTS "pgsql";
CREATE DATABASE "pgsql";
DROP DATABASE IF EXISTS "codeintel-db";
CREATE DATABASE "codeintel-db";
DROP DATABASE IF EXISTS "codeinsights-db";
CREATE DATABASE "codeinsights-db";
Import databases
Ensure databases have been reset. Then, one by one, import each database from the bucket the customer has uploaded to:
export TARGET_INSTANCE_PROJECT=$(mi2 instance get -jq '.status.gcp.projectId' | tr -d '"')
export TARGET_CLOUD_SQL_INSTANCE=$(mi2 instance get -jq '.status.gcp.cloudSQL[0].name' | tr -d '"')
# see cloud-data-migration-resources-$CUSTOMER terraform outputs
export SOURCE_GCS_BUCKET="..."
gcloud --project $TARGET_INSTANCE_PROJECT sql import sql $TARGET_CLOUD_SQL_INSTANCE gs://$SOURCE_GCS_BUCKET/primary.sql --database=pgsql
gcloud --project $TARGET_INSTANCE_PROJECT sql import sql $TARGET_CLOUD_SQL_INSTANCE gs://$SOURCE_GCS_BUCKET/codeintel.sql --database=codeintel-db
gcloud --project $TARGET_INSTANCE_PROJECT sql import sql $TARGET_CLOUD_SQL_INSTANCE gs://$SOURCE_GCS_BUCKET/codeinsights.sql --database=codeinsights-db
Upgrade databases
If the Sourcegraph version of the imported database is behind Cloud, then you must run a database migration:
mi2 instance debug migrate-db --from-version="$FROM_VERSION" --auto-approve
Spin up instance
If all upgrades succeed, spin up the instance:
mi2 generate kustomize
# Compare
kustomize build --load-restrictor LoadRestrictionsNone --enable-helm kubernetes/ | kubectl --kubeconfig=$(mi2 instance kubeconfig) diff -f -
# Apply
kustomize build --load-restrictor LoadRestrictionsNone --enable-helm kubernetes/ | kubectl --kubeconfig=$(mi2 instance kubeconfig) apply -f -
Verify that all of the instance’s Deployments and StatefulSets have been scaled up:
kubectl get deployments -n <instance_namespace>
kubectl get sts -n <instance_namespace>
#if necessary, scale up any outstanding deployments and/or statefulsets
kubectl scale deployment <deployment_name> -n <instance_namespace> --replicas=1
kubectl scale sts <sts_name> -n <instance_namespace> --replicas=1
Set up SOAP configuration:
mi2 instance check -enforce -force-apply soap
The instance will need externalURL
set to the instance domain for SOAP to work - follow this guide to directly edit the instance’s site configuration. Additionally, make sure that basic/builtin auth is enabled so that we can configure a password:
{
"externalURL": "https://<instance-display-name>.sourcegraphcloud.com",
"auth.providers": [
// ...
{ "type": "builtin" }
]
}
Visit go/cloud-ops and locate the instance, then follow instructions from the Log in to the instance UI
section to log in to the UI. Then create the Sourcegraph service account manually:
- Username:
cloud-admin
- Email:
managed+<instance-display-name>@sourcegraph.com
Run openssl rand -hex 32
in your terminal and use the output as the password. Also save the password to the SOURCEGRAPH_ADMIN_PASSWORD
GSM secret in the Cloud V2 instance project. Then copy the password reset link from creating the user and open it in an incognito tab to set the new user’s password. If you missed the link, you can recreate it from Site Admin -> Users -> dropdown menu -> “Reset password”.
Then delete the SOURCEGRAPH_ADMIN_TOKEN
GSM secret in the Cloud V2 instance project, as it is no longer valid.
You must also promote the new cloud-admin
user to Site Admin: find the user in the Users page (/site-admin/users?searchText=cloud-admin
), and from the overflow menu select Promote to Site Admin.
Enforce all invariants, now that the service account has been set up:
# Enforce invariants that will finalize the service account setup
mi2 instance check -enforce -label service-account
# Make sure all invariants are applied, including inviting the customer admin again
# Note that $CUSTOMER_ADMIN_EMAIL must match the one the Cloud instance was initially created with
mi2 instance check -enforce -customer-admin-email $CUSTOMER_ADMIN_EMAIL
# Verify full invariants suite again
mi2 instance check
Now that the service account has been promoted to a SOAP service account, we should revert any changes to "auth.providers"
we made earlier.
Run an acceptance test using the downloaded summary.json
from the snapshot bucket:
export SRC_ACCESS_TOKEN=$(gcloud secrets versions access --project=$TARGET_INSTANCE_PROJECT --secret=SOURCEGRAPH_ADMIN_TOKEN latest)
export SRC_ENDPOINT="..." # set to instance URL
src login # to the instance
src snapshot test -summary-path="gs://$SOURCE_GCS_BUCKET/summary.json"
Final Steps
After the data migration is complete, the site admin should remove the migration notice that was previously added.
Additionally, make sure to re-enable alerting by editing the instance config.yaml
in sourcegraph/cloud
as follows:
spec:
debug:
- enableAlerting: false
+ enableAlerting: true
Then regenerate Terraform manifests:
mi2 generate cdktf
Finally, backfill the instance to the control plane following the instructions in the “Backfill instance into control plane” section in the instance-specific dashboard from go/cloud-ops
Then commit your changes as a pull request. Once it has been merged, confirm the changes have been applied in Terraform Cloud.