2.1 (latest)
Search
K
Links

Migration Guide from Thanos to Kubecost 2.0 (Aggregator)

This tutorial is intended to help our users migrate from the legacy Thanos federation architecture to Kubecost v2.0's Aggregator. There are a few requirements in order to successfully migrate to Kubecost v2.0. This new version of Kubecost includes a new backend Aggregator which handles the ETL data built from source metrics more efficiently. Kubecost v2.0 provides new features, optimizes UI performance, and enhances the user experience. This tutorial is meant to be performed before the user upgrades from an older version of Kubecost to v2.0.
Important notes for the migration process:
  • For best experience, please upgrade to Kubecost v1.108.1 before following this document.
  • Once Aggregator is enabled, all queries hit the Aggregator container and not cost-model via the reverse proxy.
  • The ETL-Utils container only creates additional files in the object store. It does not delete any data/metrics.
  • For larger environments, the StorageClass must have 1GBPS throughput.
  • Having enough storage is critically important and will vary based on environment. Bias towards a larger value for aggregatorDBStorage.storageRequest at the beginning and cut it down if persistent low utilization is observed.

Key changes

  • Assets and Allocations are now paginated using offset/limit parameters
  • New data available for querying every 2 hours (can be adjusted)
  • Substantial query speed improvements even when pagination not in effect
  • Data ingested into and queried from Aggregator component instead of directly from bingen files
  • Distributed tracing integrated into core workflows
  • No more pre-computed "AggStores"; this reduces the memory footprint of Kubecost
    • Request-level caching still in effect

Aggregator architecture

Aggrgeator diagram

Migration process

All of these steps should be performed on Kubecost v1.108.1. Only at the end, will you upgrade to Kubecost 2.0. The goal of this doc is to gradually migrate off Thanos (it is no longer supported in the Kubecost v2.0 Helm chart), towards Federated ETL, then finally Aggregator.

Step 1: Use the existing Thanos object store or create a new dedicated object store

This object store will be where the ETL backups will be pushed to from the primary cluster's cost-model. If you are using a metric Federation tool which does not require an object store, or otherwise do not want to use an existing Thanos object store, you will have to create a new one.
If this is your first time setting up an object store, refer to these docs:

Step 2: Enable ETL backups on the primary cluster only

Enabling ETL backups ensures Kubecost persists historical data in durable storage (outside of Thanos) and stores the data in a format consumable by the ETL Utils container. The ETL Utils container transforms that data and writes it to a separate location in the object store for consumption by Aggregator.
kubecostModel:
etlBucketConfigSecret: <YOUR_SECRET_NAME>

Step 3: Validate that an /etl directory is present in the object store

There should be ETL data present in the following directories. CloudCosts will only have ETL data if at least one cloud integration is enabled.
  • /etl/bingen/allocations
  • /etl/bingen/assets/
  • /cloudcosts/

Step 4: Create a new federated-store secret on the primary cluster

This will point to the existing Thanos object store or the new object store created in Step 1. The secret should be identical to your object-store.yaml, with the exception that this new secret must be named federated-store.yaml.
The name of the .yaml file used to create the secret must be named federated-store.yaml or Aggregator will not start.
kubectl create secret generic federated-store --from-file=federated-store.yaml -n kubecost

Step 5: Enable FederatedETL, ETL-Utils, and Aggregator on the primary cluster

Do not disable Thanos during this step. Thanos will be disabled in a later step.
Enabling FederatedETL will begin pushing your primary cluster's ETL data to the directory /federated/CLUSTER_ID. This setting will remain on once disabling Thanos at a later step.
Enabling ETL-Utils will create the directories /federated/CLUSTER_ID for every primary/secondary cluster based on the full set of data in the /etl directory.
Enabling Aggregator will begin processing the ETL data from the /federated directory. The aggregator will then serve all Kubecost queries.
kubecostModel:
federatedStorageConfigSecret: federated-store
etlUtils:
enabled: true
thanosSourceBucketSecret: kubecost-thanos
fullImageName: gcr.io/kubecost1/cost-model-etl-utils:latest
resources: {}
kubecostAggregator:
enabled: true
replicas: 1

Step 6: Apply the changes and wait for data to populate in the UI

Ensure all pods and containers are running:
kubecost-cost-analyzer-685fd8f677-k652h 4/4 Running 0 3h2m
kubecost-etl-utils-6cdd489596-5dl75 1/1 Running 0 6d20h
This step can take some time depending on how much data the Aggregator must process. A couple importants steps are happening in the background:
  • The ETL Utils image is building the directory structure in the object store needed by Aggregator to pull the ETL data.
  • SQL tables are building.
Ensure all data loads into the Kubecost UI before moving onto Step 7.

Step 7: Upgrade your secondary clusters to build and push ETL data

Using the same federated-store.yaml created in Step 4, create this secret and add it to the values.yaml file for all secondary clusters.
kubectl create secret generic federated-store --from-file=federated-store.yaml -n kubecost
federatedETL:
federatedCluster: true
kubecostModel:
federatedStorageConfigSecret: federated-store
etl: true
containerStatsEnabled: true
warmCache: false
warmSavingsCache: false
Optionally, you can remove the Thanos sidecar running on this secondary cluster. If left on, this secondary cluster will continue to push Prometheus metrics to the object store which can be used as a backup.

Step 8: Remove Thanos configuration from the primary cluster

Remove Thanos configurations from your primary cluster since it is no longer included in the Helm chart in 2.0. Your Kubecost installation is now configured to query the Aggregator for multi-cluster data, instead of Thanos.

Step 9: Upgrade to Kubecost 2.0

You can now upgrade to Kubecost 2.0 using your standard upgrade process. If upgrading via Helm, your upgrade command will look like:
helm upgrade kubecost cost-analyzer --repo https://kubecost.github.io/cost-analyzer/ \
--namespace kubecost \
-f values.yaml

Troubeshooting

To help diagnose problems with Aggregator, check the Aggregator container logs for query failures or SQL table failures. If you have additional questions, contact Kubecost support at [email protected].