1 of 12

Multi-Cluster

Kubecost Free can now be installed on an unlimited number of individual clusters. Larger teams will benefit from using Kubecost Enterprise to better manage many clusters. See pricing for more details.

Primary and secondary clusters

In an Enterprise multi-cluster setup, the UI is accessed through a designated primary cluster. All other clusters in the environment send metrics to a central object-store with a lightweight agent (aka secondary clusters). The primary cluster is designated by setting the Helm flag .Values.federatedETL.primaryCluster=true, which instructs this cluster to read from the combined folder that was processed by the federator. This cluster will consume additional resources to run the Kubecost UI and backend.

As of Kubecost 1.108, agent health is monitored by a diagnostic pod that collects information from the local cluster and sends it to an object-store. This data is then processed by the Primary cluster and accessed via the UI and API.

Because the UI is only accessible through the primary cluster, Helm flags related to UI display are not applied to secondary clusters.

Enterprise Federation

This feature is only supported for Kubecost Enterprise.

There are two primary methods to aggregate all cluster information back to a single Kubecost UI:

Kubecost ETL Federation (preferred)
Thanos Federation

Both methods allow for greater compute efficiency by running the most resource-intensive workloads on a single primary cluster.

For environments that already have a Prometheus instance, ETL Federation may be preferred because only a single Kubecost pod is required.

The below diagrams highlight the two architectures:

Kubecost ETL Federation (Preferred)

Kubecost Thanos Federation

ETL Federation (preferred)

Federated ETL is only officially supported for Kubecost Enterprise plans.

Federated extract, transform, load (ETL) is one of two methods to aggregate all cluster information back to a single display described in our doc. Federated ETL gives teams the benefit of combining multiple Kubecost installations into one view without dependency on Thanos.

There are two primary advantages for using ETL Federation:

For environments that already have a Prometheus instance, Kubecost only requires a single pod per monitored cluster
Many solutions that aggregate Prometheus metrics (like Thanos), are often expensive to scale in large environments

Kubecost ETL Federation diagram

Sample configurations

This guide has specific detail on how ETL Configuration works and deployment options.

Clusters

The federated ETL is composed of three types of clusters.

Federated Clusters: The clusters which are being federated (clusters whose data will be combined and viewable at the end of the federated ETL pipeline). These clusters upload their ETL files after they have built them to Federated Storage.
Federator Clusters: The cluster on which the Federator (see in Other components) is set to run within the core cost-analyzer container. This cluster combines the Federated Cluster data uploaded to federated storage into combined storage.
Primary Cluster: A cluster where you can see the total Federated data that was combined from your Federated Clusters. These clusters read from combined storage.

These cluster designations can overlap, in that some clusters may be several types at once. A cluster that is a Federated Cluster, Federator Cluster, and Primary Cluster will perform the following functions:

As a Federated Cluster, push local cluster cost data to be combined from its local ETL build pipeline.
As a Federator Cluster, run the Federator inside the cost-analyzer, which pulls this local cluster data from S3, combines them, then pushes them back to combined storage.
As a Primary Cluster, pull back this combined data from combined storage to serve it on Kubecost APIs and/or the Kubecost frontend.

Other components

The Storages referred to here are an S3 (or GCP/Azure equivalent) storage bucket which acts as remote storage for the Federated ETL Pipeline.

Federated Storage: A set of folders on paths <bucket>/federated/<cluster id> which are essentially ETL backup data, holding a “copy” of Federated Cluster data. Federated Clusters push this data to Federated Storage to be combined by the Federator. Federated Clusters write this data, and the Federator reads this data.
Combined Storage: A folder on S3 on the path <bucket>/federated/combined which holds one set of ETL data containing all the allocations/assets in all the ETL data from Federated Storage. The Federator takes files from Federated Storage and combines them, adding a single set of combined ETL files to Combined Storage to be read by the Primary Cluster. The Federator writes this data, and the Primary Cluster reads this data.
The Federator: A component of the cost-model which is run on the Federator Cluster, which can be a Federated Cluster, a Primary Cluster, or neither. The Federator takes the ETL binaries from Federated Storage and merges them, adding them to Combined Storage.
Federated ETL: The pipeline containing the above components.

Federated ETL architecture

This diagram shows an example setup of the Federated ETL with:

Three pure Federated Clusters (not classified as any other cluster type): Cluster 1, Cluster 2, and Cluster 3
One Federator Cluster that is also a Federated Cluster: Cluster 4
One Primary Cluster that is also a Federated Cluster: Cluster 5

The result is 5 clusters federated together.

Setup

Step 0: Ensure unique cluster IDs

Ensure each federated cluster has a unique clusterName and cluster_id:

kubecostProductConfigs:
  clusterName: federated-one
prometheus:
  server:
    global:
      external_labels:
        cluster_id: federated-one

Step 1: Storage configuration

For any cluster in the pipeline (Federator, Federated, Primary, or any combination of the three), create a file federated-store.yaml with the same format used for Thanos/S3 backup.
Add a secret using that file: kubectl create secret generic <secret_name> -n kubecost --from-file=federated-store.yaml. Then set .Values.kubecostModel.federatedStorageConfigSecret to the kubernetes secret name.

Using an existing `object-store.yaml`

This method is not recommended, as it would enable the ETL Backup pipeline to run in addition to the the Federated ETL pipeline. If not configured correctly, there may be adverse effects on how ETLs are loaded into your primary.

If you have an existing storage configuration set via .Values.kubecostModel.etlBucketConfigSecret, you can re-use that existing config by setting the following values:

kubecostModel:
  etlBucketConfigSecret: "my-object-store-secret"
federatedETL:
  useExistingS3Config: true
  redirectS3Backup: true

Step 2: Cluster configuration (Federated/Federator)

For all clusters you want to federate together (i.e. see their data on the Primary Cluster), set .Values.federatedETL.federatedCluster to true. This cluster is now a Federated Cluster, and can also be a Federator or Primary Cluster.
For the cluster “hosting” the Federator, set .Values.federatedETL.federator.enabled to true. This cluster is now a Federator Cluster, and can also be a Federated or Primary Cluster.
- Optional: If you have any Federated Clusters pushing to a store that you do not want a Federator Cluster to federate, add the cluster id under the Federator config section .Values.federatedETL.federator.clusters. If this parameter is empty or not set, the Federator will take all ETL files in the /federated directory and federate them automatically.
- Multiple Federators federating from the same source will not break, but it’s not recommended.

Step 3: Cluster configuration (Primary)

In Kubecost, the Primary Cluster serves the UI and API endpoints as well as reconciling cloud billing (cloud-integration).

For the cluster that will be the Primary Cluster, set .Values.federatedETL.primaryCluster to true. This cluster is now a Primary Cluster, and can also be a Federator or Federated Cluster.
Cloud-integration requires .Values.federatedETL.federator.primaryClusterID set to the same value used for .Values.kubecostProductConfigs.clusterName
- Important: If the Primary Cluster is also to be federated, please wait 2-3 hours for data to populate Federated Storage before setting a Federated Cluster to primary (i.e. set .Values.federatedETL.federatedCluster to true, then wait to set .Values.federatedETL.primaryCluster to true). This allows for maximum certainty of data consistency.
- If you do not set this cluster to be federated as well as primary, you will not see local data for this cluster.
- The Primary Cluster’s local ETL will be overwritten with combined federated data.
  - This can be undone by unsetting it as a Primary Cluster and rebuilding ETL.
  - Setting a Primary Cluster may result in a loss of the cluster’s local ETL data, so it is recommended to back up any filestore data that one would want to save to S3 before designating the cluster as primary.
  - Alternatively, a fresh Kubecost install can be used as a consumer of combined federated data by setting it as the Primary but not a Federated Cluster.

Step 4: Verifying successful configuration

The Federated ETL should begin functioning. On any ETL action on a Federated Cluster (Load/Put into local ETL store) the Federated Clusters will add data to Federated Storage. The Federator will run 5 minutes after the Federator Cluster startup, and then every 30 minutes after that. The data is merged into the Combined Storage, where it can be read by the Primary.
- To verify Federated Clusters are uploading their data correctly, check the container logs on a Federated Cluster. It should log federated uploads when ETL build steps run. The S3 bucket can also be checked to see if data is being written to the /federated/<cluster_id> path.
- To verify the Federator is functioning, check the container logs on the Federator Cluster. The S3 bucket can also be checked to verify that data is being written to /federated/combined.
- To verify the entire pipeline is working, either query Allocations/Assets or view the respective views on the frontend. Multi-cluster data should appear after:
  - The Federator has run at least once.
  - There was data in the Federated Storage for the Federator to have combined.

Setup with internal certificate authority

If you are using an internal certificate authority (CA), follow this tutorial instead of the above Setup section.

Begin by creating a ConfigMap with the certificate provided by the CA on every agent, including the Federator and any federated clusters, and name the file kubecost-federator-certs.yaml.

apiVersion: v1
data:
  ca-certificates.crt: |-
    # CA Cert
    -----BEGIN CERTIFICATE-----
    abc . . . . . . . . . . . . . .
    . . . . . . . . . . . . . . . .
    . . . . . . . . . . . . . . . .
    -----END CERTIFICATE-----

    # Root Cert
    -----BEGIN CERTIFICATE-----
    xyz . . . . . . . . . . . . . .
    . . . . . . . . . . . . . . . .
    . . . . . . . . . . . . . . . .
    -----END CERTIFICATE-----

kind: ConfigMap
  name: kubecost-federator-certs
  namespace: kubecost

Now run the following command, making sure you specify the location for the ConfigMap you created:

kubectl create cm kubecost-federator-certs --from-file=/path/to/kubecost-federator-certs.yaml

Mount the certification on the Federator and any federated clusters by passing these Helm flags to your values.yaml/manifest:

extraVolumes:
  - name: kubecost-federator-certs
    configMap:
      name: kubecost-federator-certs
extraVolumeMounts:
  - name: kubecost-federator-certs
    mountPath: /path/to/ca-certificates.crt
    subPath: ca-certificates.crt

federatedETL:
  federator:
    extraVolumes:
      - name: kubecost-federator-certs
        configMap:
          name: kubecost-federator-certs
    extraVolumeMounts:
      - name: kubecost-federator-certs
        mountPath: /path/to/ca-certificates.crt
        subPath: ca-certificates.crt

Create a file federated-store.yaml, which will go on all clusters:

type: S3
config:
  bucket: "kubecost-storage"
  endpoint: <S3 endpoint>
  region: <region>
  aws_sdk_auth: true                                      
  insecure: false
  signature_version2: false
  put_user_metadata:
    "X-Amz-Acl": "bucket-owner-full-control"
  http_config:
    idle_conn_timeout: 90s
    response_header_timeout: 2m
    insecure_skip_verify: false
    tls_config:                                          
      ca_file: "/path/to/ca-certificates.crt"            
      cert_file: "CERT.pem"                              
      key_file: "KEY.PEM"                                 
      insecure_skip_verify: false                         
  trace:
    enable: true
  part_size: 134217728          
  sts_endpoint: <STS endpoint>

Now run the following command (omit kubectl create namespace kubecost if your kubecost namespace already exists, or this command will fail):

kubectl create namespace kubecost
kubectl create secret generic \
  kubecost-object-store -n kubecost \
  --from-file=federated-store.yaml

Kubecost Aggregator

Aggregator is a new backend for Kubecost. It is used in a configuration without Thanos, replacing the component. Aggregator serves a critical subset of Kubecost APIs, but will eventually be the default model for Kubecost and serve all APIs. Currently, Aggregator supports all major monitoring and savings APIs, and also budgets and reporting.

Existing documentation for Kubecost APIs will use endpoints for non-Aggregator environments unless otherwise specified, but will still be compatible after configuring Aggregator.

Aggregator is designed to accommodate queries of large-scale datasets by improving API load times and reducing UI errors. It is not designed to introduce new functionality; it is meant to improve functionality at scale.

Aggregator is currently free for all Enterprise users to configure, and is always able to be rolled back.

Configuring Aggregator

Prerequisites

Aggregator can only be configured in a Federated ETL environment
Must be using v1.107.0 of Kubecost or newer
Your values.yaml file must have set kubecostDeployment.queryServiceReplicas to its default value 0.
You must have your context set to your primary cluster. Kubecost Aggregator cannot be deployed on secondary clusters.

Tutorial

Select from one of the two templates below and save the content as federated-store.yaml. This will be your configuration template required to set up Aggregator.

The name of the .yaml file used to create the secret must be named federated-store.yaml or Aggregator will not start.

Basic configuration:

kubecostAggregator:
  replicas: 1
  enabled: true
  cloudCost:
    enabled: true
federatedETL:
  federatedCluster: true
kubecostModel:
  containerStatsEnabled: true
  cloudCost:
    enabled: false
  federatedStorageConfigSecret: federated-store
kubecostProductConfigs:
  clusterName: YOUR_CLUSTER_NAME
  cloudIntegrationSecret: cloud-integration
  productKey:
    enabled: true
    key: YOUR_KEY
prometheus:
  server:
    global:
      external_labels:
        cluster_id: YOUR_CLUSTER_NAME
# when using managed identity/irsa, set the service account accordingly:
serviceAccount:
  create: false
  name: kubecost-irsa-sa

Advanced configuration (for larger deployments):

kubecostAggregator:
  replicas: 1
  enabled: true
  cloudCost:
    enabled: true
  env:
    # governs parallelism of derivation step
    # more threads speeds derivation, but requires significantly more 
    # log level
    # default: info
    LOG_LEVEL: info
  aggregatorDbStorage:
    # governs storage size of aggregator DB storage
    # !!NOTE!! disk performance is _critically important_ to aggregator performance
    # ensure disk is specd high enough, and check for bottlenecks
    # default: 128Gi
    storageRequest: 512Gi
federatedETL:
  federatedCluster: true
kubecostModel:
  containerStatsEnabled: true
  federatedStorageConfigSecret: federated-store
kubecostProductConfigs:
  clusterName: YOUR_CLUSTER_NAME
  cloudIntegrationSecret: cloud-integration
  productKey:
    enabled: true
    key: YOUR_KEY
prometheus:
  server:
    global:
      external_labels:
        cluster_id: YOUR_CLUSTER_NAME
# when using managed identity/irsa, set the service account accordingly:
serviceAccount:
  create: false
  name: kubecost-irsa-sa

There is no baseline for what is considered a larger deployment, which will be dependent on load times in your Kubecost environment.

Once you’ve configured your federated-store.yaml_, create a secret using the following command:

kubectl create secret generic federated-storage -n kubecost --from-file=federated-store.yaml

kubectl create secret generic cloud-integration -n kubecost --from-file=cloud-integration.json

Finally, upgrade your existing Kubecost installation. This command will install Kubecost if it does not already exist:

Upgrading your existing Kubecost using your configured federated-store.yaml_ file above will reset all existing Helm values configured in your values.yaml. If you wish to preserve any of those changes, append your values.yaml by adding the contents of your federated-store.yaml file into it, then replacing federated-store.yaml with values.yaml in the upgrade command below:

helm upgrade --install "kubecost-primary" \
--namespace kubecost-primary \
--repo https://kubecost.github.io/cost-analyzer/ cost-analyzer \
-f aggregator.yaml

Validating Aggregator pod is running successfully

When first enabled, the aggregator pod will ingest the last three years (if applicable) of ETL data from the federated-store. This may take several hours. Because the combined folder is ignored, the federator pod is not used here, but can still run if needed. You can run kubectl get pods and ensure the aggregator pod is running, but should still wait for all data to be ingested.

Backups and Alerting

Federated ETL Architecture is only officially supported on Kubecost Enterprise plans.

This doc provides recommendations to improve the stability and recoverability of your Kubecost data when deploying in a Federated ETL architecture.

Option 1: Increase Prometheus retention

Kubecost can rebuild its extract, transform, load (ETL) data using Prometheus metrics from each cluster. It is strongly recommended to retain local cluster Prometheus metrics that meet an organization's disaster recovery requirements.

prometheus:
  server:
    retention: 21d
  # Ensure the volume is large enough to hold all metrics
  persistentVolume:
    size: 32Gi
    enabled: true

Option 2: Metrics backup

For long term storage of Prometheus metrics, we recommend setting up a Thanos sidecar container to push Prometheus metrics to a cloud storage bucket.

# This is an abridged example. Full example in link below.
prometheus:
  server:
    extraArgs:
      storage.tsdb.min-block-duration: 2h
      storage.tsdb.max-block-duration: 2h
    extraVolumes:
    - name: object-store-volume
      secret:
        secretName: kubecost-thanos
    sidecarContainers:
    - name: thanos-sidecar
      image: thanosio/thanos:v0.30.2
      args:
        - sidecar
        - --prometheus.url=http://127.0.0.1:9090
        - --objstore.config-file=/etc/config/object-store.yaml
      volumeMounts:
      - name: object-store-volume
        mountPath: /etc/config
      - name: storage-volume
        mountPath: /data
        subPath: ""

Option 3: Bucket versioning

Use your cloud service provider's bucket versioning feature to take frequent snapshots of the bucket holding your Kubecost data (ETL files and Prometheus metrics).

Option 4: Alerting

Thanos Federation

This feature is only officially supported on Kubecost Enterprise plans.

Thanos is a tool to aggregate Prometheus metrics to a central object storage (S3 compatible) bucket. Thanos is implemented as a sidecar on the Prometheus pod on all clusters. Thanos Federation is one of two primary methods to aggregate all cluster information back to a single view as described in our Multi-Cluster article.

The preferred method for multi-cluster is ETL Federation. The configuration guide below is for Kubecost Thanos Federation, which may not scale as well as ETL Federation in large environments.

This guide will cover how to enable Thanos on your primary cluster, and on any additional secondary clusters.

Configuration

Follow steps here to enable all required Thanos components on a Kubecost primary cluster, including the Prometheus sidecar.
For each additional cluster, only the Thanos sidecar is needed.

Consider the following Thanos recommendations for secondaries:

* Reuse your existing storage bucket and access credentials.
* Do not deploy multiple instances of `thanos-compact`.
* Optionally deploy `thanos-bucket` in each additional cluster, but it is not required.
* Optionally disable `thanos.store` and `thanos.query` (Clusters with store/query disabled will only have access to their metrics but will still write to the global bucket.)

Thanos modules can be disabled in [thanos/values.yaml](https://github.com/kubecost/cost-analyzer-helm-chart/blob/master/cost-analyzer/charts/thanos/values.yaml), or in [values-thanos.yaml](https://github.com/kubecost/cost-analyzer-helm-chart/blob/develop/cost-analyzer/values-thanos.yaml) if overriding these values from a values-thanos.yaml file supplied from the command line (`helm upgrade kubecost -f values.yaml -f values-thanos.yaml`), or by passing these parameters directly via Helm install or upgrade as follows:

```
--set thanos.compact.enabled=false --set thanos.bucket.enabled=false
```

You can also optionally disable `thanos.store`, `thanos.query` and `thanos.queryFrontend` with thanos/values.yaml or with these flags:

```
--set thanos.query.enabled=false --set thanos.store.enabled=false --set thanos.queryFrontend.enabled=false
```

Ensure you provide a unique identifier for prometheus.server.global.external_labels.cluster_id to have additional clusters be visible in the Kubecost product, e.g. cluster-two.

cluster_id can be replaced with another label (e.g. cluster) by modifying .Values.kubecostModel.promClusterIDLabel.

Follow the same verification steps available here.

Sample configurations for each cloud provider can be found here.

Architecture diagram

Configuring Thanos

This feature is only officially supported on .

Kubecost leverages Thanos and durable storage for three different purposes:

Centralize metric data for a global multi-cluster view into Kubernetes costs via a Prometheus sidecar
Allow for unlimited data retention
Backup Kubecost

To enable Thanos, follow these steps:

Step 1: Create object-store.yaml

This step creates the object-store.yaml file that contains your durable storage target (e.g. GCS, S3, etc.) configuration and access credentials. The details of this file are documented thoroughly in .

We have guides for using cloud-native storage for the largest cloud providers. Other providers can be similarly configured.

Use the appropriate guide for your cloud provider:

Step 2: Create object-store secret

Create a secret with the .yaml file generated in the previous step:

kubectl create secret generic kubecost-thanos -n kubecost --from-file=./object-store.yaml

Step 3: Unique Cluster ID

Each cluster needs to be labelled with a unique Cluster ID, which is done in two places.

values-clusterName.yaml

kubecostProductConfigs:
  clusterName: kubecostProductConfigs_clusterName
prometheus:
  server:
    global:
      external_labels:
        cluster_id: kubecostProductConfigs_clusterName

Step 4: Deploying Kubecost with Thanos

The Thanos subchart includes thanos-bucket, thanos-query, thanos-store, thanos-compact, and service discovery for thanos-sidecar. These components are recommended when deploying Thanos on the primary cluster.

helm upgrade kubecost kubecost/cost-analyzer \
    --install \
    --namespace kubecost \
    -f https://raw.githubusercontent.com/kubecost/cost-analyzer-helm-chart/v1.108/cost-analyzer/values-thanos.yaml \
    -f values-clusterName.yaml

The thanos-store container is configured to request 2.5GB memory, this may be reduced for smaller deployments. thanos-store is only used on the primary Kubecost cluster.

To verify installation, check to see all Pods are in a READY state. View Pod logs for more detail and see common troubleshooting steps below.

Troubleshooting

Thanos sends data to the bucket every 2 hours. Once 2 hours have passed, logs should indicate if data has been sent successfully or not.

You can monitor the logs with:

kubectl logs --namespace kubecost -l app=prometheus -l component=server --prefix=true --container thanos-sidecar --tail=-1 | grep uploaded

Monitoring logs this way should return results like this:

[pod/kubecost-prometheus-server-xxx/thanos-sidecar] level=debug ts=2022-06-09T13:00:10.084904136Z caller=objstore.go:206 msg="uploaded file" from=/data/thanos/upload/BUCKETID/chunks/000001 dst=BUCKETID/chunks/000001 bucket="tracing: kc-thanos-store"

As an aside, you can validate the Prometheus metrics are all configured with correct cluster names with:

kubectl logs --namespace kubecost -l app=prometheus -l component=server --prefix=true --container thanos-sidecar --tail=-1 | grep external_labels

To troubleshoot the IAM Role Attached to the serviceaccount, you can create a Pod using the same service account used by the thanos-sidecar (default is kubecost-prometheus-server):

s3-pod.yaml

apiVersion: v1
kind: Pod
metadata:
  labels:
    run: s3-pod
  name: s3-pod
spec:
  serviceAccountName: kubecost-prometheus-server
  containers:
  - image: amazon/aws-cli
    name: my-aws-cli
    command: ['sleep', '500']

kubectl apply -f s3-pod.yaml
kubectl exec -i -t s3-pod -- aws s3 ls s3://kc-thanos-store

This should return a list of objects (or at least not give a permission error).

Cluster not writing data to thanos bucket

If a cluster is not successfully writing data to the bucket, review thanos-sidecar logs with the following command:

kubectl logs kubecost-prometheus-server-<your-pod-id> -n kubecost -c thanos-sidecar

Logs in the following format are evidence of a successful bucket write:

level=debug ts=2019-12-20T20:38:32.288251067Z caller=objstore.go:91 msg="uploaded file" from=/data/thanos/upload/BUCKET-ID/meta.json dst=debug/metas/BUCKET-ID.json bucket=kc-thanos

Stores not listed at the `/stores` endpoint

If thanos-query can't connect to both the sidecar and the store, you may want to directly specify the store gRPC service address instead of using DNS discovery (the default). You can quickly test if this is the issue by running:

kubectl edit deployment kubecost-thanos-query -n kubecost

and adding

--store=kubecost-thanos-store-grpc.kubecost:10901

to the container args. This will cause a query restart and you can visit /stores again to see if the store has been added.

If it has, you'll want to use these addresses instead of DNS more permanently by setting .Values.thanos.query.stores in values-thanos.yaml.

...
thanos:
  store:
    enabled: true
    grpcSeriesMaxConcurrency: 20
    blockSyncConcurrency: 20
    extraEnv:
      - name: GOGC
        value: "100"
    resources:
      requests:
        memory: "2.5Gi"
  query:
    enabled: true
    timeout: 3m
    # Maximum number of queries processed concurrently by query node.
    maxConcurrent: 8
    # Maximum number of select requests made concurrently per a query.
    maxConcurrentSelect: 2
    resources:
      requests:
        memory: "2.5Gi"
    autoDownsampling: false
    extraEnv:
      - name: GOGC
        value: "100"
    stores:
      - "kubecost-thanos-store-grpc.kubecost:10901"

Additional Troubleshooting

A common error is as follows, which means you do not have the correct access to the supplied bucket:

thanos-svc-account@project-227514.iam.gserviceaccount.com does not have storage.objects.list access to thanos-bucket., forbidden"

Assuming pods are running, use port forwarding to connect to the thanos-query-http endpoint:

kubectl port-forward svc/kubecost-thanos-query-http 8080:10902 --namespace kubecost

If you navigate to Stores using the top navigation bar, you should be able to see the status of both the thanos-store and thanos-sidecar which accompanied the Prometheus server:

Also note that the sidecar should identify with the unique cluster_id provided in your values.yaml in the previous step. Default value is cluster-one.

The default retention period for when data is moved into the object storage is currently 2h. This configuration is based on Thanos suggested values. By default, it will be 2 hours before data is written to the provided bucket.

Thanos Upgrade

Kubecost v1.67.0+ uses Thanos 0.15.0. If you're upgrading to Kubecost v1.67.0+ from an older version and using Thanos, with AWS S3 as your backing storage for Thanos, you'll need to make a small change to your Thanos Secret in order to bump the Thanos version to 0.15.0 before you upgrade Kubecost.

Thanos 0.15.0 has over 10x performance improvements, so this is recommended.

Your values-thanos.yaml needs to be updated to the new defaults here. The PR bumps the image version, adds the query-frontend component, and increases concurrency.

This is simplified if you're using our default values-thanos.yaml, which has the new configs already.

For the Thanos Secret you're using, the encrypt-sse line needs to be removed. Everything else should stay the same.

For example, view this sample config:

type: S3
config:
  bucket: ${bucket_name}
  endpoint: "s3.amazonaws.com"
  region: ${your_bucket_region}
  access_key: ${your_access_key}
  insecure: false
  signature_version2: false
  #encrypt_sse: false <-- THIS LINE NEEDS TO BE DELETED
  secret_key: ${your_secret_here}
  put_user_metadata:
      "X-Amz-Acl": "bucket-owner-full-control"
  http_config:
    idle_conn_timeout: 90s
    response_header_timeout: 2m
    insecure_skip_verify: false
  trace:
    enable: true
  part_size: 134217728

The easiest way to do this is to delete the existing secret and upload a new one:

kubectl delete secret -n kubecost kubecost-thanos

Update your secret .YAML file as above, and save it as object-store.yaml.

kubectl create secret generic kubecost-thanos -n kubecost --from-file=./object-store.yaml

Once this is done, you're ready to upgrade!

Long-Term Storage Configuration

AWS Multi-Cluster Storage Configuration

AWS/S3 Federation

Kubecost uses a shared storage bucket to store metrics from clusters, known as durable storage, in order to provide a single-pane-of-glass for viewing cost across many clusters. Multi-cluster is an enterprise feature of Kubecost.

There are multiple methods to provide Kubecost access to an S3 bucket. This guide has two examples:

Using a Kubernetes secret
Attaching an AWS Identity and Access Management (IAM) role to the service account used by Prometheus

Both methods require an S3 bucket. Our example bucket is named kc-thanos-store.

This is a simple S3 bucket with all public access blocked. No other bucket configuration changes should be required.

Once created, add an IAM policy to access this bucket. See our AWS Thanos IAM Policy doc for instructions.

Method 1: Kubernetes Secret Method

To use the Kubernetes secret method for allowing access, create a YAML file named object-store.yaml with contents similar to the following example. See region to endpoint mappings here.


type: S3
config:
  bucket: "kc-thanos-store"
  endpoint: "s3.amazonaws.com"
  region: "us-east-1"
  access_key: "<your-access-key>"
  secret_key: "<your-secret-key>"
  insecure: false
  signature_version2: false
  put_user_metadata:
      "X-Amz-Acl": "bucket-owner-full-control"
  http_config:
    idle_conn_timeout: 90s
    response_header_timeout: 2m
    insecure_skip_verify: false
  trace:
    enable: true
  part_size: 134217728

Method 2: Attach IAM role to Service Account Method

Instead of using a secret key in a file, many will want to use this method.

Attach the policy to the Thanos pods service accounts. Your object-store.yaml should follow the format below when using this option, which does not contain the secret_key and access_key fields.

type: S3
config:
  bucket: "kc-thanos-store"
  endpoint: "s3.amazonaws.com"
  region: "us-east-1"
  insecure: false
  signature_version2: false
  put_user_metadata:
      "X-Amz-Acl": "bucket-owner-full-control"
  http_config:
    idle_conn_timeout: 90s
    response_header_timeout: 2m
    insecure_skip_verify: false
  trace:
    enable: true
  part_size: 134217728

Then, follow this AWS guide to enable attaching IAM roles to pods.

You can define the IAM role to associate with a service account in your cluster by creating a service account in the same namespace as Kubecost and adding an annotation to it of the form eks.amazonaws.com/role-arn: arn:aws:iam::<AWS_ACCOUNT_ID>:role/<IAM_ROLE_NAME> as described here.

Once that annotation has been created, configure the following:

.Values.prometheus.serviceAccounts.server.create: false
.Values.prometheus.serviceAccounts.server.name: serviceAccount # to the name of your created service account
.Values.thanos.compact.serviceAccount: serviceAccount
.Values.thanos.store.serviceAccount: serviceAccount

Thanos Encryption With S3 and KMS

You can encrypt the S3 bucket where Kubecost data is stored in AWS via S3 and KMS. However, because Thanos can store potentially millions of objects, it is suggested that you use bucket-level encryption instead of object-level encryption. More details available in these external docs:

Troubleshooting

Visit the Configuring Thanos doc for troubleshooting help.

AWS Thanos IAM Policy

In order to create an AWS IAM policy for use with Thanos:

Navigate to the AWS console and select IAM.
Select Policies in the Navigation menu, then select Create Policy.
Add the following JSON in the policy editor:

Make sure to replace <your-bucket-name> with the name of your newly-created S3 bucket.

```
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "Statement",
            "Effect": "Allow",
            "Action": [
                "s3:ListBucket",
                "s3:GetObject",
                "s3:DeleteObject",
                "s3:PutObject",
                "s3:PutObjectAcl"
            ],
            "Resource": [
                "arn:aws:s3:::<your-bucket-name>/*",
                "arn:aws:s3:::<your-bucket-name>"
            ]
        }
    ]
}
```

4. Select Review policy and name this policy, e.g. kc-thanos-store-policy.

Navigate to Users in IAM control panel, then select Add user.
Provide a username (e.g. kubecost-thanos-service-account) and select Programmatic access.
Select Attach existing policies directly, search for the policy name provided in Step 4, then create the user.
Capture your Access Key ID and secret in the view below:

If you don’t want to use a service account, IAM credentials retrieved from an instance profile are also supported. You must get both access key and secret key from the same method (i.e. both from service or instance profile). More info on retrieving credentials here.

Azure Long-Term Storage

To use Azure Storage as Thanos object store, you need to precreate a storage account from Azure portal or using Azure CLI. Follow the instructions from the Azure Storage Documentation.

Now create a .YAML file named object-store.yaml with the following format:

type: AZURE
config:
  storage_account: ""
  storage_account_key: ""
  container: ""
  # config.endpoint is only needed if primary blob service endpoint domain is not blob.core.windows.net
  # Example: blob.core.chinacloudapi.cn
  endpoint: ""
  max_retries: 0

GCP Long-Term Storage

Start by creating a new Google Cloud Storage bucket. The following example uses a bucket named thanos-bucket. Next, download a service account JSON file from Google's service account manager (steps).

Now create a YAML file named object-store.yaml in the following format, using your bucket name and service account details:

type: GCS
config:
  bucket: "thanos-bucket"
  service_account: |-
    {
      "type": "service_account",
      "project_id": "...",
      "private_key_id": "...",
      "private_key": "...",
      "client_email": "...",
      "client_id": "...",
      "auth_uri": "https://accounts.google.com/o/oauth2/auth",
      "token_uri": "https://oauth2.googleapis.com/token",
      "auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs",
      "client_x509_cert_url": ""
    }

Note: Because this is a YAML file, it requires this specific indention.

Warning: Do not apply a retention policy to your Thanos bucket, as it will prevent Thanos compaction from completing.

Configuring Thanos

This feature is only officially supported on .

Kubecost leverages Thanos and durable storage for three different purposes:

Centralize metric data for a global multi-cluster view into Kubernetes costs via a Prometheus sidecar
Allow for unlimited data retention
Backup Kubecost

To enable Thanos, follow these steps:

Step 1: Create object-store.yaml

We have guides for using cloud-native storage for the largest cloud providers. Other providers can be similarly configured.

Use the appropriate guide for your cloud provider:

Step 2: Create object-store secret

Create a secret with the .yaml file generated in the previous step:

kubectl create secret generic kubecost-thanos -n kubecost --from-file=./object-store.yaml

Step 3: Unique Cluster ID

Each cluster needs to be labelled with a unique Cluster ID, which is done in two places.

values-clusterName.yaml

kubecostProductConfigs:
  clusterName: kubecostProductConfigs_clusterName
prometheus:
  server:
    global:
      external_labels:
        cluster_id: kubecostProductConfigs_clusterName

Step 4: Deploying Kubecost with Thanos

These values can be adjusted under the thanos block in values-thanos.yaml. Available options are here:

helm upgrade kubecost kubecost/cost-analyzer \
    --install \
    --namespace kubecost \
    -f https://raw.githubusercontent.com/kubecost/cost-analyzer-helm-chart/v1.108/cost-analyzer/values-thanos.yaml \
    -f values-clusterName.yaml

The thanos-store container is configured to request 2.5GB memory, this may be reduced for smaller deployments. thanos-store is only used on the primary Kubecost cluster.

To verify installation, check to see all Pods are in a READY state. View Pod logs for more detail and see common troubleshooting steps below.

Troubleshooting

Thanos sends data to the bucket every 2 hours. Once 2 hours have passed, logs should indicate if data has been sent successfully or not.

You can monitor the logs with:

kubectl logs --namespace kubecost -l app=prometheus -l component=server --prefix=true --container thanos-sidecar --tail=-1 | grep uploaded

Monitoring logs this way should return results like this:

[pod/kubecost-prometheus-server-xxx/thanos-sidecar] level=debug ts=2022-06-09T13:00:10.084904136Z caller=objstore.go:206 msg="uploaded file" from=/data/thanos/upload/BUCKETID/chunks/000001 dst=BUCKETID/chunks/000001 bucket="tracing: kc-thanos-store"

As an aside, you can validate the Prometheus metrics are all configured with correct cluster names with:

kubectl logs --namespace kubecost -l app=prometheus -l component=server --prefix=true --container thanos-sidecar --tail=-1 | grep external_labels

To troubleshoot the IAM Role Attached to the serviceaccount, you can create a Pod using the same service account used by the thanos-sidecar (default is kubecost-prometheus-server):

s3-pod.yaml

apiVersion: v1
kind: Pod
metadata:
  labels:
    run: s3-pod
  name: s3-pod
spec:
  serviceAccountName: kubecost-prometheus-server
  containers:
  - image: amazon/aws-cli
    name: my-aws-cli
    command: ['sleep', '500']

kubectl apply -f s3-pod.yaml
kubectl exec -i -t s3-pod -- aws s3 ls s3://kc-thanos-store

This should return a list of objects (or at least not give a permission error).

Cluster not writing data to thanos bucket

If a cluster is not successfully writing data to the bucket, review thanos-sidecar logs with the following command:

kubectl logs kubecost-prometheus-server-<your-pod-id> -n kubecost -c thanos-sidecar

Logs in the following format are evidence of a successful bucket write:

level=debug ts=2019-12-20T20:38:32.288251067Z caller=objstore.go:91 msg="uploaded file" from=/data/thanos/upload/BUCKET-ID/meta.json dst=debug/metas/BUCKET-ID.json bucket=kc-thanos

Stores not listed at the `/stores` endpoint

kubectl edit deployment kubecost-thanos-query -n kubecost

and adding

--store=kubecost-thanos-store-grpc.kubecost:10901

to the container args. This will cause a query restart and you can visit /stores again to see if the store has been added.

If it has, you'll want to use these addresses instead of DNS more permanently by setting .Values.thanos.query.stores in values-thanos.yaml.

...
thanos:
  store:
    enabled: true
    grpcSeriesMaxConcurrency: 20
    blockSyncConcurrency: 20
    extraEnv:
      - name: GOGC
        value: "100"
    resources:
      requests:
        memory: "2.5Gi"
  query:
    enabled: true
    timeout: 3m
    # Maximum number of queries processed concurrently by query node.
    maxConcurrent: 8
    # Maximum number of select requests made concurrently per a query.
    maxConcurrentSelect: 2
    resources:
      requests:
        memory: "2.5Gi"
    autoDownsampling: false
    extraEnv:
      - name: GOGC
        value: "100"
    stores:
      - "kubecost-thanos-store-grpc.kubecost:10901"

Additional Troubleshooting

A common error is as follows, which means you do not have the correct access to the supplied bucket:

thanos-svc-account@project-227514.iam.gserviceaccount.com does not have storage.objects.list access to thanos-bucket., forbidden"

Assuming pods are running, use port forwarding to connect to the thanos-query-http endpoint:

kubectl port-forward svc/kubecost-thanos-query-http 8080:10902 --namespace kubecost

Then navigate to in your browser. This page should look very similar to the Prometheus console.

If you navigate to Stores using the top navigation bar, you should be able to see the status of both the thanos-store and thanos-sidecar which accompanied the Prometheus server:

Also note that the sidecar should identify with the unique cluster_id provided in your values.yaml in the previous step. Default value is cluster-one.

Instead of waiting 2h to ensure that Thanos was configured correctly, the default log level for the Thanos workloads is debug (it's very light logging even on debug). You can get logs for the thanos-sidecar, which is part of the prometheus-server Pod, and thanos-store. The logs should give you a clear indication of whether or not there was a problem consuming the secret and what the issue is. For more on Thanos architecture, view .

ETL Federation (preferred)

Federated ETL is only officially supported for Kubecost Enterprise plans.

There are two primary advantages for using ETL Federation:

For environments that already have a Prometheus instance, Kubecost only requires a single pod per monitored cluster
Many solutions that aggregate Prometheus metrics (like Thanos), are often expensive to scale in large environments

Kubecost ETL Federation diagram

Sample configurations

This guide has specific detail on how ETL Configuration works and deployment options.

Alternatively, the most common configurations can be found in our repo.

Clusters

The federated ETL is composed of three types of clusters.

Federated Clusters: The clusters which are being federated (clusters whose data will be combined and viewable at the end of the federated ETL pipeline). These clusters upload their ETL files after they have built them to Federated Storage.
Federator Clusters: The cluster on which the Federator (see in Other components) is set to run within the core cost-analyzer container. This cluster combines the Federated Cluster data uploaded to federated storage into combined storage.
Primary Cluster: A cluster where you can see the total Federated data that was combined from your Federated Clusters. These clusters read from combined storage.

As a Federated Cluster, push local cluster cost data to be combined from its local ETL build pipeline.
As a Federator Cluster, run the Federator inside the cost-analyzer, which pulls this local cluster data from S3, combines them, then pushes them back to combined storage.
As a Primary Cluster, pull back this combined data from combined storage to serve it on Kubecost APIs and/or the Kubecost frontend.

Other components

The Storages referred to here are an S3 (or GCP/Azure equivalent) storage bucket which acts as remote storage for the Federated ETL Pipeline.

Federated Storage: A set of folders on paths <bucket>/federated/<cluster id> which are essentially ETL backup data, holding a “copy” of Federated Cluster data. Federated Clusters push this data to Federated Storage to be combined by the Federator. Federated Clusters write this data, and the Federator reads this data.
Combined Storage: A folder on S3 on the path <bucket>/federated/combined which holds one set of ETL data containing all the allocations/assets in all the ETL data from Federated Storage. The Federator takes files from Federated Storage and combines them, adding a single set of combined ETL files to Combined Storage to be read by the Primary Cluster. The Federator writes this data, and the Primary Cluster reads this data.
The Federator: A component of the cost-model which is run on the Federator Cluster, which can be a Federated Cluster, a Primary Cluster, or neither. The Federator takes the ETL binaries from Federated Storage and merges them, adding them to Combined Storage.
Federated ETL: The pipeline containing the above components.

Federated ETL architecture

This diagram shows an example setup of the Federated ETL with:

Three pure Federated Clusters (not classified as any other cluster type): Cluster 1, Cluster 2, and Cluster 3
One Federator Cluster that is also a Federated Cluster: Cluster 4
One Primary Cluster that is also a Federated Cluster: Cluster 5

The result is 5 clusters federated together.

Setup

Step 0: Ensure unique cluster IDs

Ensure each federated cluster has a unique clusterName and cluster_id:

kubecostProductConfigs:
  clusterName: federated-one
prometheus:
  server:
    global:
      external_labels:
        cluster_id: federated-one

Step 1: Storage configuration

For any cluster in the pipeline (Federator, Federated, Primary, or any combination of the three), create a file federated-store.yaml with the same format used for Thanos/S3 backup.
Add a secret using that file: kubectl create secret generic <secret_name> -n kubecost --from-file=federated-store.yaml. Then set .Values.kubecostModel.federatedStorageConfigSecret to the kubernetes secret name.

Using an existing `object-store.yaml`

If you have an existing storage configuration set via .Values.kubecostModel.etlBucketConfigSecret, you can re-use that existing config by setting the following values:

kubecostModel:
  etlBucketConfigSecret: "my-object-store-secret"
federatedETL:
  useExistingS3Config: true
  redirectS3Backup: true

Step 2: Cluster configuration (Federated/Federator)

For all clusters you want to federate together (i.e. see their data on the Primary Cluster), set .Values.federatedETL.federatedCluster to true. This cluster is now a Federated Cluster, and can also be a Federator or Primary Cluster.
For the cluster “hosting” the Federator, set .Values.federatedETL.federator.enabled to true. This cluster is now a Federator Cluster, and can also be a Federated or Primary Cluster.
- Optional: If you have any Federated Clusters pushing to a store that you do not want a Federator Cluster to federate, add the cluster id under the Federator config section .Values.federatedETL.federator.clusters. If this parameter is empty or not set, the Federator will take all ETL files in the /federated directory and federate them automatically.
- Multiple Federators federating from the same source will not break, but it’s not recommended.

Step 3: Cluster configuration (Primary)

In Kubecost, the Primary Cluster serves the UI and API endpoints as well as reconciling cloud billing (cloud-integration).

For the cluster that will be the Primary Cluster, set .Values.federatedETL.primaryCluster to true. This cluster is now a Primary Cluster, and can also be a Federator or Federated Cluster.
Cloud-integration requires .Values.federatedETL.federator.primaryClusterID set to the same value used for .Values.kubecostProductConfigs.clusterName
- Important: If the Primary Cluster is also to be federated, please wait 2-3 hours for data to populate Federated Storage before setting a Federated Cluster to primary (i.e. set .Values.federatedETL.federatedCluster to true, then wait to set .Values.federatedETL.primaryCluster to true). This allows for maximum certainty of data consistency.
- If you do not set this cluster to be federated as well as primary, you will not see local data for this cluster.
- The Primary Cluster’s local ETL will be overwritten with combined federated data.
  - This can be undone by unsetting it as a Primary Cluster and rebuilding ETL.
  - Setting a Primary Cluster may result in a loss of the cluster’s local ETL data, so it is recommended to back up any filestore data that one would want to save to S3 before designating the cluster as primary.
  - Alternatively, a fresh Kubecost install can be used as a consumer of combined federated data by setting it as the Primary but not a Federated Cluster.

Step 4: Verifying successful configuration

The Federated ETL should begin functioning. On any ETL action on a Federated Cluster (Load/Put into local ETL store) the Federated Clusters will add data to Federated Storage. The Federator will run 5 minutes after the Federator Cluster startup, and then every 30 minutes after that. The data is merged into the Combined Storage, where it can be read by the Primary.
- To verify Federated Clusters are uploading their data correctly, check the container logs on a Federated Cluster. It should log federated uploads when ETL build steps run. The S3 bucket can also be checked to see if data is being written to the /federated/<cluster_id> path.
- To verify the Federator is functioning, check the container logs on the Federator Cluster. The S3 bucket can also be checked to verify that data is being written to /federated/combined.
- To verify the entire pipeline is working, either query Allocations/Assets or view the respective views on the frontend. Multi-cluster data should appear after:
  - The Federator has run at least once.
  - There was data in the Federated Storage for the Federator to have combined.

Setup with internal certificate authority

If you are using an internal certificate authority (CA), follow this tutorial instead of the above Setup section.

Begin by creating a ConfigMap with the certificate provided by the CA on every agent, including the Federator and any federated clusters, and name the file kubecost-federator-certs.yaml.

apiVersion: v1
data:
  ca-certificates.crt: |-
    # CA Cert
    -----BEGIN CERTIFICATE-----
    abc . . . . . . . . . . . . . .
    . . . . . . . . . . . . . . . .
    . . . . . . . . . . . . . . . .
    -----END CERTIFICATE-----

    # Root Cert
    -----BEGIN CERTIFICATE-----
    xyz . . . . . . . . . . . . . .
    . . . . . . . . . . . . . . . .
    . . . . . . . . . . . . . . . .
    -----END CERTIFICATE-----

kind: ConfigMap
  name: kubecost-federator-certs
  namespace: kubecost

Now run the following command, making sure you specify the location for the ConfigMap you created:

kubectl create cm kubecost-federator-certs --from-file=/path/to/kubecost-federator-certs.yaml

Mount the certification on the Federator and any federated clusters by passing these Helm flags to your values.yaml/manifest:

extraVolumes:
  - name: kubecost-federator-certs
    configMap:
      name: kubecost-federator-certs
extraVolumeMounts:
  - name: kubecost-federator-certs
    mountPath: /path/to/ca-certificates.crt
    subPath: ca-certificates.crt

federatedETL:
  federator:
    extraVolumes:
      - name: kubecost-federator-certs
        configMap:
          name: kubecost-federator-certs
    extraVolumeMounts:
      - name: kubecost-federator-certs
        mountPath: /path/to/ca-certificates.crt
        subPath: ca-certificates.crt

Create a file federated-store.yaml, which will go on all clusters:

type: S3
config:
  bucket: "kubecost-storage"
  endpoint: <S3 endpoint>
  region: <region>
  aws_sdk_auth: true                                      
  insecure: false
  signature_version2: false
  put_user_metadata:
    "X-Amz-Acl": "bucket-owner-full-control"
  http_config:
    idle_conn_timeout: 90s
    response_header_timeout: 2m
    insecure_skip_verify: false
    tls_config:                                          
      ca_file: "/path/to/ca-certificates.crt"            
      cert_file: "CERT.pem"                              
      key_file: "KEY.PEM"                                 
      insecure_skip_verify: false                         
  trace:
    enable: true
  part_size: 134217728          
  sts_endpoint: <STS endpoint>

Now run the following command (omit kubectl create namespace kubecost if your kubecost namespace already exists, or this command will fail):

kubectl create namespace kubecost
kubectl create secret generic \
  kubecost-object-store -n kubecost \
  --from-file=federated-store.yaml

Multi-Cluster

Primary and secondary clusters

Enterprise Federation

ETL Federation (preferred)

Kubecost ETL Federation diagram

Sample configurations

Clusters

Other components

Federated ETL architecture

Setup

Step 0: Ensure unique cluster IDs

Step 1: Storage configuration

Step 2: Cluster configuration (Federated/Federator)

Step 3: Cluster configuration (Primary)

Step 4: Verifying successful configuration

Setup with internal certificate authority

See also

Data recovery

Repairing ETL

Kubecost Aggregator

Configuring Aggregator

Prerequisites

Tutorial

Validating Aggregator pod is running successfully

Backups and Alerting

Option 1: Increase Prometheus retention

Option 2: Metrics backup

Option 3: Bucket versioning

Option 4: Alerting

Thanos Federation

Configuration

Architecture diagram

Configuring Thanos

Step 1: Create object-store.yaml

Step 2: Create object-store secret

Step 3: Unique Cluster ID

Step 4: Deploying Kubecost with Thanos

Troubleshooting

Cluster not writing data to thanos bucket

Stores not listed at the /stores endpoint

Additional Troubleshooting

Thanos Upgrade

Long-Term Storage Configuration

AWS Multi-Cluster Storage Configuration

AWS/S3 Federation

Method 1: Kubernetes Secret Method

Method 2: Attach IAM role to Service Account Method

Thanos Encryption With S3 and KMS

Troubleshooting

AWS Thanos IAM Policy

Azure Long-Term Storage

GCP Long-Term Storage

Thanos Federation

Configuration

Architecture diagram

Backups and Alerting

Option 1: Increase Prometheus retention

Option 2: Metrics backup

Option 3: Bucket versioning

Option 4: Alerting

Multi-Cluster

Primary and secondary clusters

Enterprise Federation

Kubecost Aggregator

Configuring Aggregator

Prerequisites

Tutorial

Validating Aggregator pod is running successfully

Configuring Thanos

Step 1: Create object-store.yaml

Step 2: Create object-store secret

Step 3: Unique Cluster ID

Step 4: Deploying Kubecost with Thanos

Troubleshooting

Cluster not writing data to thanos bucket

Stores not listed at the /stores endpoint

Additional Troubleshooting

ETL Federation (preferred)

Kubecost ETL Federation diagram

Sample configurations

Stores not listed at the `/stores` endpoint

Stores not listed at the `/stores` endpoint