Kubecost leverages the open-source Prometheus project as a time series database and post-processes the data in Prometheus to perform cost allocation calculations and provide optimization insights for your Kubernetes clusters such as Amazon Elastic Kubernetes Service (Amazon EKS). Prometheus is a single machine statically-resourced container, so depending on your cluster size or when your cluster scales out, it could exceed the scraping capabilities of a single Prometheus server. In collaboration with Amazon Web Services (AWS), Kubecost integrates with Amazon Managed Service for Prometheus (AMP), a managed Prometheus-compatible monitoring service, to enable the customer to easily monitor Kubernetes cost at scale.
The architecture of this integration is similar to Amazon EKS cost monitoring with Kubecost, which is described in the previous blog post, with some enhancements as follows:
In this integration, an additional AWS SigV4 container is added to the cost-analyzer pod, acting as a proxy to help query metrics from Amazon Managed Service for Prometheus using the AWS SigV4 signing process. It enables passwordless authentication to reduce the risk of exposing your AWS credentials.
When the Amazon Managed Service for Prometheus integration is enabled, the bundled Prometheus server in the Kubecost Helm Chart is configured in the remote_write mode. The bundled Prometheus server sends the collected metrics to Amazon Managed Service for Prometheus using the AWS SigV4 signing process. All metrics and data are stored in Amazon Managed Service for Prometheus, and Kubecost queries the metrics directly from Amazon Managed Service for Prometheus instead of the bundled Prometheus. It helps customers not worry about maintaining and scaling the local Prometheus instance.
There are two architectures you can deploy:
The Quick-Start architecture supports a small multi-cluster setup of up to 100 clusters.
The Federated architecture supports a large multi-cluster setup for over 100 clusters.
Quick-Start architecture
The infrastructure can manageup to 100 clusters. The following architecture diagram illustrates the small-scale infrastructure setup:
Federated architecture
To support the large-scale infrastructure of over 100 clusters, Kubecost leverages a Federated ETL architecture. In addition to Amazon Prometheus Workspace, Kubecost stores its extract, transform, and load (ETL) data in a central S3 bucket. Kubecost's ETL data is a computed cache based on Prometheus's metrics, from which users can perform all possible Kubecost queries. By storing the ETL data on an S3 bucket, this integration offers resiliency to your cost allocation data, improves the performance and enables high availability architecture for your Kubecost setup.
The following architecture diagram illustrates the large-scale infrastructure setup:
Instructions
Prerequisites
You have an existing AWS account. You have IAM credentials to create Amazon Managed Service for Prometheus and IAM roles programmatically. You have an existing Amazon EKS cluster with OIDC enabled. Your Amazon EKS clusters have Amazon EBS CSI driver installed
Create Amazon Managed Service for Prometheus workspace:
Step 1: Run the following command to get the information of your current EKS cluster:
Step 1: Set environment variables for integrating Kubecost with Amazon Managed Service for Prometheus
Run the following command to set environment variables for integrating Kubecost with Amazon Managed Service for Prometheus:
export RELEASE="kubecost"export YOUR_CLUSTER_NAME=<YOUR_EKS_CLUSTER_NAME>export AWS_REGION=${AWS_REGION}export VERSION="{X.XXX.X}"export KC_BUCKET="kubecost-etl-metrics"# Remove this line if you want to set up small-scale infrastructureexport AWS_ACCOUNT_ID=$(awsstsget-caller-identity--queryAccount--outputtext)export REMOTEWRITEURL="https://aps-workspaces.${AWS_REGION}.amazonaws.com/workspaces/${AMP_WORKSPACE_ID}/api/v1/remote_write"export QUERYURL="http://localhost:8005/workspaces/${AMP_WORKSPACE_ID}"
Step 2: Set up S3 bucket, IAM policy and Kubernetes secret for storing Kubecost ETL files
Note: You can ignore Step 2 for the small-scale infrastructure setup.
a. Create Object store S3 bucket to store Kubecost ETL metrics. Run the following command in your workspace:
awss3mbs3://${KC_BUCKET}
b. Create IAM Policy to grant access to the S3 bucket. The following policy is for demo purposes only. You may need to consult your security team and make appropriate changes depending on your organization's requirements.
c. Create Kubernetes secret to allow Kubecost to write ETL files to the S3 bucket. Run the following command in your workspace:
# create manifest file for the secretcat<<EOF>federated-store.yamltype: S3config: bucket: "${KC_BUCKET}" endpoint: "s3.amazonaws.com" region: "${AWS_REGION}" insecure: false signature_version2: false put_user_metadata: "X-Amz-Acl": "bucket-owner-full-control" http_config: idle_conn_timeout: 90s response_header_timeout: 2m insecure_skip_verify: false trace: enable: true part_size: 134217728EOF# create Kubecost namespace and the secret from the manifest filekubectl create namespace ${RELEASE}kubectl create secret generic \ kubecost-object-store -n ${RELEASE} \ --from-file federated-store.yaml
Step 3: Set up IRSA to allow Kubecost and Prometheus to read & write metrics from Amazon Managed Service for Prometheus
These following commands help to automate the following tasks:
Create an IAM role with the AWS-managed IAM policy and trusted policy for the following service accounts: kubecost-cost-analyzer-amp, kubecost-prometheus-server-amp.
Modify current K8s service accounts with annotation to attach a new IAM role.
Integrating Kubecost with Amazon Managed Service for Prometheus
Preparing the configuration file
Run the following command to create a file called config-values.yaml, which contains the defaults that Kubecost will use for connecting to your Amazon Managed Service for Prometheus workspace.
These installation steps are similar to those for a primary cluster setup, except you do not need to follow the steps in the section "Create Amazon Managed Service for Prometheus workspace", and you need to update these environment variables below to match with your additional clusters. Please note that the AMP_WORKSPACE_ID and KC_BUCKET are the same as the primary cluster.
You can add these recording rules to improve the performance. Recording rules allow you to precompute frequently needed or computationally expensive expressions and save their results as a new set of time series. Querying the precomputed result is often much faster than running the original expression every time it is needed. Follow these instructions to add the following rules:
groups: - name:CPUrules: - expr:sum(rate(container_cpu_usage_seconds_total{container_name!=""}[5m]))record:cluster:cpu_usage:rate5m - expr:rate(container_cpu_usage_seconds_total{container_name!=""}[5m])record:cluster:cpu_usage_nosum:rate5m - expr:avg(irate(container_cpu_usage_seconds_total{container_name!="POD", container_name!=""}[5m])) by (container_name,pod_name,namespace)record:kubecost_container_cpu_usage_irate - expr:sum(container_memory_working_set_bytes{container_name!="POD",container_name!=""}) by (container_name,pod_name,namespace)record:kubecost_container_memory_working_set_bytes - expr:sum(container_memory_working_set_bytes{container_name!="POD",container_name!=""})record:kubecost_cluster_memory_working_set_bytes - name:Savingsrules: - expr:sum(avg(kube_pod_owner{owner_kind!="DaemonSet"}) by (pod) * sum(container_cpu_allocation) by (pod))record:kubecost_savings_cpu_allocationlabels:daemonset:"false" - expr:sum(avg(kube_pod_owner{owner_kind="DaemonSet"}) by (pod) * sum(container_cpu_allocation) by (pod)) / sum(kube_node_info)record:kubecost_savings_cpu_allocationlabels:daemonset:"true" - expr:sum(avg(kube_pod_owner{owner_kind!="DaemonSet"}) by (pod) * sum(container_memory_allocation_bytes) by (pod))record:kubecost_savings_memory_allocation_byteslabels:daemonset:"false" - expr:sum(avg(kube_pod_owner{owner_kind="DaemonSet"}) by (pod) * sum(container_memory_allocation_bytes) by (pod)) / sum(kube_node_info)record:kubecost_savings_memory_allocation_byteslabels:daemonset:"true"
Troubleshooting
The below queries must return data for Kubecost to calculate costs correctly.
For the queries below to work, set the environment variables:
Verify connection to AMP and that the metric for container_memory_working_set_bytes is available:
If you have set kubecostModel.promClusterIDLabel, you will need to change the query (CLUSTER_ID) to match the label (typically cluster or alpha_eksctl_io_cluster_name).