Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
If interested in filtering or aggregating by Kubernetes Annotations when using the Allocation API, you will need to enable annotation emission. This will configure your Kubecost installation to generate the kube_pod_annotations
and kube_namespace_annotations
metrics as listed in our Kubecost Metrics doc.
You can enable it in your values.yaml:
You can also enable it via your helm install
or helm upgrade
command:
These flags can be set independently. Setting one of these to true and the other to false will omit one and not the other.
SSO and RBAC are only officially supported on Kubecost Enterprise plans.
Kubecost supports single sign-on (SSO) and role-based access control (RBAC) with SAML 2.0. Kubecost works with most identity providers including Okta, Auth0, Microsoft Entra ID (formerly Azure AD), PingID, and KeyCloak.
User authentication (.Values.saml
): SSO provides a simple mechanism to restrict application access internally and externally
Pre-defined user roles (.Values.saml.rbac
):
admin
: Full control with permissions to manage users, configure model inputs, and application settings.
readonly
: User role with read-only permission.
editor
: Role can change and build alerts and reports, but cannot edit application settings and otherwise functions as read-only.
Custom access roles (filters.json): Limit users based on attributes or group membership to view a set of namespaces, clusters, or other aggregations
All SAML 2.0 providers also work. The above guides can be used as templates for what is required.
When SAML SSO is enabled in Kubecost, ports 9090 and 9003 of service/kubecost-cost-analyzer
will require authentication. Therefore user API requests will need to be authenticated with a token. The token can be obtained by logging into the Kubecost UI and copying the token from the browser’s local storage. Alternatively, a long-term token can be issued to users from your identity provider.
For admins, Kubecost additionally exposes an unauthenticated API on port 9004 of service/kubecost-cost-analyzer
.
You will be able to view your current SAML Group in the Kubecost UI by selecting Settings from the left navigation, then scrolling to 'SAML Group'. Your access level will be displayed in the 'Current SAML Group' box.
Disable SAML and confirm that the cost-analyzer
pod starts.
If step 1 is successful, but the pod is crashing or never enters the ready state when SAML is added, it is likely that there is panic loading or parsing SAML data.
kubectl logs deployment/kubecost-cost-analyzer -c cost-model -n kubecost
If you’re supplying the SAML from the address of an Identity Provider Server, curl
the SAML metadata endpoint from within the Kubecost pod and ensure that a valid XML EntityDescriptor is being returned and downloaded. The response should be in this format:
The URL returns a 404 error or returning HTML
Contact your SAML admin to find the URL on your identity provider that serves the raw XML file.
Returning an EntitiesDescriptor instead of an EntityDescriptor
Certain metadata URLs could potentially return an EntitiesDescriptor, instead of an EntityDescriptor. While Kubecost does not currently support using an EntitiesDescriptor, you can instead copy the EntityDescriptor into a new file you create called metadata.xml:
Download the XML from the metadata URL into a file called metadata.xml
Copy all the attributes from EntitiesDescriptor
to the EntityDescriptor
that are not present.
Remove the <EntitiesDescriptor>
tag from the beginning.
Remove the </EntitiesDescriptor>
from the end of the XML file.
You are left with data in a similar format to the example below:
Then, you can upload the EntityDescriptor to a secret in the same namespace as kubecost and use that directly.
kubectl create secret generic metadata-secret --from-file=./metadata.xml --namespace kubecost
To use this secret, in your helm values set metadataSecretName to the name of the secret created above, and set idpMetadataURL to the empty string:
Invalid NameID format
On Keycloak, if you receive an “Invalid NameID format” error, you should set the option “force nameid format” in Keycloak. See Keycloak docs for more details.
Users of CSI driver for storing SAML secret
For users who want to use CSI driver for storing SAML secret, we suggest this guide.
InvalidNameIDPolicy format
From a PingIdentity article:
An alternative solution is to add an attribute called "SAML_SP_NAME_QUALIFIER" to the connection's attribute contract with a TEXT value of the requested SPNameQualifier. When you do this, select the following for attribute name format:
urn:oasis:names:tc:SAML:1.1:nameid-format:unspecified
On the PingID side: specify an attribute contract “SAML_SP_NAME_QUALIFIER” with the format urn:oasis:names:tc:SAML:1.1:nameid-format:unspecified
.
On the Kubecost side: in your Helm values, set saml.nameIDFormat
to the same format set by PingID:
Make sure audienceURI
and appRootURL
match the entityID configured within PingFed.
OIDC and RBAC are only officially supported on Kubecost Enterprise plans.
The OIDC integration in Kubecost is fulfilled via the .Values.oidc
configuration parameters in the Helm chart.
authURL
may require additional request parameters depending on the provider. Some commonly required parameters are client_id=***
and response_type=code
. Please check the provider documentation for more information.
Please refer to the following references to find out more about how to configure the Helm parameters to suit each OIDC identity provider integration.
Auth0 does not support Introspection; therefore we can only validate the access token by calling /userinfo within our current remote token validation flow. This will cause the Kubecost UI to not function under an Auth0 integration, as it makes a large number of continuous calls to load the various components on the page and the Auth0 /userinfo endpoint is rate limited. Independent calls against Kubecost endpoints (eg. via cURL or Postman) should still be supported.
Once the Kubecost application has been successfully integrated with OIDC, we will expect requests to Kubecost endpoints to contain the JWT access token, either:
As a cookie named token
,
As a cookie named id_token
(Set .Values.oidc.useIDToken = true
),
Or as part of the Authorization header Bearer token
The token is then validated remotely in one of two ways:
POST request to Introspect URL configured by identity provider
If no Introspect URL configured, GET request to /userinfo configured by identity provider
If skipOnlineTokenValidation
is set to true, Kubecost will skip accessing the OIDC introspection endpoint for online token validation and will instead attempt to locally validate the JWT claims.
Setting skipOnlineTokenValidation
to true
will prevent tokens from being manually revoked.
This parameter is only supported if using the Google OAuth 2.0 identity provider
If the hostedDomain
parameter is configured in the Helm chart, the application will deny access to users for which the identified domain is not equal to the specified domain. The domain is read from the hd
claim in the ID token commonly returned alongside the access token.
If the domain is configured alongside the access token, then requests should contain the JWT ID token, either:
As a cookie named id_token
As part of an Identification
header
The JWT ID token must contain a field (claim) named hd
with the desired domain value. We verify that the token has been properly signed (using provider certificates) and has not expired before processing the claim.
To remove a previously set Helm value, you will need to set the value to an empty string: .Values.oidc.hostedDomain = ""
. To validate that the config has been removed, you can check the /var/configs/oidc/oidc.json
inside the cost-model container.
Kubecost's OIDC supports read-only mode. This leverages OIDC for authentication, then assigns all authenticated users as read-only users.
Use your browser's devtools to observe network requests made between you, your Identity Provider, and your Kubecost. Pay close attention to cookies, and headers.
Search for oidc
in your logs to follow events
Pay attention to any WRN
related to OIDC
Search for Token Response
, and try decoding both the access_token
and id_token
to ensure they are well formed (https://jwt.io/)
Code reference for the below example can be found here.
For further assistance, reach out to support@kubecost.com and provide both logs and a HAR file.
You can apply your product key at any time within the product UI or during an install or upgrade process. More details on both options are provided below.
If you have a multi-cluster setup, you only need to apply your product key on the Kubecost primary cluster, and not on any of the Kubecost secondary clusters.
kubecostToken
is a different concept from your product key and is used for managing trial access.
Many Kubecost product configuration options can be specified at install-time, including your product key.
To create a secret you will need to create a JSON file called productkey.json with the following format. Be sure to replace <YOUR_PRODUCT_KEY>
with your Kubecost product key.
Run the following command to create the secret. Replace <SECRET_NAME>
with a name for the secret (example: productkeysecret
):
Update your values.yaml to enable the product key and specify the secret name:
kubecostProductConfigs.productKey.enabled=true
kubecostProductConfigs.productKey.secretname=<SECRET_NAME>
Run a helm upgrade
command to start using your product key.
This specific parameter can be configured under kubecostProductConfigs.productKey.key
in your values.yaml.
You must also set the kubecostProductConfigs.productKey.enabled=true
when using this option. That this will leave your secrets unencrypted in values.yaml. Use a Kubernetes secret as in the previous method to avoid this.
To apply your license key within the Kubecost UI, visit the Overview page, then select Upgrade in the page header.
Next, select Add Key in the dialog menu shown below.
You can then supply your Kubecost provided license key in the input box that is now visible.
To verify that your key has been applied successfully, visit Settings to confirm the final digits are as expected:
SSO and RBAC are only officially supported on Kubecost Enterprise plans.
This guide will show you how to configure Kubecost integrations for SSO and RBAC with Okta.
To enable SSO for Kubecost, this tutorial will show you how to create an application in Okta.
Go to the Okta admin dashboard (https://[your-subdomain]okta.com/admin/dashboard) and select Applications from the left navigation. On the Applications page, select Create App Integration > SAML 2.0 > Next.
On the 'Create SAML Integration' page, provide a name for your app. Feel free to also use this official Kubecost logo for the App logo field. Then, select Next.
Your SSO URL should be your application root URL followed by '/saml/acs', like: https://[your-kubecost-address].com/saml/acs
Your Audience URI (SP Entity ID) should be set to your application root without a trailing slash: https://[your-kubecost-address.com
(Optional) If you intend to use RBAC: under Group Attribute Statements, enter a name (ex: kubecost_group) and a filter based on your group naming standards (example Starts with kubecost_). Then, select Next.
Provide any feedback as needed, then select Finish.
Return to the Applications page, select your newly-created app, then select the Sign On tab. Copy the URL for Identity Provider metadata, and add that value to .Values.saml.idMetadataURL
in this values-saml.yaml file.
To fully configure SAML 2.0, select View Setup Instructions, download the X.509 certificate, and name the file myservice.cert.
Create a secret using the certificate with the following command:
kubectl create secret generic kubecost-okta --from-file myservice.cert --namespace kubecost
For configuring single app logout, read Okta's documentation on the subject. then, update the values.saml:redirectURL
value in your values.yaml file.
Use this Okta document to assign individuals or groups access to your Kubecost application.
Finally, add -f values-saml.yaml
to your Kubecost Helm upgrade command:
At this point, test your SSO to ensure it is working properly before moving on to the next section.
The simplest form of RBAC in Kubecost is to have two groups: admin
and readonly
. If your goal is to simply have these two groups, you do not need to configure filters. This will result in the logs message: file corruption: '%!s(MISSING)'
, but this is expected.
The values-saml.yaml file contains the admin
and readonly
groups in the RBAC section:
The assertionName: "kubecost_group"
value needs to match the name given in Step 5 of the Okta SSO Configuration section.
Filters are used to give visibility to a subset of objects in Kubecost. Examples of the various filters available are in filters.json and filters-examples.json. RBAC filtering is capable of all the same types of filtering features as that of the Allocation API.
It's possible to combine filtering with admin/readonly rights
These filters can be configured using groups or user attributes in your Okta directory. It is also possible to assign filters to specific users. The example below is using groups.
Filtering is configured very similarly to the admin/readonly above. The same group pattern match (kubecost_group) can be used for both, as is the case in this example:
The array of groups obtained during the authorization request will be matched to the subject key in the filters.json:
As an example, we will configure the following:
Admins will have full access to the Kubecost UI and have visibility to all resources
Kubecost users, by default, will not have visibility to any namespace and will be readonly
. If a group doesn't have access to any resources, the Kubecost UI may appear to be broken
The dev-namespaces group will have read only access to the Kubecost UI and only have visibility to namespaces that are prefixed with dev-
or are exactly nginx-ingress
Go to the Okta admin dashboard (https://[your-subdomain]okta.com/admin/dashboard) and select Directory > Groups from the left navigation. On the Groups page, select Add group.
Create groups for kubecost_users, kubecost_admin and kubecost_dev-namespaces by providing each value as the name with an optional description, then select Save. You will need to perform this step three times, one for each group.
Select each group, then select Assign people and add the appropriate users for testing. Select Done to confirm edits to a group. Kubecost admins will be part of both the read only kubecost_users and kubecost_admin groups. Kubecost will assign the most rights if there are conflicts.
Return to the Groups page. Select kubecost_users, then in the Applications tab, assign the Kubecost application. You do not need to assign the other kubecost_ groups to the Kubecost application because all users already have access in the kubecost_users group.
Modify filters.json as depicted above.
Create the ConfigMap using the following command:
You can modify the ConfigMap without restarting any pods.
Generate an X509 certificate and private key. Below is an example using OpenSSL:
openssl genpkey -algorithm RSA -out saml-encryption-key.pem -pkeyopt rsa_keygen_bits:2048
Generate a certificate signing request (CSR)
openssl req -new -key saml-encryption-key.pem -out request.csr
Request your organization's domain owner to sign the certificate, or generate a self-signed certificate:
openssl x509 -req -days 365 -in request.csr -signkey saml-encryption-key.pem -out saml-encryption-cert.cer
Go to your application, then under the General tab, edit the following SAML Settings:
Assertion Encryption: Encrypted
In the Encryption Algorithm box that appears, select AES256-CBC.
Select Browse Files in the Encryption Certificate field and upload an image file of your certifcate.
Create a secret with the certificate. The file name must be saml-encryption-cert.cer.
kubectl create secret generic kubecost-saml-cert --from-file saml-encryption-cert.cer --namespace kubecost
Create a secret with the private key. The file name must be saml-encryption-key.pem.
kubectl create secret generic kubecost-saml-decryption-key --from-file saml-encryption-key.pem --namespace kubecost
Pass the following values via Helm into your values.yaml:
You can view the logs on the cost-model container. In this example, the assumption is that the prefix for Kubecost groups is kubecost_
. This command is currently a work in progress.
kubectl logs deployment/kubecost-cost-analyzer -c cost-model --follow |grep -v -E 'resourceGroup|prometheus-server'|grep -i -E 'group|xmlname|saml|login|audience|kubecost_'
When the group has been matched, you will see:
This is what you should expect to see:
The network costs DaemonSet is an optional utility that gives Kubecost more detail to attribute costs to the correct pods.
When networkCost is enabled, Kubecost gathers pod-level network traffic metrics to allocate network transfer costs to the pod responsible for the traffic.
See this doc for more detail on network cost allocation methodology.
The network costs metrics are collected using a DaemonSet (one pod per node) that uses source and destination detail to determine egress and ingress data transfers by pod and are classified as internet, cross-region and cross-zone.
With the network costs DaemonSet enabled, the Network column on the Allocations page will reflect the portion of network transfer costs based on the chart-level aggregation.
When using Kubecost version 1.99 and above: Greater detail can be accessed through Allocations UI only when aggregating by namespace and selecting the link on that namespace. This opens the namespace detail page where there is a card at the bottom.
A Grafana dashboard is included with the Kubecost installation, but you can also find it in our cost-analyzer-helm-chart repository.
To enable this feature, set the following parameter in values.yaml during or after Helm installation:
You can view a list of common config options in this values.yaml template.
If using Kubecost-bundled Prometheus instance, the scrape is automatically configured.
If you are integrating with an existing Prometheus, you can set networkCosts.prometheusScrape=true
and the network costs service should be auto-discovered.
Alternatively, a serviceMonitor is also available.
You can adjust log level using the extraArgs
config:
The levels range from 0 to 5, with 0 being the least verbose (only showing panics) and 5 being the most verbose (showing trace-level information).
Ref: sig-instrumentation
Service tagging allows Kubecost to identify network activity between the pods and various cloud services (e.g. AWS S3, EC2, RDS, Azure Storage, Google Cloud Storage).
To enable this, set the following Helm values:
In order to reduce resource usage, Kubecost recommends setting a CPU limit on the network costs DaemonSet. This will cause a few seconds of delay during peak usage and does not affect overall accuracy. This is done by default in Kubecost 1.99+.
For existing deployments, these are the recommended values:
The network-simulator was used to real-time simulate updating ConnTrack entries while simultaneously running a cluster simulated network costs instance. To profile the heap, after a warmup of roughly five minutes, a heap profile of 1,000,000 ConnTrack entries was gathered and examined.
Each ConnTrack entry is equivalent to two transport directions, so every ConnTrack entry is two map entries (connections).
After modifications were made to the network costs to parallelize the delta and dispatch, large map comparisons were significantly lighter in memory. The same tests were performed against simulated data with the following footprint results.
The primary source of network metrics is a DaemonSet Pod hosted on each of the nodes in a cluster. Each DaemonSet pod uses hostNetwork: true
such that it can leverage an underlying kernel module to capture network data. Network traffic data is gathered and the destination of any outbound networking is labeled as:
Internet Egress: Network target destination was not identified within the cluster.
Cross Region Egress: Network target destination was identified, but not in the same provider region.
Cross Zone Egress: Network target destination was identified, and was part of the same region but not the same zone.
These classifications are important because they correlate with network costing models for most cloud providers. To see more detail on these metric classifications, you can view pod logs with the following command:
This will show you the top source and destination IP addresses and bytes transferred on the node where this Pod is running. To disable logs, you can set the helm value networkCosts.trafficLogging
to false
.
For traffic routed to addresses outside of your cluster but inside your VPC, Kubecost supports the ability to directly classify network traffic to a particular IP address or CIDR block. This feature can be configured in values.yaml under networkCosts.config
. Classifications are defined as follows:
As of Kubecost 1.101, LoadBalancers that proxy traffic to the Internet (ingresses and gateways) can be specifically classified.
In-zone: A list of destination addresses/ranges that will be classified as in-zone traffic, which is free for most providers.
In-region: A list of addresses/ranges that will be classified as the same region between source and destinations but different zones.
Cross-region: A list of addresses/ranges that will be classified as different regions from the source regions.
Internet: By design, all IP addresses not in a specific list are considered internet. This list can include IPs that would otherwise be "in-zone" or local to be classified as Internet traffic.
The network costs DaemonSet requires a privileged spec.containers[*].securityContext
and hostNetwork: true
in order to leverage an underlying kernel module to capture network data.
Additionally, the network costs DaemonSet mounts to the following directories on the host filesytem. It needs both read & write access. The network costs DaemonSet will only write to the filesystem to enable conntrack
(docs ref)
/proc/net/
/proc/sys/net/netfilter
To verify this feature is functioning properly, you can complete the following steps:
Confirm the kubecost-network-costs
pods are Running. If these Pods are not in a Running state, kubectl describe them and/or view their logs for errors.
Ensure kubecost-networking
target is Up in your Prometheus Targets list. View any visible errors if this target is not Up. You can further verify data is being scrapped by the presence of the kubecost_pod_network_egress_bytes_total
metric in Prometheus.
Verify Network Costs are available in your Kubecost Allocation view. View your browser's Developer Console on this page for any access/permissions errors if costs are not shown.
Failed to locate network pods: Error message is displayed when the Kubecost app is unable to locate the network pods, which we search for by a label that includes our release name. In particular, we depend on the label app=<release-name>-network-costs
to locate the pods. If the app has a blank release name this issue may happen.
Resource usage is a function of unique src and dest IP/port combinations. Most deployments use a small fraction of a CPU and it is also ok to have this Pod CPU throttled. Throttling should increase parse times but should not have other impacts. The following Prometheus metrics are available in v15.3 for determining the scale and the impact of throttling:
kubecost_network_costs_parsed_entries
is the last number of ConnTrack entries parsed kubecost_network_costs_parse_time
is the last recorded parse time
Today this feature is supported on Unix-based images with ConnTrack
Actively tested against GCP, AWS, and Azure
Pods that use hostNetwork share the host IP address
SSO and RBAC are only officially supported on Kubecost Enterprise plans.
This guide will show you how to configure Kubecost integrations for SAML and RBAC with Microsoft Entra ID.
In the Azure Portal, go to the Microsoft Entra ID Overview page and select Enterprise applications in the left navigation underneath Manage.
On the Enterprise applications page, select New application.
On the Browse Microsoft Entra ID Gallery page, select Create your own application and select Create. The 'Create your own application window' opens.
Provide a custom name for your app. Then, select Integrate any other application you don't find in the gallery. Select Create.
Return to the Enterprise applications page from Step 1.2. Find and select your Enterprise application from the table.
Select Properties in the left navigation under Manage to begin editing the application. Start by updating the logo, then select Save. Feel free to use an official Kubecost logo.
Select Users and groups in the left navigation. Assign any users or groups you want to have access to Kubecost, then select Assign.
Select Single sign-on from the left navigation. In the 'Basic SAML Configuration' box, select Edit. Populate both the Identifier and Reply URL with the URL of your Kubecost environment without a trailing slash (ex: http://localhost:9090), then select Save. If your application is using OpenId Connect and OAuth, most of the SSO configuration will have already been completed.
(Optional) If you intend to use RBAC, you also need to add a group claim. Without leaving the SAML-based Sign-on page, select Edit next to Attributes & Claims. Select Add a group claim. Configure your group association, then select Save. The claim name will be used as the assertionName
value in the values-saml.yaml file.
On the SAML-based Sign-on page, in the SAML Certificates box, copy the login of 'App Federation Metadata Url' and add it to your values-saml.yaml as the value of idpMetadataURL
.
In the SAML Certificates box, select the Download link next to Certificate (Base64) to download the X.509 cert. Name the file myservice.cert.
Create a secret using the cert with the following command:
With your existing Helm install command, append -f values-saml.yaml
to the end.
At this point, test your SSO configuration to make sure it works before moving on to the next section. There is a Troubleshooting section at the end of this doc for help if you are experiencing problems.
The simplest form of RBAC in Kubecost is to have two groups: admin and read only. If your goal is to simply have these two groups, you do not need to configure filters. If you do not configure filters, this message in the logs is expected: file corruption: '%!s(MISSING)'
The values-saml.yaml file contains the admin
and readonly
groups in the RBAC section:
Remember the value of assertionName
needs to match the claim name given in Step 2.5 above.
Filters are used to give visibility to a subset of objects in Kubecost. RBAC filtering is capable can filter for any types as the Allocation API. Examples of the various filters available are these files:
These filters can be configured using groups or user attributes in your Entra ID directory. It is also possible to assign filters to specific users. The example below is using groups.
You can combine filtering with admin/read only rights, and it can be configured the same way. The same assertionName
and values will be used, as is the case in this example.
The values-saml.yaml file contains this customGroups
section for filtering:
The array of groups obtained during the authentication request will be matched to the subject key in the filters.yaml. See this example filters.json (linked above) to understand how your created groups will be formatted:
As an example, we will configure the following:
Admins will have full access to the Kubecost UI and have visibility to all resources
Kubecost users, by default, will not have visibility to any namespace and will be read only. If a group doesn't have access to any resources, the Kubecost UI may appear to be broken.
The dev-namespaces group will have read only access to the Kubecost UI and only have visibility to namespaces that are prefixed with dev-
or are exactly nginx-ingress
In the Entra ID left navigation, select Groups. Select New group to create a new group.
For Group type, select Security. Enter a name your group. For this demonstration, create groups for kubecost_users
, kubecost_admin
and kubecost_dev-namespaces
. By selecting No members selected, Azure will pull up a list of all users in your organization for you to add (you can add or remove members after creating the group also). Add all users to the kubecost_users
group, and the appropriate users to each of the other groups for testing. Kubecost admins will be part of both the read only kubecost_users
and kubecost_admin
groups. Kubecost will assign the most rights/least restrictions when there are conflicts.
When you are done, select Create at the bottom of the page. Repeat Steps 1-2 as needed for all groups.
Return to your created Enterprise application and select Users and groups from the left navigation. Select Add user/group. Select and add all relevant groups you created. Then select Assign at the bottom of the page to confirm.
Modify filters.json as depicted above.
Replace {group-object-id-a}
with the Object Id for kubecost_admin
Replace {group-object-id-b}
with the Object Id for kubecost_users
Replace {group-object-id-c}
with the Object Id for kubecost_dev-namespaces
Create the ConfigMap:
You can modify the ConfigMap without restarting any pods.
You can look at the logs on the cost-model container. This script is currently a work in progress.
When the group has been matched, you will see:
This is what a normal output looks like:
Gluu is an open-source Identity and Access Management (IAM) platform that can be used to authenticate and authorize users for applications and services. It can be configured to use the OpenID Connect (OIDC) protocol, which is an authentication layer built on top of OAuth 2.0 that allows applications to verify the identity of users and obtain basic profile information about them.
To configure a Gluu server with OIDC, you will need to install and set up the Gluu server software on a suitable host machine. This will typically involve performing the following steps:
Install the necessary dependencies and packages.
Download and extract the Gluu server software package.
Run the installation script to set up the Gluu server.
Configure the Gluu server by modifying the /etc/gluu/conf/gluu.properties
file and setting the values for various properties, such as the hostname, LDAP bind password, and OAuth keys.
Start the Gluu server by running the /etc/init.d/gluu-serverd start
command.
You can read for more detailed help with these steps.
Note: Later versions of Gluu Server also support deployment to Kubernetes environments. You can read more about their Kubernetes support .
Once the Gluu server is up and running, you can connect it to a Kubecost cluster by performing the following steps:
Obtain the OIDC client ID and client secret for the Gluu server. These can be found in the /etc/gluu/conf/gluu.properties
file under the oxAuthClientId
and oxAuthClientPassword
properties, respectively.
In the Kubecost cluster, create a new OIDC identity provider by running kubectl apply -f oidc-provider.yaml
command, where oidc-provider.yaml is a configuration file that specifies the OIDC client ID and client secret, as well as the issuer URL and authorization and token endpoints for the Gluu server.
In this file, you will need to replace the following placeholders with the appropriate values:
<OIDC_CLIENT_ID>
: The OIDC client ID for the Gluu server. This can be found in the /etc/gluu/conf/gluu.properties
file under the oxAuthClientId
property.
<OIDC_CLIENT_SECRET>
: The OIDC client secret for the Gluu server. This can be found in the /etc/gluu/conf/gluu.properties
file under the oxAuthClientPassword
property.
<GLUU_SERVER_HOSTNAME>
: The hostname of the Gluu server.
<BASE64_ENCODED_OIDC_CLIENT_ID>
: The OIDC client ID, encoded in base64.
<BASE64_ENCODED_OIDC_CLIENT_SECRET>
: The OIDC client secret, encoded in base64.
Set up a Kubernetes service account and bind it to the OIDC identity provider. This can be done by running the kubectl apply -f service-account.yaml
command, where service-account.yaml is a configuration file that specifies the name of the service account and the OIDC identity provider.
In this file, you will need to replace the following placeholders with the appropriate values:
<SERVICE_ACCOUNT_NAME>
: The name of the service account. This can be any name that you choose.
<GLUU_SERVER_HOSTNAME>
: The hostname of the Gluu server.
<OIDC_CLIENT_ID>
: The OIDC client ID for the Gluu server. This can be found in the /etc/gluu/conf/gluu.properties file under the oxAuthClientId
property.
Note: You should also ensure that the
kubernetes.io/oidc-issuer-url
,kubernetes.io/oidc-client-id
,kubernetes.io/oidc-username-claim
, andkubernetes.io/oidc-groups-claim
annotations are set to the correct values for your Gluu server and configuration. These annotations specify the issuer URL and client ID for the OIDC identity provider, as well as the claims to use for the username and group membership of authenticated users.
Once these steps are completed, the Gluu server should be configured to use OIDC and connected to the Kubecost cluster, allowing users to authenticate and authorize themselves using their Gluu credentials.
OIDC is only officially supported on Kubecost Enterprise plans.
This guide will take you through configuring OIDC for Kubecost using a Microsoft Entra ID (formerly Azure AD) integration for SSO and RBAC.
Before following this guide, ensure that:
Kubecost is already installed
Kubecost is accessible via a TLS-enabled ingress
You are established as a in Microsoft. This may otherwise prevent you from accessing certain features required in this tutorial.
In the , select Microsoft Entra ID (Azure AD).
In the left navigation, select Applications > App registrations. Then, on the App registrations page, select New registration.
Select an appropriate name, and provide supported account types for your app.
To configure Redirect URI
, select Web from the dropdown, then provide the URI as https://{your-kubecost-address}/model/oidc/authorize.
Select Register at the bottom of the page to finalize your changes.
After creating your application, you should be taken directly to the app's Overview page. If not, return to the App registrations page, then select the application you just created.
On the Overview page for your application, obtain the Application (client) ID and the Directory (tenant) ID. These will be needed in a later step.
Next to 'Client credentials', select Add a certificate or secret. The 'Certificates & secrets' page opens.
Select New client secret. Provide a description and expiration time, then select Add.
Obtain the value created with your secret.
Add the three saved values, as well as any other values required relating to your Kubecost/Microsoft account details, into the following values.yaml template:
Return to the Overview page for the application you created in Step 1.
Select App roles > Create app role. Provide the following values:
Display name: admin
Allowed member types: Users/Groups
Value: admin
Description: Admins have read/write permissions via the Kubecost frontend (or provide a custom description as needed)
Do you want to enable this app role?: Select the checkbox
Select Apply.
Then, you need to attach the role you just created to users and groups.
In the Azure AD left navigation, select Applications > Enterprise applications. Select the application you created in Step 1.
Select Users & groups.
Select Add user/group. Select the desired group. Select the admin role you created, or another relevant role. Then, select Assign to finalize changes.
Update your existing values.yaml with this template:
Run the following command:
Staging builds for the Kubecost Helm Chart are produced at least daily before changes are moved to production. To upgrade an existing Kubecost Helm Chart deployment to the latest staging build, follow these quick steps:
Add the with the following command:
Upgrade Kubecost to use the staging repo:
Create a new .
Navigate to Realm Settings > General > Endpoints > OpenID Endpoint Configuration > Clients.
Select Create to add Kubecost to the list of clients. Define a clientID
. Ensure the Client Protocol is set to openid-connect
.
Select your newly created client, then go to Settings.
Set Access Type to confidential
.
Set Valid Redirect URIs to http://YOUR_KUBECOST_ADDRESS/model/oidc/authorize
.
Set Base URL to http://YOUR_KUBECOST_ADDRESS
.
The for Keycloak should be as follows:
Kubecost can run on clusters with thousands of nodes when resource consumption is properly tuned. Here's a chart with some of the steps you can take to tune Kubecost, along with descriptions of each.
Cloud cost metrics for all accounts can be pulled in on your primary cluster by pointing Kubecost to one or more management accounts. Therefore, you can disable CloudCost on secondary clusters by setting the following Helm value:
-- cloudCost.enabled=false
This method is only available for AWS cloud billing integrations. Kubecost is capable of tracking each individual cloud billing line item. However on certain accounts this can be quite large. If provider IDs are excluded, Kubecost won't cache granular data. Instead, Kubecost caches aggregate data and make an ad-hoc query to the AWS Cost and Usage Report to get granular data resulting in slow load times but less memory consumption.
--set kubecostModel.maxQueryConcurrency=1
--set kubecostModel.maxPrometheusQueryDurationMinutes=300
Lowering query resolution will reduce memory consumption but will cause short running pods to be sampled and rounded to the nearest interval for their runtime. The default value is: 300s
. This can be tuned with the Helm value:
--set kubecostModel.etlResolutionSeconds=600
--set prometheus.server.nodeExporter.enabled=false
--set prometheus.serviceAccounts.nodeExporter.create=false
Optionally enabling impactful memory thresholds can ensure the Go runtime garbage collector throttles at more aggressive frequencies at or approaching the soft limit. There is not a one-size fits all value here, and users looking to tune the parameters should be aware that lower values may reduce overall performance if setting the value too low. If users set the the resources.requests
memory values appropriately, using the same value for softMemoryLimit
will instruct the Go runtime to keep its heap acquisition and release within the same bounds as the expectations of the pod memory use. This can be tuned with the Helm value:
--set kubecostModel.softMemoryLimit=<Units><B, KiB, MiB, GiB>
The Cluster Controller is currently in beta. Please read the documentation carefully.
Kubecost's Cluster Controller allows you to access additional Savings features through automated processes. To function, the Cluster Controller requires write permission to certain resources on your cluster, and for this reason, the Cluster Controller is disabled by default.
The Cluster Controller enables features like:
The Cluster Controller can be enabled on any cluster type, but certain functionality will only be enabled based on the cloud service provider (CSP) of the cluster and its type:
The Cluster Controller can only be enabled on your primary cluster.
The Controller itself and container RRS are available for all cluster types and configurations.
Cluster turndown, cluster right-sizing, and Kubecost Actions are only available for GKE, EKS, and Kops-on-AWS clusters, after setting up a provider service key.
Therefore, the 'Provider service key setup' section below is optional depending on your cluster environment, but will limit functionality if you choose to skip it. Read the caution banner in the below section for more details.
If you are enabling the Cluster Controller for a GKE/EKS/Kops AWS cluster, follow the specialized instructions for your CSP(s) below. If you aren't using a GKE/EKS Kops AWS cluster, skip ahead to the section below.
You can now enable the Cluster Controller in the Helm chart by finding the clusterController
Helm flag and setting enabled: true
You may also enable via --set
when running Helm install:
You can verify that the Cluster Controller is running by issuing the following:
Once the Cluster Controller has been enabled successfully, you should automatically have access to the listed Savings features.
If you are using one Entra ID app to authenticate multiple Kubecost endpoints, you must to pass an additional redirect_uri
parameter in your authURL
, which will include the URI you configured in Step 1.4. Otherwise, Entra ID may redirect to an incorrect endpoint. You can read more about this in Microsoft Entra ID's . View the example below to see how you should format your URI:
First, you need to configure an admin role for your app. For more information on this step, see .
Use your browser's to observe network requests made between you, your Identity Provider, and Kubecost. Pay close attention to cookies and headers.
Search for oidc
in your logs to follow events. Pay attention to any WRN related to OIDC. Search for Token Response, and try decoding both the access_token
and id_token
to ensure they are well formed. .
You can find more details on these flags in Kubecost's .
allows Kubecost to pull in spend data from your integrated cloud service providers.
Secondary clusters can be configured strictly as metric emitters to save memory. Learn more about how to best configure them in our .
Lowering query concurrency for the Kubecost ETL build will mean ETL takes longer to build, but consumes less memory. The default value is: 5
. This can be adjusted with the :
Lowering query duration results in Kubecost querying for smaller windows when building ETL data. This can lead to slower ETL build times, but lower memory peaks because of the smaller datasets. The default values is: 1440
This can be tuned with the :
Fewer data points scraped from Prometheus means less data to collect and store, at the cost of Kubecost making estimations that possibly miss spikes of usage or short running pods. The default value is: 60s
. This can be tuned in our for the Prometheus scrape job.
Node-exporter is optional. Some health alerts will be disabled if node-exporter is disabled, but savings recommendations and core cost allocation will function normally. This can be disabled with the following :
More info on this environment variable can be found in .
The following command performs the steps required to set up a service account. .
To use , provide the following required parameters:
For EKS cluster provisioning, if using eksctl
, make sure that you use the --managed
option when creating the cluster. Unmanaged node groups should be upgraded to managed. .
Cluster turndown is currently in beta. Please read the documentation carefully.
Cluster turndown is an automated scale down and scaleup of a Kubernetes cluster's backing nodes based on a custom schedule and turndown criteria. This feature can be used to reduce spend during down hours and/or reduce surface area for security reasons. The most common use case is to scale non-production (prod) environments (e.g. development (dev) clusters) to zero during off hours.
If you are upgrading from a pre-1.94 version of the Kubecost Helm chart, you will have to migrate your custom resources. turndownschedules.kubecost.k8s.io
has been changed to turndownschedules.kubecost.com
and finalizers.kubecost.k8s.io
has been changed to finalizers.kubecost.com
. See the TurndownSchedule Migration Guide for an explanation.
Cluster turndown is only available for clusters on GKE, EKS, or Kops-on-AWS.
Enable the Cluster Controller
You will receive full turndown functionality once the Cluster Controller is enabled via a provider service key setup and Helm upgrade. Review the Cluster Controller doc linked above under Prerequisites for more information, then return here when you've confirmed the Cluster Controller is running.
You can verify that the cluster-turndown
pod is running with the following command:
Turndown uses a Kubernetes Custom Resource Definition to create schedules. Here is an example resource located at artifacts/example-schedule.yaml:
This definition will create a schedule that starts by turning down at the designated start
date-time and turning back up at the designated end
date-time. Both the start
and end
times should be in RFC3339 format, i.e. times based on offsets to UTC. There are three possible values for repeat
:
none
: Single schedule turndown and turnup.
daily
: Start and end times will reschedule every 24 hours.
weekly
: Start and end times will reschedule every 7 days.
To create this schedule, you may modify example-schedule.yaml to your desired schedule and run:
Currently, updating a resource is not supported, so if the scheduling of the example-schedule.yaml fails, you will need to delete the resource via:
Then make the modifications to the schedule and re-apply.
The turndownschedule
resource can be listed via kubectl
as well:
or using the shorthand:
Details regarding the status of the turndown schedule can be found by outputting as a JSON or YAML:
The status
field displays the current status of the schedule including next schedule times, specific schedule identifiers, and the overall state of schedule.
state
: The state of the turndown schedule. This can be:
ScheduleSuccess
: The schedule has been set and is waiting to run.
ScheduleFailed
: The scheduling failed due to a schedule already existing, scheduling for a date-time in the past.
ScheduleCompleted
: For schedules with repeat: none, the schedule will move to a completed state after turn up.
current
: The next action to run.
lastUpdated
: The last time the status was updated on the schedule.
nextScaleDownTime
: The next time a turndown will be executed.
nextScaleUpTime: The next time at turn up will be executed.
scaleDownId
: Specific identifier assigned by the internal scheduler for turndown.
scaleUpId
: Specific identifier assigned by the internal scheduler for turn up.
scaleDownMetadata
: Metadata attached to the scaledown job, assigned by the turndown scheduler.
scaleUpMetadata
: Metadata attached to the scale up job, assigned by the turndown scheduler.
A turndown can be canceled before turndown actually happens or after. This is performed by deleting the resource:
Canceling while turndown is currently scaling down or scaling up will result in a delayed cancellation, as the schedule must complete its operation before processing the deletion/cancellation.
If the turndown schedule is canceled between a turndown and turn up, the turn up will occur automatically upon cancellation.
Cluster turndown has limited functionality via the Kubecost UI. To access cluster turndown in the UI, you must first enable Kubecost Actions. Once this is completed, you will be able to create and delete turndown schedules instantaneously for your supported clusters. Read more about turndown's UI functionality in this section of the above Kubecost Actions doc. Review the entire doc for more information on Kubecost Actions functionality and limitations.
The internal scheduler only allows one schedule at a time to be used. Any additional schedule resources created will fail (kubectl get tds -o yaml
will display the status).
Do not attempt to kubectl edit
a turndown schedule. This is currently not supported. Recommended approach for modifying is to delete and then create a new schedule.
There is a 20-minute minimum time window between start and end of turndown schedule.
High availability mode is only officially supported on Kubecost Enterprise plans.
Running Kubecost in high availability (HA) mode is a feature that relies on multiple Kubecost replica pods implementing the ETL Bucket Backup feature combined with a Leader/Follower implementation which ensures that there always exists exactly one leader across all replicas.
The Leader/Follower implementation leverages a coordination.k8s.io/v1
Lease
resource to manage the election of a leader when necessary. To control access of the backup from the ETL pipelines, a RWStorageController
is implemented to ensure the following:
Followers block on all backup reads, and poll bucket storage for any backup reads every 30 seconds.
Followers no-op on any backup writes.
Followers who receive Queries in a backup store will not stack on pending reads, preventing external queries from blocking.
Followers promoted to Leader will drop all locks and receive write privileges.
Leaders behave identically to a single Kubecost install.
In order to enable the leader/follower and HA features, the following must also be configured:
Replicas are set to a value greater than 1
ETL FileStore is Enabled (enabled by default)
ETL Bucket Backup is configured
For example, using our Helm chart, the following is an acceptable configuration:
This can also be done in the values.yaml
file within the chart:
This feature is in currently in alpha. Please read the documentation carefully.
Kubecost's Kubescaler implements continuous request right-sizing: the automatic application of Kubecost's high-fidelity recommendations to your containers' resource requests. This provides an easy way to automatically improve your allocation of cluster resources by improving efficiency.
Kubescaler can be enabled and configured on a per-workload basis so that only the workloads you want edited will be edited.
Kubescaler is part of Cluster Controller, and should be configured after the Cluster Controller is enabled.
Kubescaler is configured on a workload-by-workload basis via annotations. Currently, only deployment workloads are supported.
Annotation | Description | Example(s) |
---|---|---|
Notable Helm values:
Helm value | Description | Example(s) |
---|---|---|
Kubescaler supports:
apps/v1 Deployments
apps/v1 DaemonSets
batch/v1 CronJobs (K8s v1.21+). No attempt will be made to autoscale a CronJob until it has run at least once.
Kubescaler cannot support:
"Uncontrolled" Pods. Learn more here.
Kubescaler will take care of the rest. It will apply the best-available recommended requests to the annotated controller every 11 hours. If the recommended requests exceed the current limits, the update is currently configured to set the request to the current limit.
To check current requests for your Deployments, use the following command:
This feature is only officially supported on Kubecost Enterprise plans.
The following steps allow Kubecost to use custom prices with a CSV pipeline. This feature allows for individual assets (e.g. nodes) to be supplied at unique prices. Common uses are for on-premise clusters, service-providers, or for external enterprise discounts.
Create a CSV file in this format (also in the below table). CSV changes are picked up hourly by default.
EndTimeStamp
: currently unused
InstanceID
: identifier used to match asset
Region
: filter match based on topology.kubernetes.io/region
AssetClass
: node pv, gpu are supported
InstanceIDField
: field in spec or metadata that will contain the relevant InstanceID. For nodes, often spec.providerID , for pv’s often metadata.name
InstanceType
: optional field to define the asset type, e.g. m5.12xlarge
MarketPriceHourly
: hourly price to charge this asset
Version
: field for schema version, currently unused
If the node label topology.kubernetes.io/region is present, it must also be in the Region
column.
This section is only required for nodes with GPUs.
The node the GPU is attached to must be matched by a CSV node price. Typically this will be matched on instance type (node.kubernetes.io/instance-type)
Supported GPU labels are currently:
gpu.nvidia.com/class
nvidia.com/gpu_type
Verification:
Connect to the Kubecost Prometheus: kubectl port-forward --namespace kubecost services/kubecost-cost-analyzer 9090:9090
Run the following query: curl localhost:9090/model/prometheusQuery?query=node_gpu_hourly_cost
You should see output similar to this: {instance="ip-192-168-34-166.us-east-2.compute.internal",instance_type="test.xlarge",node="ip-192-168-34-166.us-east-2.compute.internal",provider_id="aws:///us-east-2b/i-055274d3576800444",region="us-east-2"} 10 | YOUR_HOURLY_COST
Provide a file path for your CSV pricing data in your values.yaml. This path can reference a local PV or an S3 bucket.
Alternatively, mount a ConfigMap with the CSV:
Then set the following Helm values:
For S3 locations, provide file access. Required IAM permissions:
There are two options for adding the credentials to the Kubecost pod:
Service key: Create an S3 service key with the permissions above, then add its ID and access key as a K8s secret:
kubectl create secret generic pricing-schema-access-secret -n kubecost --from-literal=AWS_ACCESS_KEY_ID=id --from-literal=AWS_SECRET_ACCESS_KEY=key
The name of this secret should be the same as csvAccessCredentials
in values.yaml above
AWS IAM (IRSA) service account annotation
Negotiated discounts are applied after cost metrics are written to Prometheus. Discounts will apply to all node pricing data, including pricing data read directly from the custom provider CSV pipeline. Additionally, all discounts can be updated at any time and changes are applied retroactively.
The following logic is used to match node prices accurately:
First, search for an exact match in the CSV pipeline
If an exact match is not available, search for an existing CSV data point that matches region, instanceType, and AssetClass
If neither is available, fall back to pricing estimates
You can check a summary of the number of nodes that have matched with the CSV by visiting /model/pricingSourceCounts. The response is a JSON object of the form:
Cloud provider service keys can be used in various aspects of the Kubecost installation. This includes configuring , , and . While automated IAM authentication via a Kubernetes service account like AWS IRSA is recommended, there are some scenarios where key-based authentication is preferred. When this method is used, rotating the keys at a pre-defined interval is a security best practice. Combinations of these features can be used, and therefore you may need to follow one or more of the below steps.
There are multiple methods for adding cloud provider keys to Kubecost when configuring a cloud integration. This article will cover all three procedures. Be sure to use the same method that was used during the initial installation of Kubecost when rotating keys. See the doc for additional details.
The preferred and most common is via the multi-cloud cloud-integration.json Kubernetes secret.
The second method is to define the appropriate secret in Kubecost's .
The final method to configure keys is via the Kubecost Settings page.
The primary sequence for setting up your key is:
Modify the appropriate Kubernetes secret, Helm value, or update via the Settings page.
Restart the Kubecost cost-analyzer
pod.
Verify the new key is working correctly. Any authentication errors should be present early in the cost-model
container logs from the cost-analyzer
pod. Additionally, you can check the status of the cloud integration in the Kubecost UI via Settings > View Full Diagnostics.
There are two methods for enabling multi-clustering in Kubecost:
Depending on which method you are using, the key rotation process differs.
With Federated ETL objects, storage keys can be provided in two ways. The preferred method is using the secret defined by the Helm value .Values.kubecostModel.federatedStorageConfigSecret
. The alternate method is to re-use the ETL backup secret defined with the .Values.kubecostModel.etlBucketConfigSecret
Helm value.
Update the appropriate Kubernetes secret with the new key on each cluster.
Restart the Kubecost cost-analyzer
pod.
Restart the Kubecost federator
pod.
Verify the new key is working correctly by checking the cost-model
container logs from the cost-analyzer
pod for any object storage authentication errors. Additionally, verify there are no object storage errors in the federator
pod logs.
Update the kubecost-thanos
Kubernetes secret with the new key on each cluster.
Restart the prometheus
server pod installed with Kubecost on all clusters (including the primary cluster) that write data to the Thanos object store. This will ensure the Thanos sidecar has the new key.
On the primary Kubecost cluster, restart the thanos-store
pod.
Verify the new key is working correctly by checking the thanos-sidecar
logs in the prometheus
server pods for authentication errors to ensure they are able to write new block data to the object storage.
Verify the new key is working correctly by checking thanos-store
pod logs on the primary cluster for authentication errors to ensure it is able to read block data from the object storage.
Modify the appropriate Kubernetes secret.
Restart the Kubecost cost-analyzer
pod.
Verify the backups are still being written to the object storage.
The metrics listed below are emitted by Kubecost and scraped by Prometheus to help monitor the status of Kubecost data pipelines:
kubecost_allocation_data_status
, which presents the active allocation data's time series status
kubecost_asset_data_status
, which presents the time series status of the active asset data
These metrics provide data status through to proactively alert and analyze the allocation and asset data at a point in time.
The metrics below depict the status of active allocation data at a point in time. The resolution is either daily or hourly, which aligns one-to-one with the data status of allocation daily and hourly store. Each hourly and daily stores have four types of status
Empty: Depicts the total number of empty allocationSet in each store hourly or daily at a point in time.
Error: Depicts the total number of errors in the allocationSet in each store hourly or daily at a point in time.
Success: Depicts the total number of successful allocationSet in each store hourly or daily at a point in time.
Warning: Depicts the total number of warnings in all allocationSet in each store hourly or daily at a point in time.
The metrics below depict the status of active asset data at a point in time. The resolution is either daily or hourly, which aligns one-to-one with the data status of asset daily and hourly store. Each hourly and daily stores have four types of status
Empty: Depicts the total number of empty assetSet in each store hourly or daily at a point in time.
Error: Depicts the total number of errors in the assetSet in each store hourly or daily at a point in time.
Success: Depicts the total number of successful assetSet in each store hourly or daily at a point in time.
Warning: Depicts the total number of warnings in all assetSet in each store hourly or daily at a point in time.
kubecost_asset_data_status
is written to Prometheus during the assetSet and assetLoad events.
kubecost_allocation_data_status
is written to Prometheus during the allocationSet and allocationLoad events.
During the cleanup operation, the corresponding entries of each allocation and asset are deleted to avoid the metrics having those particular entries having parity with respective allocation and asset stores.
Availability Tiers impact capacity recommendations, health ratings and more in the Kubecost product. As an example, production jobs receive higher resource request recommendations than dev workloads. Another example is health scores for high availability workloads are heavily penalized for not having multiple replicas available.
Today our product supports the following tiers:
Tier | Priority | Default |
---|
To apply a namespace tier, add a tier
namespace label to reflect the desired value.
Kubecost can run on clusters with mixed Linux and Windows nodes. The Kubecost pods must run on a Linux node.
When using a Helm install, this can be done simply with:
The cluster must have at least one Linux node for the Kubecost pods to run on:
Use a nodeSelector for all Kubecost deployments:
For DaemonSets, set the affinity to only allow scheduling on Windows nodes:
Collecting data about Windows nodes is supported by Kubecost as of v1.93.0.
Accurate node and pod data exists by default, since they come from the Kubernetes API.
Kubecost requires cAdvisor for pod utilization data to determine costs at the container level.
Currently, for pods on Windows nodes: pods will be billed based on request size.
In v1.94 of Kubecost, the turndownschedules.kubecost.k8s.io/v1alpha1
Custom Resource Definition (CRD) was to turndownschedules.kubecost.com/v1alpha1
to adhere to . This is a breaking change for users of Cluster Controller's turndown functionality. Please follow this guide for a successful migration of your turndown schedule resources.
Note: As part of this change, the CRD was updated to use
apiextensions.k8s.io/v1
becausev1beta1
was removed in K8s v1.22. If using Kubecost v1.94+, Cluster Controller's turndown functionality will not work on K8s versions before the introduction ofapiextensions.k8s.io/v1
.
In this situation, you've deployed Kubecost's Cluster Controller at some point using --set clusterController.enabled=true
, but you don't use the turndown functionality.
That means that this command should return one line:
And this command should return no resources:
This situation is easy! You can do nothing, and turndown should continue to behave correctly because kubectl get turndownschedule
and related commands will correctly default to the new turndownschedules.kubecost.com/v1alpha1
CRD after you upgrade to Kubecost v1.94 or higher.
If you would like to be fastidious and clean up the old CRD, simply run kubectl delete crd turndownschedules.kubecost.k8s.io
after upgrading Kubecost to v1.94 or higher.
In this situation, you've deployed Kubecost's Cluster Controller at some point using --set clusterController.enabled=true
and you have at least one turndownschedule.kubecost.k8s.io
resource currently present in your cluster.
That means that this command should return one line:
And this command should return at least one resource:
We have a few steps to perform if you want Cluster Controller's turndown functionality to continue to behave according to your already-defined turndown schedules.
Upgrade Kubecost to v1.94 or higher with --set clusterController.enabled=true
Make sure the new CRD has been defined after your Kubecost upgrade
This command should return a line:
Copy your existing turndownschedules.kubecost.k8s.io
resources into the new CRD
(optional) Delete the old turndownschedules.kubecost.k8s.io
CRD
Note: The following command may be unnecessary because Helm should automatically remove the
turndownschedules.kubecost.k8s.io
resource during the upgrade. The removal will remain in a pending state until the finalizer patch above is implemented.
Thanos federation makes use of the kubecost-thanos
Kubernetes secret as described .
ETL backups rely on the secret defined by the Helm value .Values.kubecostModel.etlBucketConfigSecret
. More details can be found on the .
See the list of all deployments and DaemonSets in this file.
Because the CRDs have a finalizer on them, we have to follow to remove the finalizer from our old resources. This lets us clean up without locking up.
request.autoscaling.kubecost.com/enabled
Whether to autoscale the workload. See note on KUBESCALER_RESIZE_ALL_DEFAULT
.
true
, false
request.autoscaling.kubecost.com/frequencyMinutes
How often to autoscale the workload, in minutes. If unset, a conservative default is used.
73
request.autoscaling.kubecost.com/scheduleStart
Optional augmentation to the frequency parameter. If both are set, the workload will be resized on the scheduled frequency, aligned to the start. If frequency is 24h and the start is midnight, the workload will be rescheduled at (about) midnight every day. Formatted as RFC3339.
2022-11-28T00:00:00Z
cpu.request.autoscaling.kubecost.com/targetUtilization
Target utilization (CPU) for the recommendation algorithm. If unset, the backing recommendation service's default is used.
0.8
memory.request.autoscaling.kubecost.com/targetUtilization
Target utilization (Memory/RAM) for the recommendation algorithm. If unset, the backing recommendation service's default is used.
0.8
request.autoscaling.kubecost.com/recommendationQueryWindow
Value of the window
parameter to be used when acquiring recommendations. See Request sizing API for explanation of window parameter. If setting up autoscaling for a CronJob, it is strongly recommended to set this to a value greater than the duration between Job runs. For example, if you have a weekly CronJob, this parameter should be set to a value greater than 7d
to ensure a recommendation is available.
2d
clusterController.kubescaler.resizeAllDefault
If true, Kubescaler will switch to default-enabled for all workloads unless they are annotated with request.autoscaling.kubecost.com/enabled=false
. This is recommended for low-stakes clusters where you want to prioritize workload efficiency without reworking deployment specs for all workloads.
true
| 0 | If true, recommendations and health scores heavily prioritize availability. This is the default tier if none is supplied. |
| 1 | Intended for production jobs that are not necessarily mission-critical. |
| 2 | Meant for experimental or development resources. Redundancy or availability is not a high priority. |