Kubecost diagnostics run a series of tests to determine if resources necessary for accurate cost reporting are available.
You can access the Diagnostics page in the Kubecost UI by selecting Settings from the left navigation, then selecting View Full Diagnostics.
cAdvisor metrics are generated by cAdvisor directly and are required for core application functionality, including Kubecost Allocation and Savings insights.
A limited set of KSM data is required for core application functionality, including Kubecost Allocation and Savings insights.
Kubecost metrics are generated by the kubecost-cost-analyzer
pod and are required for the core application to function. Specifically, these metrics are used for Kubecost Allocations, Assets, and Savings functionality.
Node exporter metrics are used for the following features:
Reserved Instance Recommendations in Savings
Showing a compute 'breakdown' on Overview's Resource Efficiency graph, i.e. system vs idle vs user. The Compute bar on this graph will appear as a single solid-colored bar when this diagnostic is failing.
Various Kubecost Grafana dashboards
These metrics are not used in the core Assets and Allocation and therefore can be considered optional. Learn how to disable here.
If any of the above diagnostic tests fail, view the How to Troubleshoot Missing Metrics section below.
Kubecost requests kube-state-metrics >= v1.6.0
. This version check is completed by verifying the existence of the kube_persistentvolume_capacity_bytes
metric. If this diagnostic test is failing, we recommend you:
Confirm kube-state-metrics version requirement is met.
Verify this, and potentially other, kube-state-metrics metrics are not being dropped with Prometheus relabel rules.
Determine if no persistent volumes are present in this cluster. If so, you can ignore this diagnostic check.
A diagnostic view is provided for both the Allocation and Assets pipelines and is designed to assist in diagnosing missing data found in the Allocation or Assets views. Kubecost's ETL pipelines run in the background to build a daily composition of the data required to build the cost model. For each day the data is collected, a file is written to disk containing the results. These files are used as both a cache and data backup, which the diagnostic view displays:
In the event of a problem, the diagnostic view would help you identify specific days where the ETL pipeline failed to collect data.
The file on Nov 20, 2020
in the above image appears in red. This is because the data in this file has been flagged by our diagnostics page as empty (failed to pass a minimum size threshold). This could happen if the database was temporarily unavailable while building that day.
The ETL pipelines provide a way to repair a specific day in the pipeline using the following URL:
In order to repair the file for the problematic date above (Note it's for Allocation), navigate to the following in a browser:
Previous versions of Kubecost (1.81.0 and prior) provided a similar repair feature under the /rebuild
endpoint by passing a window:
Once cloud integrations have been set up, Each Cloud Store will have its own diagnostic view which will include its provider key in the title. This view will include the Cloud Connection Status and metrics for the Reconciliation and Cloud Asset Processes of that provider including:
Coverage: The window of time that the historical subprocess has covered
LastRun: The last time that the process ran, updates each time the periodic subprocess runs
NextRun: Next scheduled run of the periodic subprocess
Progress: Ratio of Coverage to Total amount of time to be covered
RefreshRate: The interval that the periodic subprocess runs
Resolution: The size of the assets being retrieved
StartTime: When the Cloud Process was started
For more information about Cloud Integration and related APIs, read the cloud-integration documentation.
Below are the minimum required versions:
Confirm that each pod is in a Running
state for the particular metric exporter.
You can see this information directly on the Kubecost Diagnostics page (screenshot below) or by visiting your Prometheus console and then Status > Targets in the top navigation bar.
If the necessary scrape target is not added to your Prometheus, then refer to this resource to learn how to add a new job under your Prometheus scrape_configs
block. You can visit <your-prometheus-console-url>/config
to view the current scrape_configs block being passed to your Prometheus.
You can see information on recent Prometheus scrape errors directly on the Kubecost Diagnostics page when present or by visiting your Prometheus console and then Status > Targets in the top navigation bar.
Contact support@kubecost.com or send a message in our Slack workspace if you encounter an error that you do not recognize.
If metrics are being collected on a supported version of the desired metrics exporter, the final step is to verify that individual metrics are not being dropped in your Prometheus pipeline. This could be in the form of an add or rule under a drop metric_relabel_configs
block in your Prometheus .yaml configuration files.