Kubecost Metrics
Kubecost Cost Model
The Cost Model both exports and consumes the following metrics.
Metric | Description |
---|---|
| Hourly cost per vCPU on this node |
| Hourly cost per GPU on this node |
| Hourly cost per Gb of memory on this node |
| Total node cost per hour |
| Hourly cost of a load balancer |
| Hourly cost paid as a cluster management fee |
| Hourly cost per Gb on a persistent volume |
| Number of GPUs available on node |
| Average number of CPUs requested/used over last 1m |
| Average number of GPUs requested over last 1m |
| Average bytes of RAM requested/used over last 1m |
| Bytes provisioned for a PVC attached to a pod |
| Cloud provider info about node preemptibility |
| Total cost per GB egress across zones |
| Total cost per GB egress across regions |
| Total cost per GB of internet egress |
| Service Selector Labels |
| Deployment Match Labels |
| StatefulSet Match Labels |
| (Created by recording rule) |
Kubecost Network Costs
The Kubecost network-costs DaemonSet collects node network data and exports the egress, ingress, and performance statistics.
Metric | Description |
---|---|
| egressed byte counts by pod |
| ingressed byte counts by pod |
| total parsed conntrack entries |
| total time in milliseconds it took to parse conntrack entries |
cAdvisor
cAdvisor (Container Advisor) provides container users an understanding of the resource usage and performance characteristics of their running containers. It is a running daemon that collects, aggregates, processes, and exports information about running containers.
GitHub: https://github.com/google/cadvisor
Metric | Description |
---|---|
| Current memory usage, including all memory regardless of when it was accessed |
| Number of bytes that can be consumed by the container on this filesystem |
| Number of bytes that are consumed by the container on this filesystem |
| Current working set |
| Cumulative count of bytes received |
| Cumulative count of bytes transmitted |
| Cumulative cpu time consumed |
| Number of elapsed enforcement period intervals |
| Number of throttled period intervals |
Kube-State-Metrics (KSM)
Although the default Kubecost installation does not include a KSM deployment, Kubecost does calculate & emit the below metrics. The below metrics and labels follow conventions of KSMv1, not KSMv2.
Metric | Description |
---|---|
| Number of pods specified for a Deployment |
| Number of pods currently available for a Deployment |
| The number of pods which reached Phase Failed and the reason for failure |
| Kubernetes annotations converted to Prometheus labels |
| Kubernetes labels converted to Prometheus labels |
| Kubernetes labels converted to Prometheus labels |
| The allocatable for different resources of a node that are available for scheduling |
| Total allocatable cpu cores of the node (Deprecated in ksm 2.0.0) |
| Total allocatable memory bytes of the node (Deprecated in ksm 2.0.0) |
| The capacity for different resources of a node |
| Total cpu cores available on the the node (Deprecated in ksm 2.0.0) |
| Total memory available on the node (bytes) (Deprecated in ksm 2.0.0) |
| The condition of a cluster node |
| Total capacity of a persistent volume (bytes) |
| Status of a persistent volume (Bound |
| Information about persistent volume claim |
| The capacity of storage requested by the persistent volume claim |
| Kubernetes annotations converted to Prometheus labels |
| The number of requested limit resource by a container |
| Limit on CPU cores that can be used by the container. (Deprecated in ksm 2.0.0) |
| Limit on the amount of memory that can be used by the container. (Deprecated in ksm 2.0.0) |
| The number of requested request resource by a container |
| The number of container restarts per container |
| Describes whether the container is currently in running state |
| Describes the reason the container is currently in terminated state |
| Kubernetes labels converted to Prometheus labels |
| Information about the Pod's owner |
| The pods current phase (Pending |
| Information about the ReplicaSet's owner |
Node exporter
Prometheus exporter for hardware and OS metrics exposed by *NIX kernels, written in Go with pluggable metric collectors.
GitHub: https://github.com/prometheus/node_exporter
Metric | Description |
---|---|
| Seconds the cpus spent in each mode |
| The total number of reads completed successfully |
| The total number of reads completed successfully |
| The total number of writes completed successfully |
| The total number of writes completed successfully |
| Whether an error occurred while getting statistics for the given device |
| Memory information field Buffers_bytes |
| Memory information field Cached_bytes |
| Memory information field MemAvailable_bytes |
| Memory information field MemFree_bytes |
| Memory information field MemTotal_bytes |
| Network device statistic transmit_bytes |
Prometheus
Prometheus emits metrics which are used by Kubecost for diagnostic purposes:
Metric | Description |
---|---|
| Scrape target status |
| Amount of time between target scrapes |
NVIDIA K8s Device Plugin (GPU)
NVIDIA GPU monitoring support can be explained in more detail on the Kubecost Blog: Monitoring NVIDIA GPU Usage in Kubernetes with Prometheus. The following metrics are consumed:
GitHub: https://github.com/NVIDIA/k8s-device-plugin
Metric | Description |
---|---|
| GPU utilization |
Last updated