Container Request Right Sizing Recommendation API (V2)
Container Request Right Sizing Recommendation API (V2)
GET
http://<kubecost-address>/model/savings/requestSizingV2
The container request right sizing recommendation API provides recommendations for container resource requests based on configurable parameters and estimates the savings from implementing those recommendations on a per-container, per-controller level. If the cluster-level resources stay static, then there may not be significant savings from applying Kubecost's recommendations until you reduce your cluster resources. Instead, your idle allocation will increase.
Query Parameters
algorithmCPU
string
The algorithm to be used to calculate CPU recommendations based on historical CPU usage data. Options are max
and quantile
. Max recommendations are based on the maximum-observed usage in window
. Quantile recommendations are based on a quantile of observed usage in window
(requires the qCPU
parameter to set the desired quantile). Defaults to max
. To use the quantile
algorithm, the ContainerStats Pipeline must be enabled.
algorithmRAM
string
Like algorithmCPU
, but for RAM recommendations.
qCPU
float in the range (0, 1]
The desired quantile to base CPU recommendations on. Only used if algorithmCPU=quantile
. Note: a quantile of 0.95
is the same as a 95th percentile.
qRAM
float in the range (0, 1]
Like qCPU
, but for RAM recommendations.
targetCPUUtilization
float in the range (0, 1]
A ratio of headroom on the base recommended CPU request. If the base recommendation is 100 mCPU and this parameter is 0.8
, the recommended CPU request will be 100 / 0.8 = 125
mCPU. Defaults to 0.7
. Inputs that fail to parse (see Go docs here) will default to 0.7
.
targetRAMUtilization
float in the range (0, 1]
Calculated like targetCPUUtilization
.
minRecCPUMillicores
float
Lower bound, in millicores, of the CPU recommendation. Defaults to 10. Be careful when modifying below 10 for the following reason. Kubernetes currently recommends a maximum of 110 pods per node. A 10m minimum recommendation allows close to that (if all nodes are single core) while also being a round number.
minRecRAMBytes
float
Lower bound, in bytes, of the RAM recommendation. Defaults to 20MiB (20 * 1024 * 1024).
window*
string
Required parameter. Duration of time over which to calculate usage. Supports days before the current time in the following format:
3d
. Note: Hourly windows are not currently supported. Note: It's recommended to provide a window greater than 2d
. See the Allocation API documentation for more a more detailed explanation of valid inputs to window
.
filter
string
A filter to reduce the set of workloads for which recommendations will be calculated. See our Filter Parameters doc for syntax. v1 filters are also supported.
sortBy
string
Column to sort the response by. Defaults to totalSavings
. Options are totalSavings
, currentEfficiency
, cpuRecommended
, cpuLatest
, memoryRecommended
, and memoryLatest
.
sortByOrder
string
Order to sort by. Defaults to descending
. Options are descending
and ascending
.
includeLabelsAndAnnotations
boolean
Displays all labels and annotations associated with each container request when set to true
. Default is false
.
API examples
Querying with /topline
endpoint to view cost totals across query (Aggregator only)
/topline
endpoint to view cost totals across query (Aggregator only)/topline
is an optional API endpoint which can be added to your right-sizing query via .../savings/RequestSizingV2/topline...
to provide a condensed overview of all items sampled. TotalMonthlySavings
is the total estimated savings value from adopting right-sizing recommendations. Count
refers to the number of items sampled. Recommendations
should return null
, as it is unable to provide a universal right-sizing recommendation.
Recommendation methodology
The "base" recommendation is calculated from the observed usage of each resource per unique container spec (e.g. a 2-replica, 3-container deployment will have 3 recommendations: one for each container spec).
Say you have a single-container deployment with two replicas: A and B.
A's container had peak usages of 120 mCPU and 300 MiB of RAM.
B's container had peak usages of 800 mCPU and 120 MiB of RAM.
The max algorithm recommendation for the deployment's container will be 800 mCPU and 300 MiB of RAM. Overhead will be added to the base recommendation according to the target utilization parameters as described above.
Applying your request sizing recommendations
After providing you with right sizing recommendations, Kubecost can additionally directly implement these recommendations into your environment. For more information, see the Container Request Recommendation Apply/Plan APIs doc.
Savings projection methodology
See V1 docs.
Last updated