Overview
2.7 is considered a production release with a focus on:
- Diagnostics: ability to quickly see the health of all agents in multi-cluster environments
- Optimization: See efficiency reports by label and save custom performance profiles to request sizing
- Collections: Share collections across categories
Known Issues
- On net new installations, diagnostics UI will appear broken for 15 minutes.
- The Diagnostics UI is new and may display false positive issues. Use the jump menu to navigate and drill into each cluster to better understand the impact of a given issue.
- Grafana and Prometheus have unfixed CVEs. These will be patched when upstream projects release GA fixes.
Features
- Beta: Allow continuous container rightsizing from agent (secondary) clusters with service accounts by @cliffcolvin in #3784
- Collections sharing: Collections can now be shared across a category.
- Multi-cluster diagnostics: improved visibility into the health of Kubecost installations on secondary clusters from the primary UI.
- Change behavior of cost-model to panic when certain misconfigurations are detected. Panic log will point users to fix specific issue. Plus other log fixes.
- Collect all node labels by default by @avrodrigues5 in #3892
- Efficiency reports can now be aggregated by workload label in the "Idle By Type" view
Fixes
- Various image bumps to resolve known CVEs, see current image list below
- Fix: An issue with ETL Health Checks in diagnostics.
- Fix: An issue where discounts were not properly getting calculated.
- Fix: Add /diagnostic/nodeCount to aggregator.
- Fix: An issue with Idle Load Balancers.
- Fix: An issue where agent appears healthy when node_total_hourly_cost is not setup in Prometheus scrape.
- Fix: An issue with uppercase labels in autocomplete queries for assets.
- Fix: Add first resolved date to multi-cluster diagnostics.
- Fix: Warning response when failing clusters in cluster sizing.
- Fix: Add master API key for auth integrations.
- Fix: Network Insights Ingestion Panics.
- Fix: Cluster controller helm template by @mittal-ishaan in #3835 and #3886
- Fix: grafana dashboard templating by @takirala in #3826
- Prevent installing Grafana dashboards when Grafana is disabled by @chipzoller in #3911
- Fix forecasting helm to use consistent image/tags by @ivankube in #3863
- Streamline EKS container image pointers in https://github.com/kubecost/cost-analyzer-helm-chart/blob/v2.7.0/cost-analyzer/values-eks-cost-monitoring.yaml
- Hide bell by default by @jessegoodier in #3974
- Fix a cluster controller issue with timeouts on service startup trying to hit Kubecost api.
- Make continuous request sizing work on secondary cluster hitting Kubecost API url with a service token.
- Clean up many noisy logs in cluster controller.
- Fix a bug where help text for request sizing quantile algorithm showed percentages as a fraction of 1% instead of a fraction of 100%.
- Update network card on allocation details page to indicate that the card is primary-cluster-only. Previously confusing because the rest of the page is mutlicluster.
- Fix an issue where filtering to a specific cluster on the Savings page was not possible.
- Fix an issue where in some cases, the Bell icon in the upper-right corner of pages was shown, even in environments that had disabled it.
- Fix and issue where the Reserved Instances page would crash due to an API change.
- All documentation links updated to reference IBM documentation site.
- Fix an issue where clicking through to a cluster from the Clusters page to Allocations would populate a deprecated parameter in the link.
*Fix a bug where toggling to a different cluster from the Cluster Detail page would not update the Budgets widget.
Other changes
- Several images are now based on UBI-9 minimal.
- Add text to the Settings page discouraging use of Share Idle By Cluster in Collections. The setting is deprecated and will be removed in a future release.
- Remove the "New" tag from External costs and Efficiency Reports navigation items.
Helm Changes
- Do not put the Kubecost logo behind auth by @nealormsbee in #3837
- Fix teams enablement condition by @kaelanspatel in #3855
- Add options to set additional nginx headers by @nealormsbee in #3817
- Remove thread limit on read database by @biancaburtoiu in #3845* Fix repo for kubecost-modeling by @thomasvn in #3880
- Update kubecost.yaml and cost-analyzer/values.yaml to target v0.1.22 of the kubecost-modeling image by @nealormsbee in #3879
- Add category/chargeback nginx routes by @nickcurie in #3900
- [KCM-3289] Collections 3.0: Add /query/chargeback/total and /timeseries by @biancaburtoiu in #3899
- [KCM-3366] Restrict CORS headers by @nealormsbee in #3904
- Fix teams configmap from values by @kaelanspatel in #3908
- nginx routing for category sharing endpoints by @nickcurie in #3906
- KCM-3462: allow user to provide custom labels to append each provider's default nodegroup label using helm to identify nodegroups in their setup by @avrodrigues5 in #3901
- bump kubecost-network-costs 0.17.8, bump kubecost-modeling 0.1.23 by @cliffcolvin in #3912
- Bump cluster-controller to v0.16.14 by @thomasvn in #3913
- Bump prometheus/prometheus from v3.2.0 to v3.2.1 in /cost-analyzer by @dependabot in #3915
- Bump grafana/grafana from 11.5.1 to 11.5.2 in /cost-analyzer by @dependabot in #3914
- Teams config example by @jessegoodier in #3910
- Set runAsUser=0 for network-costs daemonset by @thomasvn in #3916
- Default-enable Multi Cluster Diagnostics by @thomasvn in #3397 and #3933
- Request Sizing Profiles saved on PV and Expose GET, POST, and DELETE Operations with Values from Helm by @avrodrigues5 in #3922
- Remove unnecessary nginx route after code changes in KCM branch by @avrodrigues5 in #3937
- Add master API key for specific api endpoints (likely not needed for most use cases) by @kaelanspatel in #3925
- Remove GCP Marketplace install from Helm chart by @thomasvn in #3941
- Remove tags from eks values file by @jessegoodier in #3958
- Replace docs links with IBM in #3971
- Remove hostPort from cluster-controller in #3975
Full Changelog: v2.6.5...v2.7.0
Opencost
- [#3061] Remove node-label allowlist, so that all node labels can be tied to workloads.
- [#3080] Add Network Insights to Opencost.
- [#3085] Fix duplicate labels error in emitted metrics.
- [#3086] Add metric collectors for promless storage.
Image List
- gcr.io/kubecost1/cost-model:prod-2.7.0
- gcr.io/kubecost1/frontend:prod-2.7.0
- gcr.io/kubecost1/cluster-controller:v0.16.16
- gcr.io/kubecost1/kubecost-modeling:v0.1.24
- gcr.io/kubecost1/kubecost-network-costs:v0.17.9
- ghcr.io/kiwigrid/k8s-sidecar:1.30.2
- grafana/grafana:11.5.2
- prom/node-exporter:v1.9.0
- quay.io/prometheus-operator/prometheus-config-reloader:v0.81.0
- quay.io/prometheus/alertmanager:v0.28.1
- quay.io/prometheus/prometheus:v3.2.1