Bug Fixes
🐞 Unquoted cluster name could cause deployment failures 🐞
Fixes a bug raised in #193 where some cluster names that helm interprets as non-strings could cause deployment failures.
🐞 Fixes typo with trn2u.48xlarge
for neuron-monitor
🐞
Fixes a typo in the nodeSelector
values for neuron-monitor
with trn2u.48xlarge
.
Enhancements
💡 Manage tolerations
at a component level 💡
You can now specify tolerations
for each of the individual components deployed by the helm chart.
The daemonsets (cloudwatch-agent
, aws-for-fluent-bit
, dcgm-exporter
, neuron-monitor
) deployed by the helm chart default to the root level tolerations, where they tolerate all taints.
The deployments (amazon-cloudwatch-agent-operator
) specify empty tolerations at the component level to override the default at the root level.
💡 Manage affinity
and nodeSelector
s at a component level 💡
You can now specify the affinity
and nodeSelector
s for the individual components deployed by the helm chart.
💡 Manage updateStrategy
at a component level for cloudwatch-agent
and aws-for-fluent-biit
💡
You can now specify the updateStrategy
for cloudwatch-agent
and aws-for-fluent-biit
deployed by the helm chart.
Component Versions
- cloudwatch-agent - v1.30055.0b1095
- aws-for-fluent-bit - v2.32.5.20250327
- dcgm-exporter - 3.3.9-3.6.1-ubuntu22.04
- neuron-monitor - v1.4.0
- adot-autoinstrumentation-java - v2.10.0
- adot-autoinstrumentation-python - v0.9.0
- adot-autoinstrumentation-dotnet - v1.7.0
- adot-autoinstrumentation-node - v0.6.0