-
Notifications
You must be signed in to change notification settings - Fork 24
[PULSE-223] feat: Implement metrics collection for Plane Enterprise #174
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
- Added metrics configuration to values.yaml, enabling metrics collection and telemetry. - Introduced ClusterRole and ClusterRoleBinding for metrics scraping in service-account.yaml. - Created metrics-config ConfigMap for OpenTelemetry agent configuration. - Added metrics-agent deployment template to manage the OpenTelemetry collector. - Implemented helper function for generating installation UUID in _helpers.tpl. This update enhances observability by integrating metrics collection capabilities into the Plane Enterprise Helm chart.
WalkthroughThis pull request adds OpenTelemetry (OTEL) metrics and telemetry configuration to the Helm chart. A new metrics configuration block is introduced in values.yaml with telemetry endpoints and options. The configuration is conditionally passed to the application through environment variables in the app-env template. Additionally, the silo secret template is updated to decode base64-encoded HMAC keys before applying fallback defaults. Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~12 minutes
Poem
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
- Increased the batch timeout in values.yaml from 60s to 300s for improved metrics publishing frequency. - Added license domain to metrics-config.yaml to enhance configuration clarity. These changes optimize metrics collection and provide better configuration management for the Plane Enterprise Helm chart.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (2)
charts/plane-enterprise/values.yaml (1)
399-429: Telemetry defaults and installation UUID expectationsThe new
metricsblock looks structurally fine, but two behavioral points are worth calling out:
metrics.enabled: truetogether with a non‑empty defaulttelemetry.http_endpointmeans that, out of the box, clusters will start exporting metrics (including installation + cluster metadata) to Plane’s SaaS endpoint. That’s a product/compliance decision rather than a bug, but it’s worth double‑checking this is intentional and clearly documented for operators who expect “no external telemetry by default.”- The comment
uuid: "" # Auto-generated if emptyrelies on the helper in_helpers.tpl. As implemented, that helper generates a fresh UUID on each template invocation when unset, so the “installation UUID” is not truly stable unless users explicitly provide one. You may want to either (a) require users to set an explicit UUID in values, or (b) implement a stable generation mechanism (e.g., vialookup+Secret or a deterministic hash) so the behavior matches the comment.charts/plane-enterprise/templates/service-account.yaml (1)
12-67: Cluster‑scoped RBAC: consider configurability / documentationThe new
ClusterRole+ClusterRoleBindinglook reasonable for a metrics agent, but they are cluster‑scoped and created automatically whenevermetrics.enabledis true.Many operators prefer to manage cluster‑scoped RBAC separately. Consider either:
- adding a
metrics.rbac.create(or similar) flag to allow disabling in‑chart RBAC, or- clearly documenting that enabling metrics will create these cluster‑level roles/bindings and that cluster‑admin privileges are required for installation.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
charts/plane-enterprise/templates/_helpers.tpl(1 hunks)charts/plane-enterprise/templates/config-secrets/metrics-config.yaml(1 hunks)charts/plane-enterprise/templates/service-account.yaml(1 hunks)charts/plane-enterprise/templates/workloads/metrics-agent.deployment.yaml(1 hunks)charts/plane-enterprise/values.yaml(1 hunks)
🧰 Additional context used
🪛 YAMLlint (1.37.1)
charts/plane-enterprise/templates/workloads/metrics-agent.deployment.yaml
[warning] 90-90: too many blank lines (1 > 0)
(empty-lines)
[error] 1-1: syntax error: expected the node content, but found '-'
(syntax)
charts/plane-enterprise/templates/config-secrets/metrics-config.yaml
[warning] 163-163: too many blank lines (1 > 0)
(empty-lines)
[error] 1-1: syntax error: expected the node content, but found '-'
(syntax)
🔇 Additional comments (2)
charts/plane-enterprise/templates/workloads/metrics-agent.deployment.yaml (1)
57-63: I'll verify theregexReplaceAllfunction signature and behavior in Helm/Sprig to confirm whether the original review comment's analysis is correct.
<function_calls>
Helm Sprig regexReplaceAll function signature arguments order
</function_calls>I'll also search the repository for examples of
regexReplaceAllusage to understand the actual patterns:
<function_calls>
shell
#!/bin/bashSearch for existing regexReplaceAll usages in the repository
rg "regexReplaceAll" -n -B2 -A2 --type yaml --type yaml
</function_calls>charts/plane-enterprise/templates/config-secrets/metrics-config.yaml (1)
86-94: The review comment's claim aboutregexReplaceAllargument order is incorrect.After verifying the Sprig template function documentation, the correct signature for
regexReplaceAllis:regexReplaceAll "<pattern>" "<input>" "<replacement>"The actual code at lines 92-93 follows this signature correctly:
- Line 92:
regexReplaceAll "Mi|Gi" (.Values.metrics.agent.memoryLimit | default "256Mi") ""
- Pattern:
"Mi|Gi"- Input:
(.Values.metrics.agent.memoryLimit | default "256Mi")- Replacement:
""This correctly removes the "Mi" or "Gi" suffix from the memory limit value (e.g.,
"256Mi"→"256"), which is the intended behavior.The review comment incorrectly claims the signature is
(regex, replacement, src)and proposes a "fix" that would actually swap the input and replacement arguments, breaking the template. The original code is correct as written.Likely an incorrect or invalid review comment.
charts/plane-enterprise/templates/config-secrets/metrics-config.yaml
Outdated
Show resolved
Hide resolved
|
Linked to Plane Work Item(s) This comment was auto-generated by Plane |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
♻️ Duplicate comments (1)
charts/plane-enterprise/templates/config-secrets/metrics-config.yaml (1)
112-123: ** Fix OTLP/HTTP endpoint format and TLS certificate verification.**This was flagged in previous review. The endpoint transformation (lines 113–117) strips the scheme and path, then appends
:443, producing an invalid format for OTLP/HTTP exporter.OTLP/HTTP requires a full URL with scheme and path (e.g.,
https://v2.metrics.plane.so:443/v1/traces), not a host:port string.Additionally,
tls.insecure: true(line 123) disables TLS certificate verification by default—a security downgrade for external telemetry export. The TODO comment (line 122) acknowledges this is incomplete.Proposed fix:
- Preserve the scheme in the input URL or reconstruct it (default to
https://if missing).- Preserve the path (
/v1/tracesor/v1/metrics) in the endpoint URL.- Add a config option
metrics.telemetry.tlsInsecure(defaultfalse) to allow opt-in certificate verification bypass.Example reconstruction:
{{- $urlWithScheme := .Values.metrics.telemetry.http_endpoint }} {{- if not (hasPrefix "https://" $urlWithScheme) }} {{- if not (hasPrefix "http://" $urlWithScheme) }} {{- $urlWithScheme = printf "https://%s" $urlWithScheme }} {{- end }} {{- end }} endpoint: {{ $urlWithScheme }} tls: insecure: {{ .Values.metrics.telemetry.tlsInsecure | default false }}Then add to values.yaml:
telemetry: http_endpoint: "https://v2.metrics.plane.so/v1/traces" tlsInsecure: false # Set to true only for development/testing headers: {}
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
charts/plane-enterprise/templates/config-secrets/metrics-config.yaml(1 hunks)charts/plane-enterprise/values.yaml(1 hunks)
🧰 Additional context used
🪛 YAMLlint (1.37.1)
charts/plane-enterprise/templates/config-secrets/metrics-config.yaml
[warning] 165-165: too many blank lines (1 > 0)
(empty-lines)
[error] 1-1: syntax error: expected the node content, but found '-'
(syntax)
🔇 Additional comments (1)
charts/plane-enterprise/values.yaml (1)
400-429: No inconsistency exists between values.yaml and metrics-config.yaml for batch timeout.The review comment misunderstands Helm template default behavior. In
metrics-config.yamlline 89, the syntax{{ .Values.metrics.agent.batch.timeout | default "60s" }}means the "60s" is a fallback DEFAULT, used only if.Values.metrics.agent.batch.timeoutis not provided. Sincevalues.yamlline 426 explicitly setsbatch.timeout: "300s", this value overrides the template default. The 300s will be applied—not 60s. This is correct Helm behavior and represents intentional configuration override, not an inconsistency.The first concern about
metrics.enabled: truebeing a default is a valid operational consideration for product review (potential impact on cluster load and telemetry data retention), though it's a design choice rather than a technical issue.Likely an incorrect or invalid review comment.
charts/plane-enterprise/templates/config-secrets/metrics-config.yaml
Outdated
Show resolved
Hide resolved
charts/plane-enterprise/templates/config-secrets/metrics-config.yaml
Outdated
Show resolved
Hide resolved
- Introduced a new field `push_interval_minutes` in values.yaml for configuring the telemetry push interval. - Updated app-env.yaml to conditionally set the `METRICS_PUSH_INTERVAL_MINUTES` environment variable based on the new configuration. These changes enhance the flexibility of metrics collection in the Plane Enterprise Helm chart.
- Deleted the metrics agent configuration from values.yaml, including the agent settings and deployment template. - Removed the associated metrics-config.yaml file, which contained the OpenTelemetry agent configuration. These changes streamline the Helm chart by eliminating the metrics agent, simplifying the configuration for users.
- Introduced the `OTEL_EXPORTER_OTLP_PROTOCOL` environment variable with a default value of "http/protobuf" to enhance OpenTelemetry configuration for metrics telemetry. This change improves the flexibility and compatibility of metrics collection in the Plane Enterprise Helm chart.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (1)
charts/plane-enterprise/templates/config-secrets/app-env.yaml (1)
85-85: Consider making OTEL_EXPORTER_OTLP_PROTOCOL configurable.The protocol is hardcoded to
"http/protobuf". While this is a sensible default, consider exposing it as a configurable value in values.yaml (e.g.,metrics.telemetry.protocol) to support environments that may require alternative OTLP protocols (gRPC, JSON, etc.) in the future.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
charts/plane-enterprise/templates/config-secrets/app-env.yaml(1 hunks)charts/plane-enterprise/values.yaml(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- charts/plane-enterprise/values.yaml
🔇 Additional comments (2)
charts/plane-enterprise/templates/config-secrets/app-env.yaml (2)
84-84: Assess impact of empty default for OTLP_ENDPOINT.When
metrics.enabled=truebutmetrics.telemetry.http_endpointis not provided in values.yaml,OTLP_ENDPOINTwill be set to an empty string. Verify whether:
- The application gracefully handles a missing/empty OTLP endpoint, or
- This represents a misconfiguration that should fail fast with a validation error in values.yaml
83-89: Conditional metrics configuration structure is sound.The metrics block correctly:
- Gates all metrics variables behind the
metrics.enabledflag- Follows established Helm templating patterns used elsewhere in the file (compare lines 13-17, 27-33, 71-73)
- Uses defensive conditional for optional
METRICS_PUSH_INTERVAL_MINUTES
- Updated values.yaml to provide detailed comments on the installation UUID configuration for better clarity on its usage and persistence. - Improved the installation UUID generation logic in _helpers.tpl to ensure consistent UUIDs across Helm renders and provide fallback mechanisms. - Introduced a new Secret template (installation-uuid.yaml) to store the installation UUID, ensuring it persists across upgrades and can be referenced securely in workloads. These changes enhance the telemetry correlation capabilities and improve the overall user experience when configuring metrics in the Plane Enterprise Helm chart.
…erprise Helm chart - Modified values.yaml to update the primary telemetry endpoint and add a new v2 telemetry endpoint for improved metrics collection. - Updated app-env.yaml to include the new OTLP v2 endpoint for better OpenTelemetry configuration. - Adjusted silo.yaml to decode the HMAC secret key from base64, ensuring proper handling of sensitive data. These changes enhance the telemetry capabilities and improve the security of secret management in the Plane Enterprise Helm chart.
- Removed the installation UUID configuration from values.yaml and associated comments for clarity. - Deleted the installation-uuid Secret template to streamline the Helm chart and eliminate unnecessary complexity. - Updated the _helpers.tpl file to remove the installation UUID generation logic, simplifying the template structure. These changes enhance the overall maintainability of the Plane Enterprise Helm chart by reducing redundancy in UUID management.
…ated ClusterRole and ClusterRoleBinding from service-account.yaml - Modified values.yaml to include the full path for telemetry endpoints, ensuring proper configuration for OTLP exporters. - Removed the ClusterRole and ClusterRoleBinding definitions related to metrics scraping from service-account.yaml to streamline the Helm chart. These changes enhance the clarity of telemetry configurations and simplify the service account setup in the Plane Enterprise Helm chart.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
charts/plane-enterprise/README.md(1 hunks)charts/plane-enterprise/values.yaml(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- charts/plane-enterprise/values.yaml
🧰 Additional context used
🪛 markdownlint-cli2 (0.18.1)
charts/plane-enterprise/README.md
497-497: Bare URL used
(MD034, no-bare-urls)
498-498: Bare URL used
(MD034, no-bare-urls)
🔇 Additional comments (1)
charts/plane-enterprise/README.md (1)
492-508: Documentation section is well‑structured and appropriately detailed.The new "Metrics and Telemetry Configuration" section follows the existing documentation style and provides clear guidance on metrics configuration. The table format is consistent with other configuration sections, and the notes section helpfully explains path requirements, data scope, and behavior when disabled.
| SILO_HMAC_SECRET_KEY: {{ .Values.env.silo_envs.hmac_secret_key | quote }} | ||
| {{- else if (lookup "v1" "Secret" .Release.Namespace (printf "%s-silo-secrets" .Release.Name)) }} | ||
| SILO_HMAC_SECRET_KEY: {{ (lookup "v1" "Secret" .Release.Namespace (printf "%s-silo-secrets" .Release.Name)).data.SILO_HMAC_SECRET_KEY | default (randAlphaNum 32) | quote }} | ||
| SILO_HMAC_SECRET_KEY: {{ (lookup "v1" "Secret" .Release.Namespace (printf "%s-silo-secrets" .Release.Name)).data.SILO_HMAC_SECRET_KEY | b64dec | default (randAlphaNum 32) | quote }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This update enhances observability by integrating metrics collection capabilities into the Plane Enterprise Helm chart.
Description
Type of Change
Screenshots and Media (if applicable)
Test Scenarios
References
Summary by CodeRabbit
New Features
Documentation
✏️ Tip: You can customize this high-level summary in your review settings.