Skip to content

Conversation

@jaypoulz
Copy link

@jaypoulz jaypoulz commented Oct 21, 2025

Introduces tnf.etcd.openshift.io/v1alpha1 API group with PacemakerStatus custom resource. This provides visibility into Pacemaker cluster health for dual-replica etcd deployments. The status-only resource is populated by a privileged controller and consumed by the cluster-etcd-operator healthcheck controller. Not gated because it's only used by CEO when two-node has transitioned.

Works in conjunction with openshift/cluster-etcd-operator#1487

Depends on kubernetes-sigs/controller-tools#1324, which in turn needs a bump for go 1.25
In the meantime, I've vendored: openshift/kubernetes-sigs-controller-tools#33

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Oct 21, 2025
@openshift-ci-robot
Copy link

openshift-ci-robot commented Oct 21, 2025

@jaypoulz: This pull request references OCPEDGE-2084 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.21.0" version, but no target version was set.

Details

In response to this:

Introduces tnf.etcd.openshift.io/v1alpha1 API group with PacemakerStatus custom resource. This provides visibility into Pacemaker cluster health for dual-replica etcd deployments. The status-only resource is populated by a privileged controller and consumed by the cluster-etcd-operator healthcheck controller. Gated by DualReplica feature and managed by two-node-fencing component.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Oct 21, 2025

Hello @jaypoulz! Some important instructions when contributing to openshift/api:
API design plays an important part in the user experience of OpenShift and as such API PRs are subject to a high level of scrutiny to ensure they follow our best practices. If you haven't already done so, please review the OpenShift API Conventions and ensure that your proposed changes are compliant. Following these conventions will help expedite the api review process for your PR.

@openshift-ci openshift-ci bot added the size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. label Oct 21, 2025
@openshift-ci openshift-ci bot added the do-not-merge/invalid-owners-file Indicates that a PR should not merge because it has an invalid OWNERS file in it. label Oct 21, 2025
@openshift-ci-robot
Copy link

openshift-ci-robot commented Oct 21, 2025

@jaypoulz: This pull request references OCPEDGE-2084 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.21.0" version, but no target version was set.

Details

In response to this:

Introduces tnf.etcd.openshift.io/v1alpha1 API group with PacemakerStatus custom resource. This provides visibility into Pacemaker cluster health for dual-replica etcd deployments. The status-only resource is populated by a privileged controller and consumed by the cluster-etcd-operator healthcheck controller. Gated by DualReplica feature and managed by two-node-fencing component.

Works in conjunction with openshift/cluster-etcd-operator#1487

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci openshift-ci bot removed the do-not-merge/invalid-owners-file Indicates that a PR should not merge because it has an invalid OWNERS file in it. label Oct 21, 2025
@jaypoulz jaypoulz force-pushed the OCPEDGE-2084 branch 4 times, most recently from 2ba442d to 29b9fec Compare October 21, 2025 23:56
@saschagrunert
Copy link
Member

@jaypoulz thank you for the PR, do you mind making the CI happy?

@jaypoulz
Copy link
Author

Hi @saschagrunert :) Working on it! :D
New to this repo so working through beginner challenges 😸

@jaypoulz
Copy link
Author

A few open questions I have:

  1. This is a config object of a sort. It's created by cluster-etcd-operator only when you have a two-node cluster and only for the purposes of gathering information about the health of pacemaker (our ha tool) from the nodes. I put it in etcd/tnf (two node fencing) because it seemed sensible. But I'm not sure if it needs to be in config.

That said, it doesn't work like a normal config - there's no spec and it shouldn't be created during bootstrap. The CRD just needs to be present when the CEO runs an cronjob to post an update to it.

  1. bash hack/update-protobuf.sh failed for me because it wanted the path to be installed in my go path. That said, cursor happily runs it and copies over the files without issue. I'm just skeptical of the zz_generated files, but I assume those are verified by CI?

  2. For the non-boolean enum fields. Should I be creating static string definitions that can be exported to CEO? How do I generate those?

@jaypoulz jaypoulz force-pushed the OCPEDGE-2084 branch 2 times, most recently from b0ff230 to 1b57b09 Compare October 22, 2025 16:59
@openshift-ci openshift-ci bot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Oct 22, 2025
@jaypoulz jaypoulz force-pushed the OCPEDGE-2084 branch 4 times, most recently from b9b727f to fdd53e9 Compare October 22, 2025 20:37
@saschagrunert
Copy link
Member

saschagrunert commented Oct 23, 2025

Yeah, I'll ignore the CI failures for now, running ./hack/update-codegen.sh locally also gives me a diff in openapi/generated_openapi/zz_generated.openapi.go. 🙃

A few open questions I have:

  1. This is a config object of a sort. It's created by cluster-etcd-operator only when you have a two-node cluster and only for the purposes of gathering information about the health of pacemaker (our ha tool) from the nodes. I put it in etcd/tnf (two node fencing) because it seemed sensible. But I'm not sure if it needs to be in config.

I'm new to API review, but my gut feeling tells me that a dedicated etcd API group sounds fine for that purpose.

That said, it doesn't work like a normal config - there's no spec and it shouldn't be created during bootstrap. The CRD just needs to be present when the CEO runs an cronjob to post an update to it.

  1. bash hack/update-protobuf.sh failed for me because it wanted the path to be installed in my go path. That said, cursor happily runs it and copies over the files without issue. I'm just skeptical of the zz_generated files, but I assume those are verified by CI?

You can also try to run it in a container by make verify-with-container.

  1. For the non-boolean enum fields. Should I be creating static string definitions that can be exported to CEO? How do I generate those?

Do you mind elaborating on that? Do you mean generating the code for the unions?

API docs ref: https://github.com/openshift/enhancements/blob/master/dev-guide/api-conventions.md#writing-a-union-in-go


@jaypoulz is there an OpenShift enhancement available for this change?

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@etcd/v1alpha1/types_pacemakercluster.go`:
- Around line 452-454: Update the kubebuilder XValidation marker on the
PacemakerClusterStatus type (the comment above PacemakerClusterStatus) to guard
the comparison against missing oldSelf.lastUpdated; replace the rule that
currently reads "self.lastUpdated >= oldSelf.lastUpdated" with one that first
checks for the absence of oldSelf.lastUpdated (e.g., "!has(oldSelf.lastUpdated)
|| self.lastUpdated >= oldSelf.lastUpdated") so the initial write (when
oldSelf.lastUpdated is not present) passes validation.

In
`@payload-manifests/crds/0000_25_etcd_01_pacemakerclusters-TechPreviewNoUpgrade.crd.yaml`:
- Around line 157-175: The NodeAddress `address` property (inside the
`addresses` array) lacks IP format validation even though docs require "valid
global unicast IPv4 or IPv6 in canonical form" and exclusion of
unspecified/loopback/link-local/multicast; add an x-kubernetes-validations rule
(or XValidation) to enforce canonical IPv4/IPv6 and reject reserved ranges on
the `address` property (or at the `addresses` array level) similar to existing
DNS1123Subdomain/resource name validations so the CRD and corresponding Go tags
(e.g., the NodeAddress struct/address field) consistently enforce the
constraint.
♻️ Duplicate comments (3)
etcd/README.md (1)

1-36: Address documentation duplication concern.

A past reviewer noted this README duplicates content in the v1alpha1 directory. Consider consolidating to a single source of truth to avoid maintenance drift.

Options:

  1. Keep this file as a high-level overview with a pointer to v1alpha1 for details
  2. Move detailed documentation to v1alpha1/README.md and keep this minimal
openapi/generated_openapi/zz_generated.openapi.go (1)

26435-26485: Outstanding validation constraints from prior review.

This is a generated file. Past review comments have identified that documented validation requirements (minItems for conditions arrays, maxItems for nodes, CEL for lastUpdated monotonicity) are not enforced in the schema. Those concerns remain valid and should be addressed in the source Go types (etcd/v1alpha1/types_pacemakercluster.go) before regenerating.

No new issues identified beyond those already raised.

Also applies to: 26538-26644, 26646-26689, 26691-26755

payload-manifests/crds/0000_25_etcd_01_pacemakerclusters-TechPreviewNoUpgrade.crd.yaml (1)

589-591: The lastUpdated validation rule will fail on first status update due to missing guard.

The CEL rule self.lastUpdated >= oldSelf.lastUpdated lacks the necessary guard for when oldSelf.lastUpdated is not yet set. On the initial status update, this comparison will fail because oldSelf will not have a lastUpdated field.

The codebase already uses this guard pattern elsewhere (line 434: !has(oldSelf.status) || has(self.status)), confirming the developers are aware of it. Add the same guard here:

Fix required
         x-kubernetes-validations:
         - message: lastUpdated cannot be set to an earlier timestamp
-          rule: self.lastUpdated >= oldSelf.lastUpdated
+          rule: '!has(oldSelf.lastUpdated) || self.lastUpdated >= oldSelf.lastUpdated'
🧹 Nitpick comments (2)
payload-manifests/crds/0000_25_etcd_01_pacemakerclusters-CustomNoUpgrade.crd.yaml (1)

100-104: Consider quoting Unknown enum value for consistency.

The status enum uses mixed quoting: "True" and "False" are quoted while Unknown is not. While functionally equivalent (YAML parses all as strings), consistent quoting improves API schema clarity.

🔧 Suggested fix
               enum:
               - "True"
               - "False"
-              - Unknown
+              - "Unknown"
               type: string

This same pattern appears in all condition status fields throughout the CRD (node conditions, resource conditions, fencing agent conditions).

etcd/v1alpha1/types_pacemakercluster.go (1)

534-545: Add IP validation for addresses field to enforce documented constraints.

The documentation (lines 538–541) specifies that addresses must be "valid global unicast IPv4 or IPv6 address in canonical form," but no validation currently enforces this. Kubebuilder's CEL library provides the necessary functions (isIP, ip().isCanonical(), ip().isGlobalUnicast()).

Consider adding validation to the field:

// +kubebuilder:validation:items:XValidation:rule="isIP(self.Address) && ip(self.Address).isCanonical() && ip(self.Address).isGlobalUnicast()",message="address must be a valid global unicast IPv4 or IPv6 address in canonical form"

This validates each item in the Addresses list and ensures compliance with the documented requirements.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to data retention organization setting

📥 Commits

Reviewing files that changed from the base of the PR and between 3a62978 and 1b358fb.

⛔ Files ignored due to path filters (83)
  • etcd/v1alpha1/zz_generated.crd-manifests/0000_25_etcd_01_pacemakerclusters-CustomNoUpgrade.crd.yaml is excluded by !**/zz_generated.crd-manifests/*
  • etcd/v1alpha1/zz_generated.crd-manifests/0000_25_etcd_01_pacemakerclusters-DevPreviewNoUpgrade.crd.yaml is excluded by !**/zz_generated.crd-manifests/*
  • etcd/v1alpha1/zz_generated.crd-manifests/0000_25_etcd_01_pacemakerclusters-TechPreviewNoUpgrade.crd.yaml is excluded by !**/zz_generated.crd-manifests/*
  • etcd/v1alpha1/zz_generated.featuregated-crd-manifests/pacemakerclusters.etcd.openshift.io/DualReplica.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • tools/go.sum is excluded by !**/*.sum
  • tools/vendor/k8s.io/gengo/v2/Makefile is excluded by !**/vendor/**
  • tools/vendor/k8s.io/gengo/v2/generator/execute.go is excluded by !**/vendor/**
  • tools/vendor/k8s.io/gengo/v2/generator/import_tracker.go is excluded by !**/vendor/**
  • tools/vendor/k8s.io/gengo/v2/namer/namer.go is excluded by !**/vendor/**
  • tools/vendor/k8s.io/gengo/v2/parser/parse.go is excluded by !**/vendor/**
  • tools/vendor/k8s.io/gengo/v2/parser/parse_122.go is excluded by !**/vendor/**
  • tools/vendor/k8s.io/gengo/v2/parser/parse_pre_122.go is excluded by !**/vendor/**
  • tools/vendor/modules.txt is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/arrayofstruct/analyzer.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/commentstart/analyzer.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/conditions/doc.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/conflictingmarkers/analyzer.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/defaultorrequired/analyzer.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/defaultorrequired/doc.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/defaultorrequired/initializer.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/defaults/analyzer.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/defaults/config.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/defaults/doc.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/defaults/initializer.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/dependenttags/analyzer.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/dependenttags/config.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/dependenttags/doc.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/dependenttags/initializer.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/duplicatemarkers/analyzer.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/forbiddenmarkers/analyzer.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/helpers/inspector/inspector.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/helpers/markers/analyzer.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/initializer/initializer.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/integers/analyzer.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/integers/doc.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/jsontags/analyzer.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/maxlength/analyzer.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/minlength/analyzer.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/minlength/doc.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/minlength/initializer.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/namingconventions/analyzer.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/namingconventions/doc.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/nobools/analyzer.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/nobools/doc.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/nodurations/analyzer.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/nofloats/analyzer.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/nomaps/analyzer.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/nomaps/initializer.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/nonpointerstructs/analyzer.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/nonpointerstructs/config.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/nonpointerstructs/doc.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/nonpointerstructs/initializer.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/noreferences/analyzer.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/noreferences/config.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/noreferences/doc.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/noreferences/initializer.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/optionalfields/analyzer.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/optionalfields/config.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/optionalfields/doc.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/optionalfields/initializer.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/optionalorrequired/analyzer.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/preferredmarkers/analyzer.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/preferredmarkers/config.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/preferredmarkers/doc.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/preferredmarkers/initializer.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/requiredfields/analyzer.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/requiredfields/config.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/ssatags/analyzer.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/ssatags/initializer.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/statusoptional/analyzer.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/statusoptional/initializer.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/statussubresource/analyzer.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/uniquemarkers/analyzer.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/uniquemarkers/initializer.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/utils/serialization/config.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/utils/serialization/serialization_check.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/utils/serialization/util.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/utils/type_check.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/utils/utils.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/utils/zero_value.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/markers/markers.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/plugin/base/base.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/registration/doc.go is excluded by !**/vendor/**
📒 Files selected for processing (10)
  • .golangci.yaml
  • etcd/README.md
  • etcd/v1alpha1/types_pacemakercluster.go
  • etcd/v1alpha1/zz_generated.swagger_doc_generated.go
  • openapi/generated_openapi/zz_generated.openapi.go
  • openapi/openapi.json
  • payload-manifests/crds/0000_25_etcd_01_pacemakerclusters-CustomNoUpgrade.crd.yaml
  • payload-manifests/crds/0000_25_etcd_01_pacemakerclusters-DevPreviewNoUpgrade.crd.yaml
  • payload-manifests/crds/0000_25_etcd_01_pacemakerclusters-TechPreviewNoUpgrade.crd.yaml
  • tools/go.mod
🚧 Files skipped from review as they are similar to previous changes (2)
  • .golangci.yaml
  • etcd/v1alpha1/zz_generated.swagger_doc_generated.go
🔇 Additional comments (30)
tools/go.mod (1)

28-28: Both dependency versions are verified to exist in their respective upstream repositories:

  • k8s.io/gengo/v2 v2.0.0-20250922181213-ec3ebc5fd46b exists (Sept 22, 2025)
  • sigs.k8s.io/kube-api-linter v0.0.0-20260114104534-18147eee9c49 exists (Jan 14, 2026)

Both are legitimate versions sourced from official Kubernetes repositories. No issues identified.

openapi/openapi.json (8)

4575-4587: AcceptRisk type looks correct.

The new AcceptRisk definition is well-structured with the required name field. The description mentions a 256-character limit which should be enforced via validation markers in the source Go types.


5875-5886: Conditional update risk tracking additions look good.

The conditionalUpdateRisks, riskNames, and related conditions fields are well-structured with appropriate list-type annotations (map for keyed entries, set for unique values).


6348-6354: TLS profile description updates are consistent.

The description changes across TLS-related types provide cleaner formatting and more consistent documentation.

Also applies to: 8536-8539, 8814-8817, 9744-9747, 11322-11353


11639-11650: Risk acceptance flow is well-designed.

The acceptRisks field in Update spec complements the conditionalUpdateRisks in status, creating a clean spec/status pattern for cluster administrators to acknowledge specific risks.


14627-14652: PacemakerCluster root type is well-defined.

The singleton cluster-scoped resource design is appropriate for representing global pacemaker state. The Compatibility level 4 disclaimer clearly communicates the alpha API stability expectations.


14653-14682: Fencing agent separation is a good design choice.

Separating PacemakerClusterFencingAgentStatus from regular resources and mapping fencing agents to their target node (rather than where monitoring runs) correctly models STONITH semantics.


14713-14770: Node status structure is comprehensive.

The PacemakerClusterNodeStatus type captures:

  • Node identity (name, addresses)
  • Health conditions (9 types covering online, active, fencing states)
  • Per-node resources (Kubelet, Etcd)
  • Fencing agent mapping

The use of NodeAddress from core/v1 for the addresses field provides consistency with the Kubernetes Node API.


14797-14835: Cluster-level status provides clear health aggregation.

The PacemakerClusterStatus type with its lastUpdated timestamp and nodes list gives operators a single point to assess cluster health. The conditions (Healthy, InService, NodeCountAsExpected) provide meaningful top-level indicators for two-node fencing scenarios.

payload-manifests/crds/0000_25_etcd_01_pacemakerclusters-CustomNoUpgrade.crd.yaml (4)

1-18: LGTM - CRD metadata and spec header.

The CRD metadata correctly specifies the feature set, includes appropriate OpenShift annotations, and follows naming conventions for cluster-scoped resources.


137-143: LGTM - Nodes array design allows catastrophic failure reporting.

The nodes array intentionally has no minItems, allowing empty lists during catastrophic failure states where no node information is available. The maxItems: 32 correctly matches Pacemaker's upper limit.


553-571: LGTM - Resource enum and validation.

The resource name enum restricts values to Kubelet and Etcd, with CEL validation ensuring both must be present. This enforces the Two Node OpenShift with Fencing requirements at the schema level.


589-599: LGTM - Top-level validations ensure data integrity.

The CEL validations correctly enforce:

  1. Timestamp monotonicity (allows equal timestamps for idempotent updates)
  2. Singleton semantics (must be named "cluster")
  3. Status immutability (cannot be removed once set)
etcd/README.md (3)

38-89: LGTM - Comprehensive status structure documentation.

The YAML example clearly illustrates the hierarchical status structure and required fields at each level. The documentation of addresses usage aligns with the CRD description.


91-143: LGTM - Clear fencing agent semantics documentation.

The distinction between FencingAvailable (at least one agent healthy) and FencingHealthy (all agents healthy) is well-documented, with clear operational implications for each state.


197-210: LGTM - Linter exclusions are well-documented.

The explanations for golangci-yaml exclusions provide valuable context for future maintainers, clarifying why certain rules conflict with the intentional design patterns (status subresource, required conditions when status present).

payload-manifests/crds/0000_25_etcd_01_pacemakerclusters-DevPreviewNoUpgrade.crd.yaml (2)

1-10: LGTM - DevPreviewNoUpgrade variant is consistent.

The CRD correctly specifies DevPreviewNoUpgrade feature set while maintaining identical schema structure to the CustomNoUpgrade variant.


589-603: LGTM - Validation rules match CustomNoUpgrade variant.

Top-level validations (timestamp monotonicity, singleton name, status immutability) are correctly synchronized with the CustomNoUpgrade CRD.

openapi/generated_openapi/zz_generated.openapi.go (3)

547-552: LGTM!

The new etcd/v1alpha1 types are correctly registered in the OpenAPI definitions map following the established pattern.


26392-26433: LGTM - Schema structure is correct for status-only CRD.

The PacemakerCluster schema correctly declares metadata as required while making status optional, which aligns with the PR objective of a status-only resource populated by a privileged controller.


26487-26536: LGTM!

Standard list type schema following Kubernetes conventions.

payload-manifests/crds/0000_25_etcd_01_pacemakerclusters-TechPreviewNoUpgrade.crd.yaml (3)

1-10: LGTM: CRD metadata and annotations are well-configured.

The CRD properly declares:

  • API approval reference
  • Feature-gate merge indicator
  • Release inclusion annotations for IBM Cloud and self-managed HA
  • TechPreviewNoUpgrade feature-set restriction

55-129: LGTM: Cluster-level conditions with thorough validation.

The conditions array properly:

  • Uses map list semantics with type as the key
  • Enforces minimum 3 / maximum 8 items
  • Requires Healthy, InService, and NodeCountAsExpected via CEL rules
  • Includes all required fields per metav1.Condition schema

593-603: LGTM: Root-level validations enforce singleton semantics correctly.

The object-level validations properly enforce:

  • Singleton naming (metadata.name == 'cluster')
  • Status immutability once set (!has(oldSelf.status) || has(self.status))
  • Status subresource enablement
etcd/v1alpha1/types_pacemakercluster.go (6)

469-477: Clarify: +required markers with omitempty JSON tags.

Several fields are marked +required but use omitempty in their JSON tags (e.g., Conditions, LastUpdated, Nodes). While this is a valid pattern in Kubernetes APIs (required in schema, omit empty in marshaling), it can cause confusion. Based on past review discussion, this appears intentional for status fields that are populated after initial creation.


11-74: LGTM: Cluster-level condition types and reasons are well-documented.

The condition type constants and their corresponding reasons are clearly documented with:

  • Expected True/False states and their meanings
  • Aggregation relationships (Healthy aggregates other conditions)
  • Context-specific explanations for Two Node OpenShift with Fencing

418-450: LGTM: PacemakerCluster type declaration with comprehensive markers.

The type properly declares:

  • Cluster-scoped resource with status subresource
  • Singleton enforcement via CEL validation (metadata.name == 'cluster')
  • Status immutability once set
  • Feature-gate control (DualReplica)
  • Compatibility level 4 (no stability guarantees)

576-621: LGTM: PacemakerClusterFencingAgentStatus is well-structured.

The fencing agent status type properly:

  • Documents STONITH concepts and target node mapping
  • Requires all 8 condition types via MinItems and XValidation
  • Validates name (1-253 chars) and method (1-63 chars) lengths

623-667: LGTM: PacemakerClusterResourceStatus is well-structured.

The resource status type properly:

  • Explains pacemaker resource concepts
  • Distinguishes between Kubelet and Etcd resources
  • Uses enum validation for resource names
  • Requires all 8 condition types

669-685: LGTM: PacemakerClusterList for API conventions.

The list type correctly:

  • Explains singleton semantics in documentation
  • Uses proper TypeMeta and ListMeta embedding
  • Satisfies Kubernetes API list type requirements

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@etcd/v1alpha1/types_pacemakercluster.go`:
- Around line 535-546: The Addresses field's comment requires each entry be a
canonical global-unicast IP but there is no CEL/XValidation enforcing that; add
a kubebuilder XValidation annotation on the Addresses slice to validate each
element (targeting the Addresses []corev1.NodeAddress declaration) with a rule
similar to self.all(addr, isIP(addr.address) &&
ip(addr.address).isGlobalUnicast()) and an appropriate message, or alternatively
edit the comment above the Addresses field to remove the strict
canonical/global-unicast claims so doc and validation match; ensure you
reference the Addresses field and update the surrounding kubebuilder tags
(+kubebuilder:validation:XValidation or the comment) accordingly.
♻️ Duplicate comments (2)
payload-manifests/crds/0000_25_etcd_01_pacemakerclusters-TechPreviewNoUpgrade.crd.yaml (1)

159-177: IP address validation gap previously flagged.

The missing validation on NodeAddress.address was already identified in past review comments for this file. The same recommendation applies: add CEL validation to enforce the documented IP format requirements.

openapi/generated_openapi/zz_generated.openapi.go (1)

26615-26635: Consider adding maxItems constraint for fencingAgents.

The description states "up to 8 are supported for redundancy" (line 26625), but no maxItems: 8 constraint is enforced in the schema. This is similar to the nodes array issue already flagged in past review comments.

Since this is a generated file, add the constraint in the source Go type:

// +kubebuilder:validation:MaxItems=8
FencingAgents []PacemakerClusterFencingAgentStatus `json:"fencingAgents"`
🧹 Nitpick comments (2)
payload-manifests/crds/0000_25_etcd_01_pacemakerclusters-CustomNoUpgrade.crd.yaml (1)

159-177: Consider adding IP address validation to NodeAddress.address.

The documentation (lines 156-158) states addresses must be "valid global unicast IPv4 or IPv6 address in canonical form" excluding loopback, link-local, and multicast, but the schema only specifies type: string without enforcement.

Consider adding CEL validation to the address property to enforce the documented constraints. Example:

address:
  description: The node address.
  type: string
  x-kubernetes-validations:
  - message: address must be a valid global unicast IP in canonical form
    rule: isIP(self) && ip(self).isCanonical() && ip(self).isGlobalUnicast()

This would align the schema with the documented requirements and prevent invalid data from being stored.

openapi/openapi.json (1)

4575-4587: Consider adding schema validation constraints for documented limits.

The name field description states it "must not exceed 256 characters" but the schema lacks a maxLength: 256 constraint. Since this is a generated file, ensure the source Go type has the appropriate kubebuilder validation marker (+kubebuilder:validation:MaxLength=256) so the constraint is enforced at admission time and reflected in the OpenAPI schema.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to data retention organization setting

📥 Commits

Reviewing files that changed from the base of the PR and between 1b358fb and 789e55d.

⛔ Files ignored due to path filters (83)
  • etcd/v1alpha1/zz_generated.crd-manifests/0000_25_etcd_01_pacemakerclusters-CustomNoUpgrade.crd.yaml is excluded by !**/zz_generated.crd-manifests/*
  • etcd/v1alpha1/zz_generated.crd-manifests/0000_25_etcd_01_pacemakerclusters-DevPreviewNoUpgrade.crd.yaml is excluded by !**/zz_generated.crd-manifests/*
  • etcd/v1alpha1/zz_generated.crd-manifests/0000_25_etcd_01_pacemakerclusters-TechPreviewNoUpgrade.crd.yaml is excluded by !**/zz_generated.crd-manifests/*
  • etcd/v1alpha1/zz_generated.featuregated-crd-manifests/pacemakerclusters.etcd.openshift.io/DualReplica.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • tools/go.sum is excluded by !**/*.sum
  • tools/vendor/k8s.io/gengo/v2/Makefile is excluded by !**/vendor/**
  • tools/vendor/k8s.io/gengo/v2/generator/execute.go is excluded by !**/vendor/**
  • tools/vendor/k8s.io/gengo/v2/generator/import_tracker.go is excluded by !**/vendor/**
  • tools/vendor/k8s.io/gengo/v2/namer/namer.go is excluded by !**/vendor/**
  • tools/vendor/k8s.io/gengo/v2/parser/parse.go is excluded by !**/vendor/**
  • tools/vendor/k8s.io/gengo/v2/parser/parse_122.go is excluded by !**/vendor/**
  • tools/vendor/k8s.io/gengo/v2/parser/parse_pre_122.go is excluded by !**/vendor/**
  • tools/vendor/modules.txt is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/arrayofstruct/analyzer.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/commentstart/analyzer.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/conditions/doc.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/conflictingmarkers/analyzer.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/defaultorrequired/analyzer.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/defaultorrequired/doc.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/defaultorrequired/initializer.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/defaults/analyzer.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/defaults/config.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/defaults/doc.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/defaults/initializer.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/dependenttags/analyzer.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/dependenttags/config.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/dependenttags/doc.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/dependenttags/initializer.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/duplicatemarkers/analyzer.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/forbiddenmarkers/analyzer.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/helpers/inspector/inspector.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/helpers/markers/analyzer.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/initializer/initializer.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/integers/analyzer.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/integers/doc.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/jsontags/analyzer.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/maxlength/analyzer.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/minlength/analyzer.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/minlength/doc.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/minlength/initializer.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/namingconventions/analyzer.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/namingconventions/doc.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/nobools/analyzer.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/nobools/doc.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/nodurations/analyzer.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/nofloats/analyzer.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/nomaps/analyzer.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/nomaps/initializer.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/nonpointerstructs/analyzer.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/nonpointerstructs/config.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/nonpointerstructs/doc.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/nonpointerstructs/initializer.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/noreferences/analyzer.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/noreferences/config.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/noreferences/doc.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/noreferences/initializer.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/optionalfields/analyzer.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/optionalfields/config.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/optionalfields/doc.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/optionalfields/initializer.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/optionalorrequired/analyzer.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/preferredmarkers/analyzer.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/preferredmarkers/config.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/preferredmarkers/doc.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/preferredmarkers/initializer.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/requiredfields/analyzer.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/requiredfields/config.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/ssatags/analyzer.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/ssatags/initializer.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/statusoptional/analyzer.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/statusoptional/initializer.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/statussubresource/analyzer.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/uniquemarkers/analyzer.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/uniquemarkers/initializer.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/utils/serialization/config.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/utils/serialization/serialization_check.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/utils/serialization/util.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/utils/type_check.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/utils/utils.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/analysis/utils/zero_value.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/markers/markers.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/plugin/base/base.go is excluded by !**/vendor/**
  • tools/vendor/sigs.k8s.io/kube-api-linter/pkg/registration/doc.go is excluded by !**/vendor/**
📒 Files selected for processing (10)
  • .golangci.yaml
  • etcd/README.md
  • etcd/v1alpha1/types_pacemakercluster.go
  • etcd/v1alpha1/zz_generated.swagger_doc_generated.go
  • openapi/generated_openapi/zz_generated.openapi.go
  • openapi/openapi.json
  • payload-manifests/crds/0000_25_etcd_01_pacemakerclusters-CustomNoUpgrade.crd.yaml
  • payload-manifests/crds/0000_25_etcd_01_pacemakerclusters-DevPreviewNoUpgrade.crd.yaml
  • payload-manifests/crds/0000_25_etcd_01_pacemakerclusters-TechPreviewNoUpgrade.crd.yaml
  • tools/go.mod
🚧 Files skipped from review as they are similar to previous changes (3)
  • tools/go.mod
  • etcd/README.md
  • etcd/v1alpha1/zz_generated.swagger_doc_generated.go
🔇 Additional comments (31)
.golangci.yaml (1)

104-132: Well-documented linter exclusions for the status-only CRD pattern.

The exclusions are appropriate given the API design constraints:

  • Status subresource requires the Status field to be optional, but inner fields must be required when status is populated
  • Conditions, resources, and fencingAgents must be complete when present, which conflicts with Kubernetes conventions for optional fields

The regex patterns are specific to types_pacemakercluster.go and each exclusion has clear rationale in the comments.

payload-manifests/crds/0000_25_etcd_01_pacemakerclusters-CustomNoUpgrade.crd.yaml (2)

1-28: CRD structure and singleton enforcement look correct.

The CRD properly enforces:

  • Cluster scope with singleton pattern (metadata.name == 'cluster')
  • Status immutability via transition rule
  • Feature-set annotation for CustomNoUpgrade
  • Compatibility level 4 documentation

The API approval annotation correctly references this PR.


592-602: Validation rules are well-designed.

The CEL rules properly handle:

  • First status creation (guarded with !has(oldSelf.lastUpdated))
  • Monotonic timestamp updates (self.lastUpdated >= oldSelf.lastUpdated)
  • Singleton enforcement (self.metadata.name == 'cluster')
  • Status immutability (!has(oldSelf.status) || has(self.status))
payload-manifests/crds/0000_25_etcd_01_pacemakerclusters-TechPreviewNoUpgrade.crd.yaml (2)

1-10: TechPreviewNoUpgrade variant properly configured.

The CRD is correctly annotated with release.openshift.io/feature-set: TechPreviewNoUpgrade while maintaining the same schema as the CustomNoUpgrade variant.


592-606: Schema and validation rules are consistent with CustomNoUpgrade variant.

The validation logic and subresource configuration are correct and identical to the CustomNoUpgrade variant, ensuring consistent behavior across feature sets.

openapi/openapi.json (17)

5875-5886: LGTM - conditional update risks schema structure is correct.

The map-based list type with name key is appropriate for deduplication. Note: the description mentions a 500-entry limit; ensure the source type includes +kubebuilder:validation:MaxItems=500 if server-side enforcement is desired.


6094-6102: LGTM - set semantics are appropriate for unique risk names.


6129-6140: Verify maxItems constraint for conditions array.

The description states this array "must not contain more than one entry" for the Applies condition. Ensure the source type includes +kubebuilder:validation:MaxItems=1 if this constraint should be enforced at admission.


6351-6351: LGTM - TLS documentation improvements.


8537-8538: LGTM.


8815-8816: LGTM.


9745-9746: LGTM.


11323-11323: LGTM.


11334-11353: LGTM - TLS profile specifications are well-documented.

The cipher suites and version constraints are clearly documented for each profile type.


11639-11650: LGTM - acceptRisks field structure is correct.

The map-based list with name key allows efficient lookups and deduplication of accepted risks.


11685-11686: LGTM.


14627-14652: LGTM - PacemakerCluster root definition is well-structured.

The status-only resource pattern (no spec field) is appropriate for an informational CRD populated by a controller. The singleton constraint ("cluster" instance) is documented in the description.


14653-14682: LGTM - fencing agent status structure captures STONITH semantics correctly.

The separation of fencing agents from regular resources (mapped to target node rather than execution node) is a key Pacemaker concept that's well-documented here.


14683-14712: LGTM - standard list type for API conventions.


14713-14771: LGTM - node status schema is comprehensive.

The use of io.k8s.api.core.v1.NodeAddress for addresses provides consistency with the Kubernetes Node API. The separation of resources and fencingAgents reflects Pacemaker's distinct handling of these concepts.


14772-14796: LGTM - resource status schema is clear.

The documentation clearly explains that valid resource names are "Kubelet" and "Etcd", with fencing agents tracked separately.


14797-14835: LGTM - cluster status schema captures operational state effectively.

The required fields (conditions, lastUpdated, nodes) and the monotonic timestamp constraint for lastUpdated are appropriate for status reliability. The 32-node upper limit aligns with Pacemaker's architectural constraint.

payload-manifests/crds/0000_25_etcd_01_pacemakerclusters-DevPreviewNoUpgrade.crd.yaml (1)

1-606: Well-designed CRD with comprehensive validations.

The PacemakerCluster CRD is well-structured:

  • Singleton enforcement via CEL rule ensures only one instance named "cluster" can exist
  • Status monotonicity validation prevents timestamp regression
  • Required condition types are enforced via CEL exists rules
  • List types use appropriate strategies (map with keys for conditions/nodes/resources, atomic for addresses)
  • Resource names are constrained to valid values via enum

The enum quoting pattern (quoted "True"/"False", unquoted Unknown) follows upstream Kubernetes metav1.Condition conventions—YAML requires quoting for True/False to prevent boolean interpretation, while Unknown doesn't need it.

openapi/generated_openapi/zz_generated.openapi.go (3)

547-552: LGTM - OpenAPI type registrations are correctly wired.

The new etcd/v1alpha1 types are properly registered in the definitions map with consistent naming conventions matching other API groups.


26392-26433: Well-structured status-only API definition.

The PacemakerCluster schema correctly models a status-only resource: metadata is required while status is optional (populated via the status subresource). The compatibility level 4 designation and singleton pattern are appropriately documented.


26487-26536: LGTM - Standard list type implementation.

The PacemakerClusterList follows Kubernetes API conventions correctly with required items and proper references.

etcd/v1alpha1/types_pacemakercluster.go (5)

577-668: LGTM!

The PacemakerClusterFencingAgentStatus and PacemakerClusterResourceStatus structs are well-designed with:

  • Consistent condition validation patterns (8 required condition types with XValidation rules)
  • Appropriate field constraints for names and methods
  • Good use of the PacemakerClusterResourceName enum for type-safe resource names

670-686: LGTM!

Standard Kubernetes list type implementation with appropriate documentation explaining the singleton nature of PacemakerCluster.


8-75: Well-designed cluster-level conditions with comprehensive documentation.

The condition types and reason constants are clearly documented with expected states, aggregation relationships, and operational context. Exporting these constants will help CEO maintain consistency when populating and interpreting the status.


418-450: Singleton enforcement and status immutability look good.

The CEL validation rules correctly enforce:

  • Singleton naming (self.metadata.name == 'cluster')
  • Status cannot be removed once set (!has(oldSelf.status) || has(self.status))

The feature gate DualReplica and compatibility level 4 are appropriate for this alpha API.


452-492: Status validation rules are correct.

The lastUpdated monotonicity validation properly handles the initial creation case with !has(oldSelf.lastUpdated) guard. The MinItems=0 for nodes is appropriately documented to handle catastrophic failure states.

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

@jaypoulz
Copy link
Author

The verification of this patch depends on https://github.com/kubernetes-sigs/controller-tools/pull/1324/files
The validation order for the conditions are not guaranteed to be in order. This patches fixes that.

@openshift-ci-robot
Copy link

openshift-ci-robot commented Jan 15, 2026

@jaypoulz: This pull request references OCPEDGE-2084 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set.

Details

In response to this:

Introduces tnf.etcd.openshift.io/v1alpha1 API group with PacemakerStatus custom resource. This provides visibility into Pacemaker cluster health for dual-replica etcd deployments. The status-only resource is populated by a privileged controller and consumed by the cluster-etcd-operator healthcheck controller. Not gated because it's only used by CEO when two-node has transitioned.

Works in conjunction with openshift/cluster-etcd-operator#1487

Depends on kubernetes-sigs/controller-tools#1324

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot
Copy link

openshift-ci-robot commented Jan 15, 2026

@jaypoulz: This pull request references OCPEDGE-2084 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set.

Details

In response to this:

Introduces tnf.etcd.openshift.io/v1alpha1 API group with PacemakerStatus custom resource. This provides visibility into Pacemaker cluster health for dual-replica etcd deployments. The status-only resource is populated by a privileged controller and consumed by the cluster-etcd-operator healthcheck controller. Not gated because it's only used by CEO when two-node has transitioned.

Works in conjunction with openshift/cluster-etcd-operator#1487

Depends on kubernetes-sigs/controller-tools#1324, which in turn needs a bump for go 1.25
In the meantime, I've vendored: https://github.com/jaypoulz/controller-tools/tree/strict-validation-ordering-v0.18

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot
Copy link

openshift-ci-robot commented Jan 15, 2026

@jaypoulz: This pull request references OCPEDGE-2084 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set.

Details

In response to this:

Introduces tnf.etcd.openshift.io/v1alpha1 API group with PacemakerStatus custom resource. This provides visibility into Pacemaker cluster health for dual-replica etcd deployments. The status-only resource is populated by a privileged controller and consumed by the cluster-etcd-operator healthcheck controller. Not gated because it's only used by CEO when two-node has transitioned.

Works in conjunction with openshift/cluster-etcd-operator#1487

Depends on kubernetes-sigs/controller-tools#1324, which in turn needs a bump for go 1.25
In the meantime, I've vendored: openshift/kubernetes-sigs-controller-tools#33

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@jaypoulz jaypoulz force-pushed the OCPEDGE-2084 branch 2 times, most recently from f4b651f to 1d19aca Compare January 15, 2026 22:22
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@etcd/README.md`:
- Around line 85-89: Update the comment for fencingAgents -> conditions to match
the earlier clarification for resources: in the fencingAgents entry (symbol:
fencingAgents and the nested key conditions) replace the terse comment ("#
Required: Same 8 conditions as resources") with the expanded clarification used
for resources explaining that the conditions array must contain the same eight
named condition entries (not an omitted/empty placeholder) and show that the
empty "[]" in the example is only an abbreviated/placeholder form in the README;
mirror the exact wording/format used for resources so readers understand they
must supply the full condition list for each fencing agent.
- Around line 83-84: The example shows the Etcd resource with an empty
conditions array which contradicts the documented requirement of a minimum of 8
items; update the example entry for the "Etcd" resource so its "conditions"
field either contains a representative list of the required 8 (or more)
condition objects or replace the empty array with a clear inline comment like "#
abbreviated; show full 8+ conditions in real manifest" to avoid implying an
empty list is valid.
♻️ Duplicate comments (2)
etcd/README.md (1)

1-9: Address documentation duplication with v1alpha1 directory.

A past reviewer noted this README may duplicate content that exists in the v1alpha1 directory. Consider consolidating to a single location to avoid maintenance drift—either keep detailed docs here and reference from v1alpha1, or vice versa.

openapi/generated_openapi/zz_generated.openapi.go (1)

26692-26756: Outstanding concern: CEL validation for lastUpdated monotonicity and MaxItems for nodes.

The schema description documents that lastUpdated cannot go backwards and nodes is limited to 32, but the generated schema doesn't show enforcement via x-kubernetes-validations (CEL) or MaxItems. This was flagged in a previous review and remains unaddressed.

🧹 Nitpick comments (7)
etcd/v1alpha1/tests/pacemakerclusters.etcd.openshift.io/DualReplica.yaml (1)

682-1102: IP validation tests cover key cases; consider expanding IPv6 coverage.

The loopback and invalid format rejection tests are good. For completeness, consider adding tests for:

  • Valid IPv6 address (e.g., 2001:db8::1)
  • IPv6 loopback (::1)
  • Link-local addresses (169.254.x.x, fe80::)
  • Non-canonical IPv6 (e.g., 2001:0db8::1 with leading zeros)

These can be added in a follow-up if the validation rules for these cases are already implemented in the CRD schema.

openapi/openapi.json (5)

4575-4587: Add length validation for AcceptRisk.name.

The description mandates non-empty and ≤256 chars, but the schema only has type: string. Consider enforcing minLength/maxLength via kubebuilder validation on the source type and regenerate the OpenAPI.

Proposed OpenAPI shape (generated)
         "name": {
           "description": "name is the name of the acceptable risk. It must be a non-empty string and must not exceed 256 characters.",
-          "type": "string"
+          "type": "string",
+          "minLength": 1,
+          "maxLength": 256
         }

5875-5886: Enforce the stated 500-entry limit for conditionalUpdateRisks.

The description says the list must not exceed 500 entries, but there’s no maxItems. Consider adding it in the source type and regenerating.

Proposed OpenAPI shape (generated)
         "conditionalUpdateRisks": {
           "description": "conditionalUpdateRisks contains the list of risks associated with conditionalUpdates. ... conditionalUpdateRisks must not contain more than 500 entries.",
           "type": "array",
+          "maxItems": 500,
           "items": {
             "default": {},
             "$ref": "#/definitions/com.github.openshift.api.config.v1.ConditionalUpdateRisk"
           },

6094-6102: Align riskNames constraints with the doc.

The description states max 500 entries and each name ≤256 chars, but the schema doesn’t enforce either. Consider adding maxItems and maxLength (on items) in the source type.

Proposed OpenAPI shape (generated)
         "riskNames": {
           "description": "riskNames represents the set of the names ... Entries must be unique and must not exceed 256 characters. riskNames must not contain more than 500 entries.",
           "type": "array",
+          "maxItems": 500,
           "items": {
             "type": "string",
-            "default": ""
+            "default": "",
+            "maxLength": 256
           },
           "x-kubernetes-list-type": "set"
         },

6129-6140: Cap conditions to a single entry as described.

The doc says “must not contain more than one entry,” but there’s no maxItems. Consider enforcing this in the source type and regenerating.

Proposed OpenAPI shape (generated)
         "conditions": {
           "description": "conditions represents the observations ... conditions must not contain more than one entry.",
           "type": "array",
+          "maxItems": 1,
           "items": {
             "default": {},
             "$ref": "#/definitions/io.k8s.apimachinery.pkg.apis.meta.v1.Condition"
           },

11639-11650: Enforce the 1000-entry limit for acceptRisks.

The description requires ≤1000 entries, but the schema doesn’t specify maxItems. Consider adding it in the source type and regenerating.

Proposed OpenAPI shape (generated)
         "acceptRisks": {
           "description": "acceptRisks is an optional set ... acceptRisks must not contain more than 1000 entries. Entries in this list must be unique.",
           "type": "array",
+          "maxItems": 1000,
           "items": {
             "default": {},
             "$ref": "#/definitions/com.github.openshift.api.config.v1.AcceptRisk"
           },
openapi/generated_openapi/zz_generated.openapi.go (1)

26758-26784: Consider adding an enum constraint for the type field.

The description states only InternalIP is currently supported, but there's no enum constraint to enforce this. For a status-only API written by a trusted controller this is low risk, but adding an enum in the source types would provide schema-level documentation and future-proof the API.

💡 Suggested fix in source type (etcd/v1alpha1/types_pacemakercluster.go)
// +kubebuilder:validation:Enum=InternalIP
Type string `json:"type"`

Then regenerate with ./hack/update-codegen.sh.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to data retention organization setting

📥 Commits

Reviewing files that changed from the base of the PR and between f4b651f and 1d19aca.

⛔ Files ignored due to path filters (5)
  • etcd/v1alpha1/zz_generated.crd-manifests/0000_25_etcd_01_pacemakerclusters-CustomNoUpgrade.crd.yaml is excluded by !**/zz_generated.crd-manifests/*
  • etcd/v1alpha1/zz_generated.crd-manifests/0000_25_etcd_01_pacemakerclusters-DevPreviewNoUpgrade.crd.yaml is excluded by !**/zz_generated.crd-manifests/*
  • etcd/v1alpha1/zz_generated.crd-manifests/0000_25_etcd_01_pacemakerclusters-TechPreviewNoUpgrade.crd.yaml is excluded by !**/zz_generated.crd-manifests/*
  • etcd/v1alpha1/zz_generated.crd-manifests/doc.go is excluded by !**/zz_generated.crd-manifests/*
  • etcd/v1alpha1/zz_generated.featuregated-crd-manifests/pacemakerclusters.etcd.openshift.io/DualReplica.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
📒 Files selected for processing (18)
  • .golangci.yaml
  • etcd/README.md
  • etcd/install.go
  • etcd/v1alpha1/Makefile
  • etcd/v1alpha1/doc.go
  • etcd/v1alpha1/register.go
  • etcd/v1alpha1/tests/pacemakerclusters.etcd.openshift.io/DualReplica.yaml
  • etcd/v1alpha1/types_pacemakercluster.go
  • etcd/v1alpha1/zz_generated.deepcopy.go
  • etcd/v1alpha1/zz_generated.featuregated-crd-manifests.yaml
  • etcd/v1alpha1/zz_generated.swagger_doc_generated.go
  • hack/update-payload-crds.sh
  • install.go
  • openapi/generated_openapi/zz_generated.openapi.go
  • openapi/openapi.json
  • payload-manifests/crds/0000_25_etcd_01_pacemakerclusters-CustomNoUpgrade.crd.yaml
  • payload-manifests/crds/0000_25_etcd_01_pacemakerclusters-DevPreviewNoUpgrade.crd.yaml
  • payload-manifests/crds/0000_25_etcd_01_pacemakerclusters-TechPreviewNoUpgrade.crd.yaml
🚧 Files skipped from review as they are similar to previous changes (6)
  • hack/update-payload-crds.sh
  • etcd/v1alpha1/doc.go
  • etcd/install.go
  • etcd/v1alpha1/zz_generated.featuregated-crd-manifests.yaml
  • .golangci.yaml
  • etcd/v1alpha1/register.go
🧰 Additional context used
🧬 Code graph analysis (4)
install.go (4)
etcd/install.go (1)
  • Install (17-17)
etcd/v1alpha1/register.go (1)
  • Install (14-14)
apiextensions/install.go (1)
  • Install (17-17)
apiextensions/v1alpha1/register.go (1)
  • Install (14-14)
etcd/v1alpha1/zz_generated.deepcopy.go (1)
etcd/v1alpha1/types_pacemakercluster.go (7)
  • PacemakerCluster (465-480)
  • PacemakerClusterStatus (486-522)
  • PacemakerClusterFencingAgentStatus (610-651)
  • PacemakerClusterList (706-715)
  • PacemakerClusterNodeStatus (526-604)
  • PacemakerNodeAddress (414-430)
  • PacemakerClusterResourceStatus (662-697)
etcd/v1alpha1/zz_generated.swagger_doc_generated.go (1)
etcd/v1alpha1/types_pacemakercluster.go (7)
  • PacemakerCluster (465-480)
  • PacemakerClusterFencingAgentStatus (610-651)
  • PacemakerClusterList (706-715)
  • PacemakerClusterNodeStatus (526-604)
  • PacemakerClusterResourceStatus (662-697)
  • PacemakerClusterStatus (486-522)
  • PacemakerNodeAddress (414-430)
etcd/v1alpha1/types_pacemakercluster.go (1)
etcd/v1alpha1/zz_generated.swagger_doc_generated.go (6)
  • PacemakerClusterStatus (75-77)
  • PacemakerClusterNodeStatus (54-56)
  • PacemakerNodeAddress (85-87)
  • PacemakerClusterResourceStatus (64-66)
  • PacemakerClusterFencingAgentStatus (31-33)
  • PacemakerCluster (20-22)
🪛 checkmake (0.2.2)
etcd/v1alpha1/Makefile

[warning] 1-1: Missing required phony target "all"

(minphony)


[warning] 1-1: Missing required phony target "clean"

(minphony)

🔇 Additional comments (45)
install.go (1)

58-58: LGTM!

The etcd package import and etcd.Install registration follow the established pattern for API group wiring. Alphabetical ordering is correctly maintained in both the import block and the schemeBuilder list.

Also applies to: 95-95

etcd/v1alpha1/Makefile (1)

1-3: LGTM - minimal test-focused Makefile is appropriate.

The focused test target correctly delegates to the central test suite. The static analysis warnings about missing all and clean targets are false positives—this Makefile intentionally provides only a test entry point for the v1alpha1 API.

Ensure a trailing newline is present at end of file to avoid editors auto-adding it (per past review feedback).

etcd/v1alpha1/tests/pacemakerclusters.etcd.openshift.io/DualReplica.yaml (2)

1-24: Solid test fixture foundation with good negative test coverage.

The onCreate tests properly validate the singleton naming constraint. The metadata and feature gate configuration are correct.


27-438: Comprehensive positive test case validates full status structure.

The full healthy cluster test exercises all required conditions at cluster, node, resource, and fencing agent levels. The expected block correctly mirrors the updated block, confirming the validation accepts properly structured status.

etcd/README.md (2)

21-36: Excellent documentation of status subresource design rationale.

The explanation of why atomic create-with-status wasn't feasible, along with the two-step operation flow, provides valuable context for future maintainers. This addresses the design questions raised in past review discussions.


107-142: Well-structured condition reference tables.

The condition tables clearly document the expected True/False states and corresponding reasons for cluster, node, and resource levels. This will be valuable for operators and developers working with the healthcheck controller.

openapi/openapi.json (8)

6351-6353: LGTM for the minTLSVersion example update.


8537-8538: LGTM for the Intermediate TLS profile description tweak.


8815-8816: LGTM for the Modern TLS profile description tweak.


9745-9746: LGTM for the Old TLS profile description tweak.


11323-11325: LGTM for the minTLSVersion example update.


11334-11352: LGTM for the TLS profile documentation expansion.


11685-11686: LGTM for the acceptedRisks description update.


14627-14858: LGTM for the new PacemakerCluster schema additions and detailed field docs.

openapi/generated_openapi/zz_generated.openapi.go (6)

547-553: LGTM!

The type registrations for the new etcd/v1alpha1 Pacemaker types follow the established pattern and maintain alphabetical ordering with adjacent entries.


26393-26434: LGTM!

The PacemakerCluster schema correctly defines a status-only resource with no spec field, aligning with the design for a singleton CRD populated via the status subresource. The compatibility level 4 designation appropriately signals alpha-level stability.


26436-26486: LGTM!

The fencing agent status schema properly models STONITH device status with conditions as a list-map keyed by type. The required fields (conditions, name, method) ensure complete status reporting.


26488-26537: LGTM!

Standard list type implementation following Kubernetes conventions.


26539-26645: LGTM!

The node status schema comprehensively models pacemaker node state including conditions, addresses, resources, and fencing agents. The use of atomic list for addresses (where the first address is used for etcd peer URLs) and list-maps for resources/fencing agents (keyed by name) follows appropriate Kubernetes patterns.


26647-26690: LGTM!

The resource status schema correctly models pacemaker resource state for Kubelet and Etcd resources.

etcd/v1alpha1/zz_generated.deepcopy.go (7)

13-42: Deepcopy logic for PacemakerCluster looks correct.


44-65: Deepcopy for PacemakerClusterFencingAgentStatus is correct.


67-98: PacemakerClusterList deepcopy methods look good.


100-140: PacemakerClusterNodeStatus deep copy correctly handles slices.


142-163: PacemakerClusterResourceStatus deepcopy looks correct.


165-194: PacemakerClusterStatus deepcopy aligns with fields.


196-209: PacemakerNodeAddress deepcopy is straightforward and correct.

etcd/v1alpha1/zz_generated.swagger_doc_generated.go (7)

14-22: PacemakerCluster Swagger docs match the type description.


24-33: Fencing agent Swagger docs align with fields.


35-43: PacemakerClusterList Swagger docs are consistent.


45-56: Node status Swagger docs look accurate.


58-66: Resource status Swagger docs look accurate.


68-76: Cluster status Swagger docs align with the schema.


79-87: Node address Swagger docs are consistent with the type.

etcd/v1alpha1/types_pacemakercluster.go (11)

10-73: Cluster condition types and reasons are consistent.


75-243: Node condition types and reasons are consistent.


245-399: Resource condition types and reasons are consistent.


401-430: Node address enum and validation look coherent with the docs.


432-446: Resource name enum values look correct.


448-480: PacemakerCluster singleton and status subresource annotations look good.


482-522: PacemakerClusterStatus schema and validations look aligned.


524-604: PacemakerClusterNodeStatus schema and validations look solid.


606-651: PacemakerClusterFencingAgentStatus schema looks correct.


653-697: PacemakerClusterResourceStatus schema looks correct.


699-715: PacemakerClusterList follows standard list conventions.

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

@jaypoulz jaypoulz force-pushed the OCPEDGE-2084 branch 4 times, most recently from 72fdc82 to 715f512 Compare January 16, 2026 03:30
Comment on lines +102 to +105
// NoDataTypeChange is disabled because controller-tools now simplifies
// allOf enum structures to simple enums, changing the type field from
// "" (unset in allOf entries) to "string".
toRemove := []string{"NoBools", "NoFloats", "NoUints", "NoMaps", "ConditionsMustHaveProperSSATags", "NoDataTypeChange"}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@everettraven Are you happy that data type change is sufficiently covered by crdify now?

// +kubebuilder:validation:XValidation:rule="self.exists(c, c.type == 'Healthy')",message="conditions must contain a condition of type Healthy"
// +kubebuilder:validation:XValidation:rule="self.exists(c, c.type == 'InService')",message="conditions must contain a condition of type InService"
// +kubebuilder:validation:XValidation:rule="self.exists(c, c.type == 'NodeCountAsExpected')",message="conditions must contain a condition of type NodeCountAsExpected"
// +required
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this is fine, and you will trigger a linter warning, which we will override. But this being required means that {} is not valid, so the linter should be telling you that you don't need a pointer. Can you try removing those linter exceptions and see if it does actually tell you to remove the pointer?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I made the change above before checking if the linter would correct me on it. Let me revert that and get you a definitive answer.

So - yes - I can confirm that the linter gave the right guidance as I purposely messed with PacemakerClusterStatus which eventually led to the line you suggested above.

// +required
Conditions []metav1.Condition `json:"conditions,omitempty"`

// name is the pacemaker resource name of the fencing agent (e.g., "master-0_redfish", "master-1_ipmi").
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's been a while, but, remind me, did we discuss validating this string? Is it always going to be a mixture of alphanumeric characters and some other punctuation? Would be good to limit it if possible

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't remember if we discussed this one.
So the good news is that we (cluster-etcd-operator) are the ones creating these fencing resources, so we control the name. The name structure is "_"
Right now, the only supported fencing device type is redfish. In the future, we'll likely add support for ipmi, and potential other, more hardware specific options.

All that said, let me look up the API for k8s node names to see what validations are in place.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nodes are named as DNS 1123 subdomain, so alphanumerics, hyphens and periods. If this has underscores too, there is a variation that allows those, though we'd have to use a CEL matches with the regex. Perhaps we start with that?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I'll see if I can smash together some validation for {valid-node-name}_{valid-fencing-type}. We can always relax that validation down the line, but as long as we can support _redfish we should be good for now.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I err'ed on the side of explicitly checking for _redfish in the current design. Since other fencing device support is a future feature, we can look to relax this requirement in a future iteration.

@openshift-merge-robot openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jan 16, 2026
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 8

🤖 Fix all issues with AI agents
In `@openapi/openapi.json`:
- Around line 5875-5886: The schema for conditionalUpdateRisks currently states
a 500-entry cap in its description but lacks an actual maxItems constraint;
update the OpenAPI schema by adding "maxItems": 500 to the
conditionalUpdateRisks array definition (the property named
conditionalUpdateRisks that references
com.github.openshift.api.config.v1.ConditionalUpdateRisk) or, if the cap is not
required, soften the description to remove the hard limit; after changing the
source type (where ConditionalUpdateRisk is defined) regenerate the openapi.json
so the validation is enforced.
- Around line 6129-6140: The "conditions" array schema currently states it "must
not contain more than one entry" but doesn't enforce this; update the OpenAPI
schema for the conditions property to add "maxItems: 1" (i.e., add maxItems: 1
alongside type/items/x-kubernetes-list-* entries for the "conditions" property
that references io.k8s.apimachinery.pkg.apis.meta.v1.Condition) so the
single-entry constraint is validated by tooling rather than only in the
description.
- Around line 4575-4586: The schema for AcceptRisk.name lacks an enforced
256-char limit; update the Go type for AcceptRisk by adding the kubebuilder
validation tag to the Name field (e.g., +kubebuilder:validation:MaxLength=256 on
the Name field in the AcceptRisk struct) and then run the OpenAPI generator
(rebuild the OpenAPI JSON) so openapi.json includes "maxLength": 256 for
AcceptRisk.name.
- Around line 11639-11649: The OpenAPI schema for the acceptRisks array does not
enforce the documented 1000-entry cap; update the OpenAPI source so the
acceptRisks property includes "maxItems": 1000 (in the same object that defines
"type": "array" and "items") for the definition that references
com.github.openshift.api.config.v1.AcceptRisk, then regenerate the openapi.json
so the maxItems constraint appears in the output.
- Around line 6094-6102: The OpenAPI schema for the riskNames array lacks the
limits described in its description; update the schema for the riskNames field
(the array named "riskNames" used by conditionalUpdateRisks /
spec.desiredUpdate.acceptRisks) to add "maxItems": 500 on the array and add
"maxLength": 256 on each string item (i.e., the items object for riskNames),
then regenerate the OpenAPI artifact so the documented constraints are enforced
in the JSON.

In
`@payload-manifests/crds/0000_10_config-operator_01_authentications-DevPreviewNoUpgrade.crd.yaml`:
- Around line 472-475: Replace the brittle regex validation on discoveryURL with
the same user-info check used by issuerURL: change the x-kubernetes-validations
rule that currently uses '!self.matches(''^https://.+:.+@.+/.*$'')' to use the
same expression as issuerURL (i.e., ensure the rule enforces self.find('@') ==
'' so any '@' in the URL is rejected); keep the existing "must be a valid URL"
validation intact.

In
`@payload-manifests/crds/0000_10_config-operator_01_authentications-TechPreviewNoUpgrade.crd.yaml`:
- Around line 473-482: The current rule with message "discoveryURL must not
contain user info" uses a regex that requires a trailing path and misses URLs
without one; replace that rule value so it detects userinfo in any HTTPS URL
(i.e. match any presence of user:pass@ or any '@' after the scheme) — update the
rule associated with the "discoveryURL must not contain user info" message to a
pattern that checks for user info regardless of path (e.g. match an '@'
following the https:// authority) so URLs like "https://user:[email protected]" are
rejected.

In
`@payload-manifests/crds/0000_25_etcd_01_pacemakerclusters-DevPreviewNoUpgrade.crd.yaml`:
- Around line 141-142: The CRD's node list documentation and maxItems (the
descriptive text "up to 32 nodes" and the YAML field maxItems: 32) do not match
the Go types/Swagger cap of 5 (max control-plane); to fix, decide which limit is
correct and make them consistent: either update the CRD YAML to set maxItems: 5
and change the human-readable comment to reflect "up to 5 nodes" or update the
Go types/Swagger and regenerate CRDs if 32 is intended; ensure you update the
field that defines the node list (maxItems) and regenerate CRD manifests (or
update the Go struct/tag and swagger annotations) so the runtime and schema
match.
♻️ Duplicate comments (3)
payload-manifests/crds/0000_25_etcd_01_pacemakerclusters-CustomNoUpgrade.crd.yaml (1)

173-176: Fix invalid CEL ip.isCanonical call (schema will fail validation).

The rule uses ip.isCanonical(self), which is invalid CEL; it should parse the string first and then call isCanonical() on the IP object. Please fix in the source type and regenerate all pacemakercluster CRD variants.

✅ Suggested fix
-                              rule: isIP(self) && ip.isCanonical(self) && ip(self).isGlobalUnicast()
+                              rule: isIP(self) && ip(self).isCanonical() && ip(self).isGlobalUnicast()
payload-manifests/crds/0000_10_config-operator_01_authentications-CustomNoUpgrade.crd.yaml (1)

472-475: Same discoveryURL user-info gap as DevPreview variant.

Please apply the same hardened check here (and any other generated variants) to keep schemas consistent.

openapi/openapi.json (1)

14667-14678: Cardinality limits are described but not enforced in schema.

Several fields (e.g., Line 14668 for fencing‑agent conditions, Line 14740 for node conditions, Line 14768 for resources, Line 14821 for cluster conditions, Line 14837 for nodes) describe required min/max counts, but the schema doesn’t include minItems/maxItems. If those limits are intended, please add the corresponding validation tags in the source types and regenerate; otherwise soften the wording.

Also applies to: 14739-14778, 14820-14837

@openshift-merge-robot openshift-merge-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jan 16, 2026
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Fix all issues with AI agents
In `@etcd/README.md`:
- Around line 50-83: The conditions shown under status, nodes, and resources
(e.g., the arrays named conditions under status, nodes->conditions, and
resources->conditions, and resource names like Kubelet and Etcd) are abbreviated
and may imply only the type field is required; update the README example to add
a short clarifying note immediately above or beside each conditions block
stating that condition objects are abbreviated for brevity and that full
condition objects must include required fields such as status,
lastTransitionTime/lastUpdated, reason, and message (and any cluster-specific
required fields), and explicitly call out that the example only shows the type
field for readability while the real schema requires the full condition
structure.

In
`@payload-manifests/crds/0000_10_config-operator_01_authentications-CustomNoUpgrade.crd.yaml`:
- Around line 479-482: The CEL rules for discoveryURL currently call url(self)
directly which can throw on invalid strings; update both rules to guard with
isURL(self) by changing the expressions to use isURL(self) &&
url(self).getQuery().size() == 0 and isURL(self) && url(self).getScheme() ==
'https' so the url() calls only run when isURL(self) is true; locate these
checks tied to the discoveryURL validation messages and update the rule
expressions accordingly.

In
`@payload-manifests/crds/0000_10_config-operator_01_authentications-TechPreviewNoUpgrade.crd.yaml`:
- Around line 479-482: The CEL rules for discoveryURL currently call url(self)
directly which can throw on invalid values; update the two rules to guard url()
with isURL(self) (e.g., replace rule: url(self).getQuery().size() == 0 and rule:
url(self).getScheme() == 'https' with guarded forms using isURL(self) &&
url(self)...). Locate the discoveryURL validation rules in this CRD and prepend
each url(...) call with isURL(self) && so evaluation errors are avoided.
♻️ Duplicate comments (4)
payload-manifests/crds/0000_10_config-operator_01_authentications-TechPreviewNoUpgrade.crd.yaml (1)

473-475: Fix discoveryURL user‑info detection to catch URLs without a path.

The current regex only matches when a trailing path exists, so values like https://user@host can bypass. Align with the issuerURL rule (or an authority-only pattern) to catch all user‑info cases.

🐛 Proposed fix
-                          - message: discoveryURL must not contain user info
-                            rule: '!self.matches(''^https://.+:.+@.+/.*$'')'
+                          - message: discoveryURL must not contain user info
+                            rule: self.find('@') == ''
payload-manifests/crds/0000_10_config-operator_01_authentications-CustomNoUpgrade.crd.yaml (1)

473-475: Fix discoveryURL user‑info detection to catch URLs without a path.

Same pattern as the TechPreview variant—https://user@host can slip through. Align with the issuerURL rule (or an authority-only pattern) to catch all user‑info cases.

🐛 Proposed fix
-                          - message: discoveryURL must not contain user info
-                            rule: '!self.matches(''^https://.+:.+@.+/.*$'')'
+                          - message: discoveryURL must not contain user info
+                            rule: self.find('@') == ''
payload-manifests/crds/0000_25_etcd_01_pacemakerclusters-CustomNoUpgrade.crd.yaml (1)

171-175: Fix CEL IP canonical check syntax.

CEL requires ip(self).isCanonical(); ip.isCanonical(self) will fail validation. Since this manifest is generated, update the source Go marker and regenerate the CRDs.

🔧 Proposed fix
-                              rule: isIP(self) && ip.isCanonical(self) && ip(self).isGlobalUnicast()
+                              rule: isIP(self) && ip(self).isCanonical() && ip(self).isGlobalUnicast()
payload-manifests/crds/0000_10_config-operator_01_authentications-DevPreviewNoUpgrade.crd.yaml (1)

472-474: Discovery URL user-info check still misses user@host cases.

The regex only blocks user:pass@host/path; use the same self.find('@') == '' check as issuerURL.

🔧 Proposed fix
-                          - message: discoveryURL must not contain user info
-                            rule: '!self.matches(''^https://.+:.+@.+/.*$'')'
+                          - message: discoveryURL must not contain user info
+                            rule: self.find('@') == ''
🧹 Nitpick comments (2)
openapi/generated_openapi/zz_generated.openapi.go (1)

26730-26749: Minor inconsistency: nodes array items lack the Default property.

Other array items in this file (e.g., resources at line 26611, fencingAgents at line 26633) include Default: map[string]interface{}{}, but nodes items do not. This is likely a code generation quirk based on how the source type is defined.

Since nodes is a required field and this is a generated file, this has no functional impact, but if you want consistency, verify the source type definition matches other similar fields.

etcd/v1alpha1/types_pacemakercluster.go (1)

519-522: Consider using value slice instead of pointer slice.

Line 521 defines Nodes as []*PacemakerClusterNodeStatus (pointer slice), but typically Kubernetes APIs use value slices []PacemakerClusterNodeStatus. Pointer slices have several drawbacks:

  • Individual slice items can be nil, which has no clear semantic meaning
  • Adds complexity to DeepCopy implementations
  • The CRD schema doesn't indicate items can be null

If the intent is to allow the slice itself to be empty (which MinItems=0 already permits), a value slice with omitempty achieves the same goal.

Suggested change
-	Nodes []*PacemakerClusterNodeStatus `json:"nodes,omitempty"`
+	Nodes []PacemakerClusterNodeStatus `json:"nodes,omitempty"`

Update controller-tools to jaypoulz/controller-tools fork which includes
a fix for deterministic XValidation rule ordering. This ensures CRD
generation produces consistent output regardless of map iteration order.

Changes included:
- tools/go.mod: Point replace directive to jaypoulz/controller-tools
- tools/Makefile: Fix openapi-gen path (moved from code-generator to
  kube-openapi upstream)
- tools/codegen/pkg/crdify/generator.go: Ignore 'type' validation for
  allOf→enum schema simplification
- tools/codegen/pkg/schemacheck/generator.go: Disable NoDataTypeChange
  comparator for the same schema simplification

The new controller-tools generates simpler enum schemas without allOf
wrappers when both type and field have enum markers. This is functionally
equivalent but structurally different, requiring the crdify and
schemacheck adjustments to avoid false positive violations.

Upstream fix: kubernetes-sigs/controller-tools#1324
OpenShift carry: openshift/kubernetes-sigs-controller-tools#33

The OpenShift carry is a temporary backport until we can bump to Go 1.25,
which is required by the upstream controller-tools version with this fix.

Created with support from Claude Opus 4 (Anthropic)
Update kube-api-linter from v0.0.0-20251028144537 to v0.0.0-20260114104534.
This also bumps k8s.io/gengo/v2 to v2.0.0-20250922181213 as a transitive
dependency.

Created with support from Claude Opus 4 (Anthropic)
…hift with Fencing

Introduces etcd.openshift.io/v1alpha1 API group with a PacemakerCluster
custom resource. This provides visibility into pacemaker cluster health
for Two Node OpenShift with Fencing deployments.

This status-only resource reports the health of pacemaker nodes and their
managed resources (Kubelet, Etcd) along with fencing agent status. Each
node tracks conditions, IP addresses for etcd peer URLs, and per-resource
health using positive polarity conditions.

The cluster-etcd-operator is responsible for interpreting this status and
degrading appropriately when etcd is unhealthy or at risk of being unable
to recover automatically in a quorum-loss event.

Created with support from Claude Opus 4 (Anthropic)
@jaypoulz
Copy link
Author

Should be ready for another review. I had to fix some integration tests due to the update to validation ordering.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@openapi/openapi.json`:
- Around line 24952-24955: The schema for the property synchronizedAPI currently
only documents allowed values but doesn't enforce them; update the source type
that generates this OpenAPI (the struct/field that maps to synchronizedAPI) to
constrain it to an enum of "MachineAPI" and "ClusterAPI" and then regenerate the
OpenAPI spec, or if you prefer a softer change, update the description text to
say "expected values" instead of "Valid values"; ensure you apply this fix to
both occurrences of synchronizedAPI in the generated spec so the enforcement or
wording is consistent.

In
`@payload-manifests/crds/0000_10_config-operator_01_infrastructures-CustomNoUpgrade.crd.yaml`:
- Around line 275-279: The current CEL path validation rule using
url(self).getEscapedPath() is too restrictive (regex '^/(api/)?v[0-9]+/{0,1}$')
and rejects valid IBM Cloud endpoints with deeper path segments; update the rule
(the one paired with message: url path must match ...) to relax the regex so it
accepts an optional slash and any additional path segments after the version
(i.e., allow "/vN" or "/api/vN" followed by further path components), leaving
the scheme check url(self).getScheme() == "https" intact.
♻️ Duplicate comments (5)
payload-manifests/crds/0000_10_config-operator_01_authentications-TechPreviewNoUpgrade.crd.yaml (1)

473-482: Fix discoveryURL user-info detection and guard url() calls.

The current regex requires a path, so URLs like https://user:[email protected] can slip through. Also, url(self) can raise on invalid values because CEL rules don’t short‑circuit. Consider a simpler @ check and guard the url() rules.

🛠️ Suggested adjustment
-                          - message: discoveryURL must not contain user info
-                            rule: '!self.matches(''^https://.+:.+@.+/.*$'')'
+                          - message: discoveryURL must not contain user info
+                            rule: self.find('@') == ''
 ...
-                          - message: discoveryURL must not contain query parameters
-                            rule: url(self).getQuery().size() == 0
-                          - message: discoveryURL must be a valid https URL
-                            rule: url(self).getScheme() == 'https'
+                          - message: discoveryURL must not contain query parameters
+                            rule: isURL(self) && url(self).getQuery().size() == 0
+                          - message: discoveryURL must be a valid https URL
+                            rule: isURL(self) && url(self).getScheme() == 'https'
payload-manifests/crds/0000_10_config-operator_01_authentications-DevPreviewNoUpgrade.crd.yaml (1)

472-482: DiscoveryURL user-info validation still too narrow.

The regex only blocks user:pass@host/path and misses user@host or URLs without a path. Consider using the same self.find('@') == '' check as issuerURL.

Suggested fix
-                          - message: discoveryURL must not contain user info
-                            rule: '!self.matches(''^https://.+:.+@.+/.*$'')'
+                          - message: discoveryURL must not contain user info
+                            rule: self.find('@') == ''
payload-manifests/crds/0000_10_config-operator_01_authentications-CustomNoUpgrade.crd.yaml (1)

473-482: Guard url(self) with isURL(self) to avoid CEL eval errors.

url(self) throws on invalid strings; each rule is evaluated independently. Please guard both rules with isURL(self).

🛠️ Proposed fix
-                          - message: discoveryURL must not contain query parameters
-                            rule: url(self).getQuery().size() == 0
+                          - message: discoveryURL must not contain query parameters
+                            rule: isURL(self) && url(self).getQuery().size() == 0
-                          - message: discoveryURL must be a valid https URL
-                            rule: url(self).getScheme() == 'https'
+                          - message: discoveryURL must be a valid https URL
+                            rule: isURL(self) && url(self).getScheme() == 'https'
payload-manifests/crds/0000_25_etcd_01_pacemakerclusters-CustomNoUpgrade.crd.yaml (1)

171-174: Fix CEL syntax: ip.isCanonical(self) is invalid.
Use ip(self).isCanonical() (parse first, then call method). Also update the source markers and regenerate CRDs so all variants stay consistent.

🩹 Proposed fix
-                              rule: isIP(self) && ip.isCanonical(self) && ip(self).isGlobalUnicast()
+                              rule: isIP(self) && ip(self).isCanonical() && ip(self).isGlobalUnicast()
openapi/openapi.json (1)

14667-14846: Schema doesn’t enforce several documented list constraints.
Multiple fields describe required min/max counts (conditions/resources/fencingAgents/nodes/addresses) but the schema lacks corresponding minItems/maxItems (and IP/pattern constraints). Please add kubebuilder validation tags in the source Go types and regenerate, or soften the wording to avoid implying enforcement.

🧹 Nitpick comments (1)
etcd/v1alpha1/Makefile (1)

1-3: Consider adding minimal all/clean phony targets to satisfy checkmake.
If the linter gates CI, stub targets or a lint config exception would silence warnings.

Comment on lines +24952 to +24955
"synchronizedAPI": {
"description": "synchronizedAPI holds the last stable value of authoritativeAPI. It is used to detect migration cancellation requests and to restore the resource to its previous state. Valid values are \"MachineAPI\" and \"ClusterAPI\". When omitted, the resource has not yet been reconciled by the migration controller.",
"type": "string"
},
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Constrain synchronizedAPI to the documented values (or soften wording).
The description says the valid values are "MachineAPI" and "ClusterAPI", but the schema doesn’t enforce it. Consider adding an enum in the source types and regenerate, or change the text to “expected values.”

🔧 Expected OpenAPI shape (apply to both occurrences)
         "synchronizedAPI": {
           "description": "synchronizedAPI holds the last stable value of authoritativeAPI. It is used to detect migration cancellation requests and to restore the resource to its previous state. Valid values are \"MachineAPI\" and \"ClusterAPI\". When omitted, the resource has not yet been reconciled by the migration controller.",
-          "type": "string"
+          "type": "string",
+          "enum": [
+            "MachineAPI",
+            "ClusterAPI"
+          ]
         },

Also applies to: 25059-25062

🤖 Prompt for AI Agents
In `@openapi/openapi.json` around lines 24952 - 24955, The schema for the property
synchronizedAPI currently only documents allowed values but doesn't enforce
them; update the source type that generates this OpenAPI (the struct/field that
maps to synchronizedAPI) to constrain it to an enum of "MachineAPI" and
"ClusterAPI" and then regenerate the OpenAPI spec, or if you prefer a softer
change, update the description text to say "expected values" instead of "Valid
values"; ensure you apply this fix to both occurrences of synchronizedAPI in the
generated spec so the enforcement or wording is consistent.

Comment on lines +275 to 279
- message: url path must match /v[0,9]+ or /api/v[0,9]+
rule: matches((url(self).getEscapedPath()), '^/(api/)?v[0-9]+/{0,1}$')
- message: url must use https scheme
rule: url(self).getScheme() == "https"
required:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

# First, examine the CRD file structure around the lines in question
wc -l payload-manifests/crds/0000_10_config-operator_01_infrastructures-CustomNoUpgrade.crd.yaml

Repository: openshift/api

Length of output: 152


🏁 Script executed:

# Get context around lines 275-279
sed -n '260,290p' payload-manifests/crds/0000_10_config-operator_01_infrastructures-CustomNoUpgrade.crd.yaml

Repository: openshift/api

Length of output: 1737


🏁 Script executed:

# Search for IBMCloud or IBMCloudServiceEndpoint references in the codebase
rg -i 'ibmcloud|ibmcloudserviceendpoint' --type yaml -A 3 -B 3

Repository: openshift/api

Length of output: 50370


🏁 Script executed:

# Check for any existing endpoint URL examples or test data
rg -i 'serviceendpoint|endpoint.*url' --type yaml -i | head -30

Repository: openshift/api

Length of output: 5027


🏁 Script executed:

# Search for actual URL examples in test files or documentation
find . -type f \( -name "*.yaml" -o -name "*.json" -o -name "*.go" \) | head -20 | xargs grep -l "serviceEndpoint" 2>/dev/null | head -5

Repository: openshift/api

Length of output: 39


🏁 Script executed:

# Look for Go test files or examples that might have IBMCloud endpoint examples
fd -t f "test.*\.go$|example.*\.go$" | xargs grep -l "IBMCloud\|serviceEndpoint" 2>/dev/null | head -5

Repository: openshift/api

Length of output: 39


🏁 Script executed:

# Check CEL validation syntax by searching for other CEL rules in the CRDs
grep -A 2 "getEscapedPath\|getScheme\|url(self)" payload-manifests/crds/0000_10_config-operator_01_infrastructures-CustomNoUpgrade.crd.yaml

Repository: openshift/api

Length of output: 366


🏁 Script executed:

# Search for any documentation or comments about URL patterns in the repo
rg -i "api/v|/v[0-9]" --type yaml -B 2 -A 2 | grep -i "ibmcloud\|endpoint" | head -20

Repository: openshift/api

Length of output: 3396


🏁 Script executed:

# Search for other CEL validation examples in the CRDs to understand common patterns
rg "x-kubernetes-validations" -A 10 payload-manifests/crds/ | grep -A 5 "url\|path" | head -40

Repository: openshift/api

Length of output: 5453


🏁 Script executed:

# Check if there's any version specification or documentation about CEL support
grep -r "kubernetes-validations\|CEL" --include="*.md" --include="*.txt" . 2>/dev/null | head -10

Repository: openshift/api

Length of output: 1460


🏁 Script executed:

# Look for test cases or examples with actual endpoint URLs
rg "https://" --type yaml | grep -i "ibm\|endpoint" | head -20

Repository: openshift/api

Length of output: 4051


🏁 Script executed:

# Extract the full context around the IBMCloud URL validation to understand the intent
sed -n '250,300p' payload-manifests/crds/0000_10_config-operator_01_infrastructures-CustomNoUpgrade.crd.yaml

Repository: openshift/api

Length of output: 2716


🏁 Script executed:

# Search for specific examples of IBM Cloud service endpoint URLs in documentation
grep -B 5 -A 5 "private\.iam\.cloud\.ibm\.com\|us\.south\.private\.iaas\.cloud\.ibm\.com" payload-manifests/crds/0000_10_config-operator_01_infrastructures-CustomNoUpgrade.crd.yaml

Repository: openshift/api

Length of output: 2077


🏁 Script executed:

# Let me test the regex against the documented example URLs
python3 << 'EOF'
import re
import urllib.parse

# The regex from the CRD
regex_pattern = r'^/(api/)?v[0-9]+/{0,1}$'

# Example URLs from the documentation
example_urls = [
    "https://private.iam.cloud.ibm.com",
    "https://us.south.private.iaas.cloud.ibm.com",
    "https://private.iam.cloud.ibm.com/",
    "https://private.iam.cloud.ibm.com/v1",
    "https://private.iam.cloud.ibm.com/v1/",
    "https://private.iam.cloud.ibm.com/api/v1",
    "https://private.iam.cloud.ibm.com/identity/token",
    "https://api.cloud.ibm.com/v2/resources"
]

regex = re.compile(regex_pattern)

print("Testing regex pattern:", regex_pattern)
print()

for url in example_urls:
    parsed = urllib.parse.urlparse(url)
    path = parsed.path if parsed.path else "/"
    matches = bool(regex.match(path))
    print(f"URL: {url}")
    print(f"  Path: '{path}'")
    print(f"  Matches: {matches}")
    print()
EOF

Repository: openshift/api

Length of output: 735


🏁 Script executed:

# Search for Kubernetes CEL documentation or version constraints
grep -r "apiVersion\|spec:" payload-manifests/crds/0000_10_config-operator_01_infrastructures-CustomNoUpgrade.crd.yaml | head -20

Repository: openshift/api

Length of output: 136


🏁 Script executed:

# Check if url() and getEscapedPath/getScheme are used elsewhere in the codebase
rg "url\(self\)|getEscapedPath|getScheme" --type yaml | head -20

Repository: openshift/api

Length of output: 3499


🏁 Script executed:

# Verify the suggested regex fix works better
python3 << 'EOF'
import re
import urllib.parse

# Original regex from the CRD
original_regex = r'^/(api/)?v[0-9]+/{0,1}$'
# Suggested fix from review
suggested_regex = r'^/(api/)?v[0-9]+(/.*)?$'

# Test cases
test_cases = [
    "/v1",
    "/v1/",
    "/api/v1",
    "/api/v1/",
    "/v2/resources",
    "/api/v2/identity/token",
    "/v1/some/deep/path",
    "/v",
    "/api/v",
    "/v1/v1/v1"  # edge case
]

original = re.compile(original_regex)
suggested = re.compile(suggested_regex)

print("Comparing regexes:")
print(f"Original:  {original_regex}")
print(f"Suggested: {suggested_regex}")
print()

for path in test_cases:
    orig_match = bool(original.match(path))
    sugg_match = bool(suggested.match(path))
    diff = "🔄" if orig_match != sugg_match else "✓"
    print(f"{diff} {path:30} | Original: {str(orig_match):5} | Suggested: {str(sugg_match):5}")
EOF

Repository: openshift/api

Length of output: 850


The regex is too restrictive for IBM Cloud endpoints with deeper path segments.

The path rule '^/(api/)?v[0-9]+/{0,1}$' rejects valid URLs with additional path components (e.g., /v2/resources, /api/v2/identity/token). Real IBM Cloud service endpoints may include deeper paths beyond the version prefix. The CEL functions url(self).getEscapedPath() and getScheme() are available in Kubernetes validation.

Consider relaxing the regex to allow optional path segments after the version:

Suggested fix
-                                rule: matches((url(self).getEscapedPath()), '^/(api/)?v[0-9]+/{0,1}$')
+                                rule: matches((url(self).getEscapedPath()), '^/(api/)?v[0-9]+(/.*)?$')
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
- message: url path must match /v[0,9]+ or /api/v[0,9]+
rule: matches((url(self).getEscapedPath()), '^/(api/)?v[0-9]+/{0,1}$')
- message: url must use https scheme
rule: url(self).getScheme() == "https"
required:
- message: url path must match /v[0,9]+ or /api/v[0,9]+
rule: matches((url(self).getEscapedPath()), '^/(api/)?v[0-9]+(/.*)?$')
- message: url must use https scheme
rule: url(self).getScheme() == "https"
required:
🤖 Prompt for AI Agents
In
`@payload-manifests/crds/0000_10_config-operator_01_infrastructures-CustomNoUpgrade.crd.yaml`
around lines 275 - 279, The current CEL path validation rule using
url(self).getEscapedPath() is too restrictive (regex '^/(api/)?v[0-9]+/{0,1}$')
and rejects valid IBM Cloud endpoints with deeper path segments; update the rule
(the one paired with message: url path must match ...) to relax the regex so it
accepts an optional slash and any additional path segments after the version
(i.e., allow "/vN" or "/api/vN" followed by further path components), leaving
the scheme check url(self).getScheme() == "https" intact.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jan 17, 2026

@jaypoulz: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/okd-scos-e2e-aws-ovn 1d41200 link false /test okd-scos-e2e-aws-ovn
ci/prow/lint e6479a3 link true /test lint
ci/prow/verify-crdify e6479a3 link true /test verify-crdify

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants