Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
101 changes: 97 additions & 4 deletions .claude/commands/run-test.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,19 +14,50 @@ You are helping the user run a test instance of the ci-chat-bot. Follow these st
- Logs at `/tmp/ci-chat-bot.log` may contain sensitive information
- For production deployments, use proper secret management (Kubernetes secrets, vault, etc.) instead of environment variables

1. **Check Environment Variables**: First verify that the required environment variables are set by checking BOT_TOKEN and BOT_SIGNING_SECRET.
1. **Check Environment Variables**: First ask the user if they want to load environment variables from a file.

If they are NOT set, ask the user to provide:
**Option A: Load from Environment File (Recommended)**

Ask the user for the path to their environment file (e.g., `.env`, `.env.local`, etc.).

The file should contain one variable per line in the format:
```bash
BOT_TOKEN=xoxb-your-token-here
BOT_SIGNING_SECRET=your-signing-secret
GITHUB_TOKEN=ghp_your-github-token
GCP_ACCESS_DRY_RUN=true
GCP_SERVICE_ACCOUNT_JSON={"type":"service_account",...}
ORG_DATA_BUCKET=your-org-data-bucket
```

If the user provides a file path:
- Verify the file exists
- Load the environment variables using `source` or `export $(cat file | xargs)`
- Store the file path to use in step 5

**Option B: Manual Entry (if no env file)**

If they do NOT have an env file or prefer manual entry, ask the user to provide:
- `BOT_TOKEN`: Slack Bot Token (required) - starts with `xoxb-`
- `BOT_SIGNING_SECRET`: Slack App Signing Secret (required)
- `GITHUB_TOKEN`: GitHub token (optional but recommended)
- `GCP_ACCESS_DRY_RUN`: Set to `true` to enable dry-run mode for GCP credentials (optional)
- `GCP_SERVICE_ACCOUNT_JSON`: GCP service account JSON for credentials command (optional)
- `ORG_DATA_BUCKET`: GCS bucket for organizational data (optional)

Store these values to use in step 5. Tell the user where to find these values:
- Go to https://api.slack.com/apps
- Select their app
- **BOT_TOKEN**: OAuth & Permissions → Bot User OAuth Token
- **BOT_SIGNING_SECRET**: Basic Information → App Credentials → Signing Secret

**About GCP_ACCESS_DRY_RUN**:
- When set to `true`, the bot will skip all IAM policy changes for the `credentials` command
- BigQuery audit logging will still work normally
- Useful for testing the credentials command without affecting production IAM
- Safe to use even if you're already a project owner
- See TESTING_DRY_RUN.md for more details

2. **Verify Cluster Access**: Confirm the user has `oc` CLI access to the `app.ci` cluster context:
- Run `oc --context app.ci whoami` to verify access
- If this fails, the user needs to authenticate to the OpenShift CI cluster first
Expand All @@ -44,12 +75,29 @@ You are helping the user run a test instance of the ci-chat-bot. Follow these st

4. **Build the Project**: Run `make` to build the ci-chat-bot binary.

5. **Run the Full Setup**: Execute the complete setup with log redirection using the BOT_TOKEN and BOT_SIGNING_SECRET values obtained in step 1:
5. **Run the Full Setup**: Execute the complete setup with log redirection.

**If using an environment file (Option A from step 1):**
```bash
set -a && source /path/to/.env && set +a && make run > /tmp/ci-chat-bot.log 2>&1 &
```
Replace `/path/to/.env` with the actual file path provided by the user.

**If using manual entry (Option B from step 1):**

Normal mode (with IAM changes):
```bash
BOT_TOKEN=<token-from-step-1> BOT_SIGNING_SECRET=<secret-from-step-1> make run > /tmp/ci-chat-bot.log 2>&1 &
```

Use the actual values provided by the user in step 1. This will:
Dry-run mode (recommended for testing credentials command):
```bash
GCP_ACCESS_DRY_RUN=true BOT_TOKEN=<token-from-step-1> BOT_SIGNING_SECRET=<secret-from-step-1> make run > /tmp/ci-chat-bot.log 2>&1 &
```

Use the actual values provided by the user in step 1.

This will:
- Extract kubeconfig files from the `ci-chat-bot-kubeconfigs` secret
- Get Boskos credentials from the `boskos-credentials` secret
- Extract ROSA configuration (subnet IDs, OIDC config ID, billing account ID)
Expand Down Expand Up @@ -83,5 +131,50 @@ You are helping the user run a test instance of the ci-chat-bot. Follow these st
- `../release/core-services/ci-chat-bot/workflows-config.yaml` (workflow config)
- The bot runs with `--disable-rosa` flag and verbose logging (`--v=2`) by default
- If Slack isn't receiving events, verify the ngrok URL is correctly configured in Slack app settings
- **GCP Credentials dry-run mode**:
- Check logs for "DRY-RUN mode" message to confirm it's enabled
- Verify BigQuery audit logs are still being created
- Confirm IAM policy remains unchanged in GCP Console
- If testing credentials command, use: `credentials openshift gcp "test message"`

## Creating an Environment File Template

If the user wants to create an environment file, offer to create a template for them:

```bash
cat > .env.template << 'EOF'
# Required environment variables
BOT_TOKEN=xoxb-your-bot-token-here
BOT_SIGNING_SECRET=your-signing-secret-here

# Optional: GitHub integration
GITHUB_TOKEN=ghp_your-github-token-here

# Optional: GCP Credentials feature
GCP_ACCESS_DRY_RUN=true
GCP_SERVICE_ACCOUNT_JSON={"type":"service_account","project_id":"your-project",...}
ORG_DATA_BUCKET=your-org-data-bucket

# Add any other environment variables your bot needs
EOF
```

Tell the user to:
1. Copy `.env.template` to `.env`
2. Fill in their actual values
3. Never commit `.env` to git (add it to `.gitignore`)
4. Use `.env` when running the bot with the command from step 5

## Testing the Credentials Command

When running in dry-run mode (`GCP_ACCESS_DRY_RUN=true`), you can safely test the credentials command:

1. In Slack, send: `credentials openshift gcp "Testing dry-run mode"`
2. Check logs for: `grep "DRY-RUN" /tmp/ci-chat-bot.log`
3. You should see messages like:
- `GCP credentials manager running in DRY-RUN mode`
- `DRY-RUN: Would grant GCP IAM credentials to user...`
4. Verify BigQuery logs (if configured) are still created
5. Confirm no IAM changes were made in GCP Console

Guide the user through the setup process step by step.
5 changes: 5 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -6,3 +6,8 @@
coverage.out
golangci-lint.out
report.json

# Environment files with secrets
.env
.env.local
.env.*.local
4 changes: 4 additions & 0 deletions .golangci.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
version: "2"
run:
build-tags:
- gcs
61 changes: 60 additions & 1 deletion CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -256,9 +256,34 @@ workflow-launch openshift-e2e-gcp 4.19 "BASELINE_CAPABILITY_SET=None","ADDITIONA

### Testing
- Unit tests: `*_test.go` files (using Ginkgo/Gomega)
- Test command: `go test ./...`
- Test command: `make test` (includes race detection via `-race` flag)
- Manual test command: `go test ./...` or `go test -race ./...` for race detection
- Race detection is enabled by default in the Makefile to catch concurrency issues
- Integration tests in `pkg/manager/manager_test.go` and `pkg/manager/prow_test.go`

**Important**: The project includes concurrent operations (e.g., GCP access management with mutexes), so tests should always run with the race detector enabled to ensure thread safety.

### Pre-commit Verification
Before committing changes, run the following commands to ensure code quality:

```bash
make verify # Run verification checks (govet, gofmt, etc.)
make lint # Run golangci-lint for code quality checks
make test # Run all tests with race detection
make all # Build all binaries
```

Or run all checks at once:
```bash
make verify lint test all
```

This ensures that:
- Code passes static analysis and formatting checks
- All tests pass with race detection enabled
- The project builds successfully
- No regressions are introduced

## Configuration

### Required Environment Variables
Expand Down Expand Up @@ -317,3 +342,37 @@ workflow-launch openshift-e2e-gcp 4.19 "BASELINE_CAPABILITY_SET=None","ADDITIONA
- Cluster lifetimes are limited and automatically cleaned up
- Metal/bare-metal clusters require special proxy configuration
- The bot integrates deeply with Red Hat's Prow CI infrastructure

### Testing and Verification Guidelines for Claude

**When to run verification commands:**

After making any code changes (especially to `.go` files), you should proactively run:
```bash
make verify lint test all
```

This is REQUIRED for:
- Adding or modifying Go code in `pkg/`, `cmd/`, or any package
- Changing test files (`*_test.go`)
- Modifying the Makefile or build configuration
- Adding new dependencies or updating `go.mod`

This is OPTIONAL but RECOMMENDED for:
- Documentation-only changes (`.md` files)
- Configuration file changes (`.yaml`, `.json`)

**How to handle failures:**
- If `make verify` fails: Fix formatting/vet issues before proceeding
- If `make lint` fails: Address linting issues or document why they can be ignored
- If `make test` fails: Fix the failing tests or update them if behavior changed intentionally
- If `make all` fails: Fix build errors before suggesting the changes are complete

**Race detection note:**
The `-race` flag is enabled by default in `GO_TEST_FLAGS`. Any failures related to race conditions MUST be fixed, as this indicates actual concurrency bugs in production code (especially in GCP access management, Slack handlers, and other concurrent operations).

**Always inform the user:**
After running verification commands, inform the user of the results:
- ✅ "All verification checks passed: verify, lint, test, and build successful"
- ⚠️ "Verification passed with warnings: [describe warnings]"
- ❌ "Verification failed: [describe failures and fixes needed]"
10 changes: 9 additions & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -11,10 +11,13 @@ version=v${build_date}-${git_commit}
SOURCE_GIT_TAG=v1.0.0+$(shell git rev-parse --short=7 HEAD)

GO_LD_EXTRAFLAGS=-X github.com/openshift/ci-chat-bot/vendor/k8s.io/client-go/pkg/version.gitCommit=$(shell git rev-parse HEAD) -X github.com/openshift/ci-chat-bot/vendor/k8s.io/client-go/pkg/version.gitVersion=${SOURCE_GIT_TAG} -X sigs.k8s.io/prow/version.Name=ci-chat-bot -X sigs.k8s.io/prow/version.Version=${version}
# Add gcs build tag for cyborg-data GCS support
GO_BUILD_FLAGS=-tags gcs
GO_TEST_FLAGS=-tags gcs -race
GOLINT=golangci-lint run

debug:
go build -gcflags="all=-N -l" ${GO_LD_FLAGS} -mod vendor -o ci-chat-bot ./cmd/...
go build -tags gcs -gcflags="all=-N -l" ${GO_LD_FLAGS} -mod vendor -o ci-chat-bot ./cmd/...
.PHONY: debug

vendor:
Expand All @@ -33,6 +36,11 @@ run:

lint: verify-golint

# Override verify-govet to include gcs build tag
verify-govet:
go vet $(GO_MOD_FLAGS) -tags gcs $(GO_PACKAGES)
.PHONY: verify-govet

sonar-reports:
go test ./... -coverprofile=coverage.out -covermode=count -json > report.json
golangci-lint run ./... --verbose --no-config --out-format checkstyle --issues-exit-code 0 > golangci-lint.out
Expand Down
60 changes: 59 additions & 1 deletion cmd/ci-chat-bot/main.go
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@ import (
ctrlruntimelog "sigs.k8s.io/controller-runtime/pkg/log"

"github.com/adrg/xdg"
orgdatacore "github.com/openshift-eng/cyborg-data/go"
"github.com/openshift/ci-chat-bot/pkg/manager"
"github.com/openshift/ci-chat-bot/pkg/slack"
"github.com/openshift/ci-chat-bot/pkg/utils"
Expand Down Expand Up @@ -323,6 +324,61 @@ func run() error {
}

ctx := context.Background()

// Load GCP service account credentials (used by both org data GCS and credentials manager)
gcpServiceAccountJSON := os.Getenv("GCP_SERVICE_ACCOUNT_JSON")

// Initialize cyborg-data service for organizational data
// Note: This requires building with -tags gcs for GCS support
orgDataService := orgdatacore.NewService()
orgDataBucket := os.Getenv("ORG_DATA_BUCKET")
orgDataObjectPath := os.Getenv("ORG_DATA_OBJECT_PATH")
if orgDataObjectPath == "" {
orgDataObjectPath = "orgdata/comprehensive_index_dump.json" // default path
}

if orgDataBucket != "" {
// Load org data from GCS bucket using SDK implementation
// Prepare GCS options
var gcsOptions []orgdatacore.GCSOption
if gcpServiceAccountJSON != "" {
gcsOptions = append(gcsOptions, orgdatacore.WithCredentialsJSON(gcpServiceAccountJSON))
}

// Create GCS data source with SDK
gcsDataSource, err := orgdatacore.NewGCSDataSourceWithSDK(ctx, orgDataBucket, orgDataObjectPath, gcsOptions...)
if err != nil {
klog.Warningf("Failed to create GCS data source: %v. Group-based validation will not work.", err)
} else {
if err := orgDataService.LoadFromDataSource(ctx, gcsDataSource); err != nil {
klog.Warningf("Failed to load organizational data from GCS: %v. Group-based validation will not work.", err)
} else {
klog.Info("Successfully loaded organizational data from GCS")
}
if err := orgDataService.StartDataSourceWatcher(ctx, gcsDataSource); err != nil {
klog.Warningf("Failed to start organizational data watcher: %v", err)
}
}
} else {
klog.Warning("ORG_DATA_BUCKET not set. Group-based credential validation will not work.")
}

// Initialize GCP access manager
gcpDryRun := os.Getenv("GCP_ACCESS_DRY_RUN") == "true"
var gcpAccessManager *manager.GCPAccessManager
gcpAccessManager, err = manager.NewGCPAccessManager(gcpServiceAccountJSON, gcpDryRun)
if err != nil {
klog.Errorf("Failed to initialize GCP access manager: %v", err)
} else if gcpAccessManager.IsEnabled() {
if gcpDryRun {
klog.Info("GCP access manager enabled (DRY-RUN MODE)")
} else {
klog.Info("GCP access manager enabled")
}
} else {
klog.Info("GCP access manager disabled (service account credentials not configured)")
}

prowjobInformerFactory.Start(ctx.Done())

jobManager := manager.NewJobManager(
Expand All @@ -349,6 +405,8 @@ func run() error {
hiveClient,
mceNamespaceClient,
dpcrCoreClient,
gcpAccessManager,
orgDataService,
)

klog.Infof("Waiting for caches to sync")
Expand Down Expand Up @@ -408,7 +466,7 @@ func manageRosaSubnetList(path string, subnetList *manager.RosaSubnets) {
if err != nil {
klog.Errorf("Failed to read %s: %v", path, err)
}
newSubnets := sets.New[string](strings.Split(string(subnetsRaw), ",")...)
newSubnets := sets.New(strings.Split(string(subnetsRaw), ",")...)
subnetList.Lock.Lock()
subnetList.Subnets = newSubnets
subnetList.Lock.Unlock()
Expand Down
Loading