Skip to content

feat: add KEDA autoscaling playground for DocumentDB#313

Open
xgerman wants to merge 5 commits intodocumentdb:mainfrom
xgerman:xgerman/keda-autoscaling-playground
Open

feat: add KEDA autoscaling playground for DocumentDB#313
xgerman wants to merge 5 commits intodocumentdb:mainfrom
xgerman:xgerman/keda-autoscaling-playground

Conversation

@xgerman
Copy link
Copy Markdown
Collaborator

@xgerman xgerman commented Mar 16, 2026

Summary

Adds a self-contained playground demonstrating KEDA event-driven autoscaling with DocumentDB. KEDA's MongoDB scaler polls a DocumentDB collection for pending jobs and automatically scales a worker Deployment from 0 to N pods.

E2E tested on Kind — full autoscaling cycle verified: 0 → 2 pods on job insert, 2 → 0 pods after drain.

New files (10)

File Description
manifests/documentdb-instance.yaml DocumentDB CR for Kind (1 node, 2Gi, ClusterIP)
manifests/keda-trigger-auth.yaml ClusterTriggerAuthentication with DocumentDB connection string
manifests/keda-scaled-object.yaml ScaledObject with MongoDB trigger on appdb.jobs
manifests/job-worker.yaml Worker Deployment (scaled by KEDA, starts at 0)
manifests/seed-jobs.yaml Job that inserts 10 pending documents
manifests/drain-jobs.yaml Job that marks all pending documents as completed
scripts/setup.sh Installs KEDA, deploys DocumentDB, configures demo
scripts/teardown.sh Cleanup with optional KEDA/DocumentDB removal
scripts/demo.sh Interactive walkthrough of scaling behavior
README.md Quick start, connection guide, gotchas

Key findings from E2E testing

Finding Detail
tlsInsecure=true required KEDA's Go MongoDB driver needs tlsInsecure=true (not tlsAllowInvalidCertificates=true) to skip both cert AND hostname verification with self-signed certs
directConnection=true works KEDA passes this through correctly
SCRAM-SHA-256 works Standard MongoDB auth handshake compatible
Port 10260 Must be explicit in connection string
exposeViaService required DocumentDB CR needs exposeViaService.serviceType: ClusterIP for the gateway service to be created

Test results

Seed 10 pending jobs → ScaledObject ACTIVE: True → HPA TARGETS: 5/5 → 2 worker pods
Drain all jobs       → ScaledObject ACTIVE: False → HPA TARGETS: 0/5 → 0 worker pods (after 30s cooldown)

@xgerman xgerman marked this pull request as ready for review April 2, 2026 20:05
Copilot AI review requested due to automatic review settings April 2, 2026 20:05
German and others added 2 commits April 2, 2026 13:06
Adds a self-contained playground that demonstrates KEDA event-driven
autoscaling with DocumentDB. KEDA's MongoDB scaler polls a DocumentDB
collection for pending jobs and scales a worker Deployment from 0 to N.

New files:
- manifests/: DocumentDB CR, ScaledObject, ClusterTriggerAuthentication,
  worker Deployment, seed/drain Jobs
- scripts/setup.sh: installs KEDA, deploys DocumentDB, configures demo
- scripts/teardown.sh: cleanup with optional KEDA/DocumentDB removal
- scripts/demo.sh: interactive walkthrough of scaling behavior
- README.md: prerequisites, quick start, connection string guide, gotchas

Key DocumentDB integration details:
- Port 10260 (not 27017)
- directConnection=true (no replica set discovery)
- SCRAM-SHA-256 auth mechanism
- TLS with tlsAllowInvalidCertificates for self-signed certs
- ClusterTriggerAuthentication for cross-namespace secret access

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Signed-off-by: German <geeichbe@microsoft.com>
Fixes discovered during end-to-end testing on Kind:

- Use tlsInsecure=true instead of tlsAllowInvalidCertificates=true in
  connection strings. KEDA's Go MongoDB driver requires tlsInsecure to
  skip both certificate AND hostname verification. Using only
  tlsAllowInvalidCertificates causes x509 hostname errors.
- Fix mongosh command: pass MONGODB_URI as positional arg via shell
  wrapper, not just as env var (mongosh --eval doesn't auto-use env)
- Fix container image: mongo:8.0 (not mongodb-community-server:8.0-ubuntu2404)
- Add exposeViaService.serviceType: ClusterIP to DocumentDB CR
  (required for the documentdb-service-* to be created)

Verified full autoscaling cycle:
  0 pods → seed 10 jobs → 2 pods → drain jobs → 0 pods

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Signed-off-by: German <geeichbe@microsoft.com>
@xgerman xgerman force-pushed the xgerman/keda-autoscaling-playground branch from f0c1bb2 to 8bf1101 Compare April 2, 2026 20:06
Copy link
Copy Markdown
Collaborator Author

@xgerman xgerman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review: KEDA Autoscaling Playground

Summary

Well-structured playground that demonstrates KEDA event-driven autoscaling with DocumentDB. Clean separation of manifests, scripts, and documentation. The E2E-tested findings (especially tlsInsecure=true) are valuable. A few items to address before merge.


🟠 Major

1. [Required] README references wrong container image (README.md:135)
The README says mongodb/mongodb-community-server:8.0-ubuntu2404 but the actual manifests use mongo:8.0. This was fixed in the second commit but the README note was not updated.

-> **Note:** The `mongodb/mongodb-community-server:8.0-ubuntu2404` image is used for the seed and
+> **Note:** The `mongo:8.0` image is used for the seed and

2. [Required] setup.sh should use kubectl get documentdb for connection string (setup.sh:128-136)
The setup script manually constructs the connection string from the secret and hardcoded service name pattern. The DocumentDB resource provides the connection string in .status.connectionString. Using it would be more resilient to service naming changes:

RAW_CONN=$(kubectl get documentdb "$DOCUMENTDB_NAME" -n "$DOCUMENTDB_NAMESPACE" \
    -o jsonpath='{.status.connectionString}')
conn_string=$(eval echo "$RAW_CONN")

Note: The status connection string uses tlsAllowInvalidCertificates=true, but KEDA needs tlsInsecure=true. You may still need a sed replacement or keep the manual construction with a comment explaining why.

3. [Required] ClusterTriggerAuthentication missing namespace on secret reference (keda-trigger-auth.yaml)
ClusterTriggerAuthentication looks up secrets from the KEDA operator namespace by default. The secretTargetRef should explicitly specify the namespace where the secret lives, otherwise it depends on KEDA's default namespace behavior:

spec:
  secretTargetRef:
    - parameter: connectionString
      name: documentdb-keda-connection
      namespace: keda          # explicit namespace
      key: connectionString

This works today because setup.sh creates the secret in the KEDA namespace, but it's fragile — if someone changes KEDA_NAMESPACE, the manifest and secret would diverge silently.


🟡 Minor

4. [Suggestion] Hardcoded credentials in setup.sh (setup.sh:85-86)
The password KedaDemo2024! is hardcoded. For a playground this is acceptable, but consider adding a comment that users should change it, or accept it via an environment variable:

DOCUMENTDB_USER="${DOCUMENTDB_USER:-docdbadmin}"
DOCUMENTDB_PASS="${DOCUMENTDB_PASS:-KedaDemo2024!}"

5. [Suggestion] documentdb-instance.yaml includes sidecarInjectorPluginName (line 17)
This field is used for CNPG-I sidecar injection. While other playgrounds include it, it's not required for the KEDA demo to function. If you keep it, add a comment explaining its purpose, since it may confuse users who just want KEDA autoscaling.

6. [Suggestion] demo.sh uses timeout command (demo.sh:36,47)
The timeout command behaves differently on macOS (gtimeout via coreutils) vs Linux. Since this targets Kind (which can run on macOS), consider a note or a fallback:

TIMEOUT_CMD="timeout"
command -v gtimeout &>/dev/null && TIMEOUT_CMD="gtimeout"

7. [Suggestion] No .gitignore or .helmignore
Not blocking, but adding a minimal .gitignore would prevent accidental commits of local test artifacts.


🟢 Nitpick

8. [Nitpick] Shebang inconsistency
Scripts use #!/bin/bash while the LightRAG playground uses #!/usr/bin/env bash. The env form is more portable. Not blocking — just a consistency note.

9. [Nitpick] seed-jobs.yaml and drain-jobs.yaml use mongo:8.0 without pinning a digest
For reproducibility, consider pinning to a specific tag like mongo:8.0.4 or adding a comment noting the tag is intentionally floating.


✅ Positive Feedback

  • Excellent README — the connection string table (lines 72-78) with the tlsInsecure finding is genuinely useful documentation that will save other users from the same debugging session.
  • Clean script structuresetup.sh is well-organized with clear functions, color-coded output, and proper prerequisite checks.
  • Good teardown ordering — deleting the ScaledObject before the worker deployment avoids KEDA errors. This attention to deletion ordering shows operational experience.
  • Idempotent --dry-run=client | kubectl apply pattern for secrets and namespaces is the right approach.
  • Consistent labelingapp.kubernetes.io/part-of: keda-documentdb-demo across all resources enables clean bulk operations.
  • Mermaid architecture diagram in the README is a nice touch.

Verdict

/request-changes — Fix the README image reference (#1) and consider the connection string approach (#2). The rest are suggestions.

- Fix README: reference mongo:8.0 (not mongodb-community-server)
- Use kubectl get documentdb status.connectionString instead of
  manually constructing the URI from secrets
- Replace tlsAllowInvalidCertificates with tlsInsecure via sed since
  KEDA's Go driver requires it
- Add explicit namespace to ClusterTriggerAuthentication secretTargetRef
- Parameterize credentials via DOCUMENTDB_USER/DOCUMENTDB_PASS env vars

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Signed-off-by: German <geeichbe@microsoft.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new, self-contained playground under documentdb-playground/ to demonstrate KEDA event-driven autoscaling against a DocumentDB-backed “jobs” collection, including scripts to set up, demo, and tear down the environment on a Kubernetes cluster (Kind-oriented).

Changes:

  • Introduces KEDA + DocumentDB demo manifests (DocumentDB CR, ScaledObject, ClusterTriggerAuthentication, worker Deployment, seed/drain Jobs).
  • Adds setup/teardown scripts to install KEDA, deploy demo resources, and clean up.
  • Adds an interactive demo script and README with architecture + usage notes.

Reviewed changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 11 comments.

Show a summary per file
File Description
documentdb-playground/keda-autoscaling/scripts/setup.sh Installs KEDA, deploys DocumentDB and KEDA resources, seeds demo data
documentdb-playground/keda-autoscaling/scripts/teardown.sh Removes demo resources; optionally uninstalls KEDA and deletes DocumentDB
documentdb-playground/keda-autoscaling/scripts/demo.sh Interactive walkthrough (seed → scale up → drain → scale down)
documentdb-playground/keda-autoscaling/README.md Playground documentation, quickstart, and gotchas
documentdb-playground/keda-autoscaling/manifests/documentdb-instance.yaml DocumentDB instance manifest for the demo
documentdb-playground/keda-autoscaling/manifests/keda-trigger-auth.yaml ClusterTriggerAuthentication for MongoDB scaler auth
documentdb-playground/keda-autoscaling/manifests/keda-scaled-object.yaml ScaledObject using MongoDB scaler to drive HPA from query results
documentdb-playground/keda-autoscaling/manifests/job-worker.yaml Worker Deployment scaled by KEDA (starts at 0 replicas)
documentdb-playground/keda-autoscaling/manifests/seed-jobs.yaml Job to insert pending “jobs” documents
documentdb-playground/keda-autoscaling/manifests/drain-jobs.yaml Job to mark pending “jobs” as completed

Comment on lines +5 to +12
DOCUMENTDB_NAMESPACE="${DOCUMENTDB_NAMESPACE:-documentdb-ns}"
DOCUMENTDB_NAME="${DOCUMENTDB_NAME:-keda-demo}"
DOCUMENTDB_SECRET="${DOCUMENTDB_SECRET:-documentdb-credentials}"
DOCUMENTDB_USER="${DOCUMENTDB_USER:-docdbadmin}"
DOCUMENTDB_PASS="${DOCUMENTDB_PASS:-KedaDemo2024!}"
APP_NAMESPACE="${APP_NAMESPACE:-app}"
KEDA_NAMESPACE="${KEDA_NAMESPACE:-keda}"
KEDA_VERSION="${KEDA_VERSION:-2.17.0}"
Copy link

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The script advertises configurable DOCUMENTDB_* / *_NAMESPACE values, but the YAML manifests you apply later have hard-coded metadata.name/metadata.namespace (and secret names). If a user overrides these env vars, the script will create namespaces/secrets in one place but kubectl apply -f will still deploy resources into the fixed namespaces/names from the manifest, causing setup/teardown to break. Consider either removing these env vars, or templating the manifests (envsubst/yq/kustomize) so they stay consistent with the script-provided values.

Copilot uses AI. Check for mistakes.
Comment on lines +83 to +87
warn "Credentials secret already exists. Skipping."
else
kubectl create secret generic "$DOCUMENTDB_SECRET" \
--namespace "$DOCUMENTDB_NAMESPACE" \
--from-literal=username="$DOCUMENTDB_USER" \
Copy link

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This script commits a fixed username/password (including a realistic-looking password string) into the repo and always creates the credentials Secret with those literals. To avoid secret-scanner noise and accidental reuse, prefer generating a random password at runtime (and printing it once), and/or require the user to provide it via an env var/flag.

Copilot uses AI. Check for mistakes.
Comment on lines +80 to +84
if [ "$DELETE_DOCUMENTDB" = true ]; then
log "Deleting DocumentDB instance..."
kubectl delete documentdb "$DOCUMENTDB_NAME" -n "$DOCUMENTDB_NAMESPACE" --ignore-not-found=true 2>/dev/null || true
kubectl delete secret documentdb-credentials -n "$DOCUMENTDB_NAMESPACE" --ignore-not-found=true 2>/dev/null || true
fi
Copy link

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Teardown hard-codes the credentials secret name (documentdb-credentials) instead of using the same configurable secret name as setup.sh (DOCUMENTDB_SECRET). If a user changes the secret name for setup, teardown will leave the secret behind (or delete the wrong one). Consider adding DOCUMENTDB_SECRET config here and using it consistently.

Copilot uses AI. Check for mistakes.
Comment on lines +4 to +5
name: keda-demo
namespace: documentdb-ns
Copy link

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This manifest hard-codes metadata.name/metadata.namespace (keda-demo / documentdb-ns). That conflicts with DOCUMENTDB_NAME/DOCUMENTDB_NAMESPACE being configurable in scripts/README and makes overrides ineffective. Consider removing the namespace/name from the manifest and letting the scripts supply them (or template this file).

Suggested change
name: keda-demo
namespace: documentdb-ns

Copilot uses AI. Check for mistakes.
spec:
nodeCount: 1
instancesPerNode: 1
documentDbCredentialSecret: documentdb-credentials
Copy link

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

documentDbCredentialSecret is hard-coded to documentdb-credentials, but setup.sh allows overriding the secret name via DOCUMENTDB_SECRET. If a user changes DOCUMENTDB_SECRET, the DocumentDB CR will still reference documentdb-credentials and fail to authenticate. Align this with the script (template/patch the manifest or remove configurability).

Suggested change
documentDbCredentialSecret: documentdb-credentials
documentDbCredentialSecret: ${DOCUMENTDB_SECRET:-documentdb-credentials}

Copilot uses AI. Check for mistakes.
kind: ScaledObject
metadata:
name: documentdb-worker-scaler
namespace: app
Copy link

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This ScaledObject hard-codes metadata.namespace: app, but the scripts/README expose APP_NAMESPACE as configurable. Overriding APP_NAMESPACE will not move this resource, and subsequent kubectl get -n "$APP_NAMESPACE" calls will miss it. Consider templating/removing the namespace field so the scripts can control it.

Suggested change
namespace: app

Copilot uses AI. Check for mistakes.
kind: Job
metadata:
name: seed-pending-jobs
namespace: app
Copy link

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This Job hard-codes metadata.namespace: app, but APP_NAMESPACE is presented as configurable in scripts/README. If a user overrides APP_NAMESPACE, this Job will still run in app and may not find the Secret/ScaledObject in the expected namespace. Consider removing the namespace from the manifest and applying it with -n from the scripts (or template it).

Suggested change
namespace: app

Copilot uses AI. Check for mistakes.
kind: Job
metadata:
name: drain-pending-jobs
namespace: app
Copy link

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This Job hard-codes metadata.namespace: app, but APP_NAMESPACE is presented as configurable in scripts/README. If a user overrides APP_NAMESPACE, this Job will still run in app and may not find the Secret/ScaledObject in the expected namespace. Consider removing the namespace from the manifest and applying it with -n from the scripts (or template it).

Suggested change
namespace: app

Copilot uses AI. Check for mistakes.
Comment on lines +34 to +37
step "Watching pods scale up (Ctrl+C to continue)"
log "KEDA polls every 5 seconds. Worker pods should appear within 15-30 seconds."
timeout 60 kubectl get pods -n "$APP_NAMESPACE" -w 2>/dev/null || true

Copy link

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

timeout is not available by default on some common dev environments (notably macOS without coreutils). Since this is a local playground, consider replacing these timeout ... kubectl -w calls with a portable loop + sleep, or add a prerequisite check/fallback so demo.sh doesn't fail immediately.

Copilot uses AI. Check for mistakes.
Comment on lines +133 to +136
> everything in the same namespace, you can use a namespace-scoped `TriggerAuthentication` instead.

> **Note:** The `mongo:8.0` image is used for the seed and
> drain jobs because it includes `mongosh`. Any image with `mongosh` installed works.
Copy link

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

README claims the seed/drain Jobs use mongodb/mongodb-community-server:8.0-ubuntu2404, but the actual manifests use mongo:8.0. Please update the README to match what gets deployed (or change the manifests), so users know which image to expect/pull.

Copilot uses AI. Check for mistakes.
Findings from AKS E2E testing:
- Remove replicaSet=rs0 from connection string — KEDA's Go driver
  fails topology negotiation with DocumentDB when replicaSet is
  specified alongside directConnection=true
- Replace ClusterIP with DNS name for cross-namespace service
  resolution
- Revert ClusterTriggerAuthentication namespace field — not supported
  in KEDA v1alpha1 API (KEDA looks up secrets from its own namespace)
- Document the replicaSet gotcha in README

Verified full autoscaling cycle on AKS:
  0 pods → seed 10 jobs → 2 pods → drain jobs → 0 pods

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Signed-off-by: German <geeichbe@microsoft.com>
- Generate random password at runtime instead of hardcoded default
- Remove hardcoded namespaces from all manifest files
- Use configurable DOCUMENTDB_SECRET variable in teardown.sh
- Replace macOS-incompatible 'timeout' with portable perl alarm in demo.sh
- Use sed placeholders for DocumentDB name/secret in instance manifest
- Add explicit -n namespace flag to all kubectl apply commands

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Signed-off-by: German <geeichbe@microsoft.com>
@xgerman
Copy link
Copy Markdown
Collaborator Author

xgerman commented Apr 3, 2026

Addressed all review comments in commit 9f103de:

  • No hardcoded namespaces: Removed namespace: from all 5 manifest YAML files; scripts now use -n "$APP_NAMESPACE" flag
  • Configurable secret name: teardown.sh now uses $DOCUMENTDB_SECRET variable instead of hardcoded documentdb-credentials
  • macOS portability: Replaced timeout command in demo.sh with perl -e 'alarm N; exec @ARGV' (portable on macOS/Linux)
  • Manifest placeholders: documentdb-instance.yaml uses sed-substituted placeholders for name/secret
  • DCO sign-off: Commit includes Signed-off-by trailer

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants