Context Engineering Agent

This is not an officially supported Google product. This project is not eligible for the Google Open Source Software Vulnerability Rewards Program, Google Cloud Platform/SecOps Terms of Service, How Gemini for Google Cloud uses your data. This tool is provided "as is" without warranty of any kind. Users are solely responsible for understanding and managing the tool's interaction with their databases. Use of this tool constitutes acceptance of all risks associated with database access, reading, usage, and modifications.

Context Engineering Agent

Important

Repository Reorganization The original Go-based DB Schema Enricher CLI has been archived and moved to the legacy/go-enricher/ directory. This repository root is now dedicated to the Python-based Context Engineering Agent (MCP Server).

This project is a Gemini CLI extension that bridges the gap between Large Language Models (LLMs) and structured databases by generating and managing tailored context. This context helps the LLM understand database schema, business logic, and terminology, enabling more accurate Natural Language to SQL generation.

Core Concepts

A ContextSet is the central artifact generated by the Context Engineering Agent, containing structured knowledge in three primary forms:

Templates: End-to-end mappings linking a natural language query to a complete, runnable SQL query. They teach the system overarching operational logic, table join infrastructures, and broad business rules.

Full Structure Example:

{
  "nl_query": "How many accounts are in London?",
  "sql": "SELECT count(*) FROM account WHERE account.city = 'London'",
  "intent": "How many accounts are in London?",
  "manifest": "How many accounts are in a given city?",
  "parameterized": {
    "parameterized_sql": "SELECT count(*) FROM account WHERE account.city = $1",
    "parameterized_intent": "How many accounts are in $1?"
  }
}

Facets: Reusable, modular SQL fragments (like a WHERE clause or specialized join) linked to specific vocabulary or terminology. They are not standalone queries but dynamically injected filters.

Full Structure Example:

{
  "sql_snippet": "products.rating > 4.5",
  "intent": "highly rated products (above 4.5)",
  "manifest": "highly rated products (above a given number)",
  "parameterized": {
    "parameterized_sql_snippet": "products.rating > $1",
    "parameterized_intent": "highly rated products (above $1)"
  }
}

Value Searches: Specialized queries used when a value in the natural language query (e.g., "Lndn") does not perfectly match the stored value in the database ("London"). They employ mapping functions (like fuzzy trigram matching or semantic similarity) to find candidate values and their distance from the search term.
- Full Structure Example (Conceptual Fuzzy Match):
```
{
  "concept_type": "City",
  "query": "SELECT T.\"location\" AS value, 'users.location' AS columns, 'City' AS concept_type, fuzzy_distance(T.\"location\", $value) AS distance FROM \"users\" T WHERE fuzzy_match(T.\"location\", $value)",
  "description": "Fuzzy match for city in location column"
}
```

Prerequisites

Before you begin, ensure you have completed the following setup:

1. Required Services & Access

Enable the Data Analytics API with Gemini, Gemini for Google Cloud API, and Dataplex Universal Catalog API in your Google Cloud project.
Ensure your IAM user or service account has the necessary roles for database access (appropriate for your database engine), serviceusage.serviceUsageConsumer, and geminidataanalytics.queryDataUser.
Ensure the database instance has executesql permission enabled (Data API allowed).

2. Gemini CLI

Install the Gemini CLI. See Get Started with Gemini CLI.
Run gemini --version to ensure it's installed correctly.

3. Google Cloud Authentication

Configure Application Default Credentials (ADC) by running:
```
gcloud auth application-default login
```

Installation

To install the Context Engineering Agent via the Gemini CLI:

gemini extensions install https://github.com/GoogleCloudPlatform/db-context-enrichment

Optional: VSCode Integration

For an enhanced editing and diffing experience when reviewing context changes:

Install VSCode.
Install the Gemini CLI Companion extension from the VSCode Marketplace.
Open your workspace folder in VSCode, open the integrated terminal, and run gemini. You can verify the IDE extension is active by running /ide status.

Workflows

The extension is designed to support the Critical User Journeys (CUJs) for context engineering, following an iterative optimization loop:

Bootstrap: Generate an initial baseline context.
Evaluate: Measure context effectiveness against a golden dataset.
Hill-Climbing: Perform gap analysis on failures and generate automated fixes.
Iterate: Apply the improved context and re-run evaluation to continuously improve quality.
Final Validation (Optional): Verify mutations against a separated test set to ensure generalization and prevent overfitting.

All workflows are exposed as Agent Skills. Activate them by asking the agent in natural language (e.g. "bootstrap the context for my database") or by invoking the skill by name. Start the Gemini CLI by running gemini in your workspace folder, or launch Claude Code with the plugin installed.

Automated Iterative Optimization (Autoctx)

Initialize (autoctx-init): Sets up your local workspace by creating an autoctx/ directory. It checks for the presence of a valid tools.yaml configuration inside it. If missing, the agent will interactively prompt you for your database connection details and generate the file for you. It also creates the state.md file to track experiment progress and an experiments/ directory, all within autoctx/.
Generate Dataset (autoctx-dataset-generation): Rapidly creates or expands a baseline of evaluation questions (golden dataset). It asks you for sample queries or descriptions of what users might ask, and generates a JSON file with Natural Language Queries (NLQs) and Golden SQL statements.
Bootstrap (autoctx-bootstrap): Generates an initial context set. It performs progressive schema discovery to understand your database structure and qualified tables. It then generates starting Templates and Facets based on the schema and any user-provided documentation or sample queries.
Evaluate (autoctx-evaluate): Measures context effectiveness against your golden dataset. This step automatically generates all necessary Evalbench configuration files (db_config.yaml, model_config.yaml, run_config.yaml, llmrater_config.yaml) inside the experiment folder and runs the evaluation pipeline to produce accuracy scores and identify failure cases.
Hill-Climb (autoctx-hillclimb): Performs gap analysis on failures identified in the evaluation step. It reads the failure cases, determines why the LLM failed to generate correct SQL, and proposes updates or new additions to the context set (Templates or Facets) to improve performance in the next iteration.

Targeted Manual Generation

These are more basic workflows for context engineering to manually author specific context elements, each exposed via the context-generation-guide skill scoped to the relevant context type:

Generate Templates: Initiates a guided workflow where you provide a sample question and SQL, and the agent helps you parameterize and save it as a template.
Generate Facets: Guides you to define a specific intent and the corresponding SQL snippet (e.g., filter condition) to save as a facet.
Generate Value Searches: Helps you configure how the system searches for and matches specific values within a concept type (e.g., setting up exact match or trigram fuzzy search for product names).

Development and Testing

To develop a new feature and test it with the correct dependencies and full bundling (including Evalbench and Toolbox binaries), it is recommended to create a custom release in your fork.

1. Integration Testing (Fork & Release Workflow)

Create a fork of the repository on GitHub.

Set up the fork/upstream relationship in your local environment:

git clone https://github.com/YOUR-USERNAME/db-context-enrichment.git
git remote add upstream https://github.com/GoogleCloudPlatform/db-context-enrichment

Develop changes in a new branch in your local repo and push them to your fork.
Create a new release for your fork on GitHub (e.g., using tag 0.0.1-test).
Wait for the release assets to be generated by GitHub Actions in your fork.
Install the custom release via Gemini CLI:
```
gemini extensions install https://github.com/YOUR-USERNAME/db-context-enrichment --ref 0.0.1-test
```
Note: Use gemini extensions uninstall google-cloud-db-context-engineering to remove previous installations if needed.

2. Official Release Pipeline

Releases are versioned and prepared automatically by the Release Please GitHub App.
When functional PRs are merged, Release Please opens/updates a pending Release PR (bumping the extension version and updating the changelog).
Merging the Release PR signals Release Please to tag the commit and create an official GitHub Release.
The creation of the GitHub Release triggers the .github/workflows/release.yml pipeline.
The pipeline uses PyInstaller to build standalone binary executables for Linux (x64), macOS (arm64), and Windows (x64).
The pipeline packages the binary, LICENSE, GEMINI.md, and dynamically updates gemini-extension.json into .tar.gz and .zip archives.
These archives are automatically attached back to the GitHub release as downloadable assets.
Users receive the update the next time they install or upgrade the extension via Gemini CLI (gemini extensions update --all).

Name		Name	Last commit message	Last commit date
Latest commit History 188 Commits
.ci		.ci
.claude-plugin		.claude-plugin
.github		.github
dev-plugin		dev-plugin
docs		docs
evals		evals
legacy/go-enricher		legacy/go-enricher
plugin		plugin
src/google/cloud/db_context_enrichment		src/google/cloud/db_context_enrichment
tests		tests
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.python-version		.python-version
.release-please-manifest.json		.release-please-manifest.json
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
pyinstaller.spec		pyinstaller.spec
pyproject.toml		pyproject.toml
release-please-config.json		release-please-config.json
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Context Engineering Agent

Core Concepts

Prerequisites

1. Required Services & Access

2. Gemini CLI

3. Google Cloud Authentication

Installation

Optional: VSCode Integration

Workflows

Automated Iterative Optimization (Autoctx)

Targeted Manual Generation

Development and Testing

1. Integration Testing (Fork & Release Workflow)

2. Official Release Pipeline

About

Uh oh!

Releases 16

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Context Engineering Agent

Core Concepts

Prerequisites

1. Required Services & Access

2. Gemini CLI

3. Google Cloud Authentication

Installation

Optional: VSCode Integration

Workflows

Automated Iterative Optimization (Autoctx)

Targeted Manual Generation

Development and Testing

1. Integration Testing (Fork & Release Workflow)

2. Official Release Pipeline

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 16

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages