feat: Add browser session tools (Tetra + CDP + semantic queries) by paveldudka · Pull Request #35 · tinyfish-io/agentql-mcp

paveldudka · 2026-02-17T02:42:21Z

Overview

Adds browser session management and semantic query tools to the MCP server, powered by AgentQL SDK + Tetra cloud browsers. The existing extract-web-data tool is preserved — this PR extends the server from a stateless REST wrapper into a full browser automation platform.

Version bump: 1.0.1 → 2.0.0 (new capabilities, API migration)

What Changed

New Tools

Tool	Purpose
`create_session`	Provision a Tetra cloud browser, return CDP URL
`close_session`	Tear down a browser session
`query_data`	Extract structured data from a live page (AgentQL semantic queries)
`query_elements`	Find interactive elements with actionable CSS selectors

Existing Tool (preserved)

Tool	Purpose
`extract-web-data`	One-shot stateless extraction via REST API (unchanged behavior)

Migration

Server class → McpServer with registerTool() (the old server.tool() and manual setRequestHandler are deprecated)
Added agentql and playwright as dependencies

Architecture

Before: Stateless REST Wrapper

Agent ──MCP──▶ MCP Server ──HTTP──▶ AgentQL REST API
                                         │
                                    navigates, extracts,
                                    returns JSON

One tool, one call, done. No browser control.

After: Stateless REST + Full Browser Sessions

┌─────────────────────────────────────────────────────────────┐
│                          Agent                              │
│                                                             │
│  ┌──────────────┐         ┌─────────────────────────────┐   │
│  │  MCP Client  │         │ Native Browser (Playwright) │   │
│  └──────┬───────┘         └─────────────┬───────────────┘   │
└─────────┼───────────────────────────────┼───────────────────┘
          │ stdio                         │ CDP WebSocket
          ▼                               ▼
┌─────────────────────────┐    ┌─────────────────────────┐
│     MCP Server          │    │  Tetra Cloud Browser    │
│                         │    │  (Chromium instance)    │
│  extract-web-data ──REST──▶ AgentQL API               │
│                         │    │                         │
│  create_session ────────┼───▶│  wss://cdp-url          │
│  close_session          │    │                         │
│  query_data ────────────┼───▶│  (semantic queries)     │
│  query_elements ────────┼───▶│                         │
└─────────────────────────┘    └─────────────────────────┘

Session Lifecycle

create_session
     │
     ▼
Tetra provisions cloud browser
     │
     ▼
MCP server connects via CDP (holds connection alive)
     │
     ▼
Returns CDP URL to agent
     │
     ▼
Agent connects to same CDP URL (dual connection)
     │
     ├──▶ Agent browses directly (navigate, click, type)
     ├──▶ query_data extracts structured info from current page
     └──▶ query_elements finds elements with CSS selectors
               │
               ▼
          Agent clicks using selector:
          page.click('[tf623_id="42"]')
               │
               ▼
          close_session → MCP disconnects → browser dies

When to Use What

                ┌─────────────────────────┐
                │  "I need web data"      │
                └────────────┬────────────┘
                             │
                   ┌─────────▼─────────┐
                   │  Single URL,      │
                   │  no interaction?  │
                   └────┬─────────┬────┘
                    YES │         │ NO
                        ▼         ▼
              ┌──────────────┐  ┌──────────────────┐
              │ extract-web- │  │  create_session   │
              │ data         │  │  + browse via CDP │
              │              │  │  + query_data     │
              │ One call,    │  │  + query_elements │
              │ done.        │  │  + close_session  │
              └──────────────┘  └──────────────────┘

Key Design Decisions

Dual CDP connections: MCP server holds one CDP connection (keeps browser alive, powers semantic queries). Agent holds another (for direct browsing). Either alone keeps the browser alive.
Stateful, in-memory: Session state lives in a Map — no database, no persistence. The MCP server process lifecycle = session lifecycle. When the process dies, Tetra cleans up.
Actionable selectors from query_elements: Returns [tf623_id="..."] CSS selectors that agents use directly over CDP. No Playwright locators cross the MCP boundary.
Clear tool descriptions: Each tool explains when to use it vs alternatives, with AgentQL query syntax guide and examples baked into the description.

Testing

Verified end-to-end:

extract-web-data → HN top stories ✅
create_session → Tetra browser provisioned ✅
Agent CDP connection → navigate, screenshot ✅
query_data → pricing data extracted ✅
query_elements → selectors returned, used to click buttons ✅
close_session → cleanup ✅

…ery_elements) - Add Tetra cloud browser integration via AgentQL SDK + Playwright - New tools: create_session, close_session, query_data, query_elements - Existing extract-web-data tool preserved (stateless REST API) - Migrate from deprecated Server class to McpServer.registerTool() - Stateful in-memory session management with CDP connection holding - query_elements returns actionable CSS selectors for CDP interaction - Clear tool descriptions guide agents on when to use each tool - Bump version to 2.0.0

coderabbitai · 2026-02-17T02:42:41Z

📝 Walkthrough

Walkthrough

This pull request introduces browser automation and web data extraction capabilities to the MCP Server. It adds a new session management module that enables persistent browser sessions via Playwright and AgentQL, supporting both stateless REST-based extraction and interactive session-based queries. The entry point is refactored to register five new tools: extract-web-data, create_session, close_session, query_data, and query_elements. Dependencies are updated to include agentql and playwright, and the package version is bumped to 2.0.0. Documentation is expanded with installation instructions for multiple IDEs, architecture details, and development guidance.

Sequence Diagram(s)

sequenceDiagram
    participant Client as MCP Client
    participant Server as MCP Server
    participant SessionMgr as Session Manager
    participant Browser as Playwright Browser
    participant AgentQL as AgentQL Service

    rect rgba(100, 200, 150, 0.5)
    Note over Client,AgentQL: Create Browser Session Flow
    Client->>Server: create_session(options)
    Server->>SessionMgr: createSession(params)
    SessionMgr->>AgentQL: Create remote session
    AgentQL-->>SessionMgr: CDP URL + Streaming URL
    SessionMgr->>Browser: Connect via CDP
    Browser-->>SessionMgr: Browser instance ready
    SessionMgr-->>Server: {session_id, cdp_url}
    Server-->>Client: Session created
    end

    rect rgba(100, 150, 200, 0.5)
    Note over Client,AgentQL: Query Data from Session
    Client->>Server: query_data(session_id, query)
    Server->>SessionMgr: queryData(sessionId, query)
    SessionMgr->>Browser: Get page from context
    Browser-->>SessionMgr: Page instance
    SessionMgr->>AgentQL: Execute semantic query
    AgentQL-->>SessionMgr: Query results
    SessionMgr-->>Server: {data: result}
    Server-->>Client: Query results
    end

    rect rgba(200, 150, 100, 0.5)
    Note over Client,AgentQL: Close Session
    Client->>Server: close_session(session_id)
    Server->>SessionMgr: closeSession(sessionId)
    SessionMgr->>Browser: Close browser
    Browser-->>SessionMgr: Closed
    SessionMgr->>AgentQL: Cleanup session
    SessionMgr-->>Server: {ok: true}
    Server-->>Client: Session closed
    end

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly and concisely summarizes the main change—adding browser session tools with Tetra, CDP, and semantic queries—matching the substantial code additions in src/index.ts and src/sessions.ts.
Description check	✅ Passed	The description is comprehensive and directly related to the changeset, detailing new tools (create_session, close_session, query_data, query_elements), architecture diagrams, design decisions, and test verification that align with the code changes.
Merge Conflict Detection	✅ Passed	✅ No merge conflicts detected when merging into `main`

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch feat/browser-sessions

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

🧹 Nitpick comments (2)

README.md (2)

19-29: Add language specifier to fenced code block.

This ASCII diagram code block is missing a language specifier. Use text or plaintext to satisfy markdown linting rules.

📝 Proposed fix

-```
+```text
 "I need web data"

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@README.md` around lines 19 - 29, The fenced ASCII diagram starting with the
line "I need web data" is missing a language tag; open the code fence that
currently begins with ``` and add a language specifier such as text or plaintext
(e.g., change ``` to ```text) so the diagram satisfies Markdown linting rules
and renders as plain text—update the code block that contains the diagram lines
including "Single URL, no interaction needed?" and the branches to
extract-web-data / create_session accordingly.

180-199: Add language specifier to architecture diagram code block.

Same issue as the decision flow diagram — add text or plaintext for markdown lint compliance.

📝 Proposed fix

-```
+```text
 ┌──────────────────────────────────────────────────────────┐

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@README.md` around lines 180 - 199, The fenced ASCII architecture diagram
block in README.md lacks a language specifier; update the opening
triple-backtick for that diagram (the block containing the Agent / MCP Server /
Tetra Cloud Browser ASCII art) to include a language tag such as text or
plaintext (e.g., ```text) so markdown linters accept it and rendering stays
unchanged.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@README.md`:
- Around line 19-29: The fenced ASCII diagram starting with the line "I need web
data" is missing a language tag; open the code fence that currently begins with
``` and add a language specifier such as text or plaintext (e.g., change ``` to
```text) so the diagram satisfies Markdown linting rules and renders as plain
text—update the code block that contains the diagram lines including "Single
URL, no interaction needed?" and the branches to extract-web-data /
create_session accordingly.
- Around line 180-199: The fenced ASCII architecture diagram block in README.md
lacks a language specifier; update the opening triple-backtick for that diagram
(the block containing the Agent / MCP Server / Tetra Cloud Browser ASCII art) to
include a language tag such as text or plaintext (e.g., ```text) so markdown
linters accept it and rendering stays unchanged.

colriot

A few thoughts and questions to discuss:

found elements can be used for clicks/interactions — who's gonna interact with them? Job for another MCP? Then we need to mention that and suggest options
Did you observe cases when session is not closed after interaction? (minor thing, user can close themselves, but still)
Naming — extract-web-data but query-data. I think we agreed to use extract-* (and get_elements) in all integrations, and even in general it just makes sense to have aligned naming within the same thing (plus underscores vs dashes)
We have prompt and AQL inputs for our APIs. For the the old one we used prompt only, for new ones you are using AQL only. Wdyt about aligning these?
Also, how reliable is AQL query generation from simple examples in this PR?

colriot · 2026-02-20T20:45:34Z

+┌────────────────────────┐   ┌─────────────────────────┐
+│     MCP Server         │   │  Tetra Cloud Browser    │
+│                        │   │  (Chromium instance)    │
+│  extract-web-data ─REST─▶ AgentQL API               │


Smth is off with this line. What should it be pointing at?

colriot · 2026-02-20T20:46:25Z

+        │
+  Single URL, no interaction needed?
+        │
+    YES │           NO


NO branch looks disconnected

paveldudka · 2026-02-20T22:13:51Z

@colriot

found elements can be used for clicks/interactions — who's gonna interact with them? Job for another MCP? Then we need to mention that and suggest options

found elements returned as a json that contains enough breadcrumbs for AI to dynamically build a Playwright locator. That path is for cases when AI is directly connected to browser over CDP. So instead of analyzing HTML, AI triggers query_data tool. Tested locally - works

Did you observe cases when session is not closed after interaction? (minor thing, user can close themselves, but still)

I haven't, but tetra currently has a guardrail to auto-terminate on inactivity (5 mins by default). Maybe worth adding guardrail in the MCP tool itself too, so we clean up browser from the memory map by timeout too. Ill update

Naming — extract-web-data but query-data. I think we agreed to use extract-* (and get_elements) in all integrations, and even in general it just makes sense to have aligned naming within the same thing (plus underscores vs dashes)

underscore-vs-dashes - legit. Ill update.
But for others, what are your thouhgts? extract-web-data for REST API path and extract-data for CDP path? I find it worse..
AgentQL Docs describe those pieces as query-data and query-elements, so Im trying to be consistent. All integrations so far only focused on Rest API. But in this case we tap into fundamental AgentQL APIs and those are query-data and query-elements. We agreed to keep AgentQL API naming intact

We have prompt and AQL inputs for our APIs. For the the old one we used prompt only, for new ones you are using AQL only. Wdyt about aligning these?

that sounds like an extra MCP tool which we don't need at this point. I don't want to switch to prompt-only exclusively also - that will introduce an extra AI roundtrip (which we will be paying for) to generate a query to eventually call exactly the same query-data API. This is already an AI path, so why not generate query on user side? Will be much faster. Consistency argument in this context is overrated IMO.
Argument towards having a prompt-based approach - is that we can probably generate query a bit more reliably since query gen quality will depend on user's choice of model. And on our side its more stable. But I would optimize for a happy path here and use our query gen as a fallback only rather than forcing all users to pay latency penalty

ayc1 · 2026-04-07T23:21:14Z

i see closeAll is called on shutdown - that covers the happy path. the risk is ungraceful exits (SIGKILL, OOM, MCP client crash) where closeAll never runs and sessions leak.

if Tetra has idle timeouts on sessions this is probably fine in practice. if sessions are billed or have no TTL it's worth adding a SIGTERM/SIGINT handler or a heartbeat-based cleanup.

londondavila · 2026-04-08T17:51:52Z

+  const tetraSession = await createBrowserSession(sessionOpts);
+  const cdpUrl = tetraSession.cdpUrl;
+
+  // Connect Playwright to hold the connection alive
+  const browser = await chromium.connectOverCDP(cdpUrl);
+
+  const sessionId = `sess_${++idCounter}_${Date.now()}`;
+  const streamingUrl = tetraSession.getPageStreamingUrl(0);
+
+  sessions.set(sessionId, {
+    sessionId,
+    cdpUrl,
+    streamingUrl,
+    browser,
+    tetraSession,
+    createdAt: new Date(),
+  });
+
+  return {
+    session_id: sessionId,
+    cdp_url: cdpUrl,
+    streaming_url: streamingUrl,
+  };


If connectOverCDP() or getPageStreamingUrl() throws after createBrowserSession() succeeds, I believe the cloud browser is leaked

we can wrap these lines 62-84 in try/catch, call await tetraSession.close() in the catch

londondavila · 2026-04-08T17:52:43Z

+  const entry = sessions.get(sessionId);
+  if (entry) {
+    try {
+      await entry.browser.close();


browser.close() disconnects the local CDP client but never calls tetraSession.close(). Remote browsers may keep running and billing.

we should add await entry.tetraSession.close() before or after browser.close().

londondavila · 2026-04-08T17:53:39Z

+  await wrappedPage.queryElements(query, {
+    includeHidden: options.include_hidden ?? false,
+    mode: (options.mode ?? "fast") as "standard" | "fast",
+  });


getLastResponse() race in queryElements here; two concurrent calls on the same page can return each other's selectors. Since selectors are used for clicks/typing, this is silent data corruption.

lets serialize queries per page, or use an API that returns results directly

londondavila · 2026-04-08T17:54:34Z

-    "node-fetch": "^3.3.2"
+    "agentql": "^1.17.0",
+    "node-fetch": "^3.3.2",
+    "playwright": "^1.58.2"


i think we can get away with playwright-core here. as far as i can tell the MCP server only uses connectOverCDP, the whole playwright lib downloads ~300MB of unused browser binaries

londondavila · 2026-04-08T17:55:16Z

+  proxy_url?: string;
+}
+
+const sessions = new Map<string, SessionEntry>();


Unbounded Map, no cap, no reaper. A buggy agent loop can exhaust memory and Tetra resources - i think we want session limits or TTL

londondavila · 2026-04-08T17:55:35Z

+    "agentql": "^1.17.0",
+    "node-fetch": "^3.3.2",
+    "playwright": "^1.58.2"
  },


Also think we need to add zod

londondavila · 2026-04-08T17:56:21Z

+  if (params.profile !== "stealth") {
+    sessionOpts.uaPreset = uaPreset;
+  }
+  if (params.proxy === "tetra") {


proxy: "custom" without proxy_url silently ignored: Tool description says required, code doesn't enforce it. Falls through to unproxied

londondavila · 2026-04-08T17:56:47Z

+  // Connect Playwright to hold the connection alive
+  const browser = await chromium.connectOverCDP(cdpUrl);
+
+  const sessionId = `sess_${++idCounter}_${Date.now()}`;


Sequential counter + Date.now() - why not yse crypto.randomUUID() ?

londondavila

left a bunch of suggestions @paveldudka . approving with notes + nits

paveldudka requested a review from colriot February 17, 2026 02:43

fix: replace any with unknown to satisfy eslint no-explicit-any

b06e87d

paveldudka requested review from KateZhang98, ayc1, londondavila, lozzle, shuhaodo and uttambharadwaj February 17, 2026 02:43

docs: update README and package description for browser session tools

e382335

coderabbitai Bot reviewed Feb 17, 2026

View reviewed changes

colriot requested changes Feb 20, 2026

View reviewed changes

ayc1 approved these changes Apr 7, 2026

View reviewed changes

londondavila reviewed Apr 8, 2026

View reviewed changes

londondavila approved these changes Apr 8, 2026

View reviewed changes

Conversation

paveldudka commented Feb 17, 2026

Overview

What Changed

New Tools

Existing Tool (preserved)

Migration

Architecture

Before: Stateless REST Wrapper

After: Stateless REST + Full Browser Sessions

Session Lifecycle

When to Use What

Key Design Decisions

Testing

Uh oh!

coderabbitai Bot commented Feb 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Sequence Diagram(s)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

colriot left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

paveldudka commented Feb 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ayc1 commented Apr 7, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

londondavila left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

coderabbitai Bot commented Feb 17, 2026 •

edited

Loading

paveldudka commented Feb 20, 2026 •

edited

Loading