Add rendered HTML content support via Accept header #195

papertray3 · 2025-11-08T15:37:41Z

Summary

This PR adds support for retrieving rendered HTML content from markdown files using HTTP content negotiation. Clients can now request fully-rendered HTML by specifying Accept: application/vnd.olrapi.note+html in GET requests.

Changes

New Content Type: Added ContentTypes.olrapiNoteHtml constant (application/vnd.olrapi.note+html)
HTML Rendering: Implemented renderMarkdownToHtml() method using Obsidian's native MarkdownRenderer.render() API
Request Handling: Updated _vaultGet() to handle HTML content type requests
Documentation: Updated OpenAPI specification with new content format examples
Memory Management: Proper component lifecycle management to prevent memory leaks

Features

The rendered HTML output includes:

Wiki-links converted to HTML anchors with proper data-href attributes
Obsidian-specific markdown (callouts, embeds, transclusions)
Plugin integrations preserved (e.g., Metadata Menu icons)
Syntax-highlighted code blocks
Full semantic HTML structure

API Usage

# Get rendered HTML
curl -k https://127.0.0.1:27124/vault/path/to/note.md \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Accept: application/vnd.olrapi.note+html"

Backward Compatibility

This feature:

Uses standard HTTP content negotiation (Accept header)
Does not modify existing behavior for markdown or JSON responses
Works with all GET endpoints (/vault/, /active/, /periodic/)
Is fully optional - defaults to existing markdown behavior

Testing

Tested with:

Various markdown files with wiki-links, embeds, and frontmatter
Complex documents with Dataview queries
Plugin integrations (Metadata Menu)
Both /vault/{filename} and /active/ endpoints

Use Cases

This feature enables:

Digital garden publishing workflows with preview capabilities
External rendering of Obsidian notes
Wiki-link validation before publishing
Integration with static site generators

🤖 Generated with Claude Code

papertray3 · 2025-11-09T06:56:22Z

Follow-up: I discovered Dataview/DataviewJS blocks were still empty unless the renderer had a real DOM host. The latest commit
attaches a hidden container to document.body, renders there, waits for dynamic blocks to settle, then removes it. This ensures the HTML
returned by /vault matches what Obsidian shows (full Workbench/Queue/Vault stats, etc.). Added coverage for the Accept: application/
vnd.olrapi.note+html path while I was here.

Implement DOM-based extraction of rendered Obsidian content with support for both plain text and structured JSON output formats. Features: - rendered-text content type: Plain text extraction from rendered DOM - rendered-json content type: Structured JSON with typed content blocks - Cache system for fast repeated access - Full Dataview/DataviewJS rendering support - Settings UI for cache configuration - Automatic cache invalidation on file changes Content Types: - application/vnd.olrapi.note+rendered-text (plain text) - application/vnd.olrapi.note+rendered-json (structured JSON) JSON Structure: - metadata: source path, render timestamp, format version - frontmatter: complete YAML frontmatter as object - content: array of typed blocks (heading, table, list, paragraph, code, callout) Table blocks include headers and rows as 2D arrays, making them easy to parse and analyze programmatically. Implementation: - RenderCacheManager: handles rendering and caching - StructuredExtractor: converts DOM to typed JSON blocks - Cache stored in .obsidian/render-cache/ with hash-based keys - 2-second content settlement wait for async plugin rendering - Restores original user view after extraction Benefits for AI Clients: - Tables as parseable 2D arrays - Document structure preserved with type tags - Frontmatter metadata accessible - Content queryable by type - No parsing ambiguity Testing: - Verified with complex Dataview dashboards - Tables extract correctly with headers and rows - All frontmatter fields preserved - Bundle size: 2.4mb (optimized, no PDF dependencies) 🤖 Generated with Claude Code https://claude.com/claude-code Co-Authored-By: Claude <[email protected]>

Document the new rendered-text and rendered-json content types, including usage examples, JSON schema, caching system, and troubleshooting guide. - Updated README.md with rendered content overview - Added comprehensive RENDERED_CONTENT.md guide - Includes API reference, examples, and use cases - Documents both text and JSON output formats 🤖 Generated with Claude Code Co-Authored-By: Claude <[email protected]>

coddingtonbear

Thanks so much for the patch, @papertray3 — there are some great ideas here! I’m definitely open to supporting a feature that returns rendered HTML when the user supplies an appropriate Accept header (likely text/html, as I noted in a few comments).

I did notice that this PR bundles together a few larger changes that aren’t directly related to that HTML-rendering behavior. To keep things easy to review and to get your work merged more quickly, would you mind splitting this into smaller, focused PRs, each introducing a single change or feature? When you do, a quick update to the docs in /docs would also help people understand how to use the new functionality.

Once you’ve had a chance to break things out, feel free to ping me — I really appreciate the contribution, and I think the rendered-HTML feature in particular will be super useful to a lot of users!

coddingtonbear · 2025-11-15T18:50:41Z

README.md


 This was inspired by [Vinzent03](https://github.com/Vinzent03)'s [advanced-uri plugin](https://github.com/Vinzent03/obsidian-advanced-uri) with hopes of expanding the automation options beyond the limitations of custom URL schemes.
+
+## Rendered Content Support


We have public docs in the /docs/ folder. Instead of documenting this in the readme, that should probably be in the docs themselves if you could.

coddingtonbear · 2025-11-15T18:51:40Z

RENDERED_CONTENT.md

@@ -0,0 +1,540 @@
+# Rendered Content API


I'm betting this is an artifact created when you asked an LLM to help you code this and I bet you didn't intend to include this in the PR.

coddingtonbear · 2025-11-15T18:55:10Z

src/constants.ts

  olrapiNoteJson = "application/vnd.olrapi.note+json",
+  olrapiNoteHtml = "application/vnd.olrapi.note+html",
+  olrapiRenderedText = "application/vnd.olrapi.note+rendered-text",
+  olrapiRenderedJson = "application/vnd.olrapi.note+rendered-json",


Forgive me, but this PR is already extremely long -- could I trouble you to break this apart into individual PRs for each of these types of content? Only one of these content types appears to be discussed in this PR; so I think sticking to rendered HTML in this one is probably the right choice, and I want to say: I might need some convincing before I'll be up for taking on a patch for the other two rendering options.

Second thing: you don't need to invent a new content type for things like html/text/json -- those already exist:

text/html

text/plain

application/json

coddingtonbear · 2025-11-15T18:56:51Z

src/main.ts

+        );
+
+      // Cache statistics
+      if (this.plugin.renderCacheManager) {


Was there a particular performance problem you found that inspired you to want to cache renders? It seems like kind of a lot of complication to add unless there are severe performance problems.

Copilot

Pull request overview

This PR adds support for retrieving rendered HTML and structured content from Obsidian notes via HTTP content negotiation. The implementation introduces three new Accept header content types: HTML rendering for notes, structured JSON extraction for AI/LLM integration, and plain text extraction. A comprehensive caching system stores rendered content to improve performance on subsequent requests.

Key Changes

New rendering capabilities: HTML rendering via MarkdownRenderer.render(), structured JSON extraction from DOM, and plain text extraction
Caching infrastructure: MD5-based content-addressable cache with automatic invalidation on file modifications
Settings UI: Added configuration options for cache directory, size limits, auto-cleanup, and render timeout

Reviewed changes

Copilot reviewed 9 out of 10 changed files in this pull request and generated 19 comments.

Show a summary per file

File	Description
`src/structuredExtractor.ts`	New file implementing DOM-to-JSON structured content extraction with support for headings, tables, lists, code blocks, callouts, and paragraphs
`src/renderCacheManager.ts`	New caching manager with file-based cache storage, MD5 content hashing, and automatic invalidation via vault event listeners
`src/requestHandler.ts`	Added request handlers for three new content types (HTML, rendered JSON, rendered text) with proper error handling and content negotiation
`src/main.ts`	Integrated render cache manager initialization, lifecycle management, and settings UI with cache statistics and manual clear functionality
`src/types.ts`	Added cache-related settings: directory path, max size, auto-cleanup flag, and render timeout
`src/constants.ts`	Added three new content type constants and default cache settings
`package.json`	Added `@types/pdf-parse` and `pdfjs-dist` dependencies (unused)
`RENDERED_CONTENT.md`	Comprehensive documentation covering API usage, JSON schema, caching system, performance characteristics, and use cases
`README.md`	Added rendered content feature overview with examples and usage instructions

Comments suppressed due to low confidence (2)

src/renderCacheManager.ts:5

Unused import pdfParse.

import pdfParse from "pdf-parse";

src/structuredExtractor.ts:66

Unused variable currentTable.

    let currentTable: TableBlock | null = null;

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2025-11-23T16:07:26Z

src/structuredExtractor.ts

+    contentEl: HTMLElement
+  ): StructuredContent {
+    const content: ContentBlock[] = [];
+    let currentTable: TableBlock | null = null;


The variable currentTable is declared but never used in this method. Consider removing it or implementing the intended table-handling logic.

Suggested change

let currentTable: TableBlock | null = null;

Copilot · 2025-11-23T16:07:27Z

src/renderCacheManager.ts

+        this.settings.cacheDirectory
+      );
+      for (const dir of files.folders) {
+        if (!dir.endsWith("CacheIndex.json")) {


The logic for clearing cache is incorrect. This checks if folder names end with "CacheIndex.json" (which they won't, since folders are directories not files). It should check if dir === this.cacheIndexPath or use path.basename(dir) !== "CacheIndex.json" to avoid deleting the index file.

Suggested change

if (!dir.endsWith("CacheIndex.json")) {

if (path.basename(dir) !== "CacheIndex.json") {

Copilot · 2025-11-23T16:07:27Z

src/renderCacheManager.ts

+    const content = await this.app.vault.cachedRead(file);
+    const fileHash = this.getFileHash(content);
+
+    if (this.cacheIndex[fileHash]) {


The cache invalidation reads the current file content to compute its hash, but if the file was deleted, this will fail. Consider catching errors from cachedRead for the delete event or using the file hash from the cache index based on file path instead.

Suggested change

const content = await this.app.vault.cachedRead(file);

const fileHash = this.getFileHash(content);

if (this.cacheIndex[fileHash]) {

let fileHash: string | undefined;

try {

const content = await this.app.vault.cachedRead(file);

fileHash = this.getFileHash(content);

} catch (error) {

// File may have been deleted; try to find hash from cache index

for (const [hash, metadata] of Object.entries(this.cacheIndex)) {

if (metadata.sourcePath === file.path) {

fileHash = hash;

break;

}

}

}

if (fileHash && this.cacheIndex[fileHash]) {

Copilot · 2025-11-23T16:07:28Z

src/requestHandler.ts

+              console.error("Error rendering to text:", error);
+              this.returnCannedResponse(res, {
+                statusCode: 500,
+                message: `Failed to render content: ${error.message}`,


The error object should be properly typed or checked before accessing its message property. TypeScript's error in catch blocks is of type unknown. Consider using error instanceof Error ? error.message : String(error) for safer error message extraction.

Suggested change

message: `Failed to render content: ${error.message}`,

message: `Failed to render content: ${error instanceof Error ? error.message : String(error)}`,

Copilot · 2025-11-23T16:07:28Z

src/renderCacheManager.ts

+  private async extractTextFromDOM(file: TFile): Promise<string> {
+    // Store current active file to restore later
+    const originalFile = this.app.workspace.getActiveFile();
+    const activeLeaf = this.app.workspace.getLeaf(false);
+    const previousMode = (activeLeaf.view as any).getMode?.();
+
+    try {
+      // Open the file
+      await activeLeaf.openFile(file);


[nitpick] Opening files in the active leaf to render them can cause unexpected user experience issues if rendering fails or times out, as the original file might not be restored. Consider using a hidden/detached leaf or container for rendering to avoid disrupting the user's workspace.

Copilot · 2025-11-23T16:07:31Z

src/renderCacheManager.ts

+    const pdfPath = this.getCachePath(fileHash, "pdf");
+    const textPath = this.getCachePath(fileHash, "txt");
+
+    try {
+      await this.app.vault.adapter.stat(pdfPath);


Cache validation checks for the existence of both PDF and text files, but the code only generates text files (see cacheTextOnly method). This will cause cache misses even when text cache exists. Either remove the PDF check or implement PDF caching.

Suggested change

const pdfPath = this.getCachePath(fileHash, "pdf");

const textPath = this.getCachePath(fileHash, "txt");

try {

await this.app.vault.adapter.stat(pdfPath);

const textPath = this.getCachePath(fileHash, "txt");

try {

Copilot · 2025-11-23T16:07:31Z

src/renderCacheManager.ts

+      }
+
+      // Wait for content to settle (Dataview, etc.)
+      await new Promise(resolve => setTimeout(resolve, 2000));


The timeout is hardcoded to 2000ms but should use this.settings.renderTimeoutMs to respect the user's configuration. Apply the same fix to the renderToJson method (line 189).

Suggested change

await new Promise(resolve => setTimeout(resolve, 2000));

await new Promise(resolve => setTimeout(resolve, this.settings.renderTimeoutMs));

Copilot · 2025-11-23T16:07:31Z

src/renderCacheManager.ts

+        await view.setState({ mode: "preview" }, {});
+      }
+
+      await new Promise(resolve => setTimeout(resolve, 2000));


The timeout is hardcoded to 2000ms but should use this.settings.renderTimeoutMs to respect the user's configuration.

Copilot · 2025-11-23T16:07:31Z

src/requestHandler.ts

+        if (req.headers.accept === ContentTypes.olrapiNoteHtml) {
+          const file = this.app.vault.getAbstractFileByPath(path) as TFile;
+          if (file && mimeType === ContentTypes.markdown) {
+            const markdown = await this.app.vault.cachedRead(file);
+            const html = await this.renderMarkdownToHtml(markdown, path);
+            res.setHeader("Content-Type", ContentTypes.olrapiNoteHtml);
+            res.send(html);
+            return;
+          } else {
+            this.returnCannedResponse(res, {
+              statusCode: 400,
+              message: "Rendered HTML is only available for markdown files",
+            });
+            return;
+          }
+        }


The original handler for ContentTypes.olrapiNoteJson was removed in this PR. This content type returns file metadata via getFileMetadataObject(). If this was intentional (replaced by olrapiRenderedJson), the constant should be removed from ContentTypes enum and the bodyParser middleware (line 1348). If unintentional, the original handler should be restored to maintain backward compatibility.

Copilot · 2025-11-23T16:07:32Z

src/structuredExtractor.ts

+        el.querySelectorAll("li").forEach((li) => {
+          const text = li.textContent?.trim();
+          if (text) items.push(text);
+        });


The list extraction using querySelectorAll("li") will capture all nested list items, which could lead to incorrect flattening of nested lists. Consider using direct children only (> li) or implementing recursive handling to preserve list structure.

papertray3 force-pushed the main branch from af31923 to 121398c Compare November 10, 2025 18:33

coddingtonbear requested changes Nov 15, 2025

View reviewed changes

coddingtonbear requested a review from Copilot November 23, 2025 16:01

Copilot started reviewing on behalf of coddingtonbear November 23, 2025 16:01 View session

Copilot finished reviewing on behalf of coddingtonbear November 23, 2025 16:05

Copilot AI reviewed Nov 23, 2025

View reviewed changes


		This was inspired by [Vinzent03](https://github.com/Vinzent03)'s [advanced-uri plugin](https://github.com/Vinzent03/obsidian-advanced-uri) with hopes of expanding the automation options beyond the limitations of custom URL schemes.

		## Rendered Content Support

	if (!dir.endsWith("CacheIndex.json")) {
	if (path.basename(dir) !== "CacheIndex.json") {

-    const content = await this.app.vault.cachedRead(file);
-    const fileHash = this.getFileHash(content);
-    if (this.cacheIndex[fileHash]) {
+    let fileHash: string | undefined;
+    try {
+      const content = await this.app.vault.cachedRead(file);
+      fileHash = this.getFileHash(content);
+    } catch (error) {
+      // File may have been deleted; try to find hash from cache index
+      for (const [hash, metadata] of Object.entries(this.cacheIndex)) {
+        if (metadata.sourcePath === file.path) {
+          fileHash = hash;
+          break;
+        }
+      }
+    }
+    if (fileHash && this.cacheIndex[fileHash]) {

	message: `Failed to render content: ${error.message}`,
	message: `Failed to render content: ${error instanceof Error ? error.message : String(error)}`,

	await new Promise(resolve => setTimeout(resolve, 2000));
	await new Promise(resolve => setTimeout(resolve, this.settings.renderTimeoutMs));

Add rendered HTML content support via Accept header #195

Are you sure you want to change the base?

Add rendered HTML content support via Accept header #195

Uh oh!

Conversation

papertray3 commented Nov 8, 2025

Summary

Changes

Features

API Usage

Backward Compatibility

Testing

Use Cases

Uh oh!

papertray3 commented Nov 9, 2025

Uh oh!

coddingtonbear left a comment

Choose a reason for hiding this comment

Uh oh!

coddingtonbear Nov 15, 2025

Choose a reason for hiding this comment

Uh oh!

coddingtonbear Nov 15, 2025

Choose a reason for hiding this comment

Uh oh!

coddingtonbear Nov 15, 2025

Choose a reason for hiding this comment

Uh oh!

coddingtonbear Nov 15, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Key Changes

Reviewed changes

Uh oh!

Copilot AI Nov 23, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 23, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 23, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 23, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 23, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 23, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 23, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 23, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 23, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 23, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants