feat: AST-based parsing improvements and examples-update command by maxim-uvarov · Pull Request #38 · nushell-prophet/dotnu

maxim-uvarov · 2026-01-10T01:38:34Z

Summary

Add ast-complete command: Fills gaps in ast --flatten output with synthetic tokens (semicolons, whitespace, assignments, etc.) for complete byte coverage
Add split-statements command: Splits source code into individual statements using AST analysis, correctly handling nested blocks
Add examples-update command: Executes @example blocks and updates their --result values (similar to embeds-update but for examples)
Refactor list-module-commands: Uses new AST infrastructure for more accurate scope detection and attribute parsing

Changes

New commands

ast-complete - Complete AST output by filling gaps with synthetic tokens
split-statements - Split source into statements using AST analysis
find-examples - Find @example blocks with their code and result sections
execute-example - Execute example code and return result as nuon
examples-update - Update @example result values by executing them

Improvements

list-module-commands now uses ast-complete and split-statements for better accuracy
Added descriptions to @example attributes throughout codebase
Fixed @example result formats to use proper nuon syntax

Documentation

Added AST behavior test cases in tests/ast-cases/ documenting ast --flatten and ast --json behavior
Updated CLAUDE.md with project conventions
Added development notes in todo/

Tests

Unit tests for find-examples, execute-example, examples-update
Updated integration test fixtures

Test plan

nu toolkit.nu test passes
Manual verification of examples-update on real files
Review AST edge case documentation

🤖 Generated with Claude Code

Fill in example descriptions for dependencies, filter-commands-with-no-tests, and set-x commands. These descriptions are used by nutest's generate-example-tests to create documented test files. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

@example

Add a command that executes @example blocks and updates their --result values with actual execution output. Similar to embeds-update but for @example attributes. - Parses @example blocks with single-line results - Executes code and updates results in nuon format - Skips multiline results (starting with single quote) - Updated existing example results to canonical nuon format 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Documents that `ast --flatten` omits: - Statement-ending semicolons - Variable assignment operators (=) Uses dotnu embed format for captured outputs. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

@example

Documents shape_block vs shape_closure distinction: - shape_closure: def bodies, standalone closures - shape_block: if/else, @example args Also documents: - Whitespace in brace tokens - @example produces shape_garbage - @ prefix not included in token 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

@example

Documents @example, @test, @deprecated attribute parsing: - @ prefix not included in token content - Detection via byte check at (span.start - 1) - @test → shape_garbage, @example → shape_internalcall - @ inside strings not tokenized separately - Comments produce empty AST 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Documents def/export def tokenization: - "export def" is a single token (not two) - Command name is shape_string (quotes preserved) - Signature is single shape_signature token - Flags (--env, --wrapped) appear as shape_flag 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Adds `ast-complete` command that fills gaps in `ast --flatten` output with synthetic tokens, providing complete byte coverage. Synthetic shapes added: - shape_semicolon: statement-ending `;` - shape_assignment: variable assignment `=` - shape_whitespace: spaces between tokens - shape_pipe: pipe operator `|` - shape_comma: comma separator `,` - shape_gap: unclassified content (like `@` prefix) This enables reliable span-based text replacement without string matching. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

- All commands in commands.nu are exported by default - mod.nu controls the public API via selective re-exports - Internal commands are accessible but not in public API 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

@example

Replace regex-based parsing with AST-based approach for more reliable @example detection: - Use `ast --flatten` to tokenize source and get byte positions - Detect @example by checking byte at (start-1) is "@" - Extract code from shape_block token boundaries - Handle --result flag detection via shape_flag tokens This fixes: - False positives from @example inside strings - Potential crash from `| last` on empty input - Fragile line-based parsing 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Fix duplicate result bug: - Use full original text for matching instead of just result line - This ensures unique matches even when multiple examples have same result Improve error handling: - Use `do -i` with `complete` to capture subprocess errors properly - Skip failed examples instead of corrupting file with error messages - Print warning to stderr with code and error details Module name stripping reviewed and verified working correctly. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

@example

Add 13 new tests covering: find-examples (7 tests): - Basic @example detection - Multiple @examples in file - @example inside string (ignored) - @example without --result (skipped) - Empty input handling - Malformed @example handling - Multiline code extraction execute-example (3 tests): - Simple expression execution - Error handling (returns error record) - Multiline result handling examples-update (3 tests): - Updates result values correctly - Handles multiple examples - Preserves file when no examples 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Add three new AST behavior documentation files: string-literals.nu: - Single/double quoted strings (shape_string) - Interpolated strings (shape_string_interpolation with nested tokens) - Raw strings (shape_raw_string) - Backtick strings (shape_external) - Multiline strings, empty strings operators.nu: - Arithmetic operators (+, -, *, /, **) - Comparison operators (==, !=, <, >) - Logical operators (and, or, not) - Range operators (.., ..<) - Pipeline operator (shape_pipe) variables.nu: - Variable declaration (let/mut with shape_vardecl) - Variable references (shape_variable vs shape_garbage) - Environment variables ($env.X split into shape_variable + shape_string) - Special variables ($in, $nu) - Type annotations, variable shadowing 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Mark all four examples-update improvement tasks as completed: - 001: AST-based find-examples (commit 6808e50) - 002: Fix reliability bugs (commit faac906) - 003: Add unit tests (commit 2662ff5) - 004: Add AST test cases (commit 7f4491b) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

These internal commands need to be exported per project convention (all commands in commands.nu are exported for testing). 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

- Use sentinel tokens [{end: 0}] and [{start: len}] to handle leading, inter-token, and trailing gaps in a single pass - Remove redundant dead code in classify-gap (unreachable branch) - Reduce ast-complete from ~60 to ~25 lines - Reduce classify-gap from ~25 to ~10 lines 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

- Emphasize using `nu toolkit.nu test` (not separate commands) - Add `--update` flag documentation - Remove outdated test count 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Reduce code from 21 to 14 lines by inlining variables and using idiomatic where/each pattern instead of each/if/compact. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

@example

…ection Replace manual byte-checking for @ prefix with ast-complete which exposes @ as shape_gap tokens. This simplifies @example detection logic: - Check for shape_gap ending with "@" followed by "example" token - Handle gaps that include preceding newlines (e.g., "\n\n@") 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

New command that splits source code into individual statements using AST analysis. Uses ast-complete to identify statement boundaries (semicolons and newlines at top level). Correctly handles nested blocks - newlines inside blocks don't create new statements. Returns table with statement text and byte positions for precise extraction. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

…pe detection Replace line-based def detection with split-statements which provides: - Accurate statement boundaries via AST analysis - Proper scope ranges (start, end) for each def - Better handling of multi-line def signatures Also fix split-statements to: - Handle self-contained blocks like {} with no net depth change - Recognize shape_gap starting with newline as statement boundary (comments are bundled into gaps by AST) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

…mands Replace manual byte-checking for @ prefix with ast-complete pattern matching. Now uses the same approach as find-examples: detect shape_gap ending with @ followed by attribute token. Also removes unused code_bytes variable since all AST operations now use ast-complete. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Document the ast-complete and split-statements work: - Problem: ast --flatten omits semicolons, pipes, @, whitespace - Solution: ast-complete fills gaps with synthetic tokens - Built split-statements on top for statement boundary detection - Refactored find-examples and list-module-commands to use these Future work outlined: - Document ast --json output with test cases - General-purpose ast --json parser - Pipeline analysis tool - History command parser for nushell-history-based-completions

Adds a new section outlining the first step for future work: creating test cases to document ast --json behavior before building parsers on top of it. This follows the same literate programming approach used for ast --flatten documentation. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Add test case files documenting Nushell's `ast --json` behavior using literate programming annotations. Covers basic output structure, command calls with arguments/flags, blocks/closures/control flow, and span mapping. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

…bstring) Add detailed documentation comparing two approaches for extracting source text from AST spans, recommending `bytes at` for its semantic match with AST's exclusive end convention. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Updated todo with findings showing ast --json is ideal for parsing history commands (vs ast --flatten). Added comparison table of features, example outputs for common patterns (flags, parameters, positional args), and mapping to database schema for history-based-completions project. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Replace hardcoded '/tmp/' paths with $nu.temp-path for Windows compatibility. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Sort actual and expected values before comparison to handle platform differences in glob ordering. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Add sort-by to dependencies integration tests to ensure deterministic output across platforms (macOS vs Windows glob ordering). Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Add CRLF to LF conversion when reading files on Windows to ensure byte positions from AST parsing are correct. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

claude and others added 30 commits January 2, 2026 16:55

chore: format

dca7305

feat: add tests for ast-complete

db5b591

chore: remove todo/ from gitignore

bf5d280

add todos

8046b00

test: update embeds-update fixture (random int output)

f357bbe

🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

claude and others added 6 commits January 5, 2026 21:46

docs: add todo for extract-pipelines command using ast --json

b27885f

fix: use cross-platform temp path in tests

58b87d6

Replace hardcoded '/tmp/' paths with $nu.temp-path for Windows compatibility. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

fix: make example tests order-independent

edbb047

Sort actual and expected values before comparison to handle platform differences in glob ordering. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

fix: sort integration test output for cross-platform consistency

541295b

Add sort-by to dependencies integration tests to ensure deterministic output across platforms (macOS vs Windows glob ordering). Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

fix: normalize CRLF in list-module-commands for Windows

a65f027

Add CRLF to LF conversion when reading files on Windows to ensure byte positions from AST parsing are correct. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

maxim-uvarov merged commit d244a97 into main Jan 10, 2026
2 checks passed

maxim-uvarov deleted the ast branch January 10, 2026 01:52

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: AST-based parsing improvements and examples-update command#38

feat: AST-based parsing improvements and examples-update command#38
maxim-uvarov merged 36 commits intomainfrom
ast

maxim-uvarov commented Jan 10, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

maxim-uvarov commented Jan 10, 2026

Summary

Changes

New commands

Improvements

Documentation

Tests

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants