Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
25 commits
Select commit Hold shift + click to select a range
285c3a8
Full architecture overhaul: bidirectional HCL2 ↔ JSON pipeline with t…
kkozik-amplify Mar 9, 2026
8180b34
add CLAUDE.md (#260)
kkozik-amplify Mar 9, 2026
d300020
update package metadata: (#263)
kkozik-amplify Mar 9, 2026
989b9f0
add reconstructor unit tests (#265)
kkozik-amplify Mar 9, 2026
bdea212
`BaseFormatter` - fixes to complex function args formatting; (#266)
kkozik-amplify Mar 9, 2026
5bf7a9a
CHANGELOG.md - 8.0.0 notes (#264)
kkozik-amplify Mar 9, 2026
a80e8e2
update publish.yml workflow so that `rc` packages are uploaded to pyp…
kkozik-amplify Mar 9, 2026
4911962
add function tuples round-trip test suite (#268)
kkozik-amplify Mar 10, 2026
3efcd37
SerializationOptions - add an option to strip string quotes
kkozik-amplify Mar 10, 2026
f60b01c
add descriptions to each option of SerializerOptions, DeserializerOpt…
kkozik-amplify Mar 10, 2026
2dd72b7
Add postlexer to support multiline binary operators and ternary expre…
kkozik-amplify Mar 11, 2026
43dd96f
more robust whitespace handling in reconstruction (#271)
kkozik-amplify Mar 11, 2026
c886e46
update changelog for 8.0.0rc2 (#273)
kkozik-amplify Mar 11, 2026
0c30780
hq: read-only query CLI for HCL2 files (#277)
kkozik-amplify Mar 31, 2026
70a5307
Fix missing space between binary operator and unary operand in recons…
kkozik-amplify Mar 31, 2026
e42f880
Add template directives support (%{if}, %{for}) in quoted strings (#276)
kkozik-amplify Mar 31, 2026
133efa2
add roadmap section to README (#279)
kkozik-amplify Mar 31, 2026
51b3861
agent-friendly conversion CLIs (#274)
kkozik-amplify Apr 2, 2026
b8f1dfe
Add .[] as jq-compatible alias for [*] and document jq interop (#283)
kkozik-amplify Apr 4, 2026
e9be26b
Fix comment serialization: multi-token NL_OR_COMMENT and classificati…
kkozik-amplify Apr 7, 2026
68ad804
Add adjacent comment support to hq BlockView and AttributeView querie…
kkozik-amplify Apr 7, 2026
5da4887
Refactor hq CLI for readability and reduced complexity (#284)
kkozik-amplify Apr 7, 2026
33590fc
Consolidate v8 RC changelog into v8.1.0 release (2026-04-07) (#285)
kkozik-amplify Apr 7, 2026
23dd269
Add v7-to-v8 migration guide and use absolute GitHub links in README …
kkozik-amplify Apr 7, 2026
472f033
Prepare v8.1.1 release (#288)
kkozik-amplify Apr 7, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 8 additions & 2 deletions .coveragerc
Original file line number Diff line number Diff line change
@@ -1,9 +1,15 @@
[run]
branch = true
omit =
hcl2/__main__.py
hcl2/lark_parser.py
hcl2/version.py
hcl2/__main__.py
hcl2/__init__.py
hcl2/rules/__init__.py
cli/__init__.py

[report]
show_missing = true
fail_under = 80
fail_under = 95
exclude_lines =
raise NotImplementedError
14 changes: 9 additions & 5 deletions .github/ISSUE_TEMPLATE/hcl2-parsing-error.md
Original file line number Diff line number Diff line change
@@ -1,27 +1,31 @@
---
______________________________________________________________________

name: HCL2 parsing error
about: Template for reporting a bug related to parsing HCL2 code
title: ''
labels: bug
assignees: kkozik-amplify

---
______________________________________________________________________

**Describe the bug**

A clear and concise description of what the bug is.

**Software:**
- OS: [macOS / Windows / Linux]
- Python version (e.g. 3.9.21)
- python-hcl2 version (e.g. 7.0.0)

- OS: \[macOS / Windows / Linux\]
- Python version (e.g. 3.9.21)
- python-hcl2 version (e.g. 7.0.0)

**Snippet of HCL2 code causing the unexpected behaviour:**

```terraform
locals {
foo = "bar"
}
```

**Expected behavior**

A clear and concise description of what you expected to happen, e.g. python dictionary or JSON you expected to receive as a result of parsing.
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/publish.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
name: Publish
on:
release:
types: [released]
types: [published]

jobs:
build-publish:
Expand Down
1 change: 1 addition & 0 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ repos:
rev: v4.3.0
hooks:
- id: trailing-whitespace
exclude: ^test/integration/(hcl2_reconstructed|specialized)/
- id: end-of-file-fixer
- id: check-added-large-files
- id: no-commit-to-branch # Prevent commits directly to master
Expand Down
38 changes: 38 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,44 @@ and this project adheres to [Semantic Versioning](http://semver.org/spec/v2.0.0.

- Nothing yet.

## \[8.1.1\] - 2026-04-07

### Added

- v7-to-v8 migration guide and absolute GitHub links in README docs table. ([#287](https://github.com/amplify-education/python-hcl2/pull/287))

## \[8.1.0\] - 2026-04-07

### Added

- Full architecture overhaul: bidirectional HCL2 ↔ JSON pipeline with typed rule classes. ([#203](https://github.com/amplify-education/python-hcl2/pull/203))
- `hq` read-only query CLI for HCL2 files ([#277](https://github.com/amplify-education/python-hcl2/pull/277))
- Agent-friendly conversion CLIs: `hcl2tojson` and `jsontohcl2` ([#274](https://github.com/amplify-education/python-hcl2/pull/274))
- Add template directives support (`%{if}`, `%{for}`) in quoted strings ([#276](https://github.com/amplify-education/python-hcl2/pull/276))
- Support loading comments ([#134](https://github.com/amplify-education/python-hcl2/issues/134))
- CLAUDE.md ([#260](https://github.com/amplify-education/python-hcl2/pull/260))

### Fixed

- Ternary with strings parse error ([#55](https://github.com/amplify-education/python-hcl2/issues/55))
- "No terminal matches '|' in the current parser context" when parsing multi-line conditional ([#142](https://github.com/amplify-education/python-hcl2/issues/142))
- reverse_transform not working with object-type variables ([#231](https://github.com/amplify-education/python-hcl2/issues/231))
- reverse_transform not handling nested functions ([#235](https://github.com/amplify-education/python-hcl2/issues/235))
- `writes` omits quotes around map keys with `/` ([#236](https://github.com/amplify-education/python-hcl2/issues/236))
- Operator precedence bug ([#248](https://github.com/amplify-education/python-hcl2/issues/248))
- Empty string dictionary keys can't be parsed twice ([#249](https://github.com/amplify-education/python-hcl2/issues/249))
- jsonencode not deserialized correctly ([#250](https://github.com/amplify-education/python-hcl2/issues/250))
- Literal string "string" incorrectly quoted ([#251](https://github.com/amplify-education/python-hcl2/issues/251))
- Interpolation literals added to locals/variables in maps ([#252](https://github.com/amplify-education/python-hcl2/issues/252))
- Object literal expression can't be serialized ([#253](https://github.com/amplify-education/python-hcl2/issues/253))
- Heredocs should interpret backslash literally ([#262](https://github.com/amplify-education/python-hcl2/issues/262))
- Parsing a multi-line multi-conditional expression causes exception — Unexpected token Token('QMARK', '?') ([#269](https://github.com/amplify-education/python-hcl2/issues/269))
- Parsing error for multiline binary operators ([#246](https://github.com/amplify-education/python-hcl2/pull/246))

### Changed

- Updated package metadata: development status, dropped Python 3.7 support. ([#263](https://github.com/amplify-education/python-hcl2/pull/263))

## \[7.3.1\] - 2025-07-24

### Fixed
Expand Down
204 changes: 204 additions & 0 deletions CLAUDE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,204 @@
# HCL2 Parser — CLAUDE.md

## Pipeline

```
Forward: HCL2 Text → [PostLexer] → Lark Parse Tree → LarkElement Tree → Python Dict/JSON
Reverse: Python Dict/JSON → LarkElement Tree → Lark Tree → HCL2 Text
Direct: HCL2 Text → [PostLexer] → Lark Parse Tree → LarkElement Tree → Lark Tree → HCL2 Text
```

The **Direct** pipeline (`parse_to_tree` → `transform` → `to_lark` → `reconstruct`) skips serialization to dict, so all IR nodes (including `NewLineOrCommentRule` nodes for whitespace/comments) directly influence the reconstructed output. Any information discarded before the IR is lost in this pipeline.

## Module Map

| Module | Role |
|---|---|
| `hcl2/hcl2.lark` | Lark grammar definition |
| `hcl2/api.py` | Public API (`load/loads/dump/dumps` + intermediate stages) |
| `hcl2/postlexer.py` | Token stream transforms between lexer and parser |
| `hcl2/parser.py` | Lark parser factory with caching |
| `hcl2/transformer.py` | Lark parse tree → LarkElement tree |
| `hcl2/deserializer.py` | Python dict → LarkElement tree |
| `hcl2/formatter.py` | Whitespace alignment and spacing on LarkElement trees |
| `hcl2/reconstructor.py` | LarkElement tree → HCL2 text via Lark |
| `hcl2/builder.py` | Programmatic HCL document construction |
| `hcl2/walk.py` | Generic tree-walking primitives for the LarkElement IR tree |
| `hcl2/utils.py` | `SerializationOptions`, `SerializationContext`, string helpers |
| `hcl2/const.py` | Constants: `IS_BLOCK`, `COMMENTS_KEY`, `INLINE_COMMENTS_KEY` |
| `cli/helpers.py` | File/directory/stdin conversion helpers |
| `cli/hcl_to_json.py` | `hcl2tojson` entry point |
| `cli/json_to_hcl.py` | `jsontohcl2` entry point |
| `cli/hq.py` | `hq` CLI entry point — query dispatch, formatting, optional operator |
| `hcl2/query/__init__.py` | Public query API exports |
| `hcl2/query/_base.py` | `NodeView` base class, view registry, `view_for()` factory |
| `hcl2/query/body.py` | `DocumentView`, `BodyView` facades for top-level and body queries |
| `hcl2/query/blocks.py` | `BlockView` facade for block queries |
| `hcl2/query/attributes.py` | `AttributeView` facade for attribute queries |
| `hcl2/query/containers.py` | `TupleView`, `ObjectView` facades for container queries |
| `hcl2/query/expressions.py` | `ConditionalView` facade for conditional expressions |
| `hcl2/query/functions.py` | `FunctionCallView` facade for function call queries |
| `hcl2/query/for_exprs.py` | `ForTupleView`, `ForObjectView` facades for for-expressions |
| `hcl2/query/path.py` | Structural path parser (`PathSegment`, `parse_path`, `[select()]`, `type:name`) |
| `hcl2/query/resolver.py` | Path resolver — segment-by-segment with label depth, type filter |
| `hcl2/query/pipeline.py` | Pipe operator — `split_pipeline`, `classify_stage`, `execute_pipeline` |
| `hcl2/query/builtins.py` | Built-in transforms: `keys`, `values`, `length` |
| `hcl2/query/diff.py` | Structural diff between two HCL documents |
| `hcl2/query/predicate.py` | `select()` predicate tokenizer, recursive descent parser, evaluator |
| `hcl2/query/safe_eval.py` | AST-validated Python expression eval for hybrid/eval modes |
| `hcl2/query/introspect.py` | `--describe` and `--schema` output generation |

`hcl2/__main__.py` is a thin wrapper that imports `cli.hcl_to_json:main`.

### Rules (one class per grammar rule)

| File | Domain |
|---|---|
| `rules/abstract.py` | `LarkElement`, `LarkRule`, `LarkToken` base classes |
| `rules/tokens.py` | `StringToken` (cached factory), `StaticStringToken`, punctuation constants |
| `rules/base.py` | `StartRule`, `BodyRule`, `BlockRule`, `AttributeRule` |
| `rules/containers.py` | `TupleRule`, `ObjectRule`, `ObjectElemRule`, `ObjectElemKeyRule` |
| `rules/expressions.py` | `ExprTermRule`, `BinaryOpRule`, `UnaryOpRule`, `ConditionalRule` |
| `rules/literal_rules.py` | `IntLitRule`, `FloatLitRule`, `IdentifierRule`, `KeywordRule` |
| `rules/strings.py` | `StringRule`, `InterpolationRule`, `HeredocTemplateRule`, `TemplateStringRule` |
| `rules/functions.py` | `FunctionCallRule`, `ArgumentsRule` |
| `rules/indexing.py` | `GetAttrRule`, `SqbIndexRule`, splat rules |
| `rules/for_expressions.py` | `ForTupleExprRule`, `ForObjectExprRule`, `ForIntroRule`, `ForCondRule` |
| `rules/directives.py` | `TemplateIfRule`, `TemplateForRule`, and flat directive start/end rules |
| `rules/whitespace.py` | `NewLineOrCommentRule`, `InlineCommentMixIn` |

## Public API (`api.py`)

Follows the `json` module convention. All option parameters are keyword-only.

- `load/loads` — HCL2 text → Python dict
- `dump/dumps` — Python dict → HCL2 text
- `query` — HCL2 text/file → `DocumentView` for structured queries
- Intermediate stages: `parse/parses`, `parse_to_tree/parses_to_tree`, `transform`, `serialize`, `from_dict`, `from_json`, `reconstruct`

### Option Dataclasses

**`SerializationOptions`** (LarkElement → dict):
`with_comments`, `with_meta`, `wrap_objects`, `wrap_tuples`, `explicit_blocks`, `preserve_heredocs`, `force_operation_parentheses`, `preserve_scientific_notation`, `strip_string_quotes`

**`DeserializerOptions`** (dict → LarkElement):
`heredocs_to_strings`, `strings_to_heredocs`, `object_elements_colon`, `object_elements_trailing_comma`

**`FormatterOptions`** (whitespace/alignment):
`indent_length`, `open_empty_blocks`, `open_empty_objects`, `open_empty_tuples`, `vertically_align_attributes`, `vertically_align_object_elements`

## CLI

Console scripts defined in `pyproject.toml`. All three CLIs accept positional `PATH` arguments (files, directories, glob patterns, or `-` for stdin). When no `PATH` is given, stdin is read by default (like `jq`).

### Exit Codes

All CLIs use structured error output (plain text to stderr) and distinct exit codes:

| Code | `hcl2tojson` | `jsontohcl2` | `hq` |
|------|---|---|---|
| 0 | Success | Success | Success |
| 1 | Partial (some skipped) | JSON/encoding parse error | No results |
| 2 | All unparsable | Bad HCL structure | Parse error |
| 3 | — | — | Query error |
| 4 | I/O error | I/O error | I/O error |
| 5 | — | Differences found (`--diff` / `--semantic-diff`) | — |

### `hcl2tojson`

```
hcl2tojson file.tf # single file to stdout
hcl2tojson --ndjson dir/ # directory → NDJSON to stdout
hcl2tojson a.tf b.tf -o out/ # multiple files to output dir
hcl2tojson --ndjson 'modules/**/*.tf' # glob + NDJSON streaming
hcl2tojson --only resource,module file.tf # block type filtering
hcl2tojson --exclude variable file.tf # exclude block types
hcl2tojson --fields cpu,memory file.tf # field projection
hcl2tojson --compact file.tf # single-line JSON
hcl2tojson -q dir/ -o out/ # quiet (no stderr progress)
echo 'x = 1' | hcl2tojson # stdin (no args needed)
```

Key flags: `--ndjson`, `--compact`, `--only`/`--exclude`, `--fields`, `-q`/`--quiet`, `--json-indent N`, `--with-meta`, `--with-comments`, `--strip-string-quotes` (breaks round-trip). Multi-file NDJSON adds a `__file__` provenance key to each object.

### `jsontohcl2`

```
jsontohcl2 file.json # single file to stdout
jsontohcl2 --diff original.tf modified.json # preview text changes
jsontohcl2 --semantic-diff original.tf modified.json # semantic-only changes
jsontohcl2 --semantic-diff original.tf --diff-json m.json # semantic diff as JSON
jsontohcl2 --dry-run file.json # convert without writing
jsontohcl2 --fragment - # attribute snippets from stdin
jsontohcl2 --indent 4 --no-align file.json
```

Key flags: `--diff ORIGINAL`, `--semantic-diff ORIGINAL`, `--diff-json`, `--dry-run`, `--fragment`, `-q`/`--quiet`, `--indent N`, `--no-align`, `--colon-separator`.

Add new options as `parser.add_argument()` calls in the relevant entry point module.

## PostLexer (`postlexer.py`)

Lark's `postlex` parameter accepts a single object with a `process(stream)` method that transforms the token stream between the lexer and LALR parser. The `PostLexer` class is designed for extensibility: each transformation is a private method that accepts and yields tokens, and `process()` chains them together.

Current passes:

- `_merge_newlines_into_operators`

To add a new pass: create a private method with the same `(self, stream) -> generator` signature, and add a `yield from` call in `process()`.

## Hard Rules

These are project-specific constraints that must not be violated:

1. **Always use the LarkElement IR.** Never transform directly from Lark parse tree to Python dict or vice versa.
1. **Block vs object distinction.** Use `__is_block__` markers (`const.IS_BLOCK`) to preserve semantic intent during round-trips. The deserializer must distinguish blocks from regular objects.
1. **Bidirectional completeness.** Every serialization path must have a corresponding deserialization path. Test round-trip integrity: Parse → Serialize → Deserialize → Serialize produces identical results.
1. **One grammar rule = one `LarkRule` class.** Each class implements `lark_name()`, typed property accessors, `serialize()`, and declares `_children_layout: Tuple[...]` (annotation only, no assignment) to document child structure.
1. **Token caching.** Use the `StringToken` factory in `rules/tokens.py` — never create token instances directly.
1. **Interpolation context.** `${...}` generation depends on nesting depth — always pass and respect `SerializationContext`.
1. **Update both directions.** When adding language features, update transformer.py, deserializer.py, formatter.py and reconstructor.py.

## Adding a New Language Construct

1. Add grammar rules to `hcl2.lark`
1. If the new construct creates LALR ambiguities with `NL_OR_COMMENT`, add a postlexer pass in `postlexer.py`
1. Create rule class(es) in the appropriate `rules/` file
1. Add transformer method(s) in `transformer.py`
1. Implement `serialize()` in the rule class
1. Update `deserializer.py`, `formatter.py` and `reconstructor.py` for round-trip support

## Testing

Framework: `unittest.TestCase` (not pytest).

```
python -m unittest discover -s test -p "test_*.py" -v
```

**Unit tests** (`test/unit/`): instantiate rule objects directly (no parsing).

- `rules/` — one file per rules module
- `cli/` — one file per CLI module
- `test_*.py` — tests for corresponding files from `hcl2/` directory

Use concrete stubs when testing ABCs (e.g., `StubExpression(ExpressionRule)`).

**Integration tests** (`test/integration/`): full-pipeline tests with golden files.

- `test_round_trip.py` — iterates over all suites in `hcl2_original/`, tests HCL→JSON, JSON→JSON, JSON→HCL, and full round-trip
- `test_specialized.py` — feature-specific tests with golden files in `specialized/`

Always run round-trip full test suite after any modification.

## Pre-commit Checks

Hooks are defined in `.pre-commit-config.yaml` (includes black, mypy, pylint, and others). All changed files must pass these checks before committing. When writing or modifying code:

- Format Python with **black** (Python 3.8 target).
- Ensure **mypy** and **pylint** pass. Pylint config is in `pylintrc`, scoped to `hcl2/` and `test/`.
- End files with a newline; strip trailing whitespace (except under `test/integration/(hcl2_reconstructed|specialized)/`).

## Keeping Docs Current

Update this file when architecture, modules, API surface, or testing conventions change. Also update `README.md` and the docs in `docs/` (`01_getting_started.md`, `02_querying.md`, `03_advanced_api.md`, `04_hq.md`, `05_hq_examples.md`) when changes affect the public API, CLI flags, or option fields.
Loading
Loading