diff --git a/.coveragerc b/.coveragerc index 4facabdc..6e581776 100644 --- a/.coveragerc +++ b/.coveragerc @@ -1,9 +1,15 @@ [run] branch = true omit = - hcl2/__main__.py hcl2/lark_parser.py + hcl2/version.py + hcl2/__main__.py + hcl2/__init__.py + hcl2/rules/__init__.py + cli/__init__.py [report] show_missing = true -fail_under = 80 +fail_under = 95 +exclude_lines = + raise NotImplementedError diff --git a/.github/ISSUE_TEMPLATE/hcl2-parsing-error.md b/.github/ISSUE_TEMPLATE/hcl2-parsing-error.md index 4837d3ff..1b526e9a 100644 --- a/.github/ISSUE_TEMPLATE/hcl2-parsing-error.md +++ b/.github/ISSUE_TEMPLATE/hcl2-parsing-error.md @@ -1,27 +1,31 @@ ---- +______________________________________________________________________ + name: HCL2 parsing error about: Template for reporting a bug related to parsing HCL2 code title: '' labels: bug assignees: kkozik-amplify ---- +______________________________________________________________________ **Describe the bug** A clear and concise description of what the bug is. **Software:** - - OS: [macOS / Windows / Linux] - - Python version (e.g. 3.9.21) - - python-hcl2 version (e.g. 7.0.0) + +- OS: \[macOS / Windows / Linux\] +- Python version (e.g. 3.9.21) +- python-hcl2 version (e.g. 7.0.0) **Snippet of HCL2 code causing the unexpected behaviour:** + ```terraform locals { foo = "bar" } ``` + **Expected behavior** A clear and concise description of what you expected to happen, e.g. python dictionary or JSON you expected to receive as a result of parsing. diff --git a/.github/workflows/publish.yml b/.github/workflows/publish.yml index f4242c57..75872e23 100644 --- a/.github/workflows/publish.yml +++ b/.github/workflows/publish.yml @@ -2,7 +2,7 @@ name: Publish on: release: - types: [released] + types: [published] jobs: build-publish: diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml index 11b63555..ef43294d 100644 --- a/.pre-commit-config.yaml +++ b/.pre-commit-config.yaml @@ -6,6 +6,7 @@ repos: rev: v4.3.0 hooks: - id: trailing-whitespace + exclude: ^test/integration/(hcl2_reconstructed|specialized)/ - id: end-of-file-fixer - id: check-added-large-files - id: no-commit-to-branch # Prevent commits directly to master diff --git a/CHANGELOG.md b/CHANGELOG.md index 1f3590fd..0e2b5a61 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -9,6 +9,44 @@ and this project adheres to [Semantic Versioning](http://semver.org/spec/v2.0.0. - Nothing yet. +## \[8.1.1\] - 2026-04-07 + +### Added + +- v7-to-v8 migration guide and absolute GitHub links in README docs table. ([#287](https://github.com/amplify-education/python-hcl2/pull/287)) + +## \[8.1.0\] - 2026-04-07 + +### Added + +- Full architecture overhaul: bidirectional HCL2 ↔ JSON pipeline with typed rule classes. ([#203](https://github.com/amplify-education/python-hcl2/pull/203)) +- `hq` read-only query CLI for HCL2 files ([#277](https://github.com/amplify-education/python-hcl2/pull/277)) +- Agent-friendly conversion CLIs: `hcl2tojson` and `jsontohcl2` ([#274](https://github.com/amplify-education/python-hcl2/pull/274)) +- Add template directives support (`%{if}`, `%{for}`) in quoted strings ([#276](https://github.com/amplify-education/python-hcl2/pull/276)) +- Support loading comments ([#134](https://github.com/amplify-education/python-hcl2/issues/134)) +- CLAUDE.md ([#260](https://github.com/amplify-education/python-hcl2/pull/260)) + +### Fixed + +- Ternary with strings parse error ([#55](https://github.com/amplify-education/python-hcl2/issues/55)) +- "No terminal matches '|' in the current parser context" when parsing multi-line conditional ([#142](https://github.com/amplify-education/python-hcl2/issues/142)) +- reverse_transform not working with object-type variables ([#231](https://github.com/amplify-education/python-hcl2/issues/231)) +- reverse_transform not handling nested functions ([#235](https://github.com/amplify-education/python-hcl2/issues/235)) +- `writes` omits quotes around map keys with `/` ([#236](https://github.com/amplify-education/python-hcl2/issues/236)) +- Operator precedence bug ([#248](https://github.com/amplify-education/python-hcl2/issues/248)) +- Empty string dictionary keys can't be parsed twice ([#249](https://github.com/amplify-education/python-hcl2/issues/249)) +- jsonencode not deserialized correctly ([#250](https://github.com/amplify-education/python-hcl2/issues/250)) +- Literal string "string" incorrectly quoted ([#251](https://github.com/amplify-education/python-hcl2/issues/251)) +- Interpolation literals added to locals/variables in maps ([#252](https://github.com/amplify-education/python-hcl2/issues/252)) +- Object literal expression can't be serialized ([#253](https://github.com/amplify-education/python-hcl2/issues/253)) +- Heredocs should interpret backslash literally ([#262](https://github.com/amplify-education/python-hcl2/issues/262)) +- Parsing a multi-line multi-conditional expression causes exception — Unexpected token Token('QMARK', '?') ([#269](https://github.com/amplify-education/python-hcl2/issues/269)) +- Parsing error for multiline binary operators ([#246](https://github.com/amplify-education/python-hcl2/pull/246)) + +### Changed + +- Updated package metadata: development status, dropped Python 3.7 support. ([#263](https://github.com/amplify-education/python-hcl2/pull/263)) + ## \[7.3.1\] - 2025-07-24 ### Fixed diff --git a/CLAUDE.md b/CLAUDE.md new file mode 100644 index 00000000..b89a9ec0 --- /dev/null +++ b/CLAUDE.md @@ -0,0 +1,204 @@ +# HCL2 Parser — CLAUDE.md + +## Pipeline + +``` +Forward: HCL2 Text → [PostLexer] → Lark Parse Tree → LarkElement Tree → Python Dict/JSON +Reverse: Python Dict/JSON → LarkElement Tree → Lark Tree → HCL2 Text +Direct: HCL2 Text → [PostLexer] → Lark Parse Tree → LarkElement Tree → Lark Tree → HCL2 Text +``` + +The **Direct** pipeline (`parse_to_tree` → `transform` → `to_lark` → `reconstruct`) skips serialization to dict, so all IR nodes (including `NewLineOrCommentRule` nodes for whitespace/comments) directly influence the reconstructed output. Any information discarded before the IR is lost in this pipeline. + +## Module Map + +| Module | Role | +|---|---| +| `hcl2/hcl2.lark` | Lark grammar definition | +| `hcl2/api.py` | Public API (`load/loads/dump/dumps` + intermediate stages) | +| `hcl2/postlexer.py` | Token stream transforms between lexer and parser | +| `hcl2/parser.py` | Lark parser factory with caching | +| `hcl2/transformer.py` | Lark parse tree → LarkElement tree | +| `hcl2/deserializer.py` | Python dict → LarkElement tree | +| `hcl2/formatter.py` | Whitespace alignment and spacing on LarkElement trees | +| `hcl2/reconstructor.py` | LarkElement tree → HCL2 text via Lark | +| `hcl2/builder.py` | Programmatic HCL document construction | +| `hcl2/walk.py` | Generic tree-walking primitives for the LarkElement IR tree | +| `hcl2/utils.py` | `SerializationOptions`, `SerializationContext`, string helpers | +| `hcl2/const.py` | Constants: `IS_BLOCK`, `COMMENTS_KEY`, `INLINE_COMMENTS_KEY` | +| `cli/helpers.py` | File/directory/stdin conversion helpers | +| `cli/hcl_to_json.py` | `hcl2tojson` entry point | +| `cli/json_to_hcl.py` | `jsontohcl2` entry point | +| `cli/hq.py` | `hq` CLI entry point — query dispatch, formatting, optional operator | +| `hcl2/query/__init__.py` | Public query API exports | +| `hcl2/query/_base.py` | `NodeView` base class, view registry, `view_for()` factory | +| `hcl2/query/body.py` | `DocumentView`, `BodyView` facades for top-level and body queries | +| `hcl2/query/blocks.py` | `BlockView` facade for block queries | +| `hcl2/query/attributes.py` | `AttributeView` facade for attribute queries | +| `hcl2/query/containers.py` | `TupleView`, `ObjectView` facades for container queries | +| `hcl2/query/expressions.py` | `ConditionalView` facade for conditional expressions | +| `hcl2/query/functions.py` | `FunctionCallView` facade for function call queries | +| `hcl2/query/for_exprs.py` | `ForTupleView`, `ForObjectView` facades for for-expressions | +| `hcl2/query/path.py` | Structural path parser (`PathSegment`, `parse_path`, `[select()]`, `type:name`) | +| `hcl2/query/resolver.py` | Path resolver — segment-by-segment with label depth, type filter | +| `hcl2/query/pipeline.py` | Pipe operator — `split_pipeline`, `classify_stage`, `execute_pipeline` | +| `hcl2/query/builtins.py` | Built-in transforms: `keys`, `values`, `length` | +| `hcl2/query/diff.py` | Structural diff between two HCL documents | +| `hcl2/query/predicate.py` | `select()` predicate tokenizer, recursive descent parser, evaluator | +| `hcl2/query/safe_eval.py` | AST-validated Python expression eval for hybrid/eval modes | +| `hcl2/query/introspect.py` | `--describe` and `--schema` output generation | + +`hcl2/__main__.py` is a thin wrapper that imports `cli.hcl_to_json:main`. + +### Rules (one class per grammar rule) + +| File | Domain | +|---|---| +| `rules/abstract.py` | `LarkElement`, `LarkRule`, `LarkToken` base classes | +| `rules/tokens.py` | `StringToken` (cached factory), `StaticStringToken`, punctuation constants | +| `rules/base.py` | `StartRule`, `BodyRule`, `BlockRule`, `AttributeRule` | +| `rules/containers.py` | `TupleRule`, `ObjectRule`, `ObjectElemRule`, `ObjectElemKeyRule` | +| `rules/expressions.py` | `ExprTermRule`, `BinaryOpRule`, `UnaryOpRule`, `ConditionalRule` | +| `rules/literal_rules.py` | `IntLitRule`, `FloatLitRule`, `IdentifierRule`, `KeywordRule` | +| `rules/strings.py` | `StringRule`, `InterpolationRule`, `HeredocTemplateRule`, `TemplateStringRule` | +| `rules/functions.py` | `FunctionCallRule`, `ArgumentsRule` | +| `rules/indexing.py` | `GetAttrRule`, `SqbIndexRule`, splat rules | +| `rules/for_expressions.py` | `ForTupleExprRule`, `ForObjectExprRule`, `ForIntroRule`, `ForCondRule` | +| `rules/directives.py` | `TemplateIfRule`, `TemplateForRule`, and flat directive start/end rules | +| `rules/whitespace.py` | `NewLineOrCommentRule`, `InlineCommentMixIn` | + +## Public API (`api.py`) + +Follows the `json` module convention. All option parameters are keyword-only. + +- `load/loads` — HCL2 text → Python dict +- `dump/dumps` — Python dict → HCL2 text +- `query` — HCL2 text/file → `DocumentView` for structured queries +- Intermediate stages: `parse/parses`, `parse_to_tree/parses_to_tree`, `transform`, `serialize`, `from_dict`, `from_json`, `reconstruct` + +### Option Dataclasses + +**`SerializationOptions`** (LarkElement → dict): +`with_comments`, `with_meta`, `wrap_objects`, `wrap_tuples`, `explicit_blocks`, `preserve_heredocs`, `force_operation_parentheses`, `preserve_scientific_notation`, `strip_string_quotes` + +**`DeserializerOptions`** (dict → LarkElement): +`heredocs_to_strings`, `strings_to_heredocs`, `object_elements_colon`, `object_elements_trailing_comma` + +**`FormatterOptions`** (whitespace/alignment): +`indent_length`, `open_empty_blocks`, `open_empty_objects`, `open_empty_tuples`, `vertically_align_attributes`, `vertically_align_object_elements` + +## CLI + +Console scripts defined in `pyproject.toml`. All three CLIs accept positional `PATH` arguments (files, directories, glob patterns, or `-` for stdin). When no `PATH` is given, stdin is read by default (like `jq`). + +### Exit Codes + +All CLIs use structured error output (plain text to stderr) and distinct exit codes: + +| Code | `hcl2tojson` | `jsontohcl2` | `hq` | +|------|---|---|---| +| 0 | Success | Success | Success | +| 1 | Partial (some skipped) | JSON/encoding parse error | No results | +| 2 | All unparsable | Bad HCL structure | Parse error | +| 3 | — | — | Query error | +| 4 | I/O error | I/O error | I/O error | +| 5 | — | Differences found (`--diff` / `--semantic-diff`) | — | + +### `hcl2tojson` + +``` +hcl2tojson file.tf # single file to stdout +hcl2tojson --ndjson dir/ # directory → NDJSON to stdout +hcl2tojson a.tf b.tf -o out/ # multiple files to output dir +hcl2tojson --ndjson 'modules/**/*.tf' # glob + NDJSON streaming +hcl2tojson --only resource,module file.tf # block type filtering +hcl2tojson --exclude variable file.tf # exclude block types +hcl2tojson --fields cpu,memory file.tf # field projection +hcl2tojson --compact file.tf # single-line JSON +hcl2tojson -q dir/ -o out/ # quiet (no stderr progress) +echo 'x = 1' | hcl2tojson # stdin (no args needed) +``` + +Key flags: `--ndjson`, `--compact`, `--only`/`--exclude`, `--fields`, `-q`/`--quiet`, `--json-indent N`, `--with-meta`, `--with-comments`, `--strip-string-quotes` (breaks round-trip). Multi-file NDJSON adds a `__file__` provenance key to each object. + +### `jsontohcl2` + +``` +jsontohcl2 file.json # single file to stdout +jsontohcl2 --diff original.tf modified.json # preview text changes +jsontohcl2 --semantic-diff original.tf modified.json # semantic-only changes +jsontohcl2 --semantic-diff original.tf --diff-json m.json # semantic diff as JSON +jsontohcl2 --dry-run file.json # convert without writing +jsontohcl2 --fragment - # attribute snippets from stdin +jsontohcl2 --indent 4 --no-align file.json +``` + +Key flags: `--diff ORIGINAL`, `--semantic-diff ORIGINAL`, `--diff-json`, `--dry-run`, `--fragment`, `-q`/`--quiet`, `--indent N`, `--no-align`, `--colon-separator`. + +Add new options as `parser.add_argument()` calls in the relevant entry point module. + +## PostLexer (`postlexer.py`) + +Lark's `postlex` parameter accepts a single object with a `process(stream)` method that transforms the token stream between the lexer and LALR parser. The `PostLexer` class is designed for extensibility: each transformation is a private method that accepts and yields tokens, and `process()` chains them together. + +Current passes: + +- `_merge_newlines_into_operators` + +To add a new pass: create a private method with the same `(self, stream) -> generator` signature, and add a `yield from` call in `process()`. + +## Hard Rules + +These are project-specific constraints that must not be violated: + +1. **Always use the LarkElement IR.** Never transform directly from Lark parse tree to Python dict or vice versa. +1. **Block vs object distinction.** Use `__is_block__` markers (`const.IS_BLOCK`) to preserve semantic intent during round-trips. The deserializer must distinguish blocks from regular objects. +1. **Bidirectional completeness.** Every serialization path must have a corresponding deserialization path. Test round-trip integrity: Parse → Serialize → Deserialize → Serialize produces identical results. +1. **One grammar rule = one `LarkRule` class.** Each class implements `lark_name()`, typed property accessors, `serialize()`, and declares `_children_layout: Tuple[...]` (annotation only, no assignment) to document child structure. +1. **Token caching.** Use the `StringToken` factory in `rules/tokens.py` — never create token instances directly. +1. **Interpolation context.** `${...}` generation depends on nesting depth — always pass and respect `SerializationContext`. +1. **Update both directions.** When adding language features, update transformer.py, deserializer.py, formatter.py and reconstructor.py. + +## Adding a New Language Construct + +1. Add grammar rules to `hcl2.lark` +1. If the new construct creates LALR ambiguities with `NL_OR_COMMENT`, add a postlexer pass in `postlexer.py` +1. Create rule class(es) in the appropriate `rules/` file +1. Add transformer method(s) in `transformer.py` +1. Implement `serialize()` in the rule class +1. Update `deserializer.py`, `formatter.py` and `reconstructor.py` for round-trip support + +## Testing + +Framework: `unittest.TestCase` (not pytest). + +``` +python -m unittest discover -s test -p "test_*.py" -v +``` + +**Unit tests** (`test/unit/`): instantiate rule objects directly (no parsing). + +- `rules/` — one file per rules module +- `cli/` — one file per CLI module +- `test_*.py` — tests for corresponding files from `hcl2/` directory + +Use concrete stubs when testing ABCs (e.g., `StubExpression(ExpressionRule)`). + +**Integration tests** (`test/integration/`): full-pipeline tests with golden files. + +- `test_round_trip.py` — iterates over all suites in `hcl2_original/`, tests HCL→JSON, JSON→JSON, JSON→HCL, and full round-trip +- `test_specialized.py` — feature-specific tests with golden files in `specialized/` + +Always run round-trip full test suite after any modification. + +## Pre-commit Checks + +Hooks are defined in `.pre-commit-config.yaml` (includes black, mypy, pylint, and others). All changed files must pass these checks before committing. When writing or modifying code: + +- Format Python with **black** (Python 3.8 target). +- Ensure **mypy** and **pylint** pass. Pylint config is in `pylintrc`, scoped to `hcl2/` and `test/`. +- End files with a newline; strip trailing whitespace (except under `test/integration/(hcl2_reconstructed|specialized)/`). + +## Keeping Docs Current + +Update this file when architecture, modules, API surface, or testing conventions change. Also update `README.md` and the docs in `docs/` (`01_getting_started.md`, `02_querying.md`, `03_advanced_api.md`, `04_hq.md`, `05_hq_examples.md`) when changes affect the public API, CLI flags, or option fields. diff --git a/README.md b/README.md index 1ff75876..ce255029 100644 --- a/README.md +++ b/README.md @@ -1,5 +1,4 @@ [![Codacy Badge](https://app.codacy.com/project/badge/Grade/2e2015f9297346cbaa788c46ab957827)](https://app.codacy.com/gh/amplify-education/python-hcl2/dashboard?utm_source=gh&utm_medium=referral&utm_content=&utm_campaign=Badge_grade) -[![Build Status](https://travis-ci.org/amplify-education/python-hcl2.svg?branch=master)](https://travis-ci.org/amplify-education/python-hcl2) [![License](https://img.shields.io/badge/license-MIT-blue.svg)](https://raw.githubusercontent.com/amplify-education/python-hcl2/master/LICENSE) [![PyPI](https://img.shields.io/pypi/v/python-hcl2.svg)](https://pypi.org/project/python-hcl2/) [![Python Versions](https://img.shields.io/pypi/pyversions/python-hcl2.svg)](https://pypi.python.org/pypi/python-hcl2) @@ -8,8 +7,9 @@ # Python HCL2 A parser for [HCL2](https://github.com/hashicorp/hcl/blob/hcl2/hclsyntax/spec.md) written in Python using -[Lark](https://github.com/lark-parser/lark). This parser only supports HCL2 and isn't backwards compatible -with HCL v1. It can be used to parse any HCL2 config file such as Terraform. +[Lark](https://github.com/lark-parser/lark). It can be used as a Python library or through its CLI tools: +`hcl2tojson`, `jsontohcl2`, and `hq` — a jq-like query tool for HCL files. +Supports HCL2 only (not backwards compatible with HCL v1) and works with any HCL2 config file such as Terraform. ## About Amplify @@ -24,7 +24,7 @@ Learn more at ### Prerequisites -python-hcl2 requires Python 3.7 or higher to run. +python-hcl2 requires Python 3.8 or higher to run. ### Installing @@ -34,21 +34,78 @@ This package can be installed using `pip` pip3 install python-hcl2 ``` +To install the CLI tools (`hcl2tojson`, `jsontohcl2`, `hq`) globally without affecting your project environments, use [pipx](https://pipx.pypa.io/): + +```sh +pipx install python-hcl2 +``` + ### Usage +**HCL2 to Python dict:** + ```python import hcl2 -with open('foo.tf', 'r') as file: - dict = hcl2.load(file) + +with open("main.tf") as f: + data = hcl2.load(f) ``` -### Parse Tree to HCL2 reconstruction +**Python dict to HCL2:** + +```python +import hcl2 -With version 6.x the possibility of HCL2 reconstruction from the Lark Parse Tree and Python dictionaries directly was introduced. +hcl_string = hcl2.dumps(data) + +with open("output.tf", "w") as f: + hcl2.dump(data, f) +``` -Documentation and an example of manipulating Lark Parse Tree and reconstructing it back into valid HCL2 can be found in [tree-to-hcl2-reconstruction.md](https://github.com/amplify-education/python-hcl2/blob/main/tree-to-hcl2-reconstruction.md) file. +**Building HCL from scratch:** + +```python +import hcl2 -More details about reconstruction implementation can be found in PRs #169 and #177. +doc = hcl2.Builder() +res = doc.block("resource", labels=["aws_instance", "web"], ami="abc-123", instance_type="t2.micro") +res.block("tags", Name="HelloWorld") + +hcl_string = hcl2.dumps(doc.build()) +``` + +### Documentation + +| Guide | Contents | +|---|---| +| [Getting Started](https://github.com/amplify-education/python-hcl2/blob/main/docs/01_getting_started.md) | Installation, load/dump, options, CLI converters | +| [Querying HCL (Python)](https://github.com/amplify-education/python-hcl2/blob/main/docs/02_querying.md) | DocumentView, BlockView, tree walking, view hierarchy | +| [Advanced API](https://github.com/amplify-education/python-hcl2/blob/main/docs/03_advanced_api.md) | Pipeline stages, Builder | +| [hq Reference](https://github.com/amplify-education/python-hcl2/blob/main/docs/04_hq.md) | `hq` CLI — structural queries, hybrid/eval, introspection | +| [hq Examples](https://github.com/amplify-education/python-hcl2/blob/main/docs/05_hq_examples.md) | Real-world queries for discovery, compliance, extraction | +| [Migrating to v8](https://github.com/amplify-education/python-hcl2/blob/main/docs/06_migrating_to_v8.md) | Breaking changes, updated API patterns, v7-compat options | + +### CLI Tools + +python-hcl2 ships three command-line tools: + +```sh +# HCL2 → JSON +hcl2tojson main.tf # prints JSON to stdout +hcl2tojson main.tf output.json # writes to file +hcl2tojson terraform/ output/ # converts a directory + +# JSON → HCL2 +jsontohcl2 output.json # prints HCL2 to stdout +jsontohcl2 output.json main.tf # writes to file +jsontohcl2 output/ terraform/ # converts a directory + +# Query HCL2 files +hq 'resource.aws_instance.main.ami' main.tf +hq 'variable[*]' variables.tf --json +``` + +All commands accept `-` as PATH to read from stdin. Run `--help` on any command for the full list of flags. ## Building From Source @@ -61,7 +118,7 @@ Running `tox` will automatically execute linters as well as the unit tests. You can also run them individually with the `-e` argument. -For example, `tox -e py37-unit` will run the unit tests for python 3.7 +For example, `tox -e py310-unit` will run the unit tests for python 3.10 To see all the available options, run `tox -l`. @@ -71,6 +128,17 @@ To create a new release go to Releases page, press 'Draft a new release', create with a version you want to be released, fill the release notes and press 'Publish release'. Github actions will take care of publishing it to PyPi. +## Roadmap + +Planned features, roughly in priority order: + +- **MCP server** — expose parsing, querying, and formatting as MCP tools for AI agents +- **Predictable formatting** — source-derived formatting heuristics and stable whitespace defaults +- **In-place tree edits** — `hq set`, `hq delete` for programmatic HCL modification +- **Comment deserialization** — comments survive JSON round-trips and tree edits +- **Expression intelligence** — variable reference tracking, unused variable detection, cross-file analysis +- **Refactoring operations** — `hq rename`, `hq extract-module`, `hq sort` for high-level code transforms + ## Responsible Disclosure If you have any security issue to report, contact project maintainers privately. @@ -81,21 +149,10 @@ You can reach us at We welcome pull requests! For your pull request to be accepted smoothly, we suggest that you: - For any sizable change, first open a GitHub issue to discuss your idea. -- Create a pull request. Explain why you want to make the change and what it’s for. +- Create a pull request. Explain why you want to make the change and what it's for. -We’ll try to answer any PR’s promptly. +We'll try to answer any PR's promptly. ## Limitations -### Using inline expression as an object key - -- Object key can be an expression as long as it is wrapped in parentheses: - ```terraform - locals { - foo = "bar" - baz = { - (format("key_prefix_%s", local.foo)) : "value" - # format("key_prefix_%s", local.foo) : "value" this will fail - } - } - ``` +None that are known. diff --git a/hcl2/py.typed b/cli/__init__.py similarity index 100% rename from hcl2/py.typed rename to cli/__init__.py diff --git a/cli/hcl_to_json.py b/cli/hcl_to_json.py new file mode 100644 index 00000000..961e87ac --- /dev/null +++ b/cli/hcl_to_json.py @@ -0,0 +1,465 @@ +"""``hcl2tojson`` CLI entry point — convert HCL2 files to JSON.""" + +import argparse +import json +import os +import sys +from typing import IO, List, Optional, TextIO + +from hcl2 import load +from hcl2.utils import SerializationOptions +from hcl2.version import __version__ +from cli.helpers import ( + EXIT_IO_ERROR, + EXIT_PARSE_ERROR, + EXIT_PARTIAL, + EXIT_SUCCESS, + HCL_SKIPPABLE, + _collect_files, + _convert_directory, + _convert_multiple_files, + _convert_single_file, + _error, + _expand_file_args, + _install_sigpipe_handler, +) + +_HCL_EXTENSIONS = {".tf", ".hcl"} + + +def _filter_data( + data: dict, + only: Optional[str] = None, + exclude: Optional[str] = None, + fields: Optional[str] = None, +) -> dict: + """Apply block-type filtering and field projection to parsed HCL data.""" + if only: + types = {t.strip() for t in only.split(",")} + data = {k: val for k, val in data.items() if k in types} + elif exclude: + types = {t.strip() for t in exclude.split(",")} + data = {k: val for k, val in data.items() if k not in types} + if fields: + field_set = {f.strip() for f in fields.split(",")} + data = _project_fields(data, field_set) + return data + + +def _project_fields(data, field_set): + """Keep only specified fields (plus metadata keys) in nested dicts. + + Structural keys (whose values are dicts or lists) are always preserved + so the block hierarchy stays intact. Only leaf attribute keys are + filtered. + """ + if isinstance(data, dict): + result = {} + for key, val in data.items(): + if key in field_set or key.startswith("__"): + result[key] = val + elif isinstance(val, dict): + projected = _project_fields(val, field_set) + if projected: + result[key] = projected + elif isinstance(val, list) and any(isinstance(item, dict) for item in val): + projected = _project_fields(val, field_set) + if projected: + result[key] = projected + # else: leaf value (scalar or leaf list) not in field_set — drop it + return result + if isinstance(data, list): + out = [_project_fields(item, field_set) for item in data] + return [item for item in out if not isinstance(item, (dict, list)) or item] + return data + + +def _hcl_to_json( # pylint: disable=too-many-arguments,too-many-positional-arguments + in_file: TextIO, + out_file: IO, + options: SerializationOptions, + json_indent: Optional[int] = None, + compact_separators: bool = False, + only: Optional[str] = None, + exclude: Optional[str] = None, + fields: Optional[str] = None, +) -> None: + data = load(in_file, serialization_options=options) + data = _filter_data(data, only, exclude, fields) + separators = (",", ":") if compact_separators else None + json.dump(data, out_file, indent=json_indent, separators=separators) + + +def _load_to_dict( + in_file: TextIO, + options: SerializationOptions, + only: Optional[str] = None, + exclude: Optional[str] = None, + fields: Optional[str] = None, +) -> dict: + """Load HCL2 and return the parsed dict (no JSON serialization).""" + data = load(in_file, serialization_options=options) + return _filter_data(data, only, exclude, fields) + + +def _stream_ndjson( # pylint: disable=too-many-arguments,too-many-positional-arguments + file_paths: List[str], + options: SerializationOptions, + json_indent: Optional[int], + skip: bool, + quiet: bool, + add_provenance: bool, + only: Optional[str] = None, + exclude: Optional[str] = None, + fields: Optional[str] = None, +) -> int: + """Stream one JSON object per file to stdout (NDJSON). + + Returns the worst exit code encountered. + """ + worst_exit = EXIT_SUCCESS + any_success = False + worst_skip_exit = EXIT_PARSE_ERROR # default for all-fail case + for file_path in file_paths: + if not quiet and file_path != "-": + print(file_path, file=sys.stderr, flush=True) + try: + if file_path == "-": + data = _load_to_dict( + sys.stdin, options, only=only, exclude=exclude, fields=fields + ) + else: + with open(file_path, "r", encoding="utf-8") as f: + data = _load_to_dict( + f, options, only=only, exclude=exclude, fields=fields + ) + except HCL_SKIPPABLE as exc: + if skip: + worst_exit = max(worst_exit, EXIT_PARTIAL) + continue + print( + _error( + str(exc), use_json=True, error_type="parse_error", file=file_path + ), + file=sys.stderr, + ) + return EXIT_PARSE_ERROR + except (OSError, IOError) as exc: + if skip: + worst_exit = max(worst_exit, EXIT_PARTIAL) + worst_skip_exit = EXIT_IO_ERROR + continue + print( + _error(str(exc), use_json=True, error_type="io_error", file=file_path), + file=sys.stderr, + ) + return EXIT_IO_ERROR + + # Skip empty results after filtering (no useful data for agents) + if not data: + continue + if add_provenance: + data = {"__file__": file_path, **data} + print(json.dumps(data, indent=json_indent, separators=(",", ":")), flush=True) + any_success = True + + if not any_success and worst_exit > EXIT_SUCCESS: + return worst_skip_exit + return worst_exit + + +_EXAMPLES = """\ +examples: + hcl2tojson file.tf # single file to stdout + hcl2tojson --ndjson dir/ # directory to stdout (NDJSON) + hcl2tojson a.tf b.tf -o out/ # multiple files to output dir + hcl2tojson --ndjson a.tf b.tf # multiple files as NDJSON + hcl2tojson --ndjson 'modules/**/*.tf' # glob + NDJSON streaming + hcl2tojson --only resource,module file.tf # block type filtering + hcl2tojson --exclude variable file.tf # exclude block types + hcl2tojson --fields cpu,memory file.tf # field projection + hcl2tojson --compact file.tf # single-line JSON + echo 'x = 1' | hcl2tojson # stdin (no args needed) + +exit codes: + 0 Success + 1 Partial success (some files skipped via -s) + 2 Parse error (all input unparsable) + 4 I/O error (file not found) +""" + + +def main(): # pylint: disable=too-many-branches,too-many-statements,too-many-locals + """The ``hcl2tojson`` console_scripts entry point.""" + _install_sigpipe_handler() + parser = argparse.ArgumentParser( + description="Convert HCL2 files to JSON", + epilog=_EXAMPLES, + formatter_class=argparse.RawDescriptionHelpFormatter, + ) + parser.add_argument( + "-s", dest="skip", action="store_true", help="Skip un-parsable files" + ) + parser.add_argument( + "PATH", + nargs="*", + help="Files, directories, or glob patterns to convert (default: stdin)", + ) + parser.add_argument( + "-o", + "--output", + dest="output", + help="Output path (file for single input, directory for multiple inputs)", + ) + parser.add_argument( + "-q", + "--quiet", + action="store_true", + help="Suppress progress output on stderr (errors still shown)", + ) + parser.add_argument( + "--ndjson", + action="store_true", + help="Output one JSON object per line (newline-delimited JSON)", + ) + parser.add_argument("--version", action="version", version=__version__) + + # SerializationOptions flags + parser.add_argument( + "--with-meta", + action="store_true", + help="Add meta parameters like __start_line__ and __end_line__", + ) + parser.add_argument( + "--with-comments", + action="store_true", + help="Include comments in the output", + ) + parser.add_argument( + "--wrap-objects", + action="store_true", + help="Wrap object values as an inline HCL2", + ) + parser.add_argument( + "--wrap-tuples", + action="store_true", + help="Wrap tuple values an inline HCL2", + ) + parser.add_argument( + "--no-explicit-blocks", + action="store_true", + help="Disable explicit block markers. Note: round-trip through json_to_hcl " + "is NOT supported with this option.", + ) + parser.add_argument( + "--no-preserve-heredocs", + action="store_true", + help="Convert heredocs to plain strings", + ) + parser.add_argument( + "--force-parens", + action="store_true", + help="Force parentheses around all operations", + ) + parser.add_argument( + "--no-preserve-scientific", + action="store_true", + help="Convert scientific notation to standard floats", + ) + parser.add_argument( + "--strip-string-quotes", + action="store_true", + help="Strip surrounding double-quotes from serialized string values. " + "Note: round-trip through json_to_hcl is NOT supported with this option.", + ) + + # JSON output formatting + parser.add_argument( + "--json-indent", + type=int, + default=None, + metavar="N", + help="JSON indentation width (default: 2 for TTY, compact otherwise)", + ) + parser.add_argument( + "--compact", + action="store_true", + help="Compact single-line JSON output (no whitespace)", + ) + + # Filtering + filter_group = parser.add_mutually_exclusive_group() + filter_group.add_argument( + "--only", + metavar="TYPES", + help="Comma-separated block types to include (e.g. resource,module)", + ) + filter_group.add_argument( + "--exclude", + metavar="TYPES", + help="Comma-separated block types to exclude (e.g. variable,output)", + ) + parser.add_argument( + "--fields", + metavar="FIELDS", + help="Comma-separated field names to keep in output", + ) + + args = parser.parse_args() + + options = SerializationOptions( + with_meta=args.with_meta, + with_comments=args.with_comments, + wrap_objects=args.wrap_objects, + wrap_tuples=args.wrap_tuples, + explicit_blocks=not args.no_explicit_blocks, + preserve_heredocs=not args.no_preserve_heredocs, + force_operation_parentheses=args.force_parens, + preserve_scientific_notation=not args.no_preserve_scientific, + strip_string_quotes=args.strip_string_quotes, + ) + + # Resolve JSON indent: --compact > explicit --json-indent > TTY default (2) > compact + if args.compact: + json_indent: Optional[int] = None + elif args.json_indent is not None: + json_indent = args.json_indent + elif sys.stdout.isatty(): + json_indent = 2 + else: + json_indent = None + + compact = args.compact + quiet = args.quiet + ndjson = args.ndjson + only = args.only + exclude = args.exclude + fields = args.fields + + def convert(in_file, out_file): + _hcl_to_json( + in_file, + out_file, + options, + json_indent=json_indent, + compact_separators=compact, + only=only, + exclude=exclude, + fields=fields, + ) + + # Default to stdin when no paths given + paths = args.PATH if args.PATH else ["-"] + paths = _expand_file_args(paths) + output = args.output + + try: + # NDJSON streaming mode (explicit --ndjson flag) + if ndjson: + file_paths = _resolve_file_paths(paths, parser) + if args.json_indent is not None and not quiet: + print( + "Warning: --json-indent is ignored in NDJSON mode", + file=sys.stderr, + ) + # NDJSON always uses compact output (one object per line) + ndjson_indent = None + exit_code = _stream_ndjson( + file_paths, + options, + ndjson_indent, + args.skip, + quiet, + add_provenance=len(file_paths) > 1, + only=only, + exclude=exclude, + fields=fields, + ) + if exit_code != EXIT_SUCCESS: + sys.exit(exit_code) + return + + if len(paths) == 1: + path = paths[0] + if path == "-" or os.path.isfile(path): + if not _convert_single_file( + path, output, convert, args.skip, HCL_SKIPPABLE, quiet=quiet + ): + sys.exit(EXIT_PARTIAL) + elif os.path.isdir(path): + if output is None: + parser.error("directory to stdout requires --ndjson or -o ") + if _convert_directory( + path, + output, + convert, + args.skip, + HCL_SKIPPABLE, + in_extensions=_HCL_EXTENSIONS, + out_extension=".json", + quiet=quiet, + ): + sys.exit(EXIT_PARTIAL) + else: + print( + _error( + f"File not found: {path}", + error_type="io_error", + file=path, + ), + file=sys.stderr, + ) + sys.exit(EXIT_IO_ERROR) + else: + # Validate all paths are files (stdin not supported with multiple) + for file_path in paths: + if file_path == "-": + parser.error("stdin (-) cannot be combined with other files") + if not os.path.isfile(file_path): + print( + _error( + f"File not found: {file_path}", + error_type="io_error", + file=file_path, + ), + file=sys.stderr, + ) + sys.exit(EXIT_IO_ERROR) + if output is None: + parser.error("multiple files to stdout requires --ndjson or -o ") + if _convert_multiple_files( + paths, + output, + convert, + args.skip, + HCL_SKIPPABLE, + out_extension=".json", + quiet=quiet, + ): + sys.exit(EXIT_PARTIAL) + except HCL_SKIPPABLE as exc: + print( + _error(str(exc), error_type="parse_error"), + file=sys.stderr, + ) + sys.exit(EXIT_PARSE_ERROR) + except (OSError, IOError) as exc: + print( + _error(str(exc), error_type="io_error"), + file=sys.stderr, + ) + sys.exit(EXIT_IO_ERROR) + + +def _resolve_file_paths(paths: List[str], parser) -> List[str]: + """Expand directories into individual file paths for NDJSON streaming.""" + file_paths: List[str] = [] + for path in paths: + file_paths.extend(_collect_files(path, _HCL_EXTENSIONS)) + if not file_paths: + parser.error("no HCL files found in the given paths") + return file_paths + + +if __name__ == "__main__": + main() diff --git a/cli/helpers.py b/cli/helpers.py new file mode 100644 index 00000000..fe416d8a --- /dev/null +++ b/cli/helpers.py @@ -0,0 +1,254 @@ +"""Shared file-conversion helpers for the HCL2 CLI commands.""" + +import glob as glob_mod +import json +import os +import signal +import sys +from io import StringIO +from typing import Callable, IO, List, Optional, Set, Tuple, Type + +from lark import UnexpectedCharacters, UnexpectedToken + +# Exit codes shared across CLIs +EXIT_SUCCESS = 0 +EXIT_PARTIAL = 1 # hcl2tojson: some files skipped; jsontohcl2: JSON/encoding error +EXIT_PARSE_ERROR = 2 # hcl2tojson: all unparsable; jsontohcl2: bad HCL structure +EXIT_IO_ERROR = 4 +EXIT_DIFF = 5 # jsontohcl2 --diff: differences found + +# Exceptions that can be skipped when -s is passed +HCL_SKIPPABLE = (UnexpectedToken, UnexpectedCharacters, UnicodeDecodeError) +JSON_SKIPPABLE = (json.JSONDecodeError, UnicodeDecodeError) + + +def _install_sigpipe_handler() -> None: + """Reset SIGPIPE to default so piping to ``head`` etc. exits cleanly.""" + if hasattr(signal, "SIGPIPE"): + signal.signal(signal.SIGPIPE, signal.SIG_DFL) + + +def _error(msg: str, use_json: bool = False, **extra) -> str: + """Format an error message for stderr. + + When *use_json* is true the result is a single-line JSON object with + ``error`` and ``message`` keys (plus any *extra* fields). Otherwise + a plain ``Error: …`` string is returned. + """ + if use_json: + data: dict = {"error": extra.pop("error_type", "error"), "message": msg} + data.update(extra) + return json.dumps(data) + return f"Error: {msg}" + + +def _expand_file_args(file_args: List[str]) -> List[str]: + """Expand glob patterns in file arguments. + + For each arg containing glob metacharacters (``*``, ``?``, ``[``), + expand via :func:`glob.glob` with ``recursive=True``. Literal paths + and ``-`` (stdin) pass through unchanged. If a glob matches nothing, + the literal pattern is kept so the caller produces an IO error. + """ + expanded: List[str] = [] + for arg in file_args: + if arg == "-": + expanded.append(arg) + continue + if any(c in arg for c in "*?["): + matches = sorted(glob_mod.glob(arg, recursive=True)) + if matches: + expanded.extend(matches) + else: + expanded.append(arg) # keep literal — will produce IO error + else: + expanded.append(arg) + return expanded + + +def _collect_files(path: str, extensions: Set[str]) -> List[str]: + """Return a sorted list of files under *path* matching *extensions*. + + If *path* is ``-`` (stdin marker) or a plain file, it is returned as-is + in a single-element list. Directories are walked recursively. + """ + if path == "-": + return ["-"] + if os.path.isfile(path): + return [path] + if os.path.isdir(path): + files: List[str] = [] + for dirpath, _, filenames in os.walk(path): + for fname in filenames: + if os.path.splitext(fname)[1] in extensions: + files.append(os.path.join(dirpath, fname)) + files.sort() + return files + # Not a file or directory — return as-is so caller can report IO error + return [path] + + +def _convert_single_file( # pylint: disable=too-many-positional-arguments + in_path: str, + out_path: Optional[str], + convert_fn: Callable[[IO, IO], None], + skip: bool, + skippable: Tuple[Type[BaseException], ...], + quiet: bool = False, +) -> bool: + """Convert a single file. Returns ``True`` on success, ``False`` if skipped.""" + if in_path == "-": + if out_path is not None: + try: + with open(out_path, "w", encoding="utf-8") as out_file: + convert_fn(sys.stdin, out_file) + except skippable: + if skip: + if os.path.exists(out_path): + os.remove(out_path) + return False + raise + return True + return _convert_single_stream(sys.stdin, convert_fn, skip, skippable) + with open(in_path, "r", encoding="utf-8") as in_file: + if not quiet: + print(in_path, file=sys.stderr, flush=True) + if out_path is not None: + try: + with open(out_path, "w", encoding="utf-8") as out_file: + convert_fn(in_file, out_file) + except skippable: + if skip: + if os.path.exists(out_path): + os.remove(out_path) + return False + raise + elif skip: + buf = StringIO() + try: + convert_fn(in_file, buf) + except skippable: + return False + sys.stdout.write(buf.getvalue()) + sys.stdout.write("\n") + else: + convert_fn(in_file, sys.stdout) + sys.stdout.write("\n") + return True + + +def _convert_directory( # pylint: disable=too-many-positional-arguments,too-many-locals + in_path: str, + out_path: Optional[str], + convert_fn: Callable[[IO, IO], None], + skip: bool, + skippable: Tuple[Type[BaseException], ...], + in_extensions: Set[str], + out_extension: str, + quiet: bool = False, +) -> bool: + """Convert all matching files in a directory. Returns ``True`` if any were skipped.""" + if out_path is None: + raise RuntimeError("Output path is required for directory conversion (use -o)") + if not os.path.exists(out_path): + os.makedirs(out_path) + + any_skipped = False + processed_files: set = set() + for current_dir, _, files in os.walk(in_path): + dir_prefix = os.path.commonpath([in_path, current_dir]) + relative_current_dir = os.path.relpath(current_dir, dir_prefix) + current_out_path = os.path.normpath( + os.path.join(out_path, relative_current_dir) + ) + if not os.path.exists(current_out_path): + os.makedirs(current_out_path) + for file_name in files: + _, ext = os.path.splitext(file_name) + if ext not in in_extensions: + continue + + in_file_path = os.path.join(current_dir, file_name) + out_file_path = os.path.join(current_out_path, file_name) + out_file_path = os.path.splitext(out_file_path)[0] + out_extension + + if in_file_path in processed_files or out_file_path in processed_files: + continue + + processed_files.add(in_file_path) + processed_files.add(out_file_path) + + with open(in_file_path, "r", encoding="utf-8") as in_file: + if not quiet: + print(in_file_path, file=sys.stderr, flush=True) + try: + with open(out_file_path, "w", encoding="utf-8") as out_file: + convert_fn(in_file, out_file) + except skippable: + if skip: + any_skipped = True + if os.path.exists(out_file_path): + os.remove(out_file_path) + continue + raise + return any_skipped + + +def _convert_multiple_files( # pylint: disable=too-many-positional-arguments + in_paths: List[str], + out_path: str, + convert_fn: Callable[[IO, IO], None], + skip: bool, + skippable: Tuple[Type[BaseException], ...], + out_extension: str, + quiet: bool = False, +) -> bool: + """Convert multiple files into an output directory. + + Preserves relative path structure to avoid basename collisions when + files from different directories share the same name. Returns ``True`` + if any files were skipped. + """ + if not os.path.exists(out_path): + os.makedirs(out_path) + abs_paths = [os.path.abspath(p) for p in in_paths] + common = os.path.commonpath(abs_paths) if len(abs_paths) > 1 else "" + if common and not os.path.isdir(common): + common = os.path.dirname(common) + any_skipped = False + for in_path, abs_path in zip(in_paths, abs_paths): + if common: + rel = os.path.relpath(abs_path, common) + else: + rel = os.path.basename(in_path) + dest = os.path.splitext(rel)[0] + out_extension + file_out = os.path.join(out_path, dest) + file_out_dir = os.path.dirname(file_out) + if file_out_dir and not os.path.exists(file_out_dir): + os.makedirs(file_out_dir) + if not _convert_single_file( + in_path, file_out, convert_fn, skip, skippable, quiet=quiet + ): + any_skipped = True + return any_skipped + + +def _convert_single_stream( + in_file: IO, + convert_fn: Callable[[IO, IO], None], + skip: bool, + skippable: Tuple[Type[BaseException], ...], +) -> bool: + """Convert from a stream (e.g. stdin) to stdout. Returns ``True`` on success.""" + if skip: + buf = StringIO() + try: + convert_fn(in_file, buf) + except skippable: + return False + sys.stdout.write(buf.getvalue()) + sys.stdout.write("\n") + else: + convert_fn(in_file, sys.stdout) + sys.stdout.write("\n") + return True diff --git a/cli/hq.py b/cli/hq.py new file mode 100755 index 00000000..d6344d71 --- /dev/null +++ b/cli/hq.py @@ -0,0 +1,870 @@ +"""``hq`` CLI entry point — query HCL2 files.""" + +import argparse +import dataclasses +import json +import multiprocessing +import os +import sys +from typing import Any, List, Optional, Tuple + +from hcl2.query._base import NodeView +from hcl2.utils import SerializationOptions +from hcl2.query.body import DocumentView +from hcl2.query.introspect import build_schema, describe_results +from hcl2.query.path import QuerySyntaxError +from hcl2.query.pipeline import classify_stage, execute_pipeline, split_pipeline +from hcl2.query.resolver import resolve_path +from hcl2.query.safe_eval import ( + UnsafeExpressionError, + _SAFE_CALLABLE_NAMES, + safe_eval, +) +from hcl2.version import __version__ +from .helpers import _expand_file_args # noqa: F401 — re-exported for tests + +# --------------------------------------------------------------------------- +# Constants +# --------------------------------------------------------------------------- + +EXIT_SUCCESS = 0 +EXIT_NO_RESULTS = 1 +EXIT_PARSE_ERROR = 2 +EXIT_QUERY_ERROR = 3 +EXIT_IO_ERROR = 4 + +_EXIT_TO_ERROR_TYPE = { + EXIT_IO_ERROR: "io_error", + EXIT_PARSE_ERROR: "parse_error", + EXIT_QUERY_ERROR: "query_error", +} + +_HCL_EXTENSIONS = {".tf", ".hcl", ".tfvars"} + +_EVAL_PREFIXES = tuple(f"{name}(" for name in sorted(_SAFE_CALLABLE_NAMES)) + ("doc",) + +EXAMPLES_TEXT = """\ +examples: + # Structural queries + hq 'resource.aws_instance.main.ami' main.tf + hq 'variable[*]' variables.tf --json + echo 'x = 1' | hq 'x' --value + + # Multiple files and globs + hq 'resource[*]' file1.tf file2.tf --json + hq 'variable[*]' modules/ --ndjson + hq 'resource[*]' 'modules/**/*.tf' --json + + # Pipes + hq 'resource.aws_instance[*] | .tags' main.tf + hq 'variable[*] | select(.default) | .default' vars.tf --json + + # Builtins + hq 'x | keys' file.tf --json + hq 'x | length' file.tf --value + + # Select (bracket syntax) + hq '*[select(.name == "x")]' file.tf --value + + # String functions (jq-compatible) + hq 'module~[select(.source | contains("docker"))]' dir/ + hq 'resource~[select(.ami | test("^ami-"))]' dir/ + hq 'resource~[select(has("tags"))]' main.tf + hq 'resource~[select(.tags | not)]' main.tf + + # Object construction (jq-style) + hq 'resource[*] | {type: .block_type, name: .name_labels}' main.tf --json + + # Optional (exit 0 on empty results) + hq 'nonexistent?' file.tf --value + + # Raw output (strip quotes, ideal for shell piping) + hq 'resource.aws_instance.main.ami' main.tf --raw + + # NDJSON (one JSON object per line, ideal for streaming) + hq 'resource[*]' dir/ --ndjson + + # Source location metadata + hq 'resource[*]' main.tf --json --with-location + + # Comments in output + hq 'resource[*]' main.tf --json --with-comments + + # Structural diff + hq file1.tf --diff file2.tf + hq file1.tf --diff file2.tf --json + + # Hybrid (structural::eval) + hq 'resource.aws_instance[*]::name_labels' main.tf + hq 'variable[*]::block_type' variables.tf --value + + # Pure eval (-e) + hq -e 'doc.blocks("variable")[0].attribute("default").value' variables.tf --json + + # Introspection + hq --describe 'variable[*]' variables.tf + hq --schema + +docs: https://github.com/amplify-education/python-hcl2/tree/main/docs +""" + +# --------------------------------------------------------------------------- +# Helpers: strings +# --------------------------------------------------------------------------- + + +def _strip_dollar_wrap(text: str) -> str: + """Strip ``${...}`` wrapping from a serialized expression string.""" + if text.startswith("${") and text.endswith("}"): + return text[2:-1] + return text + + +def _strip_quotes(text: str) -> str: + """Strip surrounding quotes from a string value.""" + if len(text) >= 2 and text[0] == '"' and text[-1] == '"': + return text[1:-1] + return text + + +def _rawify(value: Any) -> Any: + """Recursively strip quotes and ${} wrapping from all string values.""" + if isinstance(value, str): + return _strip_dollar_wrap(_strip_quotes(value)) + if isinstance(value, dict): + return {k: _rawify(v) for k, v in value.items()} + if isinstance(value, list): + return [_rawify(v) for v in value] + return value + + +# --------------------------------------------------------------------------- +# Helpers: I/O & errors +# --------------------------------------------------------------------------- + + +def _read_input(path: str) -> str: + """Read from a file path, or stdin if path is ``-``.""" + if path == "-": + return sys.stdin.read() + with open(path, encoding="utf-8") as f: + return f.read() + + +def _collect_files(path: str) -> List[str]: + """Return a list of HCL file paths from a file path, directory, or stdin marker.""" + if path == "-": + return ["-"] + if os.path.isdir(path): + return sorted( + os.path.join(dp, fn) + for dp, _, fnames in os.walk(path) + for fn in fnames + if os.path.splitext(fn)[1] in _HCL_EXTENSIONS + ) + return [path] + + +# _expand_file_args is imported from .helpers and re-exported at module level. + + +def _error(msg: str, use_json: bool, **extra) -> str: + """Format an error message.""" + if use_json: + return json.dumps( + {"error": extra.get("error_type", "error"), "message": msg, **extra} + ) + return f"Error: {msg}" + + +# --------------------------------------------------------------------------- +# Helpers: JSON conversion & result metadata +# --------------------------------------------------------------------------- + + +def _convert_for_json( + value: Any, + options: Optional[SerializationOptions] = None, +) -> Any: + """Recursively convert NodeViews to dicts for JSON serialization.""" + if isinstance(value, NodeView): + return value.to_dict(options=options) + if isinstance(value, list): + return [_convert_for_json(item, options=options) for item in value] + return value + + +def _inject_provenance(converted: Any, file_path: str) -> Any: + """Add ``__file__`` key to dict results for multi-file provenance.""" + if isinstance(converted, dict): + return {"__file__": file_path, **converted} + return converted + + +def _extract_location(result: Any, file_path: str) -> dict: + """Extract source location metadata from a result.""" + from hcl2.query.pipeline import _LocatedDict + + loc: dict = {"__file__": file_path} + meta = None + if isinstance(result, NodeView): + meta = getattr(result.raw, "_meta", None) + elif isinstance(result, _LocatedDict): + meta = result._source_meta + if meta is not None: + for attr in ("line", "end_line", "column", "end_column"): + if hasattr(meta, attr): + loc[f"__{attr}__"] = getattr(meta, attr) + return loc + + +def _merge_location(converted: Any, location: dict) -> Any: + """Merge location metadata into a converted JSON value.""" + if isinstance(converted, dict): + return {**location, **converted} + return {"__value__": converted, **location} + + +# --------------------------------------------------------------------------- +# Query dispatch +# --------------------------------------------------------------------------- + + +def _normalize_eval_expr(expr_part: str) -> str: + """Normalize the eval expression after '::' for ergonomics.""" + stripped = expr_part.strip() + if not stripped: + return "_" + if stripped.startswith("_"): + return stripped + if stripped.startswith("."): + return "_" + stripped + # Check if it starts with a known function/variable name + if stripped.startswith(_EVAL_PREFIXES): + return stripped + return "_." + stripped + + +def _dispatch_query( + query_str: str, + is_eval: bool, + doc_view: DocumentView, + file_path: str = "", +) -> List[Any]: + """Dispatch a query and return results.""" + if is_eval: + result = safe_eval(query_str, {"doc": doc_view}) + if isinstance(result, list): + return result + return [result] + + # Hybrid mode: checked before pipeline since "::" is unambiguous + if "::" in query_str: + from hcl2.query.path import parse_path + + path_part, expr_part = query_str.split("::", 1) + segments = parse_path(path_part) + nodes = resolve_path(doc_view, segments) + expr = _normalize_eval_expr(expr_part) + return [safe_eval(expr, {"_": node, "doc": doc_view}) for node in nodes] + + # Structural mode: route through pipeline (handles pipes, builtins, select) + stages = [classify_stage(s) for s in split_pipeline(query_str)] + return execute_pipeline(doc_view, stages, file_path=file_path) + + +# --------------------------------------------------------------------------- +# Output: formatting & lifecycle +# --------------------------------------------------------------------------- + + +@dataclasses.dataclass +class OutputConfig: + """Output mode configuration for hq results. + + All fields are primitives or dataclasses, ensuring picklability + for ``multiprocessing.Pool`` workers. + """ + + output_json: bool = False + output_value: bool = False + output_raw: bool = False + json_indent: Optional[int] = None + ndjson: bool = False + with_location: bool = False + with_comments: bool = False + no_filename: bool = False + serialization_options: Optional[SerializationOptions] = None + + def format_result(self, result: Any) -> str: + """Format a single result for output.""" + if self.output_json: + return json.dumps( + _convert_for_json(result, options=self.serialization_options), + indent=self.json_indent, + default=str, + ) + + if self.output_raw: + if isinstance(result, NodeView): + val = result.to_dict() + if isinstance(val, str): + return _strip_dollar_wrap(_strip_quotes(val)) + # For dicts with a single key (e.g. attribute), extract the value + if isinstance(val, dict) and len(val) == 1: + inner = next(iter(val.values())) + if isinstance(inner, str): + return _strip_dollar_wrap(_strip_quotes(inner)) + return str(inner) + return json.dumps(_rawify(val), default=str) + if isinstance(result, dict): + return json.dumps(_rawify(result), default=str) + if isinstance(result, str): + return _strip_dollar_wrap(_strip_quotes(result)) + return str(result) + + if self.output_value: + if isinstance(result, NodeView): + val = result.to_dict() + # Auto-unwrap single-key dicts (e.g. AttributeView → inner value) + if isinstance(val, dict) and len(val) == 1: + inner = next(iter(val.values())) + return _strip_dollar_wrap(str(inner)) + return _strip_dollar_wrap(str(val)) + if isinstance(result, str): + return _strip_dollar_wrap(result) + return str(result) + + # Default: HCL output + if isinstance(result, NodeView): + return result.to_hcl() + if isinstance(result, list): + return self.format_list(result) + if isinstance(result, str): + return _strip_dollar_wrap(result) + return str(result) + + def format_list(self, items: list) -> str: + """Format a list result (e.g. from hybrid mode returning a list).""" + if self.output_json: + converted = [ + _convert_for_json(item, options=self.serialization_options) + for item in items + ] + return json.dumps(converted, indent=self.json_indent, default=str) + parts = [] + for item in items: + if isinstance(item, NodeView): + parts.append( + item.to_hcl() if not self.output_value else str(item.to_dict()) + ) + else: + parts.append(str(item)) + if not self.output_value: + return "[" + ", ".join(parts) + "]" + return "\n".join(parts) + + def format_output(self, results: List[Any]) -> str: + """Format results for final output.""" + if self.output_json and len(results) > 1: + items = [ + _convert_for_json(item, options=self.serialization_options) + for item in results + ] + return json.dumps(items, indent=self.json_indent, default=str) + return "\n".join(self.format_result(r) for r in results) + + +def _convert_results( + results: List[Any], + file_path: str, + multi: bool, + output_config: OutputConfig, +) -> List[Any]: + """Convert query results for JSON output with location/provenance metadata.""" + converted = [] + for result in results: + item = _convert_for_json(result, options=output_config.serialization_options) + if output_config.with_location: + loc = _extract_location(result, file_path) + item = _merge_location(item, loc) + elif multi and not output_config.no_filename: + item = _inject_provenance(item, file_path) + converted.append(item) + return converted + + +class OutputSink: + """Owns the result output lifecycle: stream or accumulate, then flush.""" + + def __init__(self, output_config: OutputConfig, multi: bool): + self.config = output_config + self.multi = multi + self._accumulator: List[Any] = [] + + def __enter__(self): + return self + + def __exit__(self, *exc_info): + self.flush() + return False + + def emit(self, results: List[Any], file_path: str) -> None: + """Emit raw query results for one file (serial path).""" + # NDJSON — stream immediately + if self.config.ndjson: + for item in _convert_results(results, file_path, self.multi, self.config): + line = json.dumps(item, default=str) + if ( + self.multi + and not self.config.no_filename + and not self.config.with_location + and not isinstance(item, dict) + ): + line = f"{file_path}:{line}" + print(line, flush=True) + return + + # JSON + multi — accumulate for merged output + if self.config.output_json and self.multi: + self._accumulator.extend( + _convert_results(results, file_path, self.multi, self.config) + ) + return + + # Single-file output (with_location or default) + if self.config.with_location: + items = _convert_results(results, file_path, self.multi, self.config) + data = items[0] if len(items) == 1 else items + output = json.dumps(data, indent=self.config.json_indent, default=str) + else: + output = self.config.format_output(results) + + if self.multi and not self.config.no_filename and not self.config.with_location: + prefix = f"{file_path}:" + print("\n".join(prefix + ln for ln in output.splitlines())) + else: + print(output) + + def emit_converted(self, converted: List[Any]) -> None: + """Emit pre-converted results (parallel path).""" + if self.config.ndjson: + for item in converted: + print(json.dumps(item, default=str), flush=True) + elif self.config.output_json and self.multi: + self._accumulator.extend(converted) + + def flush(self) -> None: + """Sort and emit accumulated JSON results.""" + if not self._accumulator: + return + self._accumulator.sort( + key=lambda x: x.get("__file__", "") if isinstance(x, dict) else "" + ) + print( + json.dumps(self._accumulator, indent=self.config.json_indent, default=str) + ) + self._accumulator.clear() + + +# --------------------------------------------------------------------------- +# File-level query execution +# --------------------------------------------------------------------------- + + +def _run_query_on_file( + file_path: str, + query: str, + is_eval: bool, + use_json: bool, + raw_query: str, +) -> Tuple[Optional[List[Any]], int]: + """Parse a file and run a query. + + Returns ``(results, exit_code)``. On error, results is ``None`` and + exit_code is one of the ``EXIT_*`` constants. + """ + try: + text = _read_input(file_path) + except (OSError, IOError) as exc: + print(_error(str(exc), use_json, error_type="io_error"), file=sys.stderr) + return None, EXIT_IO_ERROR + + try: + doc = DocumentView.parse(text) + except Exception as exc: # pylint: disable=broad-except + print( + _error(str(exc), use_json, error_type="parse_error", file=file_path), + file=sys.stderr, + ) + return None, EXIT_PARSE_ERROR + + try: + return _dispatch_query(query, is_eval, doc, file_path=file_path), EXIT_SUCCESS + except Exception as exc: # pylint: disable=broad-except + if isinstance(exc, QuerySyntaxError): + extra = {"error_type": "query_syntax", "query": raw_query} + elif isinstance(exc, UnsafeExpressionError): + extra = {"error_type": "unsafe_expression", "expression": raw_query} + else: + extra = {"error_type": "eval_error", "query": raw_query} + print(_error(str(exc), use_json, **extra), file=sys.stderr) + return None, EXIT_QUERY_ERROR + + +def _process_file(args_tuple): + """Worker: parse, query, and convert results for one file. + + Returns ``(file_path, exit_code, converted_results, error_msg)``. + All return values are picklable plain Python objects. + """ + file_path, query, is_eval, raw_query, multi, output_config = args_tuple + + try: + text = _read_input(file_path) + except (OSError, IOError) as exc: + return (file_path, EXIT_IO_ERROR, None, str(exc)) + + try: + doc = DocumentView.parse(text) + except Exception as exc: # pylint: disable=broad-except + return (file_path, EXIT_PARSE_ERROR, None, str(exc)) + + try: + results = _dispatch_query(query, is_eval, doc, file_path=file_path) + except Exception as exc: # pylint: disable=broad-except + return (file_path, EXIT_QUERY_ERROR, None, str(exc)) + + if not results: + return (file_path, EXIT_SUCCESS, [], None) + + converted = _convert_results(results, file_path, multi, output_config) + + return (file_path, EXIT_SUCCESS, converted, None) + + +def _run_diff( + file1: str, file2: str, use_json: bool, json_indent: Optional[int] +) -> int: + """Run structural diff between two HCL files. + + Returns an exit code: 0 if files are identical, 1 if they differ. + Exits directly on I/O or parse errors (matching ``diff(1)`` convention). + """ + import hcl2 + from hcl2.query.diff import diff_dicts, format_diff_json, format_diff_text + + opts = SerializationOptions( + with_comments=False, with_meta=False, explicit_blocks=True + ) + for path in (file1, file2): + if path == "-": + continue + if not os.path.isfile(path): + print( + _error(f"File not found: {path}", use_json, error_type="io_error"), + file=sys.stderr, + ) + sys.exit(EXIT_IO_ERROR) + + try: + text1 = _read_input(file1) + text2 = _read_input(file2) + except (OSError, IOError) as exc: + print(_error(str(exc), use_json, error_type="io_error"), file=sys.stderr) + sys.exit(EXIT_IO_ERROR) + + try: + dict1 = hcl2.loads(text1, serialization_options=opts) + dict2 = hcl2.loads(text2, serialization_options=opts) + except Exception as exc: # pylint: disable=broad-except + print(_error(str(exc), use_json, error_type="parse_error"), file=sys.stderr) + sys.exit(EXIT_PARSE_ERROR) + + entries = diff_dicts(dict1, dict2) + if not entries: + return EXIT_SUCCESS + + if use_json: + print(format_diff_json(entries)) + else: + print(format_diff_text(entries)) + return EXIT_NO_RESULTS + + +# --------------------------------------------------------------------------- +# CLI: argument parsing & orchestration +# --------------------------------------------------------------------------- + + +def _build_parser() -> argparse.ArgumentParser: + """Build the argument parser for ``hq``.""" + parser = argparse.ArgumentParser( + prog="hq", + description=( + "Query HCL2 files using jq-like structural paths. " + "Supports pipes, select(), string functions, object construction. " + "Prefer structural queries over -e (eval) mode." + ), + epilog=EXAMPLES_TEXT, + formatter_class=argparse.RawDescriptionHelpFormatter, + ) + parser.add_argument( + "QUERY", + nargs="?", + default=None, + help="Structural path, hybrid path::expr, or -e for eval", + ) + parser.add_argument( + "FILE", + nargs="*", + default=["-"], + help="HCL2 files or directories (default: stdin)", + ) + parser.add_argument( + "-e", + "--eval", + action="store_true", + help="Treat QUERY as a Python expression (doc bound to DocumentView)", + ) + + output_group = parser.add_mutually_exclusive_group() + output_group.add_argument("--json", action="store_true", help="Output as JSON") + output_group.add_argument( + "--value", action="store_true", help="Output raw value only" + ) + output_group.add_argument( + "--raw", + action="store_true", + help="Output raw string (strip surrounding quotes)", + ) + + parser.add_argument( + "--ndjson", + action="store_true", + help="Output one JSON object per line (newline-delimited JSON)", + ) + parser.add_argument( + "--json-indent", + type=int, + default=None, + metavar="N", + help="JSON indentation width (default: 2 for TTY, compact otherwise)", + ) + parser.add_argument( + "--version", + action="version", + version=__version__, + ) + parser.add_argument( + "--describe", + action="store_true", + help="Show type and available properties/methods for query results", + ) + parser.add_argument( + "--schema", + action="store_true", + help="Dump full view API schema as JSON (ignores QUERY/FILE)", + ) + parser.add_argument( + "--no-filename", + action="store_true", + help="Suppress filename prefix when querying directories", + ) + parser.add_argument( + "--diff", + metavar="FILE2", + help="Structural diff against FILE2", + ) + parser.add_argument( + "--with-location", + action="store_true", + help="Include source file and line numbers in JSON output", + ) + parser.add_argument( + "--with-comments", + action="store_true", + help="Include comments in JSON output", + ) + parser.add_argument( + "-j", + "--jobs", + type=int, + default=None, + metavar="N", + help="Parallel workers (default: auto for large file sets, 0 or 1 = serial)", + ) + return parser + + +def _validate_and_configure( + parser: argparse.ArgumentParser, + args: argparse.Namespace, +) -> Tuple[bool, OutputConfig]: + """Validate argument combinations and build output configuration.""" + if args.ndjson: + if args.value: + parser.error("--ndjson cannot be combined with --value") + if args.raw: + parser.error("--ndjson cannot be combined with --raw") + + use_json = args.json or args.describe or args.schema or args.ndjson + output_raw = getattr(args, "raw", False) + + # Resolve JSON indent: explicit flag > TTY default (2) > compact (None) + if args.json_indent is not None: + json_indent: Optional[int] = args.json_indent + elif sys.stdout.isatty(): + json_indent = 2 + else: + json_indent = None + + if args.with_location and not use_json: + parser.error("--with-location requires --json or --ndjson") + if args.with_comments and not use_json: + parser.error("--with-comments requires --json or --ndjson") + + serialization_options = None + if args.with_comments: + serialization_options = SerializationOptions(with_comments=True) + + output_config = OutputConfig( + output_json=args.json, + output_value=args.value, + output_raw=output_raw, + json_indent=json_indent, + ndjson=args.ndjson, + with_location=args.with_location, + with_comments=args.with_comments, + no_filename=args.no_filename, + serialization_options=serialization_options, + ) + return use_json, output_config + + +def _resolve_query( + args: argparse.Namespace, + parser: argparse.ArgumentParser, + use_json: bool, + output_config: OutputConfig, +) -> Tuple[str, bool]: + """Handle early exits and resolve the query string. + + May call ``sys.exit`` for ``--schema``/``--diff`` or ``parser.error`` + for invalid arguments and never return. Otherwise returns + ``(query, optional)``. + """ + # --schema: dump schema and exit + if args.schema: + print(json.dumps(build_schema(), indent=2)) + sys.exit(EXIT_SUCCESS) + + # --diff: structural diff mode + if args.diff: + file1 = args.QUERY + if file1 is None: + parser.error("--diff requires two files: hq FILE1 --diff FILE2") + sys.exit(_run_diff(file1, args.diff, use_json, output_config.json_indent)) + + # QUERY is required unless --schema or --diff + if args.QUERY is None: + parser.error("the following arguments are required: QUERY") + + # Detect common mistake: user passed a file path but no query. + if ( + args.FILE == ["-"] + and sys.stdin.isatty() + and args.QUERY + and (os.path.exists(args.QUERY) or os.sep in args.QUERY) + ): + parser.error(f"missing QUERY argument (did you mean: hq QUERY {args.QUERY}?)") + + # Handle trailing '?' (optional operator — exit 0 on empty results) + query = args.QUERY + optional = query.rstrip().endswith("?") and not args.eval + if optional: + query = query.rstrip()[:-1].rstrip() + + return query, optional + + +def _execute_and_emit( + args: argparse.Namespace, + query: str, + optional: bool, + use_json: bool, + output_config: OutputConfig, +) -> int: + """Execute queries across files and emit results. Returns an exit code.""" + file_paths = [ + fp for fa in _expand_file_args(args.FILE) for fp in _collect_files(fa) + ] + any_results = False + worst_exit = EXIT_SUCCESS + multi = len(file_paths) > 1 + + use_parallel = ( + multi + and len(file_paths) >= 20 + and "-" not in file_paths + and not args.eval + and not args.describe + and (args.json or args.ndjson) + and (args.jobs is None or args.jobs > 1) + ) + + with OutputSink(output_config, multi) as sink: + if use_parallel: + n_workers = args.jobs or min(os.cpu_count() or 1, len(file_paths)) + worker_args = [ + (fp, query, False, args.QUERY, multi, output_config) + for fp in file_paths + ] + with multiprocessing.Pool(n_workers) as pool: + for fp, exit_code, converted, error_msg in pool.imap_unordered( + _process_file, worker_args + ): + if error_msg: + etype = _EXIT_TO_ERROR_TYPE.get(exit_code, "error") + print( + _error(error_msg, use_json, error_type=etype), + file=sys.stderr, + ) + worst_exit = max(worst_exit, exit_code) + continue + if not converted: + continue + any_results = True + sink.emit_converted(converted) + else: + for file_path in file_paths: + results, exit_code = _run_query_on_file( + file_path, query, args.eval, use_json, args.QUERY + ) + if results is None: + worst_exit = max(worst_exit, exit_code) + continue + if not results: + continue + any_results = True + if args.describe: + print(json.dumps(describe_results(results), indent=2)) + continue + sink.emit(results, file_path) + + if any_results: + return EXIT_SUCCESS + return EXIT_SUCCESS if optional else worst_exit or EXIT_NO_RESULTS + + +def main(): + """The ``hq`` console_scripts entry point.""" + parser = _build_parser() + args = parser.parse_args() + use_json, output_config = _validate_and_configure(parser, args) + query, optional = _resolve_query(args, parser, use_json, output_config) + sys.exit(_execute_and_emit(args, query, optional, use_json, output_config)) + + +if __name__ == "__main__": + main() diff --git a/cli/json_to_hcl.py b/cli/json_to_hcl.py new file mode 100644 index 00000000..fa284322 --- /dev/null +++ b/cli/json_to_hcl.py @@ -0,0 +1,462 @@ +"""``jsontohcl2`` CLI entry point — convert JSON files to HCL2.""" + +import argparse +import difflib +import json +import os +import sys +from io import StringIO +from typing import TextIO + +import hcl2 +from hcl2 import dump +from hcl2.deserializer import DeserializerOptions +from hcl2.formatter import FormatterOptions +from hcl2.query.diff import diff_dicts, format_diff_json, format_diff_text +from hcl2.utils import SerializationOptions +from hcl2.version import __version__ +from cli.helpers import ( + EXIT_DIFF, + EXIT_IO_ERROR, + EXIT_PARSE_ERROR, + EXIT_PARTIAL, + JSON_SKIPPABLE, # used in _convert_* calls for skip handling + _convert_directory, + _convert_multiple_files, + _convert_single_file, + _error, + _expand_file_args, + _install_sigpipe_handler, +) + + +def _json_to_hcl( + in_file: TextIO, + out_file: TextIO, + d_opts: DeserializerOptions, + f_opts: FormatterOptions, +) -> None: + data = json.load(in_file) + dump(data, out_file, deserializer_options=d_opts, formatter_options=f_opts) + + +def _json_to_hcl_string( + in_file: TextIO, + d_opts: DeserializerOptions, + f_opts: FormatterOptions, +) -> str: + """Convert JSON input to an HCL string (for --diff / --dry-run).""" + buf = StringIO() + _json_to_hcl(in_file, buf, d_opts, f_opts) + return buf.getvalue() + + +def _json_to_hcl_fragment( + in_file: TextIO, + d_opts: DeserializerOptions, + f_opts: FormatterOptions, +) -> str: + """Convert a JSON fragment to HCL attribute assignments. + + Unlike normal conversion, this strips ``__is_block__`` markers so the + input is always treated as flat attributes — even if it came from + ``hcl2tojson`` output. + """ + data = json.load(in_file) + if not isinstance(data, dict): + raise TypeError(f"--fragment expects a JSON object, got {type(data).__name__}") + data = _strip_block_markers(data) + buf = StringIO() + dump(data, buf, deserializer_options=d_opts, formatter_options=f_opts) + return buf.getvalue() + + +def _strip_block_markers(data): + """Recursively remove ``__is_block__`` keys from nested dicts.""" + if isinstance(data, dict): + return { + k: _strip_block_markers(v) for k, v in data.items() if k != "__is_block__" + } + if isinstance(data, list): + return [_strip_block_markers(item) for item in data] + return data + + +_EXAMPLES = """\ +examples: + jsontohcl2 file.json # single file to stdout + jsontohcl2 a.json b.json -o out/ # multiple files to output dir + jsontohcl2 --diff original.tf modified.json # preview text changes + jsontohcl2 --semantic-diff original.tf modified.json # semantic-only changes + jsontohcl2 --semantic-diff original.tf --diff-json m.json # semantic diff as JSON + jsontohcl2 --dry-run file.json # convert without writing + jsontohcl2 --fragment - # attribute snippet from stdin + echo '{"x": 1}' | jsontohcl2 # stdin (no args needed) + +fragment string format: + Strings use python-hcl2's inner-quote convention. To produce HCL "value", + the JSON string must be: "\\"value\\"". Unquoted strings become identifiers. + Example: {"name": "\\"test\\"", "count": 3} => name = "test" count = 3 + +exit codes: + 0 Success + 1 JSON/encoding parse error + 2 Valid JSON but incompatible HCL structure + 4 I/O error (file not found) + 5 Differences found (--diff / --semantic-diff) +""" + + +def main(): # pylint: disable=too-many-branches,too-many-statements,too-many-locals + """The ``jsontohcl2`` console_scripts entry point.""" + _install_sigpipe_handler() + parser = argparse.ArgumentParser( + description="Convert JSON files to HCL2", + epilog=_EXAMPLES, + formatter_class=argparse.RawDescriptionHelpFormatter, + ) + parser.add_argument( + "-s", dest="skip", action="store_true", help="Skip un-parsable files" + ) + parser.add_argument( + "PATH", + nargs="*", + help="Files, directories, or glob patterns to convert (default: stdin)", + ) + parser.add_argument( + "-o", + "--output", + dest="output", + help="Output path (file for single input, directory for multiple inputs)", + ) + parser.add_argument( + "-q", + "--quiet", + action="store_true", + help="Suppress progress output on stderr (errors still shown)", + ) + mode_group = parser.add_mutually_exclusive_group() + mode_group.add_argument( + "--diff", + metavar="ORIGINAL", + help="Show unified diff against ORIGINAL file instead of writing output", + ) + mode_group.add_argument( + "--dry-run", + action="store_true", + help="Convert and print to stdout without writing files", + ) + mode_group.add_argument( + "--semantic-diff", + metavar="ORIGINAL", + help="Show semantic-only diff against ORIGINAL (ignores formatting)", + ) + mode_group.add_argument( + "--fragment", + action="store_true", + help="Treat input as a JSON fragment (attribute dict, not full HCL " + 'document). Strings must use inner quotes: \'"\\"value\\""\' for HCL ' + '"value", bare strings become identifiers', + ) + parser.add_argument( + "--diff-json", + action="store_true", + help="Output diff results as JSON (works with --diff and --semantic-diff)", + ) + parser.add_argument("--version", action="version", version=__version__) + + # DeserializerOptions flags + parser.add_argument( + "--colon-separator", + action="store_true", + help="Use colons instead of equals in object elements", + ) + parser.add_argument( + "--no-trailing-comma", + action="store_true", + help="Omit trailing commas in object elements", + ) + parser.add_argument( + "--heredocs-to-strings", + action="store_true", + help="Convert heredocs to plain strings", + ) + parser.add_argument( + "--strings-to-heredocs", + action="store_true", + help="Convert strings containing escaped newlines to heredocs", + ) + + # FormatterOptions flags + parser.add_argument( + "--indent", + type=int, + default=2, + metavar="N", + help="Indentation width (default: 2)", + ) + parser.add_argument( + "--no-open-empty-blocks", + action="store_true", + help="Collapse empty blocks to a single line", + ) + parser.add_argument( + "--no-open-empty-objects", + action="store_true", + help="Collapse empty objects to a single line", + ) + parser.add_argument( + "--open-empty-tuples", + action="store_true", + help="Expand empty tuples across multiple lines", + ) + parser.add_argument( + "--no-align", + action="store_true", + help="Disable vertical alignment of attributes and object elements", + ) + + args = parser.parse_args() + + d_opts = DeserializerOptions( + object_elements_colon=args.colon_separator, + object_elements_trailing_comma=not args.no_trailing_comma, + heredocs_to_strings=args.heredocs_to_strings, + strings_to_heredocs=args.strings_to_heredocs, + ) + f_opts = FormatterOptions( + indent_length=args.indent, + open_empty_blocks=not args.no_open_empty_blocks, + open_empty_objects=not args.no_open_empty_objects, + open_empty_tuples=args.open_empty_tuples, + vertically_align_attributes=not args.no_align, + vertically_align_object_elements=not args.no_align, + ) + quiet = args.quiet + + def convert(in_file, out_file): + _json_to_hcl(in_file, out_file, d_opts, f_opts) + + # Default to stdin when no paths given + paths = args.PATH if args.PATH else ["-"] + paths = _expand_file_args(paths) + output = args.output + + try: + # --diff mode: convert JSON, diff against original file + if args.diff: + if len(paths) != 1: + parser.error("--diff requires exactly one input file") + json_path = paths[0] + original_path = args.diff + + if not os.path.isfile(original_path): + print( + _error( + f"File not found: {original_path}", + error_type="io_error", + file=original_path, + ), + file=sys.stderr, + ) + sys.exit(EXIT_IO_ERROR) + + if json_path == "-": + hcl_output = _json_to_hcl_string(sys.stdin, d_opts, f_opts) + else: + with open(json_path, "r", encoding="utf-8") as f: + hcl_output = _json_to_hcl_string(f, d_opts, f_opts) + + with open(original_path, "r", encoding="utf-8") as f: + original_lines = f.readlines() + + converted_lines = hcl_output.splitlines(keepends=True) + diff_output = list( + difflib.unified_diff( + original_lines, + converted_lines, + fromfile=original_path, + tofile=f"(from {json_path})", + ) + ) + if diff_output: + if args.diff_json: + print( + json.dumps( + { + "from_file": original_path, + "to_file": json_path, + "diff": "".join(diff_output), + }, + indent=2, + ) + ) + else: + sys.stdout.writelines(diff_output) + sys.exit(EXIT_DIFF) + return + + # --semantic-diff mode: compare semantic dicts (ignores formatting) + if args.semantic_diff: + if len(paths) != 1: + parser.error("--semantic-diff requires exactly one input file") + json_path = paths[0] + original_path = args.semantic_diff + + if not os.path.isfile(original_path): + print( + _error( + f"File not found: {original_path}", + error_type="io_error", + file=original_path, + ), + file=sys.stderr, + ) + sys.exit(EXIT_IO_ERROR) + + # Parse original HCL → normalized dict + sem_opts = SerializationOptions( + with_comments=False, with_meta=False, explicit_blocks=True + ) + with open(original_path, "r", encoding="utf-8") as f: + dict_original = hcl2.load(f, serialization_options=sem_opts) + + # Load modified JSON → dict + if json_path == "-": + dict_modified = json.load(sys.stdin) + else: + with open(json_path, "r", encoding="utf-8") as f: + dict_modified = json.load(f) + + entries = diff_dicts(dict_original, dict_modified) + if entries: + if args.diff_json: + print(format_diff_json(entries)) + else: + print(format_diff_text(entries)) + sys.exit(EXIT_DIFF) + return + + # --dry-run mode: convert to stdout without writing + if args.dry_run: + if len(paths) != 1: + parser.error("--dry-run requires exactly one input file") + json_path = paths[0] + if json_path == "-": + hcl_output = _json_to_hcl_string(sys.stdin, d_opts, f_opts) + else: + with open(json_path, "r", encoding="utf-8") as f: + hcl_output = _json_to_hcl_string(f, d_opts, f_opts) + sys.stdout.write(hcl_output) + return + + # --fragment mode: convert JSON fragment to HCL attributes + if args.fragment: + if len(paths) != 1: + parser.error("--fragment requires exactly one input file") + json_path = paths[0] + if json_path == "-": + hcl_output = _json_to_hcl_fragment(sys.stdin, d_opts, f_opts) + else: + with open(json_path, "r", encoding="utf-8") as f: + hcl_output = _json_to_hcl_fragment(f, d_opts, f_opts) + sys.stdout.write(hcl_output) + return + + if len(paths) == 1: + path = paths[0] + if path == "-" or os.path.isfile(path): + if not _convert_single_file( + path, output, convert, args.skip, JSON_SKIPPABLE, quiet=quiet + ): + sys.exit(EXIT_PARTIAL) + elif os.path.isdir(path): + if output is None: + parser.error("directory conversion requires -o ") + if _convert_directory( + path, + output, + convert, + args.skip, + JSON_SKIPPABLE, + in_extensions={".json"}, + out_extension=".tf", + quiet=quiet, + ): + sys.exit(EXIT_PARTIAL) + else: + print( + _error( + f"File not found: {path}", + error_type="io_error", + file=path, + ), + file=sys.stderr, + ) + sys.exit(EXIT_IO_ERROR) + else: + for file_path in paths: + if file_path == "-": + parser.error("stdin (-) cannot be combined with other files") + if not os.path.isfile(file_path): + print( + _error( + f"File not found: {file_path}", + error_type="io_error", + file=file_path, + ), + file=sys.stderr, + ) + sys.exit(EXIT_IO_ERROR) + any_skipped = False + if output is None: + for file_path in paths: + if not _convert_single_file( + file_path, + None, + convert, + args.skip, + JSON_SKIPPABLE, + quiet=quiet, + ): + any_skipped = True + else: + any_skipped = _convert_multiple_files( + paths, + output, + convert, + args.skip, + JSON_SKIPPABLE, + out_extension=".tf", + quiet=quiet, + ) + if any_skipped: + sys.exit(EXIT_PARTIAL) + except json.JSONDecodeError as exc: + print( + _error(str(exc), error_type="json_parse_error"), + file=sys.stderr, + ) + sys.exit(EXIT_PARTIAL) + except UnicodeDecodeError as exc: + print( + _error(str(exc), error_type="parse_error"), + file=sys.stderr, + ) + sys.exit(EXIT_PARTIAL) + except (KeyError, TypeError, ValueError) as exc: + print( + _error(str(exc), error_type="structure_error"), + file=sys.stderr, + ) + sys.exit(EXIT_PARSE_ERROR) + except (OSError, IOError) as exc: + print( + _error(str(exc), error_type="io_error"), + file=sys.stderr, + ) + sys.exit(EXIT_IO_ERROR) + + +if __name__ == "__main__": + main() diff --git a/docs/01_getting_started.md b/docs/01_getting_started.md new file mode 100644 index 00000000..1c389a3c --- /dev/null +++ b/docs/01_getting_started.md @@ -0,0 +1,264 @@ +# Getting Started + +python-hcl2 parses [HCL2](https://github.com/hashicorp/hcl/blob/hcl2/hclsyntax/spec.md) into Python dicts and converts them back. This guide covers installation, everyday usage, and the CLI tools. + +## Installation + +python-hcl2 requires Python 3.8 or higher. + +```sh +pip install python-hcl2 +``` + +For the CLI tools only (`hcl2tojson`, `jsontohcl2`, `hq`), [pipx](https://pipx.pypa.io/) installs them globally in an isolated environment: + +```sh +pipx install python-hcl2 +``` + +## Quick Reference + +| Function | Description | +|---|---| +| `hcl2.load(file)` | Parse an HCL2 file to a Python dict | +| `hcl2.loads(text)` | Parse an HCL2 string to a Python dict | +| `hcl2.dump(data, file)` | Write a Python dict as HCL2 to a file | +| `hcl2.dumps(data)` | Convert a Python dict to an HCL2 string | +| `hcl2.parse(file)` | Parse an HCL2 file to a LarkElement tree | +| `hcl2.parses(text)` | Parse an HCL2 string to a LarkElement tree | +| `hcl2.parse_to_tree(file)` | Parse an HCL2 file to a raw Lark tree | +| `hcl2.parses_to_tree(text)` | Parse an HCL2 string to a raw Lark tree | +| `hcl2.transform(lark_tree)` | Transform a raw Lark tree into a LarkElement tree | +| `hcl2.serialize(tree)` | Serialize a LarkElement tree to a Python dict | +| `hcl2.from_dict(data)` | Convert a Python dict into a LarkElement tree | +| `hcl2.from_json(text)` | Convert a JSON string into a LarkElement tree | +| `hcl2.reconstruct(tree)` | Convert a LarkElement tree (or Lark tree) to HCL2 text | +| `hcl2.Builder()` | Build HCL documents programmatically | +| `hcl2.query(source)` | Query HCL documents with typed view facades | + +For intermediate pipeline stages (`parse_to_tree`, `transform`, `serialize`, `from_dict`, `from_json`, `reconstruct`) and the `Builder` class, see [Advanced API Reference](03_advanced_api.md). + +## HCL to Python dict + +Use `load` / `loads` to parse HCL2 into a Python dictionary: + +```python +import hcl2 + +with open("main.tf") as f: + data = hcl2.load(f) + +# or from a string +data = hcl2.loads('resource "aws_instance" "web" { ami = "abc-123" }') +``` + +### SerializationOptions + +The default serialization options are tuned for **content fidelity** — the output preserves enough detail (`__is_block__` markers, heredoc delimiters, quoted strings like `'"hello"'`, scientific notation, etc.) that it can be deserialized back into a LarkElement tree and reconstructed into valid HCL2 without information loss. This makes the defaults ideal for round-trip workflows (`load` → modify → `dump`), but it does add noise to the output compared to what you might expect from a plain JSON conversion. If you only need to *read* values and don't plan to reconstruct HCL2 from the dict, you can disable options like `explicit_blocks` and `preserve_heredocs`, or enable `strip_string_quotes` for cleaner output. + +Pass `serialization_options` to control how the dict is produced: + +```python +from hcl2 import loads, SerializationOptions + +data = loads(text, serialization_options=SerializationOptions( + with_meta=True, + wrap_objects=True, +)) +``` + +| Field | Type | Default | Description | +|---|---|---|-------------------------------------------------------------------------------------------------------------------------------------------------| +| `with_comments` | `bool` | `True` | Include comments as `__comments__` and `__inline_comments__` keys (see [Comment Format](#comment-format)) | +| `with_meta` | `bool` | `False` | Add `__start_line__` / `__end_line__` metadata | +| `wrap_objects` | `bool` | `False` | Wrap object values as inline HCL2 strings | +| `wrap_tuples` | `bool` | `False` | Wrap tuple values as inline HCL2 strings | +| `explicit_blocks` | `bool` | `True` | Add `__is_block__: True` markers to blocks. **Mandatory for JSON->HCL2 deserialization and reconstruction.** | +| `preserve_heredocs` | `bool` | `True` | Keep heredocs in their original form | +| `force_operation_parentheses` | `bool` | `False` | Force parentheses around all operations | +| `preserve_scientific_notation` | `bool` | `True` | Keep scientific notation as-is | +| `strip_string_quotes` | `bool` | `False` | Remove surrounding quotes from string values (e.g. `"hello"` instead of `'"hello"'`). **Breaks JSON->HCL2 deserialization and reconstruction.** | + +### Comment Format + +When `with_comments` is enabled (the default), comments are included as lists of objects under the `__comments__` and `__inline_comments__` keys. Each object has a `"value"` key containing the comment text (with delimiters stripped): + +```python +from hcl2 import loads, SerializationOptions + +data = loads( + "# Configure the provider\nx = 1\n", + serialization_options=SerializationOptions(with_comments=True), +) + +data["__comments__"] +# [{"value": "Configure the provider"}] +``` + +`__comments__` contains standalone comments (on their own lines), while `__inline_comments__` contains comments found inside expressions. + +> **Note:** Comments are currently **read-only** — they are captured during parsing but not restored when converting a dict back to HCL2 with `dump`/`dumps`. + +## Python dict to HCL + +Use `dump` / `dumps` to convert a Python dictionary back into HCL2 text: + +```python +import hcl2 + +hcl_string = hcl2.dumps(data) + +with open("output.tf", "w") as f: + hcl2.dump(data, f) +``` + +### DeserializerOptions + +Control how the dict is interpreted when building the LarkElement tree: + +```python +from hcl2 import dumps, DeserializerOptions + +text = dumps(data, deserializer_options=DeserializerOptions( + object_elements_colon=True, +)) +``` + +| Field | Type | Default | Description | +|---|---|---|---| +| `heredocs_to_strings` | `bool` | `False` | Convert heredocs to plain strings | +| `strings_to_heredocs` | `bool` | `False` | Convert strings with `\n` to heredocs | +| `object_elements_colon` | `bool` | `False` | Use `:` instead of `=` in object elements | +| `object_elements_trailing_comma` | `bool` | `True` | Add trailing commas in object elements | + +### FormatterOptions + +Control whitespace and alignment in the generated HCL2: + +```python +from hcl2 import dumps, FormatterOptions + +text = dumps(data, formatter_options=FormatterOptions( + indent_length=4, + vertically_align_attributes=False, +)) +``` + +| Field | Type | Default | Description | +|---|---|---|---| +| `indent_length` | `int` | `2` | Number of spaces per indentation level | +| `open_empty_blocks` | `bool` | `True` | Expand empty blocks across multiple lines | +| `open_empty_objects` | `bool` | `True` | Expand empty objects across multiple lines | +| `open_empty_tuples` | `bool` | `False` | Expand empty tuples across multiple lines | +| `vertically_align_attributes` | `bool` | `True` | Vertically align `=` signs in attribute groups | +| `vertically_align_object_elements` | `bool` | `True` | Vertically align `=` signs in object elements | + +## CLI Tools + +python-hcl2 ships three console scripts: `hcl2tojson`, `jsontohcl2`, and [`hq`](04_hq.md). + +### hcl2tojson + +Convert HCL2 files to JSON. Accepts files, directories, glob patterns, or stdin (default when no args given). + +```sh +hcl2tojson main.tf # single file to stdout +hcl2tojson main.tf -o output.json # single file to output file +hcl2tojson terraform/ -o output/ # directory to output dir +hcl2tojson --ndjson terraform/ # directory to stdout (NDJSON) +hcl2tojson --ndjson 'modules/**/*.tf' # glob + NDJSON streaming +hcl2tojson a.tf b.tf -o output/ # multiple files to output dir +hcl2tojson --only resource,module main.tf # block type filtering +hcl2tojson --fields cpu,memory main.tf # field projection +hcl2tojson --compact main.tf # single-line JSON +echo 'x = 1' | hcl2tojson # stdin (no args needed) +``` + +**Exit codes:** 0 = success, 1 = partial (some skipped), 2 = all unparsable, 4 = I/O error. + +**Flags:** + +| Flag | Description | +|---|---| +| `-o`, `--output` | Output path (file for single input, directory for multiple) | +| `-s` | Skip un-parsable files | +| `-q`, `--quiet` | Suppress progress output on stderr | +| `--ndjson` | One JSON object per line (newline-delimited JSON). Multi-file adds `__file__` provenance key. | +| `--compact` | Compact JSON output (no whitespace) | +| `--json-indent N` | JSON indentation width (default: 2 for TTY, compact otherwise) | +| `--only TYPES` | Comma-separated block types to include | +| `--exclude TYPES` | Comma-separated block types to exclude | +| `--fields FIELDS` | Comma-separated field names to keep | +| `--with-meta` | Add `__start_line__` / `__end_line__` metadata | +| `--with-comments` | Include comments as `__comments__` / `__inline_comments__` object lists | +| `--wrap-objects` | Wrap object values as inline HCL2 | +| `--wrap-tuples` | Wrap tuple values as inline HCL2 | +| `--no-explicit-blocks` | Disable `__is_block__` markers | +| `--no-preserve-heredocs` | Convert heredocs to plain strings | +| `--force-parens` | Force parentheses around all operations | +| `--no-preserve-scientific` | Convert scientific notation to standard floats | +| `--strip-string-quotes` | Strip surrounding double-quotes from string values (breaks round-trip) | +| `--version` | Show version and exit | + +> **Note on `--strip-string-quotes`:** This removes the surrounding `"..."` from serialized string values (e.g. `"\"my-bucket\""` becomes `"my-bucket"`). Useful for read-only workflows but round-trip through `jsontohcl2` is **not supported** with this option, as the parser cannot distinguish bare strings from expressions. + +### jsontohcl2 + +Convert JSON files to HCL2. Accepts files, directories, glob patterns, or stdin (default when no args given). + +```sh +jsontohcl2 output.json # single file to stdout +jsontohcl2 output.json -o main.tf # single file to output file +jsontohcl2 output/ -o terraform/ # directory conversion +jsontohcl2 --diff original.tf modified.json # preview changes as unified diff +jsontohcl2 --semantic-diff original.tf modified.json # semantic-only diff (ignores formatting) +jsontohcl2 --semantic-diff original.tf --diff-json m.json # semantic diff as JSON +jsontohcl2 --dry-run file.json # convert without writing +jsontohcl2 --fragment - # attribute snippets from stdin +echo '{"x": 1}' | jsontohcl2 # stdin (no args needed) +``` + +**Exit codes:** 0 = success, 1 = JSON/encoding parse error, 2 = bad HCL structure, 4 = I/O error, 5 = differences found (`--diff` / `--semantic-diff`). + +**Flags:** + +| Flag | Description | +|---|---| +| `-o`, `--output` | Output path (file for single input, directory for multiple) | +| `-s` | Skip un-parsable files | +| `-q`, `--quiet` | Suppress progress output on stderr | +| `--diff ORIGINAL` | Show unified diff against ORIGINAL file (exit 0 = identical, 5 = differs) | +| `--semantic-diff ORIGINAL` | Show semantic-only diff against ORIGINAL (ignores formatting differences) | +| `--diff-json` | Output diff results as JSON (works with `--diff` and `--semantic-diff`) | +| `--dry-run` | Convert and print to stdout without writing files | +| `--fragment` | Treat input as attribute dict, not full HCL document (see note below) | +| `--indent N` | Indentation width (default: 2) | +| `--colon-separator` | Use `:` instead of `=` in object elements | +| `--no-trailing-comma` | Omit trailing commas in object elements | +| `--heredocs-to-strings` | Convert heredocs to plain strings | +| `--strings-to-heredocs` | Convert strings with escaped newlines to heredocs | +| `--no-open-empty-blocks` | Collapse empty blocks to a single line | +| `--no-open-empty-objects` | Collapse empty objects to a single line | +| `--open-empty-tuples` | Expand empty tuples across multiple lines | +| `--no-align` | Disable vertical alignment of attributes and object elements | +| `--version` | Show version and exit | + +> **Note on `--fragment` string format:** `--fragment` uses python-hcl2's standard JSON format, where HCL string values carry inner quotes. To produce the HCL attribute `name = "test"`, the JSON value must be `"\"test\""` (escaped inner quotes). A plain JSON string like `"test"` becomes the bare identifier `test`. This is the same convention used by `hcl2tojson` output — so piping `hcl2tojson` output into `jsontohcl2 --fragment` works correctly. Numbers, booleans, and expressions (`var.foo`, `local.name`) do not need quoting. + +### hq + +Query HCL2 files by structure, with optional Python expressions. + +```sh +hq 'resource.aws_instance.main.ami' main.tf +hq 'variable[*]' variables.tf --json +``` + +For the full guide, see [hq Reference](04_hq.md). + +## Next Steps + +- [Querying HCL (Python)](02_querying.md) — navigate documents with typed view facades +- [Advanced API Reference](03_advanced_api.md) — intermediate pipeline stages, Builder, pipeline diagram +- [hq Reference](04_hq.md) — query HCL files from the command line +- [hq Examples](05_hq_examples.md) — validated real-world queries by use case diff --git a/docs/02_querying.md b/docs/02_querying.md new file mode 100644 index 00000000..eacbd3d7 --- /dev/null +++ b/docs/02_querying.md @@ -0,0 +1,221 @@ +# Querying HCL (Python API) + +The query system lets you navigate HCL documents by structure rather than serializing to dicts. This page covers the Python API; for the `hq` CLI tool, see [hq Reference](04_hq.md). + +## Quick Start + +```python +import hcl2 + +doc = hcl2.query('resource "aws_instance" "main" { ami = "abc-123" }') + +for block in doc.blocks("resource"): + print(block.block_type, block.name_labels) + ami = block.attribute("ami") + if ami: + print(f" ami = {ami.value}") +``` + +You can also parse from a file: + +```python +from hcl2.query import DocumentView + +doc = DocumentView.parse_file("main.tf") +``` + +## DocumentView + +The entry point for queries. Wraps a `StartRule`. + +```python +doc = DocumentView.parse(text) # from string +doc = DocumentView.parse_file("main.tf") # from file +doc = hcl2.query(text) # convenience alias +doc = hcl2.query(open("main.tf")) # also accepts file objects +``` + +| Method / Property | Returns | Description | +|---|---|---| +| `body` | `BodyView` | The document body | +| `blocks(block_type?, *labels)` | `List[BlockView]` | Blocks matching type and optional labels | +| `attributes(name?)` | `List[AttributeView]` | Attributes, optionally filtered by name | +| `attribute(name)` | `AttributeView \| None` | Single attribute by name | + +## BodyView + +Wraps a `BodyRule`. Same filtering methods as `DocumentView`. + +## BlockView + +Wraps a `BlockRule`. + +```python +block = doc.blocks("resource", "aws_instance")[0] +block.block_type # "resource" +block.labels # ["resource", "aws_instance", "main"] +block.name_labels # ["aws_instance", "main"] +block.body # BodyView +``` + +| Property / Method | Returns | Description | +|---|---|---| +| `block_type` | `str` | First label (the block type name) | +| `labels` | `List[str]` | All labels as plain strings | +| `name_labels` | `List[str]` | Labels after the block type (`labels[1:]`) | +| `body` | `BodyView` | The block body | +| `blocks(...)` | `List[BlockView]` | Nested blocks (delegates to body) | +| `attributes(...)` | `List[AttributeView]` | Nested attributes (delegates to body) | +| `attribute(name)` | `AttributeView \| None` | Single nested attribute | + +## AttributeView + +Wraps an `AttributeRule`. + +```python +attr = doc.attribute("ami") +attr.name # "ami" +attr.value # '"abc-123"' (serialized Python value) +attr.value_node # NodeView over the expression +``` + +## Container Views + +### TupleView + +Wraps a `TupleRule`. Access via `find_all` or by navigating to a tuple-valued attribute. + +```python +from hcl2.query.containers import TupleView +from hcl2.walk import find_first +from hcl2.rules.containers import TupleRule + +doc = DocumentView.parse('x = [1, 2, 3]\n') +node = find_first(doc.attribute("x").raw, TupleRule) +tv = TupleView(node) +len(tv) # 3 +tv[0] # NodeView for the first element +tv.elements # List[NodeView] +``` + +### ObjectView + +Wraps an `ObjectRule`. + +```python +from hcl2.query.containers import ObjectView +from hcl2.rules.containers import ObjectRule + +node = find_first(doc.attribute("tags").raw, ObjectRule) +ov = ObjectView(node) +ov.keys # ["Name", "Env"] +ov.get("Name") # NodeView for the value +ov.entries # List[Tuple[str, NodeView]] +``` + +## Expression Views + +### ForTupleView / ForObjectView + +Wraps `ForTupleExprRule` / `ForObjectExprRule`. + +```python +from hcl2.query.for_exprs import ForTupleView +from hcl2.rules.for_expressions import ForTupleExprRule + +doc = DocumentView.parse('x = [for item in var.list : item]\n') +node = find_first(doc.raw, ForTupleExprRule) +fv = ForTupleView(node) +fv.iterator_name # "item" +fv.second_iterator_name # None (or "v" for "k, v in ...") +fv.iterable # NodeView +fv.value_expr # NodeView +fv.has_condition # bool +fv.condition # NodeView | None +``` + +`ForObjectView` adds `key_expr` and `has_ellipsis`. + +### ConditionalView + +Wraps a `ConditionalRule` (ternary `condition ? true : false`). + +```python +from hcl2.query.expressions import ConditionalView +from hcl2.rules.expressions import ConditionalRule + +doc = DocumentView.parse('x = var.enabled ? "on" : "off"\n') +node = find_first(doc.raw, ConditionalRule) +cv = ConditionalView(node) +cv.condition # NodeView over the condition expression +cv.true_val # NodeView over the true branch +cv.false_val # NodeView over the false branch +``` + +### FunctionCallView + +Wraps a `FunctionCallRule`. + +```python +from hcl2.query.functions import FunctionCallView +from hcl2.rules.functions import FunctionCallRule + +doc = DocumentView.parse('x = length(var.list)\n') +node = find_first(doc.raw, FunctionCallRule) +fv = FunctionCallView(node) +fv.name # "length" +fv.args # List[NodeView] +fv.has_ellipsis # bool +``` + +## Common NodeView Methods + +All view classes inherit from `NodeView`: + +| Method / Property | Returns | Description | +|---|---|---| +| `raw` | `LarkElement` | The underlying IR node | +| `parent_view` | `NodeView \| None` | View over the parent node | +| `to_hcl()` | `str` | Reconstruct this subtree as HCL text | +| `to_dict(options?)` | `Any` | Serialize to a Python value | +| `find_all(rule_type)` | `List[NodeView]` | Find descendants by rule class | +| `find_by_predicate(fn)` | `List[NodeView]` | Find descendants where `fn(view)` is truthy | +| `walk_semantic()` | `List[NodeView]` | All semantic descendant nodes | +| `walk_rules()` | `List[NodeView]` | All rule descendant nodes | + +## Tree Walking Primitives + +The `hcl2.walk` module provides free functions for traversing the IR tree directly (without view wrappers): + +```python +from hcl2.walk import walk, walk_rules, walk_semantic, find_all, find_first, ancestors +from hcl2.rules.base import AttributeRule + +tree = hcl2.parses('x = 1\ny = 2\n') + +# All nodes depth-first (including tokens) +for node in walk(tree): + print(node) + +# Only LarkRule nodes +for rule in walk_rules(tree): + print(rule) + +# Only semantic rules (skip NewLineOrCommentRule) +for rule in walk_semantic(tree): + print(rule) + +# Find specific rule types +attrs = list(find_all(tree, AttributeRule)) +first_attr = find_first(tree, AttributeRule) + +# Walk up the parent chain +for parent in ancestors(first_attr): + print(parent) +``` + +## Next Steps + +- [hq Reference](04_hq.md) — query HCL files from the command line +- [Advanced API Reference](03_advanced_api.md) — intermediate pipeline stages, Builder +- [Getting Started](01_getting_started.md) — core API (`load`/`dump`), options, CLI converters diff --git a/docs/03_advanced_api.md b/docs/03_advanced_api.md new file mode 100644 index 00000000..d8cc5b54 --- /dev/null +++ b/docs/03_advanced_api.md @@ -0,0 +1,147 @@ +# Advanced API Reference + +This document covers the intermediate pipeline stages, programmatic document construction with `Builder`, and the full pipeline diagram. For basic `load`/`dump` usage and options, see [Getting Started](01_getting_started.md). + +## Intermediate Pipeline Stages + +The full pipeline looks like this: + +``` +Forward: HCL2 Text → Lark Parse Tree → LarkElement Tree → Python Dict +Reverse: Python Dict → LarkElement Tree → HCL2 Text +``` + +You can access each stage individually for advanced use cases. + +### parse / parses — HCL2 text to LarkElement tree + +```python +tree = hcl2.parses('x = 1') # StartRule +tree = hcl2.parse(open("main.tf")) # StartRule +``` + +Pass `discard_comments=True` to strip comments during transformation. + +### parse_to_tree / parses_to_tree — HCL2 text to raw Lark tree + +```python +lark_tree = hcl2.parses_to_tree('x = 1') # lark.Tree +``` + +### transform — raw Lark tree to LarkElement tree + +```python +lark_tree = hcl2.parses_to_tree('x = 1') +tree = hcl2.transform(lark_tree) # StartRule +``` + +### serialize — LarkElement tree to Python dict + +```python +tree = hcl2.parses('x = 1') +data = hcl2.serialize(tree) +# or with options: +from hcl2 import SerializationOptions +data = hcl2.serialize(tree, serialization_options=SerializationOptions(with_meta=True)) +``` + +### from_dict / from_json — Python dict or JSON to LarkElement tree + +```python +tree = hcl2.from_dict(data) # StartRule +tree = hcl2.from_json('{"x": 1}') # StartRule +``` + +Both accept optional `deserializer_options`, `formatter_options`, and `apply_format` (default `True`). + +### reconstruct — LarkElement tree (or Lark tree) to HCL2 text + +```python +tree = hcl2.from_dict(data) +text = hcl2.reconstruct(tree) +``` + +## Builder + +The `Builder` class produces dicts with the correct `__is_block__` markers so that `dumps` can distinguish blocks from plain objects: + +```python +import hcl2 + +doc = hcl2.Builder() +res = doc.block("resource", labels=["aws_instance", "web"], + ami="abc-123", instance_type="t2.micro") +res.block("tags", Name="HelloWorld") + +hcl_string = hcl2.dumps(doc.build()) +``` + +Output: + +```hcl +resource "aws_instance" "web" { + ami = "abc-123" + instance_type = "t2.micro" + + tags { + Name = "HelloWorld" + } +} +``` + +### Builder.block() + +```python +block( + block_type: str, + labels: Optional[List[str]] = None, + __nested_builder__: Optional[Builder] = None, + **attributes, +) -> Builder +``` + +Returns the child `Builder` for the new block, allowing chained calls. + +## Pipeline Diagram + +``` + Forward Pipeline + ================ + HCL2 Text + │ + ▼ + ┌──────────────────┐ parse_to_tree / parses_to_tree + │ Lark Parse Tree │ + └────────┬─────────┘ + │ transform + ▼ + ┌──────────────────┐ + │ LarkElement Tree │ parse / parses (shortcut: HCL2 text → here) + └────────┬─────────┘ + │ serialize + ▼ + ┌──────────────────┐ + │ Python Dict │ load / loads (shortcut: HCL2 text → here) + └──────────────────┘ + + + Reverse Pipeline + ================ + Python Dict / JSON + │ + ▼ + ┌──────────────────┐ from_dict / from_json + │ LarkElement Tree │ + └────────┬─────────┘ + │ reconstruct + ▼ + ┌──────────────────┐ + │ HCL2 Text │ dump / dumps (shortcut: Python Dict / JSON → here) + └──────────────────┘ +``` + +## See Also + +- [Getting Started](01_getting_started.md) — basic `load`/`dump` usage, options reference +- [Querying HCL (Python)](02_querying.md) — typed view facades and tree walking +- [hq Reference](04_hq.md) — query HCL files from the command line diff --git a/docs/04_hq.md b/docs/04_hq.md new file mode 100644 index 00000000..96b4e20b --- /dev/null +++ b/docs/04_hq.md @@ -0,0 +1,667 @@ +# hq — HCL Query CLI + +`hq` is a jq-like query tool for HCL2 files. It ships with python-hcl2 and supports structural queries, hybrid Python expressions, and full eval mode. + +**Mode preference** (use the first one that works): + +1. **Structural** (default) — jq-like syntax with pipes, select, string functions, object construction; recommended. +1. **Hybrid** (`::`) — structural path on the left, Python expression on the right. Only when structural can't express the transform. +1. **Eval** (`-e`) — full Python expressions. Last resort — many operations are blocked for safety. + +## Structural Queries + +`hq` queries HCL2 files using dot-separated paths. Segments match block types, then name labels, then body contents. + +```sh +# Get all variable blocks +hq 'variable[*]' variables.tf + +# Navigate into a specific resource +hq 'resource.aws_instance.main.ami' main.tf + +# Output as JSON +hq 'variable[*]' variables.tf --json + +# Output raw values only +hq 'resource.aws_instance.main.ami' main.tf --value + +# Wildcard: all top-level blocks/attributes +hq '*' main.tf + +# Index: first variable only +hq 'variable[0]' variables.tf +``` + +### Path Grammar + +``` +path := segment ("." segment)* +segment := (type_filter ":")? name "~"? ("[*]" | "[" INT "]" | "[select(PRED)]")? +name := "*" | IDENTIFIER +type_filter := IDENTIFIER +``` + +- `name` matches block types and attribute names +- `type:name` matches only nodes of the given type (e.g. `function_call:length`) +- `name~` skips all block labels, going straight to the body (see below) +- `[*]` or `.[]` selects all matches at that level (`.[]` is a jq-compatible alias) +- `[N]` selects the Nth match (zero-based) +- `[select(PRED)]` filters matches using a predicate (see below) + +### Resolution Rules + +1. On a `DocumentView`/`BodyView`: segment matches block types and attribute names +1. On a `BlockView` with unconsumed name labels: segment matches the next label +1. On a `BlockView` with all labels consumed: delegates to body +1. On an `AttributeView`: unwraps to the value expression +1. On an `ObjectView`: segment matches keys +1. On a `TupleView`: `[N]` or `[*]` selects elements + +### Skip Labels (`~`) + +HCL blocks have labels (e.g. `resource "aws_instance" "main"`). Normally you consume them one segment at a time: `resource.aws_instance.main`. The `~` suffix explicitly skips all remaining labels and goes straight to the block body: + +```sh +# Without ~: must name every label or use wildcard +hq 'resource.aws_instance.main.ami' main.tf + +# With ~: skip all labels, access body directly +hq 'resource~.ami' main.tf + +# All resource blocks regardless of labels +hq 'resource~[*]' main.tf + +# Filter blocks by body content +hq 'resource~[select(.ami)]' main.tf + +# Combine with wildcards +hq '*~[*] | .block_type' main.tf --value +``` + +### Pipes + +Chain stages with `|`. Each stage feeds its results into the next. Between stages, attributes unwrap to their values and blocks unwrap to their bodies (for non-path stages like builtins and select). + +```sh +# Navigate through block body, then extract an attribute +hq 'resource.aws_instance.main | .ami' main.tf --value + +# Select with pipe +hq 'variable[*] | select(.default)' variables.tf --json + +# Builtins +hq 'x | keys' file.tf --json +hq 'x | length' file.tf --value +``` + +**Property accessors** — when a pipe stage like `.name` or `.block_type` doesn't match a structural path, it falls back to Python properties on the view: + +| View Type | Available Properties | +|---|---| +| `BlockView` | `.block_type` (e.g. `"resource"`), `.labels` (all labels including type), `.name_labels` (labels after the block type, e.g. `["aws_instance", "main"]`) | +| `AttributeView` | `.name` (attribute name), `.value` (serialized value) | +| `FunctionCallView` | `.name` (function name), `.args` (argument list), `.has_ellipsis` | +| `ForTupleView` | `.iterator_name`, `.second_iterator_name`, `.iterable`, `.value_expr`, `.has_condition`, `.condition` | +| `ForObjectView` | same as ForTupleView plus `.key_expr`, `.has_ellipsis` | +| `ConditionalView` | `.condition`, `.true_val`, `.false_val` | +| `TupleView` | `.elements` | +| `ObjectView` | `.keys`, `.entries` | +| All views | `.type` (short string like `"block"`, `"attribute"`, etc.) | + +```sh +# Get block types +hq 'resource[*] | .block_type' main.tf --value + +# Get name labels (labels after the block type) +hq 'resource[*] | .name_labels' main.tf --json + +# Get all labels including block type +hq 'resource[*] | .labels' main.tf --json + +# Get attribute names +hq '*[*] | .name' main.tf --value + +# Get function call names +hq '*..function_call:*[*] | .name' main.tf --value + +# Chain with builtins +hq 'resource[*] | .labels | length' main.tf --value +``` + +### Pipeline Semantics + +Between pipe stages: + +- **AttributeView** → unwraps to value node (ObjectView, TupleView, etc.) +- **BlockView** → unwraps to body (BodyView) +- **ExprTermRule** → unwraps to inner expression + +This means label traversal should be done within a single stage: + +```sh +# Good: label traversal in one stage, pipe at body boundary +hq 'resource.aws_instance.main | .ami' main.tf + +# Won't work: labels split across pipe stages +# hq 'resource | .aws_instance | .main | .ami' main.tf +``` + +### Select Predicates + +Filter results without eval. Two syntactic positions: + +**Bracket syntax** — inline in a path segment (works with type qualifiers too): + +```sh +hq '*[select(.name == "x")]' file.tf --value +hq 'variable[select(.default)]' variables.tf +hq '*..function_call:*[select(.args[2])]' file.tf # functions with >2 args +``` + +**Pipe stage** — as a pipeline stage: + +```sh +hq 'variable[*] | select(.default)' variables.tf --json +hq 'resource[*] | select(.block_type == "resource")' main.tf +``` + +**Predicate grammar:** + +``` +predicate := or_expr +or_expr := and_expr ("or" and_expr)* +and_expr := not_expr ("and" not_expr)* +not_expr := "not" not_expr | comparison +comparison := accessor (comp_op literal)? | any_all | has_expr +any_all := ("any" | "all") "(" accessor ";" predicate ")" +has_expr := "has" "(" STRING ")" +accessor := "." IDENT ("." IDENT)* ("[" INT "]")? ("|" BUILTIN_OR_FUNC)? +BUILTIN := "keys" | "values" | "length" | "not" +FUNC := ("contains" | "test" | "startswith" | "endswith") "(" STRING ")" +literal := STRING | NUMBER | "true" | "false" | "null" +comp_op := "==" | "!=" | "<" | ">" | "<=" | ">=" +``` + +Without a comparison operator, the accessor is an existence/truthy check. + +**Builtin transforms in accessors** — append `| builtin` to apply a transform before comparing: + +```sh +# Functions with more than 2 arguments +hq '*..function_call:*[select(.args | length > 2)] | .name' file.tf --value + +# Blocks with more than 2 labels +hq '*[select(.labels | length > 2)]' file.tf +``` + +**String functions** (jq-compatible) — filter by substring, regex, or prefix/suffix: + +```sh +# contains — substring match +hq 'module~[select(.source | contains("docker"))]' dir/ --value + +# test — regex match +hq 'resource.aws_instance~[select(.ami | test("^ami-[0-9]+"))]' dir/ --value + +# startswith / endswith +hq '*[select(.name | startswith("prod-"))]' file.tf +hq '*[select(.path | endswith("/api"))]' file.tf +``` + +**`has("key")`** — jq-compatible key existence check: + +```sh +# Blocks with a "tags" attribute +hq 'resource~[select(has("tags"))]' main.tf + +# Equivalent to: select(.tags) +``` + +**Postfix `not`** — jq-style postfix negation (equivalent to prefix `not`): + +```sh +# Blocks without tags +hq 'resource~[select(.tags | not)]' main.tf + +# Equivalent to: select(not .tags) +``` + +**`any` / `all`** — iterate over a list-valued accessor and test a predicate on each element (jq-style `any(generator; condition)`): + +```sh +# Tuples that contain function calls +hq '*..tuple:*[*] | select(any(.elements; .type == "function_call"))' file.tf + +# Tuples where ALL elements are plain nodes +hq '*..tuple:*[*] | select(all(.elements; .type == "node"))' file.tf + +# Combine with boolean operators +hq '*..tuple:*[*] | select(any(.elements; .type == "function_call" or .type == "tuple"))' file.tf +``` + +**Virtual accessor `.type`** — returns the view type as a short string: + +| View Class | `.type` value | +|---|---| +| `DocumentView` | `"document"` | +| `BodyView` | `"body"` | +| `BlockView` | `"block"` | +| `AttributeView` | `"attribute"` | +| `ObjectView` | `"object"` | +| `TupleView` | `"tuple"` | +| `ForTupleView` | `"for_tuple"` | +| `ForObjectView` | `"for_object"` | +| `FunctionCallView` | `"function_call"` | +| `NodeView` | `"node"` | + +```sh +# Filter to only object-valued attributes, then get keys +hq '*[select(.type == "attribute")] | select(.type == "object") | keys' file.tf --json + +# Or use the type qualifier syntax +hq 'attribute:*[*] | select(.type == "object") | keys' file.tf --json +``` + +### Type Qualifiers + +Prefix a segment with `type:` to match only nodes of that type. Most useful with recursive descent: + +```sh +# Find all function calls named "length" anywhere in the document +hq '*..function_call:length' file.tf + +# Get the arguments of a specific function call +hq '*..function_call:length | .args' file.tf --json + +# All function calls (wildcard name) +hq '*..function_call:*[*]' file.tf + +# Filter top-level to only blocks +hq 'block:*[*]' file.tf +``` + +After resolving to a `FunctionCallView`, you can navigate into it: + +| Segment | Behavior | +|---|---| +| `args` | All arguments | +| `args[*]` | All arguments (explicit select-all) | +| `args[N]` | Nth argument (zero-based) | + +### Builtins + +Terminal transforms available as pipe stages: + +| Builtin | Description | +|---|---| +| `keys` | Object → key list; Body/Document → block type + attribute names; Block → labels | +| `values` | Object → values; Tuple → elements; Body → blocks + attributes | +| `length` | Tuple/Object/Body → count; others → 1 | + +Append `[*]` to unpack list results into individual pipeline items: + +```sh +hq 'tags | keys' file.tf --json # one JSON array +hq 'tags | keys[*]' file.tf --value # one key per line +hq 'items | length' file.tf --value +``` + +### Object Construction `{...}` (jq-style) + +Extract multiple fields into a JSON object per result. Matches jq syntax: + +```sh +# Shorthand — field name = key name +hq 'module~[*] | {source, cpu, memory}' dir/ --json +# Output: {"source": "...", "cpu": 2, "memory": 512} + +# Renamed keys +hq 'resource[*] | {type: .block_type, name: .name_labels}' main.tf --json +# Output: {"type": "resource", "name": ["aws_instance", "main"]} + +# Combine with select +hq 'resource~[select(has("tags"))] | {name: .name_labels, tags}' main.tf --json +``` + +### Optional Operator (`?`) + +Append `?` to a query to exit 0 even when no results are found: + +```sh +hq 'nonexistent?' file.tf --value; echo "exit: $?" # exit: 0 +hq 'x?' file.tf --value # prints value, exit: 0 +``` + +The `?` is a CLI-level concern only — it is not stripped in eval mode. It works with all query forms including `[select()]`. + +## Input: Multiple Files and Globs + +`hq` accepts multiple FILE arguments — files, directories, and glob patterns: + +```sh +# Multiple files +hq 'resource[*]' file1.tf file2.tf --json + +# Directory (walks for .tf, .hcl, .tfvars) +hq 'variable[*]' modules/ + +# Mix files and directories +hq 'resource[*]' main.tf modules/ --json + +# Glob patterns (expanded by hq, not the shell) +hq 'resource[*]' 'modules/**/*.tf' --json + +# Stdin (default when no FILE given) +echo 'x = 1' | hq 'x' --value +``` + +Glob patterns containing `*`, `?`, or `[` are expanded by `hq` itself (using Python's `glob.glob` with `recursive=True`), so they work even when the shell doesn't expand them (e.g. when quoted or used by agents). + +When multiple files are queried, output lines are prefixed with `filename:` (like grep). Use `--no-filename` to suppress. + +## Output Modes + +| Flag | Behavior | +|---|---| +| *(default)* | HCL reconstruction via `to_hcl()`, `str()` for primitives | +| `--json` | `to_dict()` then `json.dumps()`. Array for multiple results | +| `--value` | `to_dict()` for views, `str()` for primitives. Auto-unwraps single-key dicts (e.g. attribute → inner value). One per line | +| `--raw` | Like `--value` but strips surrounding `"quotes"` from strings — ideal for shell piping | +| `--ndjson` | One JSON object per line (newline-delimited JSON). Ideal for streaming/piping large result sets | + +### Compact JSON for Non-TTY + +When stdout is not a TTY (e.g. piped to another command), JSON output is compact (no indentation) by default. When stdout is a TTY, JSON is pretty-printed with 2-space indent. Override with `--json-indent N`. + +### NDJSON + +`--ndjson` emits one JSON object per line, flushed immediately. Useful for streaming large monorepos: + +```sh +# One JSON per result, compact +hq 'resource[*]' dir/ --ndjson + +# With source location +hq 'resource[*]' dir/ --ndjson --with-location + +# Pipe to jq for further processing +hq 'resource[*]' dir/ --ndjson | jq '.tags' +``` + +Cannot be combined with `--value` or `--raw`. + +### JSON Provenance (`__file__`) + +When querying multiple files with `--json` or `--ndjson`, dict results automatically include a `"__file__"` key indicating the source file. Use `--no-filename` to suppress. + +### Source Location (`--with-location`) + +Add `--with-location` to `--json` or `--ndjson` output to include source file and line numbers: + +```sh +hq 'resource[*]' main.tf --json --with-location +``` + +```json +{ + "__file__": "main.tf", + "__line__": 3, + "__end_line__": 7, + "__column__": 1, + "__end_column__": 2, + "ami": "\"ami-12345\"" +} +``` + +### Comments (`--with-comments`) + +Add `--with-comments` to `--json` or `--ndjson` output to include comments in the serialized output (uses `SerializationOptions(with_comments=True)`): + +```sh +hq 'resource[*]' main.tf --json --with-comments +``` + +## Exit Codes + +| Code | Meaning | +|---|---| +| 0 | Query matched at least one result (or `?` suffix used) | +| 1 | No results found | +| 2 | HCL parse error | +| 3 | Query syntax error or unsafe expression | +| 4 | I/O error (file not found, permission denied) | + +When querying multiple files, **"worst error wins"** — but only when no results. If any file produces results, exit 0 (grep-like semantics: a single unparseable file in a 100-file directory shouldn't mask success). + +### Parallel Processing + +When querying 20+ files with `--json` or `--ndjson`, `hq` automatically uses multiprocessing for faster results. Each file is parsed and queried in a separate worker process, with results merged at the end. + +```sh +# Auto-parallel (default for 20+ files with --json/--ndjson) +hq 'resource[*]' large-monorepo/ --ndjson + +# Force serial processing +hq 'resource[*]' large-monorepo/ --ndjson --jobs 0 + +# Explicit worker count +hq 'resource[*]' large-monorepo/ --ndjson --jobs 8 +``` + +Parallel mode is used when all of these are true: + +- 20+ files to process +- `--json` or `--ndjson` output mode +- Not reading from stdin +- Not using `--eval` or `--describe` +- `--jobs` is not `0` or `1` + +Text output modes (`--value`, `--raw`, default HCL) always run serially to preserve file ordering. + +### Agent Tips + +- Use distinct exit codes to distinguish "no results" (1) from "bad query" (3) from "file not found" (4) +- Pipe output through `--ndjson` for streaming; compact JSON is default for non-TTY +- Use `--with-location` for IDE/editor integration (file + line numbers) +- Quote glob patterns to prevent shell expansion: `hq 'query' 'modules/**/*.tf'` +- Use `?` suffix for optional queries that shouldn't fail on empty results +- Large repos: `--ndjson` with auto-parallel gives the best throughput + +## Diff + +Compare two HCL files structurally: + +```sh +hq file1.tf --diff file2.tf +hq file1.tf --diff file2.tf --json +``` + +## Hybrid Queries + +> **Note:** Most queries should use structural mode (pipes, select, object construction). Only reach for hybrid mode when you need a Python transform that structural mode can't express. + +Use `::` to split a structural path (left) from a Python eval expression (right). The expression runs once per result from the structural path, with `_` bound to each result. + +```sh +# Get name_labels for all variables +hq 'variable[*]::name_labels' variables.tf + +# Get block_type +hq 'variable[*]::block_type' variables.tf --value + +# Call methods +hq 'resource.aws_instance[*].tags::entries()' main.tf + +# Use builtins +hq 'variable[*]::len(_.name_labels)' variables.tf --value +``` + +**Expression normalization** (right of `::`): + +| Input | Normalized | +|---|---| +| `name_labels` | `_.name_labels` | +| `.foo` | `_.foo` | +| `_.foo` | `_.foo` (unchanged) | +| `len(_.x)` | `len(_.x)` (unchanged) | +| `doc.blocks()` | `doc.blocks()` (unchanged) | + +## Eval Mode + +> **Note:** Eval mode is a last resort. Many operations (comprehensions, imports, f-strings) are blocked for safety. Prefer structural queries with pipes, select, and object construction. + +Use `-e` to treat the entire query as a Python expression. `doc` is bound to the `DocumentView`. + +```sh +# Access specific block attributes +hq -e 'doc.blocks("variable")[0].attribute("default").value' variables.tf --json + +# Sort blocks +hq -e 'sorted(doc.blocks("variable"), key=lambda b: b.name_labels[0])' variables.tf + +# Filter blocks +hq -e 'list(filter(lambda b: b.attribute("default"), doc.blocks("variable")))' variables.tf + +# Find by predicate +hq -e 'doc.find_by_predicate(lambda n: n.type == "attribute" and n.name == "ami")' main.tf +``` + +### Safe Eval Namespace + +- **Variables:** `doc` (DocumentView), `_` (per-result in hybrid mode) +- **Builtins:** `len`, `str`, `int`, `float`, `bool`, `list`, `tuple`, `type`, `isinstance`, `sorted`, `reversed`, `enumerate`, `zip`, `range`, `min`, `max`, `print`, `any`, `all`, `filter`, `map` +- **Allowed:** attribute access (except dunder attributes), method calls, subscripts, lambdas, comparisons, boolean/arithmetic ops, keyword arguments +- **Blocked:** imports, comprehensions, assignments, f-strings, walrus operator, `exec`/`eval`/`__import__`, dunder attribute access (`__class__`, `__subclasses__`, etc.) + +## Introspection + +**`--describe`** — Show type info and available API for query results: + +```sh +hq --describe 'variable[*]' variables.tf +``` + +```json +{ + "results": [ + { + "type": "BlockView", + "properties": ["block_type", "labels", "name_labels", "body", "raw", "parent_view"], + "methods": ["blocks(...)", "attributes(...)", "attribute(...)"], + "summary": "block_type='variable', labels=['variable', 'name']" + } + ] +} +``` + +**`--schema`** — Dump the full view API hierarchy as JSON (no QUERY or FILE needed): + +```sh +hq --schema +``` + +## All Flags + +| Flag | Description | +|---|---| +| `-e`, `--eval` | Treat QUERY as a Python expression | +| `--json` | Output as JSON | +| `--value` | Output raw values only (auto-unwraps attributes) | +| `--raw` | Output raw strings (strip surrounding quotes) | +| `--ndjson` | One JSON object per line (newline-delimited) | +| `--json-indent N` | JSON indentation width (default: 2 for TTY, compact otherwise) | +| `--with-location` | Include `__file__`, `__line__`, `__end_line__` in JSON output | +| `--with-comments` | Include comments in JSON output | +| `--describe` | Show type and available properties/methods | +| `--schema` | Dump full view API schema as JSON | +| `--diff FILE2` | Structural diff against FILE2 | +| `--no-filename` | Suppress filename prefix when querying directories | +| `-j N`, `--jobs N` | Parallel workers (default: auto for 20+ files, `0` or `1` = serial) | +| `--version` | Show version and exit | + +## Error Output + +Errors are printed to stderr. When `--json`, `--describe`, or `--schema` is active, errors are JSON: + +```json +{"error": "query_syntax", "message": "Invalid path segment: '123' in '123invalid'", "query": "123invalid"} +{"error": "unsafe_expression", "message": "comprehensions are not allowed", "expression": "[x for x in _]"} +{"error": "parse_error", "message": "Unexpected token ..."} +``` + +## Real-World Examples + +For a comprehensive collection of validated, task-oriented examples (discovery, compliance, cost analysis, deployment, etc.), see [hq Examples](05_hq_examples.md). + +A few syntax-focused examples showing feature combinations: + +```sh +# Skip labels, unpack keys — tag keys across all resources +hq 'resource~[*] | .tags | keys[*]' main.tf --value + +# Select + string functions — modules sourcing "docker" +hq 'module~[select(.source | contains("docker"))]' dir/ --value + +# Recursive descent + type qualifier — all function calls named "length" +hq '*..function_call:length | .args' file.tf --json + +# Object construction — multiple fields per result +hq 'resource[*] | {type: .block_type, name: .name_labels}' main.tf --json + +# Hybrid mode — Python expression on structural results +hq 'variable[select(.default)]::name_labels[0] + " = " + str(_.attribute("default").value)' variables.tf --value +``` + +## Piping to jq + +hq handles structural HCL navigation (blocks, labels, type filters, predicates). For data transforms on the results, pipe `--json` or `--ndjson` output to jq: + +```sh +# Defaults — fill missing values +hq 'resource~[*] | {name: .name_labels, tags}' main.tf --json | jq '.tags // "none"' + +# Reshaping — extract specific fields into arrays +hq 'resource[*]' main.tf --json | jq '[.block_type, .name]' + +# Mapping — transform each result +hq 'module~[*] | .source' dir/ --ndjson | jq -r 'ascii_downcase' + +# Filtering on JSON values +hq 'resource~[*]' main.tf --ndjson | jq 'select(.count > 3)' + +# Aggregation — group or sort across results +hq 'resource[*]' dir/ --ndjson | jq -s 'group_by(.block_type)' +``` + +**Mental model:** hq navigates HCL structure (block types, labels, attributes, nesting). Once you have `--json` output, you're in jq's world — use jq for arithmetic, string transforms, reshaping, defaults, and aggregation. + +## Coming from jq + +| jq | hq equivalent | Notes | +|---|---|---| +| `.key` | `.key` | Same syntax | +| `.[]` | `.[]` or `[*]` | `.[]` is an alias for `[*]` | +| `.[N]` | `[N]` | Zero-based index | +| `select(.x == "y")` | `[select(.x == "y")]` | Bracket or pipe syntax | +| `keys` | `\| keys` | Pipe stage | +| `length` | `\| length` | Pipe stage | +| `has("key")` | `[select(has("key"))]` | Inside select predicates | +| `contains("str")` | `[select(.field \| contains("str"))]` | Inside select predicates | +| `test("regex")` | `[select(.field \| test("regex"))]` | Inside select predicates | +| `{a, b}` | `{a, b}` | Object construction | +| `{newkey: .old}` | `{newkey: .old}` | Renamed keys | +| `map(.f)` | — | `--json \| jq 'map(.f)'` | +| `.x // "default"` | — | `--json \| jq '.x // "default"'` | +| `[.x, .y]` | — | `--json \| jq '[.x, .y]'` | +| `group_by(.f)` | — | `--ndjson \| jq -s 'group_by(.f)'` | +| `sort_by(.f)` | — | `--ndjson \| jq -s 'sort_by(.f)'` | +| `if-then-else` | — | `--json \| jq 'if ...'` or hybrid (`::`) mode | + +Features unique to hq (no jq equivalent): block type navigation (`resource.aws_instance`), label traversal, skip labels (`~`), type qualifiers (`function_call:name`), recursive descent (`..`), `--describe` / `--schema` introspection. + +## See Also + +- [hq Examples](05_hq_examples.md) — validated real-world queries by use case +- [Getting Started](01_getting_started.md) — core API (`load`/`dump`), options, CLI converters +- [Querying HCL (Python)](02_querying.md) — typed view facades for programmatic access +- [Advanced API Reference](03_advanced_api.md) — pipeline stages, Builder diff --git a/docs/05_hq_examples.md b/docs/05_hq_examples.md new file mode 100644 index 00000000..37297131 --- /dev/null +++ b/docs/05_hq_examples.md @@ -0,0 +1,207 @@ +# hq Examples — Real-World Queries + +Validated queries against a production Terraform code. All examples use structural mode. + +For query syntax reference, see [04_hq.md](04_hq.md). + +______________________________________________________________________ + +## Quick Reference + +```sh +hq 'resource[*]' main.tf --json # Query a file +hq 'resource[*]' infra/ --ndjson # Query a directory (recursive) +hq 'resource[*]' main.tf vars.tf --json # Multiple files +hq 'resource[*]' 'modules/**/*.tf' --json # Glob pattern +hq 'resource[*]' infra/ --json --with-location # With line numbers +hq 'locals.app_name?' dir/ --value # Exit 0 even if missing +``` + +______________________________________________________________________ + +## Discovery & Inventory + +### List all resources + +```sh +hq 'resource[*] | .name_labels' infra/ --value --no-filename +``` + +``` +['aws_instance', 'web_server'] +['aws_security_group', 'allow_https'] +['aws_iam_role', 'lambda_exec'] +``` + +### List resources of a specific type + +```sh +hq 'resource.aws_s3_bucket[*] | .name_labels' infra/ --value +``` + +### List all modules and their sources + +```sh +hq 'module~[*] | .source | .value' infra/ --value +``` + +The `~` skips remaining labels (module names) and descends into all named module blocks. `.source | .value` unwraps the attribute. + +``` +infra/prod/api/main.tf:"../../modules/ecs_service/v2" +infra/prod/api/main.tf:"../../modules/alb/v1" +``` + +### List module outputs + +```sh +hq 'output[*] | .name_labels' modules/ecs_service/v2/ --value --no-filename +``` + +### Count resources (blast radius) + +```sh +hq 'resource[*] | .name_labels' infra/prod/ --value --no-filename | wc -l +``` + +______________________________________________________________________ + +## Tags & Compliance + +### Find resources without tags + +```sh +hq 'resource~[select(not .tags)] | .name_labels' infra/ --value +``` + +Non-empty output = potential compliance violation. + +### Check if a required local exists + +```sh +# Script validation: exit 1 if missing +hq 'locals.app_name' infra/prod/new-service/ --value + +# Soft check: exit 0 either way +hq 'locals.app_name?' infra/prod/new-service/ --value +``` + +______________________________________________________________________ + +## Multi-Attribute Extraction + +Object construction extracts multiple fields per result in one query. + +### Instance types for cost analysis + +```sh +hq 'resource.aws_instance~[*] | {name: .name_labels, type: .instance_type}' infra/ --ndjson +``` + +```json +{"__file__": "infra/prod/bastion/ec2.tf", "name": ["aws_instance", "bastion"], "type": "\"t3.small\""} +``` + +### CPU and memory for ECS services + +```sh +hq 'module~[select(.cpu)] | {cpu, memory}' infra/ --ndjson +``` + +```json +{"__file__": "infra/prod/api/ecs.tf", "cpu": 1024, "memory": 2048} +{"__file__": "infra/prod/worker/ecs.tf", "cpu": 2048, "memory": 4096} +``` + +### Resource inventory with tags + +```sh +hq 'resource~[*] | {type: .name_labels, tags}' infra/prod/ --ndjson +``` + +______________________________________________________________________ + +## Deployment & Scaling + +### Modules with a specific attribute + +```sh +hq 'module~[select(.auto_deploy)] | .auto_deploy | .value' infra/ --value +``` + +### Resources using for_each or count + +```sh +hq 'resource~[select(.for_each)] | .name_labels' infra/ --value +hq 'resource~[select(.count)] | .count | .value' infra/ --value +``` + +______________________________________________________________________ + +## Secrets & Parameters + +### List SSM parameter paths + +```sh +hq 'resource.aws_ssm_parameter~[*] | .name | .value' infra/ --value +``` + +``` +infra/prod/api/ssm.tf:"/${var.region}/${var.env}/api/db-password" +``` + +### Secrets passed to modules + +```sh +hq 'module~[select(.secrets)] | .secrets' infra/ --ndjson +``` + +______________________________________________________________________ + +## Provider & Module Versions + +### Provider version pins + +```sh +hq 'terraform.required_providers' infra/ --ndjson +``` + +```json +{"__file__": "infra/prod/api/versions.tf", "aws": {"source": "\"hashicorp/aws\"", "version": "\"5.80.0\""}} +``` + +### Find modules on a specific version + +```sh +hq 'module~[*] | {source}' infra/ --ndjson | grep 'ecs_service/v1' +``` + +______________________________________________________________________ + +## IAM & Networking + +```sh +hq 'data.aws_iam_policy_document[*] | .name_labels' infra/ --value +hq 'resource.aws_iam_role[*] | .name_labels' infra/ --value +hq 'resource.aws_route53_record[*] | .name_labels' infra/ --value +``` + +______________________________________________________________________ + +## Non-AWS and Non-Terraform + +`hq` parses any HCL2 file — not just `.tf`. + +```sh +hq 'resource.datadog_monitor[*] | .name_labels' config/datadog/ --value +hq 'resource.github_team[*] | .name_labels' config/github/ --value +hq 'resource[*] | .name_labels' config/snowflake/ --value +hq 'inputs' terragrunt.hcl --json +hq 'plugin[*]' .tflint.hcl --value +``` + +For output modes, exit codes, and all flags, see [hq Reference](04_hq.md). + +**Tip:** Use `--ndjson` for directory queries (streams, works with `head`/`grep`/`jq`). Use `--json` for single files or when you need a parseable array. + +**Performance:** When querying 20+ files with `--json` or `--ndjson`, `hq` automatically parallelizes across CPU cores. Use `--jobs 0` to force serial, or `--jobs N` to set an explicit worker count. diff --git a/docs/06_migrating_to_v8.md b/docs/06_migrating_to_v8.md new file mode 100644 index 00000000..bc9677c3 --- /dev/null +++ b/docs/06_migrating_to_v8.md @@ -0,0 +1,205 @@ +# Migrating to v8 + +This guide covers breaking changes when upgrading from python-hcl2 v7 to v8. Changes are ordered by likelihood of impact — if you only use `load()`/`loads()` to read HCL files, focus on the first three sections. + +## String values now include HCL quotes + +**Impact: high** — silently changes output without raising errors. + +In v7, `load()` stripped the surrounding double-quotes from HCL string values. In v8, quotes are preserved by default to enable lossless round-trips. + +```python +# Given: name = "hello" + +# v7 +data["name"] # 'hello' + +# v8 (default) +data["name"] # '"hello"' +``` + +To restore v7 behavior: + +```python +import hcl2 +from hcl2 import SerializationOptions + +data = hcl2.load(f, serialization_options=SerializationOptions(strip_string_quotes=True)) +``` + +> **Note:** `strip_string_quotes=True` is one-way — dicts produced with it cannot round-trip back to HCL via `dumps()` because the quotes needed to distinguish strings from identifiers are gone. + +## New metadata keys in output dicts + +**Impact: high** — code that iterates keys or does exact-match assertions will break. + +v8 adds two new key categories to output dicts by default: + +| Key | Default | Purpose | +|---|---|---| +| `__is_block__` | on (`explicit_blocks=True`) | Distinguishes HCL blocks from plain objects | +| `__comments__`, `__inline_comments__` | on (`with_comments=True`) | Preserves HCL comments | + +To suppress them: + +```python +opts = SerializationOptions(explicit_blocks=False, with_comments=False) +data = hcl2.load(f, serialization_options=opts) +``` + +> **Note:** `explicit_blocks=False` disables round-trip support via `dumps()` — the deserializer needs `__is_block__` markers to reconstruct blocks correctly. + +The v7 metadata keys `__start_line__` and `__end_line__` are still available but remain opt-in: + +```python +opts = SerializationOptions(with_meta=True) +``` + +## `load()` / `loads()` signature changed + +**Impact: high** — calls using `with_meta` will raise `TypeError`. + +The `with_meta` positional/keyword parameter has been replaced by a `SerializationOptions` object: + +```python +# v7 +data = hcl2.load(f, with_meta=True) +data = hcl2.loads(text, with_meta=True) + +# v8 +from hcl2 import SerializationOptions +data = hcl2.load(f, serialization_options=SerializationOptions(with_meta=True)) +data = hcl2.loads(text, serialization_options=SerializationOptions(with_meta=True)) +``` + +All parameters on `load()`/`loads()` are now keyword-only. + +## `reverse_transform()` and `writes()` removed + +**Impact: medium** — calls will raise `ImportError` / `AttributeError`. + +The v7 two-step dict-to-HCL workflow has been replaced by `dump()`/`dumps()`: + +```python +# v7 +ast = hcl2.reverse_transform(data) +text = hcl2.writes(ast) + +# v8 +text = hcl2.dumps(data) + +# or to a file: +with open("output.tf", "w") as f: + hcl2.dump(data, f) +``` + +`dumps()` accepts optional `deserializer_options` and `formatter_options` for controlling the output: + +```python +from hcl2 import DeserializerOptions, FormatterOptions + +text = hcl2.dumps( + data, + deserializer_options=DeserializerOptions(object_elements_colon=True), + formatter_options=FormatterOptions(indent_length=4), +) +``` + +## `parse()` / `parses()` return type changed + +**Impact: medium** — code accessing Lark tree internals will break. + +These functions now return a typed `StartRule` (a `LarkElement` node) instead of a raw `lark.Tree`: + +```python +# v7 +tree = hcl2.parses(text) # -> lark.Tree +tree.data # 'start' +tree.children # [lark.Tree, ...] + +# v8 +tree = hcl2.parses(text) # -> StartRule +tree.body # typed BodyRule accessor +``` + +If you need the raw Lark tree, use the new explicit functions: + +```python +lark_tree = hcl2.parses_to_tree(text) # -> lark.Tree (raw) +rule_tree = hcl2.transform(lark_tree) # -> StartRule (typed) +``` + +## `transform()` signature and return type changed + +**Impact: medium** — same cause as above. + +```python +# v7 +data = hcl2.transform(ast, with_meta=True) # -> dict + +# v8 +rule_tree = hcl2.transform(lark_tree, discard_comments=False) # -> StartRule +data = hcl2.serialize(rule_tree, serialization_options=opts) # -> dict +``` + +In v8, `transform()` produces a typed IR tree. To get a dict, follow it with `serialize()`. + +## `DictTransformer` and `reconstruction_parser` removed + +**Impact: low** — only affects code importing internals. + +| v7 import | v8 replacement | +|---|---| +| `from hcl2.transformer import DictTransformer` | Use `hcl2.transform()` + `hcl2.serialize()` | +| `from hcl2.parser import reconstruction_parser` | Use `hcl2.parser.parser()` (single parser) | +| `from hcl2.reconstructor import HCLReverseTransformer` | Use `hcl2.from_dict()` + `hcl2.reconstruct()` | + +## New pipeline stages + +v8 exposes the full bidirectional pipeline as composable functions: + +``` +Forward: HCL text -> parses_to_tree() -> transform() -> serialize() -> dict +Reverse: dict -> from_dict() -> reconstruct() -> HCL text +``` + +| Function | Input | Output | +|---|---|---| +| `parses_to_tree(text)` | HCL string | raw `lark.Tree` | +| `transform(lark_tree)` | `lark.Tree` | `StartRule` | +| `serialize(tree)` | `StartRule` | `dict` | +| `from_dict(data)` | `dict` | `StartRule` | +| `from_json(text)` | JSON string | `StartRule` | +| `reconstruct(tree)` | `StartRule` | HCL string | + +## CLI changes + +The `hcl2tojson` entry point moved from `hcl2.__main__:main` to `cli.hcl_to_json:main`. A shim keeps `python -m hcl2` working, but direct imports from `hcl2.__main__` should be updated. + +Two new CLI tools ship with v8: + +- **`jsontohcl2`** — convert JSON back to HCL2, with diff/dry-run support +- **`hq`** — structural query tool for HCL files (jq-like syntax) + +## Python 3.7 no longer supported + +The minimum Python version is now **3.8**. + +## Quick reference: v7-compatible defaults + +If you want v8 to behave as closely to v7 as possible: + +```python +import hcl2 +from hcl2 import SerializationOptions + +V7_COMPAT = SerializationOptions( + strip_string_quotes=True, + explicit_blocks=False, + with_comments=False, +) + +data = hcl2.load(f, serialization_options=V7_COMPAT) +``` + +This restores the v7 dict shape but disables round-trip support and comment preservation. diff --git a/hcl2/__init__.py b/hcl2/__init__.py index 62f5a198..73f1ee07 100644 --- a/hcl2/__init__.py +++ b/hcl2/__init__.py @@ -8,11 +8,22 @@ from .api import ( load, loads, + dump, + dumps, parse, parses, + parse_to_tree, + parses_to_tree, + from_dict, + from_json, + reconstruct, transform, - reverse_transform, - writes, + serialize, + query, ) from .builder import Builder +from .deserializer import DeserializerOptions +from .formatter import FormatterOptions +from .rules.base import StartRule +from .utils import SerializationOptions diff --git a/hcl2/__main__.py b/hcl2/__main__.py index 17a021e1..7431bb13 100644 --- a/hcl2/__main__.py +++ b/hcl2/__main__.py @@ -1,106 +1,5 @@ -#!/usr/bin/env python -""" -This script recursively converts hcl2 files to json - -Usage: - hcl2tojson [-s] PATH [OUT_PATH] - -Options: - -s Skip un-parsable files - PATH The path to convert - OUT_PATH The path to write files to - --with-meta If set add meta parameters to the output_json like __start_line__ and __end_line__ -""" -import argparse -import json -import os -import sys - -from lark import UnexpectedCharacters, UnexpectedToken - -from . import load -from .version import __version__ - - -def main(): - """The `console_scripts` entry point""" - - parser = argparse.ArgumentParser( - description="This script recursively converts hcl2 files to json" - ) - parser.add_argument( - "-s", dest="skip", action="store_true", help="Skip un-parsable files" - ) - parser.add_argument("PATH", help="The file or directory to convert") - parser.add_argument( - "OUT_PATH", - nargs="?", - help="The path where to write files to. Optional when parsing a single file. " - "Output is printed to stdout if OUT_PATH is blank", - ) - parser.add_argument("--version", action="version", version=__version__) - parser.add_argument( - "--with-meta", - action="store_true", - help="If set add meta parameters to the output_json like __start_line__ and __end_line__", - ) - - args = parser.parse_args() - - skippable_exceptions = (UnexpectedToken, UnexpectedCharacters, UnicodeDecodeError) - - if os.path.isfile(args.PATH): - with open(args.PATH, "r", encoding="utf-8") as in_file: - # pylint: disable=R1732 - out_file = ( - sys.stdout - if args.OUT_PATH is None - else open(args.OUT_PATH, "w", encoding="utf-8") - ) - print(args.PATH, file=sys.stderr, flush=True) - json.dump(load(in_file, with_meta=args.with_meta), out_file) - if args.OUT_PATH is None: - out_file.write("\n") - out_file.close() - elif os.path.isdir(args.PATH): - processed_files = set() - if args.OUT_PATH is None: - raise RuntimeError("Positional OUT_PATH parameter shouldn't be empty") - if not os.path.exists(args.OUT_PATH): - os.mkdir(args.OUT_PATH) - for current_dir, _, files in os.walk(args.PATH): - dir_prefix = os.path.commonpath([args.PATH, current_dir]) - relative_current_dir = os.path.relpath(current_dir, dir_prefix) - current_out_path = os.path.normpath( - os.path.join(args.OUT_PATH, relative_current_dir) - ) - if not os.path.exists(current_out_path): - os.mkdir(current_out_path) - for file_name in files: - in_file_path = os.path.join(current_dir, file_name) - out_file_path = os.path.join(current_out_path, file_name) - out_file_path = os.path.splitext(out_file_path)[0] + ".json" - - # skip any files that we already processed or generated to avoid loops and file lock errors - if in_file_path in processed_files or out_file_path in processed_files: - continue - - processed_files.add(in_file_path) - processed_files.add(out_file_path) - - with open(in_file_path, "r", encoding="utf-8") as in_file: - print(in_file_path, file=sys.stderr, flush=True) - try: - parsed_data = load(in_file) - except skippable_exceptions: - if args.skip: - continue - raise - with open(out_file_path, "w", encoding="utf-8") as out_file: - json.dump(parsed_data, out_file) - else: - raise RuntimeError("Invalid Path", args.PATH) - +"""Allow ``python -m hcl2`` to run the hcl2tojson command.""" +from cli.hcl_to_json import main if __name__ == "__main__": main() diff --git a/hcl2/api.py b/hcl2/api.py index 399ba929..5d1c9520 100644 --- a/hcl2/api.py +++ b/hcl2/api.py @@ -1,67 +1,234 @@ -"""The API that will be exposed to users of this package""" -from typing import TextIO +"""The API that will be exposed to users of this package. + +Follows the json module convention: load/loads for reading, dump/dumps for writing. +Also exposes intermediate pipeline stages for advanced usage. +""" + +import json as _json +from typing import TextIO, Optional from lark.tree import Tree -from hcl2.parser import parser, reconstruction_parser -from hcl2.transformer import DictTransformer -from hcl2.reconstructor import HCLReconstructor, HCLReverseTransformer +from hcl2.deserializer import BaseDeserializer, DeserializerOptions +from hcl2.formatter import BaseFormatter, FormatterOptions +from hcl2.parser import parser as _get_parser +from hcl2.reconstructor import HCLReconstructor +from hcl2.rules.base import StartRule +from hcl2.transformer import RuleTransformer +from hcl2.utils import SerializationOptions + + +# --------------------------------------------------------------------------- +# Primary API: load / loads / dump / dumps +# --------------------------------------------------------------------------- + + +def load( + file: TextIO, + *, + serialization_options: Optional[SerializationOptions] = None, +) -> dict: + """Load a HCL2 file and return a Python dict. -def load(file: TextIO, with_meta=False) -> dict: - """Load a HCL2 file. - :param file: File with hcl2 to be loaded as a dict. - :param with_meta: If set to true then adds `__start_line__` and `__end_line__` - parameters to the output dict. Default to false. + :param file: File with HCL2 content. + :param serialization_options: Options controlling serialization behavior. """ - return loads(file.read(), with_meta=with_meta) + return loads(file.read(), serialization_options=serialization_options) -def loads(text: str, with_meta=False) -> dict: - """Load HCL2 from a string. - :param text: Text with hcl2 to be loaded as a dict. - :param with_meta: If set to true then adds `__start_line__` and `__end_line__` - parameters to the output dict. Default to false. +def loads( + text: str, + *, + serialization_options: Optional[SerializationOptions] = None, +) -> dict: + """Load HCL2 from a string and return a Python dict. + + :param text: HCL2 text. + :param serialization_options: Options controlling serialization behavior. + """ + tree = parses(text) + return serialize(tree, serialization_options=serialization_options) + + +def dump( + data: dict, + file: TextIO, + *, + deserializer_options: Optional[DeserializerOptions] = None, + formatter_options: Optional[FormatterOptions] = None, +) -> None: + """Write a Python dict as HCL2 to a file. + + :param data: Python dict (as produced by :func:`load`). + :param file: Writable text file. + :param deserializer_options: Options controlling deserialization behavior. + :param formatter_options: Options controlling formatting behavior. """ - # append new line as a workaround for https://github.com/lark-parser/lark/issues/237 + file.write( + dumps( + data, + deserializer_options=deserializer_options, + formatter_options=formatter_options, + ) + ) + + +def dumps( + data: dict, + *, + deserializer_options: Optional[DeserializerOptions] = None, + formatter_options: Optional[FormatterOptions] = None, +) -> str: + """Convert a Python dict to an HCL2 string. + + :param data: Python dict (as produced by :func:`load`). + :param deserializer_options: Options controlling deserialization behavior. + :param formatter_options: Options controlling formatting behavior. + """ + tree = from_dict( + data, + deserializer_options=deserializer_options, + formatter_options=formatter_options, + ) + return reconstruct(tree) + + +# --------------------------------------------------------------------------- +# Parsing: HCL text -> LarkElement tree or raw Lark tree +# --------------------------------------------------------------------------- + + +def parse(file: TextIO, *, discard_comments: bool = False) -> StartRule: + """Parse a HCL2 file into a LarkElement tree. + + :param file: File with HCL2 content. + :param discard_comments: If True, discard comments during transformation. + """ + return parses(file.read(), discard_comments=discard_comments) + + +def parses(text: str, *, discard_comments: bool = False) -> StartRule: + """Parse a HCL2 string into a LarkElement tree. + + :param text: HCL2 text. + :param discard_comments: If True, discard comments during transformation. + """ + lark_tree = parses_to_tree(text) + return transform(lark_tree, discard_comments=discard_comments) + + +def parse_to_tree(file: TextIO) -> Tree: + """Parse a HCL2 file into a raw Lark parse tree. + + :param file: File with HCL2 content. + """ + return parses_to_tree(file.read()) + + +def parses_to_tree(text: str) -> Tree: + """Parse a HCL2 string into a raw Lark parse tree. + + :param text: HCL2 text. + """ + # Append newline as workaround for https://github.com/lark-parser/lark/issues/237 # Lark doesn't support EOF token so our grammar can't look for "new line or end of file" - # This means that all blocks must end in a new line even if the file ends - # Append a new line as a temporary fix - tree = parser().parse(text + "\n") - return DictTransformer(with_meta=with_meta).transform(tree) + return _get_parser().parse(text + "\n") + +# --------------------------------------------------------------------------- +# Intermediate pipeline stages +# --------------------------------------------------------------------------- -def parse(file: TextIO) -> Tree: - """Load HCL2 syntax tree from a file. - :param file: File with hcl2 to be loaded as a dict. + +def from_dict( + data: dict, + *, + deserializer_options: Optional[DeserializerOptions] = None, + formatter_options: Optional[FormatterOptions] = None, + apply_format: bool = True, +) -> StartRule: + """Convert a Python dict into a LarkElement tree. + + :param data: Python dict (as produced by :func:`load`). + :param deserializer_options: Options controlling deserialization behavior. + :param formatter_options: Options controlling formatting behavior. + :param apply_format: If True (default), apply formatting to the tree. """ - return parses(file.read()) + deserializer = BaseDeserializer(deserializer_options) + tree = deserializer.load_python(data) + if apply_format: + formatter = BaseFormatter(formatter_options) + formatter.format_tree(tree) + return tree + +def from_json( + text: str, + *, + deserializer_options: Optional[DeserializerOptions] = None, + formatter_options: Optional[FormatterOptions] = None, + apply_format: bool = True, +) -> StartRule: + """Convert a JSON string into a LarkElement tree. -def parses(text: str) -> Tree: - """Load HCL2 syntax tree from a string. - :param text: Text with hcl2 to be loaded as a dict. + :param text: JSON string. + :param deserializer_options: Options controlling deserialization behavior. + :param formatter_options: Options controlling formatting behavior. + :param apply_format: If True (default), apply formatting to the tree. """ - return reconstruction_parser().parse(text) + data = _json.loads(text) + return from_dict( + data, + deserializer_options=deserializer_options, + formatter_options=formatter_options, + apply_format=apply_format, + ) -def transform(ast: Tree, with_meta=False) -> dict: - """Convert an HCL2 AST to a dictionary. - :param ast: HCL2 syntax tree, output from `parse` or `parses` - :param with_meta: If set to true then adds `__start_line__` and `__end_line__` - parameters to the output dict. Default to false. +def reconstruct(tree) -> str: + """Convert a LarkElement tree (or raw Lark tree) to an HCL2 string. + + :param tree: A :class:`StartRule` (LarkElement tree) or :class:`lark.Tree`. """ - return DictTransformer(with_meta=with_meta).transform(ast) + reconstructor = HCLReconstructor() + if isinstance(tree, StartRule): + tree = tree.to_lark() + return reconstructor.reconstruct(tree) + +def transform(lark_tree: Tree, *, discard_comments: bool = False) -> StartRule: + """Transform a raw Lark parse tree into a LarkElement tree. -def reverse_transform(hcl2_dict: dict) -> Tree: - """Convert a dictionary to an HCL2 AST. - :param hcl2_dict: a dictionary produced by `load` or `transform` + :param lark_tree: Raw Lark tree from :func:`parse_to_tree` or :func:`parse_string_to_tree`. + :param discard_comments: If True, discard comments during transformation. """ - return HCLReverseTransformer().transform(hcl2_dict) + return RuleTransformer(discard_new_line_or_comments=discard_comments).transform( + lark_tree + ) + + +def query(source): + """Parse HCL2 text or file into a DocumentView for querying. + + :param source: HCL2 text string or file-like object. + """ + from hcl2.query.body import DocumentView # avoid circular with hcl2.query package + + if hasattr(source, "read"): + return DocumentView(parse(source)) + return DocumentView(parses(source)) + +def serialize( + tree: StartRule, + *, + serialization_options: Optional[SerializationOptions] = None, +) -> dict: + """Serialize a LarkElement tree to a Python dict. -def writes(ast: Tree) -> str: - """Convert an HCL2 syntax tree to a string. - :param ast: HCL2 syntax tree, output from `parse` or `parses` + :param tree: A :class:`StartRule` (LarkElement tree). + :param serialization_options: Options controlling serialization behavior. """ - return HCLReconstructor(reconstruction_parser()).reconstruct(ast) + if serialization_options is not None: + return tree.serialize(options=serialization_options) + return tree.serialize() diff --git a/hcl2/builder.py b/hcl2/builder.py index b5b149da..5ef0c416 100644 --- a/hcl2/builder.py +++ b/hcl2/builder.py @@ -3,18 +3,16 @@ from collections import defaultdict -from hcl2.const import START_LINE_KEY, END_LINE_KEY +from hcl2.const import IS_BLOCK class Builder: """ The `hcl2.Builder` class produces a dictionary that should be identical to the - output of `hcl2.load(example_file, with_meta=True)`. The `with_meta` keyword - argument is important here. HCL "blocks" in the Python dictionary are - identified by the presence of `__start_line__` and `__end_line__` metadata - within them. The `Builder` class handles adding that metadata. If that metadata - is missing, the `hcl2.reconstructor.HCLReverseTransformer` class fails to - identify what is a block and what is just an attribute with an object value. + output of `hcl2.load(example_file)`. HCL "blocks" in the Python dictionary are + identified by the presence of `__is_block__: True` markers within them. + The `Builder` class handles adding that marker. If that marker is missing, + the deserializer fails to distinguish blocks from regular object attributes. """ def __init__(self, attributes: Optional[dict] = None): @@ -49,8 +47,7 @@ def build(self): body.update( { - START_LINE_KEY: -1, - END_LINE_KEY: -1, + IS_BLOCK: True, **self.attributes, } ) @@ -79,7 +76,7 @@ def _add_nested_blocks( """Add nested blocks defined within another `Builder` instance to the `block` dictionary""" nested_block = nested_blocks_builder.build() for key, value in nested_block.items(): - if key not in (START_LINE_KEY, END_LINE_KEY): + if key != IS_BLOCK: if key not in block.keys(): block[key] = [] block[key].extend(value) diff --git a/hcl2/const.py b/hcl2/const.py index 1d46f35a..555c56aa 100644 --- a/hcl2/const.py +++ b/hcl2/const.py @@ -1,4 +1,5 @@ """Module for various constants used across the library""" -START_LINE_KEY = "__start_line__" -END_LINE_KEY = "__end_line__" +IS_BLOCK = "__is_block__" +COMMENTS_KEY = "__comments__" +INLINE_COMMENTS_KEY = "__inline_comments__" diff --git a/hcl2/deserializer.py b/hcl2/deserializer.py new file mode 100644 index 00000000..3902f9ca --- /dev/null +++ b/hcl2/deserializer.py @@ -0,0 +1,400 @@ +"""Deserialize Python dicts (or JSON) into LarkElement trees.""" + +import json +import re +from abc import ABC, abstractmethod +from dataclasses import dataclass +from functools import cached_property +from typing import Any, TextIO, List, Optional, Union + +from regex import regex + +from hcl2.parser import parser as _get_parser +from hcl2.const import IS_BLOCK, COMMENTS_KEY, INLINE_COMMENTS_KEY +from hcl2.rules.abstract import LarkElement, LarkRule +from hcl2.rules.base import ( + BlockRule, + AttributeRule, + BodyRule, + StartRule, +) +from hcl2.rules.containers import ( + TupleRule, + ObjectRule, + ObjectElemRule, + ObjectElemKeyExpressionRule, + ObjectElemKeyRule, +) +from hcl2.rules.expressions import ExprTermRule +from hcl2.rules.literal_rules import ( + IdentifierRule, + IntLitRule, + FloatLitRule, +) +from hcl2.rules.strings import ( + StringRule, + InterpolationRule, + StringPartRule, + HeredocTemplateRule, + HeredocTrimTemplateRule, +) +from hcl2.rules.tokens import ( + NAME, + EQ, + DBLQUOTE, + STRING_CHARS, + ESCAPED_INTERPOLATION, + INTERP_START, + RBRACE, + IntLiteral, + FloatLiteral, + RSQB, + LSQB, + COMMA, + LBRACE, + HEREDOC_TRIM_TEMPLATE, + HEREDOC_TEMPLATE, + COLON, +) +from hcl2.transformer import RuleTransformer +from hcl2.utils import HEREDOC_TRIM_PATTERN, HEREDOC_PATTERN + + +@dataclass +class DeserializerOptions: + """Options controlling how Python dicts are deserialized into LarkElement trees.""" + + # Convert heredoc values (< LarkElement: + """Deserialize a JSON string into a LarkElement tree.""" + raise NotImplementedError() + + def load(self, file: TextIO) -> LarkElement: + """Deserialize a JSON file into a LarkElement tree.""" + return self.loads(file.read()) + + +class BaseDeserializer(LarkElementTreeDeserializer): + """Default deserializer: Python dict/JSON → LarkElement tree.""" + + @cached_property + def _transformer(self) -> RuleTransformer: + return RuleTransformer() + + def load_python(self, value: Any) -> StartRule: + """Deserialize a Python object into a StartRule tree.""" + if not isinstance(value, dict): + raise TypeError( + f"Expected dict for top-level HCL body, got {type(value).__name__}" + ) + # Top-level dict is always a body (attributes + blocks), not an object + children = self._deserialize_block_elements(value) + return StartRule([BodyRule(children)]) + + def loads(self, value: str) -> LarkElement: + """Deserialize a JSON string into a LarkElement tree.""" + return self.load_python(json.loads(value)) + + def _deserialize(self, value: Any) -> LarkElement: + if isinstance(value, dict): + if self._contains_block_marker(value): + + children: List[Any] = [] + + block_elements = self._deserialize_block_elements(value) + for element in block_elements: + children.append(element) + + return BodyRule(children) + + return self._deserialize_object(value) + + if isinstance(value, list): + return self._deserialize_list(value) + + return self._deserialize_text(value) + + def _deserialize_block_elements(self, value: dict) -> List[LarkElement]: + children: List[LarkElement] = [] + for key, val in value.items(): + if self._is_block(val): + # this value is a list of blocks, iterate over each block and deserialize them + for block in val: + children.append(self._deserialize_block(key, block)) + + else: + # otherwise it's just an attribute + if not self._is_reserved_key(key): + children.append(self._deserialize_attribute(key, val)) + + return children + + # pylint: disable=R0911 + def _deserialize_text(self, value: Any) -> LarkRule: + # bool must be checked before int since bool is a subclass of int + if isinstance(value, bool): + return self._deserialize_identifier(str(value).lower()) + + if isinstance(value, float): + return FloatLitRule([FloatLiteral(value)]) + + if isinstance(value, int): + return IntLitRule([IntLiteral(value)]) + + if isinstance(value, str): + if value.startswith('"') and value.endswith('"'): + if not self.options.heredocs_to_strings and value.startswith('"<<-'): + match = HEREDOC_TRIM_PATTERN.match(value[1:-1]) + if match: + return self._deserialize_heredoc(value[1:-1], True) + + if not self.options.heredocs_to_strings and value.startswith('"<<'): + match = HEREDOC_PATTERN.match(value[1:-1]) + if match: + return self._deserialize_heredoc(value[1:-1], False) + + if self.options.strings_to_heredocs: + inner = value[1:-1] + if "\\n" in inner: + return self._deserialize_string_as_heredoc(inner) + + return self._deserialize_string(value) + + if self._is_expression(value): + return self._deserialize_expression(value) + + return self._deserialize_identifier(value) + + return self._deserialize_identifier(str(value)) + + def _deserialize_identifier(self, value: str) -> IdentifierRule: + return IdentifierRule([NAME(value)]) + + def _deserialize_string(self, value: str) -> StringRule: + # If the string contains template directives, delegate to parser + inner = value[1:-1] if value.startswith('"') and value.endswith('"') else value + # Check for unescaped %{ (i.e. %{ not preceded by another %) + stripped = inner.replace("%%{", "") + if "%{" in stripped: + return self._deserialize_string_via_parser(value) + + result = [] + # split string into individual parts based on lark grammar + # e.g. 'aaa$${bbb}ccc${"ddd-${eee}"}' -> ['aaa', '$${bbb}', 'ccc', '${"ddd-${eee}"}'] + # 'aa-${"bb-${"cc-${"dd-${5 + 5}"}"}"}' -> ['aa-', '${"bb-${"cc-${"dd-${5 + 5}"}"}"}'] + pattern = regex.compile(r"(\${1,2}\{(?:[^{}]|(?R))*\})") + parts = [part for part in pattern.split(value) if part != ""] + + for part in parts: + if part == '"': + continue + + if part.startswith('"'): + part = part[1:] + if part.endswith('"'): + part = part[:-1] + + string_part = self._deserialize_string_part(part) + result.append(string_part) + + return StringRule([DBLQUOTE(), *result, DBLQUOTE()]) + + def _deserialize_string_via_parser(self, value: str) -> StringRule: + """Deserialize a string containing template directives by parsing it.""" + # Ensure the value is quoted + if not (value.startswith('"') and value.endswith('"')): + value = f'"{value}"' + snippet = f"temp = {value}" + parsed_tree = _get_parser().parse(snippet) + rules_tree = self._transformer.transform(parsed_tree) + # Extract the string from: start -> body -> attribute -> expression -> string + expr = rules_tree.body.children[0].expression + # The expression is an ExprTermRule wrapping a StringRule + for child in expr.children: + if isinstance(child, StringRule): + return child + # Fallback: shouldn't happen, but return as-is + return expr # type: ignore[return-value] + + def _deserialize_string_part(self, value: str) -> StringPartRule: + if value.startswith("$${") and value.endswith("}"): + return StringPartRule([ESCAPED_INTERPOLATION(value)]) + + if value.startswith("${") and value.endswith("}"): + return StringPartRule( + [ + InterpolationRule( + [INTERP_START(), self._deserialize_expression(value), RBRACE()] + ) + ] + ) + + return StringPartRule([STRING_CHARS(value)]) + + def _deserialize_heredoc( + self, value: str, trim: bool + ) -> Union[HeredocTemplateRule, HeredocTrimTemplateRule]: + if trim: + return HeredocTrimTemplateRule([HEREDOC_TRIM_TEMPLATE(value)]) + return HeredocTemplateRule([HEREDOC_TEMPLATE(value)]) + + def _deserialize_string_as_heredoc(self, inner: str) -> HeredocTemplateRule: + """Convert a quoted string with escaped newlines back into a heredoc.""" + # Single-pass unescape: \\n → \n, \\" → ", \\\\ → \ + content = re.sub( + r'\\(n|"|\\)', + lambda m: "\n" if m.group(1) == "n" else m.group(1), + inner, + ) + heredoc = f"< ExprTermRule: + """Deserialize an expression string into an ExprTermRule.""" + # instead of processing expression manually and trying to recognize what kind of expression it is, + # turn it into HCL2 code and parse it with lark: + + # unwrap from ${ and } + value = value[2:-1] + # create HCL2 snippet + value = f"temp = {value}" + # parse the above + parsed_tree = _get_parser().parse(value) + # transform parsed tree into LarkElement tree + rules_tree = self._transformer.transform(parsed_tree) + # extract expression from the tree + result = rules_tree.body.children[0].expression + + return result + + def _deserialize_block(self, first_label: str, value: dict) -> BlockRule: + """Deserialize a block by extracting labels and body""" + labels = [first_label] + body = value + + # Keep peeling off single-key layers until we hit the body (dict with IS_BLOCK) + while isinstance(body, dict) and not body.get(IS_BLOCK): + non_block_keys = [k for k in body.keys() if not self._is_reserved_key(k)] + if len(non_block_keys) == 1: + # This is another label level + label = non_block_keys[0] + labels.append(label) + body = body[label] + else: + # Multiple keys = this is the body + break + + return BlockRule( + [ + *[self._deserialize(label) for label in labels], + LBRACE(), + self._deserialize(body), + RBRACE(), + ] + ) + + def _deserialize_attribute(self, name: str, value: Any) -> AttributeRule: + expr_term = self._deserialize(value) + + if not isinstance(expr_term, ExprTermRule): + expr_term = ExprTermRule([expr_term]) + + children = [ + self._deserialize_identifier(name), + EQ(), + expr_term, + ] + return AttributeRule(children) + + def _deserialize_list(self, value: List) -> TupleRule: + children: List[Any] = [] + for element in value: + deserialized = self._deserialize(element) + if not isinstance(deserialized, ExprTermRule): + # whatever an element of the list is, it has to be nested inside ExprTermRule + deserialized = ExprTermRule([deserialized]) + children.append(deserialized) + children.append(COMMA()) + + return TupleRule([LSQB(), *children, RSQB()]) + + def _deserialize_object(self, value: dict) -> ObjectRule: + children: List[Any] = [] + for key, val in value.items(): + children.append(self._deserialize_object_elem(key, val)) + + if self.options.object_elements_trailing_comma: + children.append(COMMA()) + + return ObjectRule([LBRACE(), *children, RBRACE()]) + + def _deserialize_object_elem(self, key: Any, value: Any) -> ObjectElemRule: + key_rule: Union[ObjectElemKeyExpressionRule, ObjectElemKeyRule] + + if self._is_expression(key): + expr = self._deserialize_expression(key) + key_rule = ObjectElemKeyExpressionRule([expr]) + else: + key = self._deserialize_text(key) + key_rule = ObjectElemKeyRule([key]) + + result = [ + key_rule, + COLON() if self.options.object_elements_colon else EQ(), + ExprTermRule([self._deserialize(value)]), + ] + + return ObjectElemRule(result) + + def _is_reserved_key(self, key: str) -> bool: + """Check if a key is a reserved metadata key that should be skipped during deserialization.""" + return key in (IS_BLOCK, COMMENTS_KEY, INLINE_COMMENTS_KEY) + + def _is_expression(self, value: Any) -> bool: + return isinstance(value, str) and value.startswith("${") and value.endswith("}") + + def _is_block(self, value: Any) -> bool: + """Simple check: if it's a list containing dicts with IS_BLOCK markers""" + if not isinstance(value, list) or len(value) == 0: + return False + + # Check if any item in the list has IS_BLOCK marker (directly or nested) + for item in value: + if isinstance(item, dict) and self._contains_block_marker(item): + return True + + return False + + def _contains_block_marker(self, obj: dict) -> bool: + """Recursively check if a dict contains IS_BLOCK marker anywhere""" + if obj.get(IS_BLOCK): + return True + for value in obj.values(): + if isinstance(value, dict) and self._contains_block_marker(value): + return True + if isinstance(value, list): + for element in value: + if isinstance(element, dict) and self._contains_block_marker( + element + ): + return True + return False diff --git a/hcl2/formatter.py b/hcl2/formatter.py new file mode 100644 index 00000000..f45a37ff --- /dev/null +++ b/hcl2/formatter.py @@ -0,0 +1,332 @@ +"""Format LarkElement trees with indentation, alignment, and spacing.""" +from abc import ABC, abstractmethod +from dataclasses import dataclass +from typing import List, Optional + +from hcl2.rules.abstract import LarkElement, LarkRule +from hcl2.rules.base import ( + StartRule, + BlockRule, + AttributeRule, + BodyRule, +) +from hcl2.rules.containers import ( + ObjectRule, + ObjectElemRule, + ObjectElemKeyExpressionRule, + TupleRule, +) +from hcl2.rules.expressions import ExprTermRule +from hcl2.rules.functions import FunctionCallRule +from hcl2.rules.for_expressions import ( + ForTupleExprRule, + ForObjectExprRule, + ForIntroRule, + ForCondRule, +) +from hcl2.rules.tokens import NL_OR_COMMENT, LBRACE, COLON, LSQB, COMMA +from hcl2.rules.whitespace import NewLineOrCommentRule + + +@dataclass +class FormatterOptions: + """Options controlling whitespace formatting of LarkElement trees.""" + + # Number of spaces per indentation level. + indent_length: int = 2 + # Use multi-line format for empty blocks (opening brace on same line, + # closing brace on next). When False, empty blocks collapse to "{}". + open_empty_blocks: bool = True + # Use multi-line format for empty objects. When False, empty objects + # collapse to "{}". + open_empty_objects: bool = False + # Use multi-line format for empty tuples. When False, empty tuples + # collapse to "[]". + open_empty_tuples: bool = False + # Pad attribute equals signs so they align vertically within a block body. + vertically_align_attributes: bool = True + # Pad object element equals/colons so they align vertically within an object. + vertically_align_object_elements: bool = True + + +class LarkElementTreeFormatter(ABC): + """Abstract base for formatters that operate on LarkElement trees.""" + + def __init__(self, options: Optional[FormatterOptions] = None): + self.options = options or FormatterOptions() + + @abstractmethod + def format_tree(self, tree: LarkElement): + """Apply formatting to the given LarkElement tree in place.""" + raise NotImplementedError() + + +class BaseFormatter(LarkElementTreeFormatter): + """Default formatter: adds indentation, newlines, and vertical alignment.""" + + def __init__(self, options: Optional[FormatterOptions] = None): + super().__init__(options) + self._last_new_line: Optional[NewLineOrCommentRule] = None + + def format_tree(self, tree: LarkElement): + """Apply formatting to the given LarkElement tree in place.""" + if isinstance(tree, StartRule): + self.format_start_rule(tree) + + def format_start_rule(self, rule: StartRule): + """Format the top-level start rule.""" + self.format_body_rule(rule.body, 0) + + def format_block_rule(self, rule: BlockRule, indent_level: int = 0): + """Format a block rule with its body and closing brace.""" + if self.options.vertically_align_attributes: + self._vertically_align_attributes_in_body(rule.body) + + self.format_body_rule(rule.body, indent_level) + if len(rule.body.children) > 0: + rule.children.insert(-1, self._build_newline(indent_level - 1)) + elif self.options.open_empty_blocks: + rule.children.insert(-1, self._build_newline(indent_level - 1, 2)) + + def format_body_rule(self, rule: BodyRule, indent_level: int = 0): + """Format a body rule, adding newlines between attributes and blocks.""" + in_start = isinstance(rule.parent, StartRule) + + new_children = [] + if not in_start: + new_children.append(self._build_newline(indent_level)) + + for i, child in enumerate(rule.children): + new_children.append(child) + + if isinstance(child, AttributeRule): + self.format_attribute_rule(child, indent_level) + new_children.append(self._build_newline(indent_level)) + + if isinstance(child, BlockRule): + self.format_block_rule(child, indent_level + 1) + + if i > 0: + new_children.insert(-2, self._build_newline(indent_level)) + new_children.append(self._build_newline(indent_level, 2)) + + if new_children: + new_children.pop(-1) + self._set_children(rule, new_children) + + def format_attribute_rule(self, rule: AttributeRule, indent_level: int = 0): + """Format an attribute rule by formatting its value expression.""" + self.format_expression(rule.expression, indent_level + 1) + + def format_tuple_rule(self, rule: TupleRule, indent_level: int = 0): + """Format a tuple rule with one element per line.""" + if len(rule.elements) == 0: + if self.options.open_empty_tuples: + rule.children.insert(1, self._build_newline(indent_level - 1, 2)) + return + + new_children = [] + for child in rule.children: + new_children.append(child) + if isinstance(child, ExprTermRule): + self.format_expression(child, indent_level + 1) + + if isinstance(child, (COMMA, LSQB)): # type: ignore[misc] + new_children.append(self._build_newline(indent_level)) + + # If no trailing comma, add newline before closing bracket + if not isinstance(new_children[-2], NewLineOrCommentRule): + new_children.insert(-1, self._build_newline(indent_level)) + + self._deindent_last_line() + self._set_children(rule, new_children) + + def format_object_rule(self, rule: ObjectRule, indent_level: int = 0): + """Format an object rule with one element per line and optional alignment.""" + if len(rule.elements) == 0: + if self.options.open_empty_objects: + rule.children.insert(1, self._build_newline(indent_level - 1, 2)) + return + + new_children = [] + for i, child in enumerate(rule.children): + next_child = rule.children[i + 1] if i + 1 < len(rule.children) else None + new_children.append(child) + + if isinstance(child, LBRACE): # type: ignore[misc] + new_children.append(self._build_newline(indent_level)) + + if ( + next_child + and isinstance(next_child, ObjectElemRule) + and isinstance(child, (ObjectElemRule, COMMA)) # type: ignore[misc] + ): + new_children.append(self._build_newline(indent_level)) + + if isinstance(child, ObjectElemRule): + self.format_expression(child.expression, indent_level + 1) + + new_children.insert(-1, self._build_newline(indent_level)) + self._deindent_last_line() + + self._set_children(rule, new_children) + + if self.options.vertically_align_object_elements: + self._vertically_align_object_elems(rule) + + def format_expression(self, rule: ExprTermRule, indent_level: int = 0): + """Dispatch formatting for the inner expression of an ExprTermRule.""" + if isinstance(rule.expression, ObjectRule): + self.format_object_rule(rule.expression, indent_level) + + elif isinstance(rule.expression, TupleRule): + self.format_tuple_rule(rule.expression, indent_level) + + elif isinstance(rule.expression, ForTupleExprRule): + self.format_fortupleexpr(rule.expression, indent_level) + + elif isinstance(rule.expression, ForObjectExprRule): + self.format_forobjectexpr(rule.expression, indent_level) + + elif isinstance(rule.expression, FunctionCallRule): + self.format_function_call(rule.expression, indent_level) + + elif isinstance(rule.expression, ExprTermRule): + self.format_expression(rule.expression, indent_level) + + def format_function_call(self, rule: FunctionCallRule, indent_level: int = 0): + """Format a function call by recursively formatting its arguments.""" + if rule.arguments is not None: + for arg in rule.arguments.arguments: + if isinstance(arg, ExprTermRule): + self.format_expression(arg, indent_level) + + def format_fortupleexpr(self, expression: ForTupleExprRule, indent_level: int = 0): + """Format a for-tuple expression with newlines around clauses.""" + for child in expression.children: + if isinstance(child, ExprTermRule): + self.format_expression(child, indent_level + 1) + elif isinstance(child, (ForIntroRule, ForCondRule)): + for sub_child in child.children: + if isinstance(sub_child, ExprTermRule): + self.format_expression(sub_child, indent_level + 1) + + for index in [1, 3]: + expression.children[index] = self._build_newline(indent_level) + + if expression.condition is not None: + expression.children[5] = self._build_newline(indent_level) + else: + expression.children[5] = None + + expression.children[7] = self._build_newline(indent_level) + self._deindent_last_line() + + def format_forobjectexpr( + self, expression: ForObjectExprRule, indent_level: int = 0 + ): + """Format a for-object expression with newlines around clauses.""" + for child in expression.children: + if isinstance(child, ExprTermRule): + self.format_expression(child, indent_level + 1) + elif isinstance(child, (ForIntroRule, ForCondRule)): + for sub_child in child.children: + if isinstance(sub_child, ExprTermRule): + self.format_expression(sub_child, indent_level + 1) + + for index in [1, 3]: + expression.children[index] = self._build_newline(indent_level) + + for index in [6, 8]: + child = expression.children[index] + if not isinstance(child, NewLineOrCommentRule) or child.to_list() is None: + expression.children[index] = None + + if expression.condition is not None: + expression.children[10] = self._build_newline(indent_level) + else: + expression.children[10] = None + + expression.children[12] = self._build_newline(indent_level) + self._deindent_last_line() + + @staticmethod + def _set_children(rule: LarkRule, new_children): + """Replace a rule's children and re-establish parent/index links.""" + rule._children = new_children + for i, child in enumerate(new_children): + if child is not None: + child.set_index(i) + child.set_parent(rule) + + def _vertically_align_attributes_in_body(self, body: BodyRule): + attributes_sequence: List[AttributeRule] = [] + + for child in body.children: + if isinstance(child, AttributeRule): + attributes_sequence.append(child) + + elif attributes_sequence: + self._align_attributes_sequence(attributes_sequence) + attributes_sequence = [] + + if attributes_sequence: + self._align_attributes_sequence(attributes_sequence) + + def _align_attributes_sequence(self, attributes_sequence: List[AttributeRule]): + max_length = max( + len(attribute.identifier.token.value) for attribute in attributes_sequence + ) + for attribute in attributes_sequence: + name_length = len(attribute.identifier.token.value) + spaces_to_add = max_length - name_length + base = attribute.children[1].value.lstrip(" \t") + attribute.children[1].set_value(" " * (spaces_to_add + 1) + base) + + def _vertically_align_object_elems(self, rule: ObjectRule): + max_length = max(self._key_text_width(elem.key) for elem in rule.elements) + for elem in rule.elements: + key_length = self._key_text_width(elem.key) + + spaces_to_add = max_length - key_length + + separator = elem.children[1] + if isinstance(separator, COLON): # type: ignore[misc] + spaces_to_add += 1 + + base = separator.value.lstrip(" \t") + # EQ gets +1 for the base space (reconstructor no longer adds it + # when the token already has leading whitespace); COLON already + # has its own +1 above and the reconstructor doesn't add space. + extra = 1 if not isinstance(separator, COLON) else 0 # type: ignore[misc] + elem.children[1].set_value(" " * (spaces_to_add + extra) + base) + + @staticmethod + def _key_text_width(key: LarkElement) -> int: + """Compute the HCL text width of an object element key.""" + width = len(str(key.serialize())) + # Expression keys serialize with ${...} wrapping (+3 chars vs HCL text). + if isinstance(key, ObjectElemKeyExpressionRule): + width -= 3 + return width + + def _build_newline( + self, next_line_indent: int = 0, count: int = 1 + ) -> NewLineOrCommentRule: + result = NewLineOrCommentRule( + [ + NL_OR_COMMENT( + ("\n" * count) + " " * self.options.indent_length * next_line_indent + ) + ] + ) + self._last_new_line = result + return result + + def _deindent_last_line(self, times: int = 1): + if self._last_new_line is None: + return + token = self._last_new_line.token + for _ in range(times): + if token.value.endswith(" " * self.options.indent_length): + token.set_value(token.value[: -self.options.indent_length]) diff --git a/hcl2/hcl2.lark b/hcl2/hcl2.lark index 78ba3ca6..a9ae6128 100644 --- a/hcl2/hcl2.lark +++ b/hcl2/hcl2.lark @@ -1,27 +1,33 @@ -start : body -body : (new_line_or_comment? (attribute | block))* new_line_or_comment? -attribute : identifier EQ expression -block : identifier (identifier | string)* new_line_or_comment? "{" body "}" -new_line_or_comment: ( NL_OR_COMMENT )+ +// ============================================================================ +// Terminals +// ============================================================================ + +// Whitespace and Comments NL_OR_COMMENT: /\n[ \t]*/ | /#.*\n/ | /\/\/.*\n/ | /\/\*(.|\n)*?(\*\/)/ -identifier : NAME | IN | FOR | IF | FOR_EACH -NAME : /[a-zA-Z_][a-zA-Z0-9_-]*/ +// Keywords IF : "if" IN : "in" FOR : "for" FOR_EACH : "for_each" +ELSE : "else" +ENDIF : "endif" +ENDFOR : "endfor" -?expression : expr_term | operation | conditional -conditional : expression "?" new_line_or_comment? expression new_line_or_comment? ":" new_line_or_comment? expression +// Literals +NAME : /[a-zA-Z_][a-zA-Z0-9_-]*/ +ESCAPED_INTERPOLATION.2: /\$\$\{[^}]*\}/ +ESCAPED_DIRECTIVE.2: /%%\{[^}]*\}/ +STRING_CHARS.1: /(?:(?!\$\$\{)(?!\$\{)(?!%%\{)(?!%\{)[^"\\]|\\.|(?:\$(?!\$?\{))|(?:%(?!%?\{)))+/ +DECIMAL : "0".."9" +NEGATIVE_DECIMAL : "-" DECIMAL +EXP_MARK : ("e" | "E") ("+" | "-")? DECIMAL+ +INT_LITERAL: NEGATIVE_DECIMAL? DECIMAL+ +FLOAT_LITERAL: (NEGATIVE_DECIMAL? DECIMAL+ | NEGATIVE_DECIMAL+) "." DECIMAL+ (EXP_MARK)? + | (NEGATIVE_DECIMAL? DECIMAL+ | NEGATIVE_DECIMAL+) (EXP_MARK) -?operation : unary_op | binary_op -!unary_op : ("-" | "!") expr_term -binary_op : expression binary_term new_line_or_comment? -!binary_operator : BINARY_OP -binary_term : binary_operator new_line_or_comment? expression -BINARY_OP : DOUBLE_EQ | NEQ | LT | GT | LEQ | GEQ | MINUS | ASTERISK | SLASH | PERCENT | DOUBLE_AMP | DOUBLE_PIPE | PLUS +// Operators DOUBLE_EQ : "==" NEQ : "!=" LT : "<" @@ -35,74 +41,194 @@ PERCENT : "%" DOUBLE_AMP : "&&" DOUBLE_PIPE : "||" PLUS : "+" +NOT : "!" +QMARK : "?" + +// Punctuation LPAR : "(" RPAR : ")" +LBRACE : "{" +RBRACE : "}" +LSQB : "[" +RSQB : "]" COMMA : "," DOT : "." -COLON : ":" +EQ : /[ \t]*=(?!=|>)/ +COLON : /[ \t]*:(?!:)/ +DBLQUOTE : "\"" +// Escaped-quote string inside template directives: matches \"content\" +// (Lark escape-processes the regex, so the compiled pattern is \\?"(...)\\?" — +// i.e. one literal backslash + quote as delimiters.) +TEMPLATE_STRING.3 : /\\\\"(?:[^"\\\\]|\\\\.)*\\\\"/ + +// Interpolation +INTERP_START : "${" + +// Template Directives +DIRECTIVE_START : "%{" +STRIP_MARKER : "~" + +// Splat Operators +ATTR_SPLAT : ".*" +FULL_SPLAT_START : "[*]" +// Special Operators +FOR_OBJECT_ARROW : "=>" +ELLIPSIS : "..." +COLONS: "::" + +// Heredocs +HEREDOC_TEMPLATE : /<<(?P[a-zA-Z][a-zA-Z0-9._-]+)\n(?:.|\n)*?\n\s*(?P=heredoc)\n/ +HEREDOC_TEMPLATE_TRIM : /<<-(?P[a-zA-Z][a-zA-Z0-9._-]+)\n(?:.|\n)*?\n\s*(?P=heredoc_trim)\n/ + +// Ignore whitespace (but not newlines, as they're significant in HCL) +%ignore /[ \t]+/ + +// ============================================================================ +// Rules +// ============================================================================ + +// Top-level structure +start : body + +// Body and basic constructs +body : (new_line_or_comment? (attribute | block))* new_line_or_comment? +attribute : _attribute_name EQ expression +_attribute_name : identifier | keyword +block : identifier (identifier | string)* new_line_or_comment? LBRACE body RBRACE + +// Whitespace and comments +new_line_or_comment: ( NL_OR_COMMENT )+ + +// Basic literals and identifiers +identifier : NAME +keyword: IN | FOR | IF | FOR_EACH | ELSE | ENDIF | ENDFOR +int_lit: INT_LITERAL +float_lit: FLOAT_LITERAL +string: DBLQUOTE string_part* DBLQUOTE +string_part: STRING_CHARS + | ESCAPED_INTERPOLATION + | ESCAPED_DIRECTIVE + | interpolation + | template_if_start + | template_else + | template_endif + | template_for_start + | template_endfor + +// Expressions +?expression : or_expr QMARK new_line_or_comment? expression new_line_or_comment? COLON new_line_or_comment? expression -> conditional + | or_expr +interpolation: INTERP_START expression RBRACE + +// Template directives (flat rules — transformer assembles if/for structure) +template_if_start: DIRECTIVE_START STRIP_MARKER? IF expression STRIP_MARKER? RBRACE +template_else: DIRECTIVE_START STRIP_MARKER? ELSE STRIP_MARKER? RBRACE +template_endif: DIRECTIVE_START STRIP_MARKER? ENDIF STRIP_MARKER? RBRACE +template_for_start: DIRECTIVE_START STRIP_MARKER? FOR identifier (COMMA identifier)? IN expression STRIP_MARKER? RBRACE +template_endfor: DIRECTIVE_START STRIP_MARKER? ENDFOR STRIP_MARKER? RBRACE + +// Operator precedence ladder (lowest to highest) +// Each level uses left recursion for left-associativity. +// Rule aliases (-> binary_op, -> binary_term, -> binary_operator) maintain +// transformer compatibility with BinaryOpRule / BinaryTermRule / BinaryOperatorRule. + +// Logical OR +?or_expr : or_expr or_binary_term new_line_or_comment? -> binary_op + | and_expr +or_binary_term : or_binary_operator new_line_or_comment? and_expr -> binary_term +!or_binary_operator : DOUBLE_PIPE -> binary_operator + +// Logical AND +?and_expr : and_expr and_binary_term new_line_or_comment? -> binary_op + | eq_expr +and_binary_term : and_binary_operator new_line_or_comment? eq_expr -> binary_term +!and_binary_operator : DOUBLE_AMP -> binary_operator + +// Equality +?eq_expr : eq_expr eq_binary_term new_line_or_comment? -> binary_op + | rel_expr +eq_binary_term : eq_binary_operator new_line_or_comment? rel_expr -> binary_term +!eq_binary_operator : DOUBLE_EQ -> binary_operator + | NEQ -> binary_operator + +// Relational +?rel_expr : rel_expr rel_binary_term new_line_or_comment? -> binary_op + | add_expr +rel_binary_term : rel_binary_operator new_line_or_comment? add_expr -> binary_term +!rel_binary_operator : LT -> binary_operator + | GT -> binary_operator + | LEQ -> binary_operator + | GEQ -> binary_operator + +// Additive +?add_expr : add_expr add_binary_term new_line_or_comment? -> binary_op + | mul_expr +add_binary_term : add_binary_operator new_line_or_comment? mul_expr -> binary_term +!add_binary_operator : PLUS -> binary_operator + | MINUS -> binary_operator + +// Multiplicative +?mul_expr : mul_expr mul_binary_term new_line_or_comment? -> binary_op + | unary_expr +mul_binary_term : mul_binary_operator new_line_or_comment? unary_expr -> binary_term +!mul_binary_operator : ASTERISK -> binary_operator + | SLASH -> binary_operator + | PERCENT -> binary_operator + +// Unary (highest precedence for operations) +?unary_expr : unary_op | expr_term +!unary_op : (MINUS | NOT) expr_term + +// Expression terms expr_term : LPAR new_line_or_comment? expression new_line_or_comment? RPAR | float_lit | int_lit | string + | template_string | tuple | object - | function_call - | index_expr_term - | get_attr_expr_term | identifier - | provider_function_call + | function_call | heredoc_template | heredoc_template_trim + | index_expr_term + | get_attr_expr_term | attr_splat_expr_term | full_splat_expr_term | for_tuple_expr | for_object_expr -string: "\"" string_part* "\"" -string_part: STRING_CHARS - | ESCAPED_INTERPOLATION - | interpolation -interpolation: "${" expression "}" -ESCAPED_INTERPOLATION.2: /\$\$\{[^}]*\}/ -STRING_CHARS.1: /(?:(?!\$\$\{)(?!\$\{)[^"\\]|\\.|(?:\$(?!\$?\{)))+/ - -int_lit : NEGATIVE_DECIMAL? DECIMAL+ | NEGATIVE_DECIMAL+ -!float_lit: (NEGATIVE_DECIMAL? DECIMAL+ | NEGATIVE_DECIMAL+) "." DECIMAL+ (EXP_MARK)? - | (NEGATIVE_DECIMAL? DECIMAL+ | NEGATIVE_DECIMAL+) ("." DECIMAL+)? (EXP_MARK) -NEGATIVE_DECIMAL : "-" DECIMAL -DECIMAL : "0".."9" -EXP_MARK : ("e" | "E") ("+" | "-")? DECIMAL+ -EQ : /[ \t]*=(?!=|>)/ +template_string : TEMPLATE_STRING -tuple : "[" (new_line_or_comment* expression new_line_or_comment* ",")* (new_line_or_comment* expression)? new_line_or_comment* "]" -object : "{" new_line_or_comment? (new_line_or_comment* (object_elem | (object_elem COMMA)) new_line_or_comment*)* "}" +// Collections +tuple : LSQB new_line_or_comment? (expression new_line_or_comment? COMMA new_line_or_comment?)* (expression new_line_or_comment? COMMA? new_line_or_comment?)? RSQB +object : LBRACE new_line_or_comment? ((object_elem | (object_elem new_line_or_comment? COMMA)) new_line_or_comment?)* RBRACE object_elem : object_elem_key ( EQ | COLON ) expression -object_elem_key : float_lit | int_lit | identifier | string | object_elem_key_dot_accessor | object_elem_key_expression -object_elem_key_expression : LPAR expression RPAR -object_elem_key_dot_accessor : identifier (DOT identifier)+ +object_elem_key : expression -heredoc_template : /<<(?P[a-zA-Z][a-zA-Z0-9._-]+)\n?(?:.|\n)*?\n\s*(?P=heredoc)\n/ -heredoc_template_trim : /<<-(?P[a-zA-Z][a-zA-Z0-9._-]+)\n?(?:.|\n)*?\n\s*(?P=heredoc_trim)\n/ +// Heredocs +heredoc_template : HEREDOC_TEMPLATE +heredoc_template_trim : HEREDOC_TEMPLATE_TRIM -function_call : identifier "(" new_line_or_comment? arguments? new_line_or_comment? ")" -arguments : (expression (new_line_or_comment* "," new_line_or_comment* expression)* ("," | "...")? new_line_or_comment*) -colons: "::" -provider_function_call: identifier colons identifier colons identifier "(" new_line_or_comment? arguments? new_line_or_comment? ")" +// Functions +function_call : identifier (COLONS identifier COLONS identifier)? LPAR new_line_or_comment? arguments? new_line_or_comment? RPAR +arguments : (expression (new_line_or_comment? COMMA new_line_or_comment? expression)* (COMMA | ELLIPSIS)? new_line_or_comment?) +// Indexing and attribute access index_expr_term : expr_term index get_attr_expr_term : expr_term get_attr attr_splat_expr_term : expr_term attr_splat full_splat_expr_term : expr_term full_splat -index : "[" new_line_or_comment? expression new_line_or_comment? "]" | "." DECIMAL+ -get_attr : "." identifier -attr_splat : ".*" get_attr* -full_splat : "[*]" (get_attr | index)* +?index : braces_index | short_index +braces_index : LSQB new_line_or_comment? expression new_line_or_comment? RSQB +short_index : DOT INT_LITERAL +get_attr : DOT identifier +attr_splat : ATTR_SPLAT (get_attr | index)* +full_splat : FULL_SPLAT_START (get_attr | index)* -FOR_OBJECT_ARROW : "=>" -!for_tuple_expr : "[" new_line_or_comment? for_intro new_line_or_comment? expression new_line_or_comment? for_cond? new_line_or_comment? "]" -!for_object_expr : "{" new_line_or_comment? for_intro new_line_or_comment? expression FOR_OBJECT_ARROW new_line_or_comment? expression new_line_or_comment? "..."? new_line_or_comment? for_cond? new_line_or_comment? "}" -!for_intro : "for" new_line_or_comment? identifier ("," identifier new_line_or_comment?)? new_line_or_comment? "in" new_line_or_comment? expression new_line_or_comment? ":" new_line_or_comment? -!for_cond : "if" new_line_or_comment? expression - -%ignore /[ \t]+/ +// For expressions +!for_tuple_expr : LSQB new_line_or_comment? for_intro new_line_or_comment? expression new_line_or_comment? for_cond? new_line_or_comment? RSQB +!for_object_expr : LBRACE new_line_or_comment? for_intro new_line_or_comment? expression FOR_OBJECT_ARROW new_line_or_comment? expression new_line_or_comment? ELLIPSIS? new_line_or_comment? for_cond? new_line_or_comment? RBRACE +!for_intro : FOR new_line_or_comment? identifier (COMMA identifier new_line_or_comment?)? new_line_or_comment? IN new_line_or_comment? expression new_line_or_comment? COLON new_line_or_comment? +!for_cond : IF new_line_or_comment? expression diff --git a/hcl2/parser.py b/hcl2/parser.py index 79d50122..d275a589 100644 --- a/hcl2/parser.py +++ b/hcl2/parser.py @@ -4,6 +4,8 @@ from lark import Lark +from hcl2.postlexer import PostLexer + PARSER_FILE = Path(__file__).absolute().resolve().parent / ".lark_cache.bin" @@ -17,26 +19,5 @@ def parser() -> Lark: cache=str(PARSER_FILE), # Disable/Delete file to effect changes to the grammar rel_to=__file__, propagate_positions=True, - ) - - -@functools.lru_cache() -def reconstruction_parser() -> Lark: - """ - Build parser for transforming python structures into HCL2 text. - This is duplicated from `parser` because we need different options here for - the reconstructor. Please make sure changes are kept in sync between the two - if necessary. - """ - return Lark.open( - "hcl2.lark", - parser="lalr", - # Caching must be disabled to allow for reconstruction until lark-parser/lark#1472 is fixed: - # - # https://github.com/lark-parser/lark/issues/1472 - # - # cache=str(PARSER_FILE), # Disable/Delete file to effect changes to the grammar - rel_to=__file__, - propagate_positions=True, - maybe_placeholders=False, # Needed for reconstruction + postlex=PostLexer(), ) diff --git a/hcl2/postlexer.py b/hcl2/postlexer.py new file mode 100644 index 00000000..1d22cc03 --- /dev/null +++ b/hcl2/postlexer.py @@ -0,0 +1,79 @@ +"""Postlexer that transforms the token stream between the Lark lexer and parser. + +Each transformation is implemented as a private method that accepts and yields +tokens. The public ``process`` method chains them together, making it easy to +add new passes without touching existing logic. +""" + +from typing import FrozenSet, Iterator, Optional, Tuple + +from lark import Token + +# Type alias for a token stream consumed and produced by each pass. +TokenStream = Iterator[Token] + +# Operator token types that may legally follow a line-continuation newline. +# MINUS is excluded — it is also the unary negation operator, and merging a +# newline into MINUS would incorrectly consume statement-separating newlines +# before negative literals (e.g. "a = 1\nb = -2"). +OPERATOR_TYPES: FrozenSet[str] = frozenset( + { + "DOUBLE_EQ", + "NEQ", + "LT", + "GT", + "LEQ", + "GEQ", + "ASTERISK", + "SLASH", + "PERCENT", + "DOUBLE_AMP", + "DOUBLE_PIPE", + "PLUS", + "QMARK", + } +) + + +class PostLexer: + """Transform the token stream before it reaches the LALR parser.""" + + def process(self, stream: TokenStream) -> TokenStream: + """Chain all postlexer passes over the token stream.""" + yield from self._merge_newlines_into_operators(stream) + + def _merge_newlines_into_operators(self, stream: TokenStream) -> TokenStream: + """Merge NL_OR_COMMENT tokens into immediately following operator tokens. + + LALR parsers cannot distinguish a statement-ending newline from a + line-continuation newline before a binary operator. This pass resolves + the ambiguity by merging NL_OR_COMMENT into the operator token's value + when the next token is a binary operator or QMARK. The transformer + later extracts the newline prefix and creates a NewLineOrCommentRule + node, preserving round-trip fidelity. + """ + pending_nl: Optional[Token] = None + for token in stream: + if token.type == "NL_OR_COMMENT": + if pending_nl is not None: + yield pending_nl + pending_nl = token + else: + if pending_nl is not None: + if token.type in OPERATOR_TYPES: + token = token.update(value=str(pending_nl) + str(token)) + else: + yield pending_nl + pending_nl = None + yield token + if pending_nl is not None: + yield pending_nl + + @property + def always_accept(self) -> Tuple[()]: + """Terminal names the parser must accept even when not expected by LALR. + + Lark requires this property on postlexer objects. An empty tuple + means no extra terminals are injected. + """ + return () diff --git a/hcl2/query/__init__.py b/hcl2/query/__init__.py new file mode 100644 index 00000000..39ce9e5d --- /dev/null +++ b/hcl2/query/__init__.py @@ -0,0 +1,46 @@ +"""Query facades for navigating HCL2 LarkElement trees.""" + +from hcl2.query._base import NodeView, view_for, register_view +from hcl2.query.body import DocumentView, BodyView +from hcl2.query.blocks import BlockView +from hcl2.query.attributes import AttributeView +from hcl2.query.containers import TupleView, ObjectView +from hcl2.query.for_exprs import ForTupleView, ForObjectView +from hcl2.query.functions import FunctionCallView +from hcl2.query.expressions import ConditionalView +from hcl2.query.pipeline import ( + split_pipeline, + classify_stage, + execute_pipeline, + PathStage, + BuiltinStage, + SelectStage, +) +from hcl2.query.builtins import apply_builtin, BUILTIN_NAMES +from hcl2.query.predicate import parse_predicate, evaluate_predicate + +__all__ = [ + "NodeView", + "view_for", + "register_view", + "DocumentView", + "BodyView", + "BlockView", + "AttributeView", + "TupleView", + "ObjectView", + "ForTupleView", + "ForObjectView", + "FunctionCallView", + "ConditionalView", + "split_pipeline", + "classify_stage", + "execute_pipeline", + "PathStage", + "BuiltinStage", + "SelectStage", + "apply_builtin", + "BUILTIN_NAMES", + "parse_predicate", + "evaluate_predicate", +] diff --git a/hcl2/query/_base.py b/hcl2/query/_base.py new file mode 100644 index 00000000..27421410 --- /dev/null +++ b/hcl2/query/_base.py @@ -0,0 +1,120 @@ +"""Base view class and registry for query facades.""" + +from typing import ( + Any, + Callable, + Dict, + List, + Optional, + Type, + TypeVar, +) + +from hcl2.rules.abstract import LarkElement, LarkRule +from hcl2.utils import SerializationOptions +from hcl2 import walk as _walk_mod + +T = TypeVar("T", bound=LarkRule) + +_VIEW_REGISTRY: Dict[Type[LarkElement], Type["NodeView"]] = {} + + +def register_view(rule_type: Type[LarkElement]): + """Class decorator: register a view class for a given rule type.""" + + def decorator(cls): + _VIEW_REGISTRY[rule_type] = cls + return cls + + return decorator + + +def view_for(node: LarkElement) -> "NodeView": + """Factory: dispatch by type, walk MRO for base matches, fallback to NodeView.""" + node_type = type(node) + # Direct match + if node_type in _VIEW_REGISTRY: + return _VIEW_REGISTRY[node_type](node) + # Walk MRO + for base in node_type.__mro__: + if base in _VIEW_REGISTRY: + return _VIEW_REGISTRY[base](node) + return NodeView(node) + + +class NodeView: + """Base view wrapping a LarkElement node.""" + + def __init__(self, node: LarkElement): + self._node = node + + @property + def raw(self) -> LarkElement: + """Return the underlying IR node.""" + return self._node + + @property + def parent_view(self) -> Optional["NodeView"]: + """Return a view over the parent node, or None.""" + parent = getattr(self._node, "_parent", None) + if parent is None: + return None + return view_for(parent) + + def find_all(self, rule_type: Type[T]) -> List["NodeView"]: + """Find all descendants matching a rule class, returned as views.""" + return [view_for(n) for n in _walk_mod.find_all(self._node, rule_type)] + + def find_by_predicate(self, predicate: Callable[..., bool]) -> List["NodeView"]: + """Find descendants matching a predicate on their views.""" + results = [] + for element in _walk_mod.walk_semantic(self._node): + wrapped = view_for(element) + if predicate(wrapped): + results.append(wrapped) + return results + + def walk_semantic(self) -> List["NodeView"]: + """Return all semantic descendant nodes as views.""" + return [view_for(n) for n in _walk_mod.walk_semantic(self._node)] + + def walk_rules(self) -> List["NodeView"]: + """Return all rule descendant nodes as views.""" + return [view_for(n) for n in _walk_mod.walk_rules(self._node)] + + def to_hcl(self) -> str: + """Reconstruct this subtree as HCL text.""" + from hcl2.reconstructor import HCLReconstructor + + reconstructor = HCLReconstructor() + return reconstructor.reconstruct_fragment(self._node) + + def to_dict(self, options: Optional[SerializationOptions] = None) -> Any: + """Serialize this node to a Python value.""" + if options is not None: + return self._node.serialize(options=options) + return self._node.serialize() + + def __repr__(self) -> str: + return f"<{self.__class__.__name__} wrapping {self._node!r}>" + + +VIEW_TYPE_NAMES = { + "DocumentView": "document", + "BodyView": "body", + "BlockView": "block", + "AttributeView": "attribute", + "TupleView": "tuple", + "ObjectView": "object", + "ForTupleView": "for_tuple", + "ForObjectView": "for_object", + "FunctionCallView": "function_call", + "ConditionalView": "conditional", + "NodeView": "node", +} + + +def view_type_name(node: "NodeView") -> str: + """Return a short type name string for a view node.""" + cls_name = type(node).__name__ + return VIEW_TYPE_NAMES.get(cls_name, cls_name.lower()) diff --git a/hcl2/query/attributes.py b/hcl2/query/attributes.py new file mode 100644 index 00000000..567bb037 --- /dev/null +++ b/hcl2/query/attributes.py @@ -0,0 +1,51 @@ +"""AttributeView facade.""" + +from typing import Any, List, Optional + +from hcl2.query._base import NodeView, register_view, view_for +from hcl2.rules.abstract import LarkElement +from hcl2.rules.base import AttributeRule +from hcl2.utils import SerializationOptions + + +@register_view(AttributeRule) +class AttributeView(NodeView): + """View over an HCL2 attribute (AttributeRule).""" + + def __init__( + self, + node: LarkElement, + adjacent_comments: Optional[List[dict]] = None, + ): + super().__init__(node) + self._adjacent_comments = adjacent_comments + + @property + def name(self) -> str: + """Return the attribute name as a plain string.""" + node: AttributeRule = self._node # type: ignore[assignment] + return node.identifier.serialize() + + @property + def value(self) -> Any: + """Return the serialized Python value of the attribute expression.""" + node: AttributeRule = self._node # type: ignore[assignment] + return node.expression.serialize() + + @property + def value_node(self) -> "NodeView": + """Return a view over the expression node.""" + node: AttributeRule = self._node # type: ignore[assignment] + return view_for(node.expression) + + def to_dict(self, options: Optional[SerializationOptions] = None) -> Any: + """Serialize, merging adjacent comments from the parent body.""" + result = super().to_dict(options=options) + if ( + self._adjacent_comments + and options is not None + and options.with_comments + and isinstance(result, dict) + ): + result["__comments__"] = self._adjacent_comments + return result diff --git a/hcl2/query/blocks.py b/hcl2/query/blocks.py new file mode 100644 index 00000000..2a5fd6cf --- /dev/null +++ b/hcl2/query/blocks.py @@ -0,0 +1,100 @@ +"""BlockView facade.""" + +from typing import Any, List, Optional + +from hcl2.const import COMMENTS_KEY +from hcl2.query._base import NodeView, register_view +from hcl2.rules.abstract import LarkElement +from hcl2.rules.base import BlockRule +from hcl2.rules.literal_rules import IdentifierRule +from hcl2.rules.strings import StringRule +from hcl2.utils import SerializationOptions + + +def _label_to_str(label) -> str: + """Convert a block label (IdentifierRule or StringRule) to a plain string.""" + if isinstance(label, IdentifierRule): + return label.serialize() + if isinstance(label, StringRule): + raw = label.serialize() + # Strip surrounding quotes + if isinstance(raw, str) and len(raw) >= 2 and raw[0] == '"' and raw[-1] == '"': + return raw[1:-1] + return str(raw) + return str(label.serialize()) + + +@register_view(BlockRule) +class BlockView(NodeView): + """View over an HCL2 block (BlockRule).""" + + def __init__( + self, + node: LarkElement, + adjacent_comments: Optional[List[dict]] = None, + ): + super().__init__(node) + self._adjacent_comments = adjacent_comments + + @property + def block_type(self) -> str: + """Return the block type (first label) as a plain string.""" + node: BlockRule = self._node # type: ignore[assignment] + return _label_to_str(node.labels[0]) + + @property + def labels(self) -> List[str]: + """Return all labels as plain strings.""" + node: BlockRule = self._node # type: ignore[assignment] + return [_label_to_str(lbl) for lbl in node.labels] + + @property + def name_labels(self) -> List[str]: + """Return labels after the block type (labels[1:]) as plain strings.""" + return self.labels[1:] + + @property + def body(self) -> "NodeView": + """Return the block body as a BodyView.""" + from hcl2.query.body import BodyView + + node: BlockRule = self._node # type: ignore[assignment] + return BodyView(node.body) + + def to_dict(self, options: Optional[SerializationOptions] = None) -> Any: + """Serialize, merging adjacent comments from the parent body.""" + result = super().to_dict(options=options) + if ( + self._adjacent_comments + and options is not None + and options.with_comments + and isinstance(result, dict) + ): + # Place adjacent comments at the outer level of the block dict, + # alongside the label keys — not drilled into the body dict. + existing = result.get(COMMENTS_KEY, []) + result[COMMENTS_KEY] = self._adjacent_comments + existing + return result + + def blocks( + self, block_type: Optional[str] = None, *labels: str + ) -> List["NodeView"]: + """Delegate to body.""" + from hcl2.query.body import BodyView + + node: BlockRule = self._node # type: ignore[assignment] + return BodyView(node.body).blocks(block_type, *labels) + + def attributes(self, name: Optional[str] = None) -> List["NodeView"]: + """Delegate to body.""" + from hcl2.query.body import BodyView + + node: BlockRule = self._node # type: ignore[assignment] + return BodyView(node.body).attributes(name) + + def attribute(self, name: str) -> Optional["NodeView"]: + """Delegate to body.""" + from hcl2.query.body import BodyView + + node: BlockRule = self._node # type: ignore[assignment] + return BodyView(node.body).attribute(name) diff --git a/hcl2/query/body.py b/hcl2/query/body.py new file mode 100644 index 00000000..a3bce89e --- /dev/null +++ b/hcl2/query/body.py @@ -0,0 +1,124 @@ +"""DocumentView and BodyView facades.""" + +from typing import List, Optional + +from hcl2.query._base import NodeView, register_view +from hcl2.rules.base import AttributeRule, BlockRule, BodyRule, StartRule +from hcl2.rules.whitespace import NewLineOrCommentRule + + +def _collect_leading_comments(body: BodyRule, child_index: int) -> List[dict]: + """Collect comments from NewLineOrCommentRule siblings preceding *child_index*. + + Walks backward through ``body.children`` from ``child_index - 1``, + collecting comment dicts via ``to_list()``, stopping at the first + ``BlockRule`` or ``AttributeRule`` (the previous semantic sibling) or + the start of the children list. + """ + chunks: List[List[dict]] = [] + for i in range(child_index - 1, -1, -1): + sibling = body.children[i] + if isinstance(sibling, (BlockRule, AttributeRule)): + break + if isinstance(sibling, NewLineOrCommentRule): + comments = sibling.to_list() + if comments: + chunks.append(comments) + # Reverse node order (walked backward) but keep each node's comments in order + chunks.reverse() + result: List[dict] = [] + for chunk in chunks: + result.extend(chunk) + return result + + +@register_view(StartRule) +class DocumentView(NodeView): + """View over the top-level HCL2 document (StartRule).""" + + @staticmethod + def parse(text: str) -> "DocumentView": + """Parse HCL2 text into a DocumentView.""" + from hcl2 import api + + tree = api.parses(text) + return DocumentView(tree) + + @staticmethod + def parse_file(path: str) -> "DocumentView": + """Parse an HCL2 file into a DocumentView.""" + from hcl2 import api + + with open(path, encoding="utf-8") as f: + tree = api.parse(f) + return DocumentView(tree) + + @property + def body(self) -> "BodyView": + """Return the document body as a BodyView.""" + node: StartRule = self._node # type: ignore[assignment] + return BodyView(node.body) + + def blocks( + self, block_type: Optional[str] = None, *labels: str + ) -> List["NodeView"]: + """Return matching blocks, delegating to body.""" + return self.body.blocks(block_type, *labels) + + def attributes(self, name: Optional[str] = None) -> List["NodeView"]: + """Return matching attributes, delegating to body.""" + return self.body.attributes(name) + + def attribute(self, name: str) -> Optional["NodeView"]: + """Return a single attribute by name, or None.""" + return self.body.attribute(name) + + +@register_view(BodyRule) +class BodyView(NodeView): + """View over an HCL2 body (BodyRule).""" + + def blocks( + self, block_type: Optional[str] = None, *labels: str + ) -> List["NodeView"]: + """Return blocks, optionally filtered by type and labels.""" + from hcl2.query.blocks import BlockView + + node: BodyRule = self._node # type: ignore[assignment] + results: List[NodeView] = [] + for child in node.children: + if not isinstance(child, BlockRule): + continue + adjacent = _collect_leading_comments(node, child.index) or None + block_view = BlockView(child, adjacent_comments=adjacent) + if block_type is not None and block_view.block_type != block_type: + continue + if labels: + name_lbls = block_view.name_labels + if len(labels) > len(name_lbls): + continue + if any(l != nl for l, nl in zip(labels, name_lbls)): + continue + results.append(block_view) + return results + + def attributes(self, name: Optional[str] = None) -> List["NodeView"]: + """Return attributes, optionally filtered by name.""" + from hcl2.query.attributes import AttributeView + + node: BodyRule = self._node # type: ignore[assignment] + results: List[NodeView] = [] + for child in node.children: + if not isinstance(child, AttributeRule): + continue + adjacent = _collect_leading_comments(node, child.index) or None + attr_view = AttributeView(child, adjacent_comments=adjacent) + if name is not None and attr_view.name != name: + continue + results.append(attr_view) + return results + + def attribute(self, name: str) -> Optional["NodeView"]: + """Return a single attribute by name, or None.""" + attrs = self.attributes(name) + return attrs[0] if attrs else None diff --git a/hcl2/query/builtins.py b/hcl2/query/builtins.py new file mode 100644 index 00000000..575310fb --- /dev/null +++ b/hcl2/query/builtins.py @@ -0,0 +1,115 @@ +"""Built-in terminal transforms for the hq query pipeline.""" + +from typing import Any, List + +from hcl2.query.path import QuerySyntaxError + +BUILTIN_NAMES = frozenset({"keys", "values", "length"}) + + +def apply_builtin(name: str, nodes: List[Any]) -> List[Any]: + """Apply a builtin function to a list of nodes. + + Each builtin produces one result per supported input node. + Unsupported input types are silently skipped (filtered out). + """ + nodes = _unwrap_to_values(nodes) + if name == "keys": + return _apply_keys(nodes) + if name == "values": + return _apply_values(nodes) + if name == "length": + return _apply_length(nodes) + raise QuerySyntaxError(f"Unknown builtin: {name!r}") + + +def _unwrap_to_values(nodes: List[Any]) -> List[Any]: + """Unwrap AttributeView and ExprTermRule wrappers for builtins.""" + from hcl2.query._base import NodeView, view_for + from hcl2.query.attributes import AttributeView + from hcl2.rules.expressions import ExprTermRule + + result: List[Any] = [] + for node in nodes: + if isinstance(node, AttributeView): + node = node.value_node + if isinstance(node, NodeView) and isinstance(node._node, ExprTermRule): + inner = node._node.expression + if inner is not None: + node = view_for(inner) + result.append(node) + return result + + +def _apply_keys(nodes: List[Any]) -> List[Any]: + from hcl2.query.blocks import BlockView + from hcl2.query.body import BodyView, DocumentView + from hcl2.query.containers import ObjectView + + results: List[Any] = [] + for node in nodes: + if isinstance(node, ObjectView): + results.append(node.keys) + elif isinstance(node, (DocumentView, BodyView)): + body = node.body if isinstance(node, DocumentView) else node + names: List[str] = [] + for blk in body.blocks(): + names.append(blk.block_type) # type: ignore[attr-defined] + for attr in body.attributes(): + names.append(attr.name) # type: ignore[attr-defined] + results.append(names) + elif isinstance(node, BlockView): + results.append(node.labels) + elif isinstance(node, dict): + results.append(list(node.keys())) + # other types silently produce nothing + return results + + +def _apply_values(nodes: List[Any]) -> List[Any]: + from hcl2.query.body import BodyView, DocumentView + from hcl2.query.containers import ObjectView, TupleView + + results: List[Any] = [] + for node in nodes: + if isinstance(node, ObjectView): + results.append([v for _, v in node.entries]) + elif isinstance(node, TupleView): + results.append(node.elements) + elif isinstance(node, (DocumentView, BodyView)): + body = node.body if isinstance(node, DocumentView) else node + items: list = [] + items.extend(body.blocks()) + items.extend(body.attributes()) + results.append(items) + elif isinstance(node, dict): + results.append(list(node.values())) + elif isinstance(node, list): + results.append(node) + return results + + +def _apply_length(nodes: List[Any]) -> List[Any]: + from hcl2.query._base import NodeView + from hcl2.query.body import BodyView, DocumentView + from hcl2.query.containers import ObjectView, TupleView + from hcl2.query.functions import FunctionCallView + + results: List[Any] = [] + for node in nodes: + if isinstance(node, TupleView): + results.append(len(node)) + elif isinstance(node, ObjectView): + results.append(len(node.entries)) + elif isinstance(node, FunctionCallView): + results.append(len(node.args)) + elif isinstance(node, (DocumentView, BodyView)): + body = node.body if isinstance(node, DocumentView) else node + results.append(len(body.blocks()) + len(body.attributes())) + elif isinstance(node, NodeView): + results.append(1) + elif isinstance(node, (list, dict, str)): + results.append(len(node)) + else: + results.append(1) + return results diff --git a/hcl2/query/containers.py b/hcl2/query/containers.py new file mode 100644 index 00000000..812f3719 --- /dev/null +++ b/hcl2/query/containers.py @@ -0,0 +1,53 @@ +"""TupleView and ObjectView facades.""" + +from typing import List, Optional, Tuple + +from hcl2.query._base import NodeView, register_view, view_for +from hcl2.rules.containers import ObjectRule, TupleRule + + +@register_view(TupleRule) +class TupleView(NodeView): + """View over an HCL2 tuple (TupleRule).""" + + @property + def elements(self) -> List[NodeView]: + """Return the tuple elements as views.""" + node: TupleRule = self._node # type: ignore[assignment] + return [view_for(elem) for elem in node.elements] + + def __len__(self) -> int: + node: TupleRule = self._node # type: ignore[assignment] + return len(node.elements) + + def __getitem__(self, index: int) -> NodeView: + node: TupleRule = self._node # type: ignore[assignment] + return view_for(node.elements[index]) + + +@register_view(ObjectRule) +class ObjectView(NodeView): + """View over an HCL2 object (ObjectRule).""" + + @property + def entries(self) -> List[Tuple[str, NodeView]]: + """Return (key, value_view) pairs.""" + node: ObjectRule = self._node # type: ignore[assignment] + result = [] + for elem in node.elements: + key = str(elem.key.serialize()) + val = view_for(elem.expression) + result.append((key, val)) + return result + + def get(self, key: str) -> Optional[NodeView]: + """Get a value view by key, or None.""" + for entry_key, entry_val in self.entries: + if entry_key == key: + return entry_val + return None + + @property + def keys(self) -> List[str]: + """Return all keys as strings.""" + return [k for k, _ in self.entries] diff --git a/hcl2/query/diff.py b/hcl2/query/diff.py new file mode 100644 index 00000000..432af783 --- /dev/null +++ b/hcl2/query/diff.py @@ -0,0 +1,88 @@ +"""Structural diff between two HCL documents.""" + +import json +from dataclasses import dataclass +from typing import Any, List + + +@dataclass +class DiffEntry: + """A single difference between two structures.""" + + path: str + kind: str # "added", "removed", "changed" + left: Any = None + right: Any = None + + +def diff_dicts(left: Any, right: Any, path: str = "") -> List[DiffEntry]: + """Recursively compare two Python structures and return differences.""" + entries: List[DiffEntry] = [] + + if isinstance(left, dict) and isinstance(right, dict): + all_keys = sorted(set(list(left.keys()) + list(right.keys()))) + for key in all_keys: + child_path = f"{path}.{key}" if path else key + if key not in left: + entries.append( + DiffEntry(path=child_path, kind="added", right=right[key]) + ) + elif key not in right: + entries.append( + DiffEntry(path=child_path, kind="removed", left=left[key]) + ) + else: + entries.extend(diff_dicts(left[key], right[key], child_path)) + elif isinstance(left, list) and isinstance(right, list): + max_len = max(len(left), len(right)) + for i in range(max_len): + child_path = f"{path}[{i}]" + if i >= len(left): + entries.append(DiffEntry(path=child_path, kind="added", right=right[i])) + elif i >= len(right): + entries.append(DiffEntry(path=child_path, kind="removed", left=left[i])) + else: + entries.extend(diff_dicts(left[i], right[i], child_path)) + elif left != right: + entries.append( + DiffEntry(path=path or "(root)", kind="changed", left=left, right=right) + ) + + return entries + + +def format_diff_text(entries: List[DiffEntry]) -> str: + """Format diff entries as human-readable text.""" + if not entries: + return "" + lines: List[str] = [] + for entry in entries: + if entry.kind == "added": + lines.append(f"+ {entry.path}: {_fmt_val(entry.right)}") + elif entry.kind == "removed": + lines.append(f"- {entry.path}: {_fmt_val(entry.left)}") + elif entry.kind == "changed": + lines.append( + f"~ {entry.path}: {_fmt_val(entry.left)} -> {_fmt_val(entry.right)}" + ) + return "\n".join(lines) + + +def format_diff_json(entries: List[DiffEntry]) -> str: + """Format diff entries as JSON.""" + data = [] + for entry in entries: + item: dict = {"path": entry.path, "kind": entry.kind} + if entry.left is not None: + item["left"] = entry.left + if entry.right is not None: + item["right"] = entry.right + data.append(item) + return json.dumps(data, indent=2, default=str) + + +def _fmt_val(val: Any) -> str: + """Format a value for text diff display.""" + if isinstance(val, str): + return repr(val) + return str(val) diff --git a/hcl2/query/expressions.py b/hcl2/query/expressions.py new file mode 100644 index 00000000..ee57c755 --- /dev/null +++ b/hcl2/query/expressions.py @@ -0,0 +1,27 @@ +"""View facade for HCL2 conditional expressions.""" + +from hcl2.query._base import NodeView, register_view, view_for +from hcl2.rules.expressions import ConditionalRule + + +@register_view(ConditionalRule) +class ConditionalView(NodeView): + """View over a ternary conditional expression (condition ? true : false).""" + + @property + def condition(self) -> NodeView: + """Return the condition expression.""" + node: ConditionalRule = self._node # type: ignore[assignment] + return view_for(node.condition) + + @property + def true_val(self) -> NodeView: + """Return the true-branch expression.""" + node: ConditionalRule = self._node # type: ignore[assignment] + return view_for(node.if_true) + + @property + def false_val(self) -> NodeView: + """Return the false-branch expression.""" + node: ConditionalRule = self._node # type: ignore[assignment] + return view_for(node.if_false) diff --git a/hcl2/query/for_exprs.py b/hcl2/query/for_exprs.py new file mode 100644 index 00000000..64638f40 --- /dev/null +++ b/hcl2/query/for_exprs.py @@ -0,0 +1,112 @@ +"""ForTupleView and ForObjectView facades.""" + +from typing import Optional + +from hcl2.query._base import NodeView, register_view, view_for +from hcl2.rules.for_expressions import ForObjectExprRule, ForTupleExprRule + + +@register_view(ForTupleExprRule) +class ForTupleView(NodeView): + """View over a for-tuple expression ([for ...]).""" + + @property + def iterator_name(self) -> str: + """Return the first iterator identifier name.""" + node: ForTupleExprRule = self._node # type: ignore[assignment] + return node.for_intro.first_iterator.serialize() + + @property + def second_iterator_name(self) -> Optional[str]: + """Return the second iterator identifier name, or None.""" + node: ForTupleExprRule = self._node # type: ignore[assignment] + second = node.for_intro.second_iterator + if second is None: + return None + return second.serialize() + + @property + def iterable(self) -> NodeView: + """Return a view over the iterable expression.""" + node: ForTupleExprRule = self._node # type: ignore[assignment] + return view_for(node.for_intro.iterable) + + @property + def value_expr(self) -> NodeView: + """Return a view over the value expression.""" + node: ForTupleExprRule = self._node # type: ignore[assignment] + return view_for(node.value_expr) + + @property + def has_condition(self) -> bool: + """Return whether the for expression has an if condition.""" + node: ForTupleExprRule = self._node # type: ignore[assignment] + return node.condition is not None + + @property + def condition(self) -> Optional[NodeView]: + """Return a view over the condition, or None.""" + node: ForTupleExprRule = self._node # type: ignore[assignment] + cond = node.condition + if cond is None: + return None + return view_for(cond) + + +@register_view(ForObjectExprRule) +class ForObjectView(NodeView): + """View over a for-object expression ({for ...}).""" + + @property + def iterator_name(self) -> str: + """Return the first iterator identifier name.""" + node: ForObjectExprRule = self._node # type: ignore[assignment] + return node.for_intro.first_iterator.serialize() + + @property + def second_iterator_name(self) -> Optional[str]: + """Return the second iterator identifier name, or None.""" + node: ForObjectExprRule = self._node # type: ignore[assignment] + second = node.for_intro.second_iterator + if second is None: + return None + return second.serialize() + + @property + def iterable(self) -> NodeView: + """Return a view over the iterable expression.""" + node: ForObjectExprRule = self._node # type: ignore[assignment] + return view_for(node.for_intro.iterable) + + @property + def key_expr(self) -> NodeView: + """Return a view over the key expression.""" + node: ForObjectExprRule = self._node # type: ignore[assignment] + return view_for(node.key_expr) + + @property + def value_expr(self) -> NodeView: + """Return a view over the value expression.""" + node: ForObjectExprRule = self._node # type: ignore[assignment] + return view_for(node.value_expr) + + @property + def has_ellipsis(self) -> bool: + """Return whether the for expression has an ellipsis.""" + node: ForObjectExprRule = self._node # type: ignore[assignment] + return node.ellipsis is not None + + @property + def has_condition(self) -> bool: + """Return whether the for expression has an if condition.""" + node: ForObjectExprRule = self._node # type: ignore[assignment] + return node.condition is not None + + @property + def condition(self) -> Optional[NodeView]: + """Return a view over the condition, or None.""" + node: ForObjectExprRule = self._node # type: ignore[assignment] + cond = node.condition + if cond is None: + return None + return view_for(cond) diff --git a/hcl2/query/functions.py b/hcl2/query/functions.py new file mode 100644 index 00000000..58e3c496 --- /dev/null +++ b/hcl2/query/functions.py @@ -0,0 +1,35 @@ +"""FunctionCallView facade.""" + +from typing import List + +from hcl2.query._base import NodeView, register_view, view_for +from hcl2.rules.functions import FunctionCallRule + + +@register_view(FunctionCallRule) +class FunctionCallView(NodeView): + """View over an HCL2 function call (FunctionCallRule).""" + + @property + def name(self) -> str: + """Return the function name (namespace::name joined).""" + node: FunctionCallRule = self._node # type: ignore[assignment] + return "::".join(ident.serialize() for ident in node.identifiers) + + @property + def args(self) -> List[NodeView]: + """Return the function arguments as views.""" + node: FunctionCallRule = self._node # type: ignore[assignment] + args_rule = node.arguments + if args_rule is None: + return [] + return [view_for(arg) for arg in args_rule.arguments] + + @property + def has_ellipsis(self) -> bool: + """Return whether the argument list ends with ellipsis.""" + node: FunctionCallRule = self._node # type: ignore[assignment] + args_rule = node.arguments + if args_rule is None: + return False + return args_rule.has_ellipsis diff --git a/hcl2/query/introspect.py b/hcl2/query/introspect.py new file mode 100644 index 00000000..55c06234 --- /dev/null +++ b/hcl2/query/introspect.py @@ -0,0 +1,186 @@ +"""Introspection utilities for --describe and --schema flags.""" + +import inspect +from typing import Any, Dict, List + +from hcl2.query._base import NodeView, _VIEW_REGISTRY +from hcl2.query.safe_eval import _SAFE_CALLABLE_NAMES + + +def describe_results(results: List[Any]) -> Dict[str, Any]: + """Build a description dict for --describe output.""" + described = [] + for result in results: + if isinstance(result, NodeView): + described.append(_describe_view(result)) + else: + described.append( + { + "type": type(result).__name__, + "value": repr(result), + } + ) + return {"results": described} + + +def _describe_view(view: NodeView) -> Dict[str, Any]: + """Describe a single view instance.""" + cls = type(view) + props = [] + methods = [] + + for name, obj in inspect.getmembers(cls): + if name.startswith("_"): + continue + if isinstance(obj, property): + props.append(name) + elif callable(obj) and not isinstance(obj, (staticmethod, classmethod)): + sig = "" + try: + sig = str(inspect.signature(obj)) + except (ValueError, TypeError): + pass + methods.append(f"{name}{sig}") + + summary = _summarize_view(view) + + result: Dict[str, Any] = { + "type": cls.__name__, + "properties": props, + "methods": methods, + } + if summary: + result["summary"] = summary + return result + + +def _summarize_view(view: NodeView) -> str: + """Generate a brief summary string for a view.""" + from hcl2.query.blocks import BlockView + from hcl2.query.attributes import AttributeView + + if isinstance(view, BlockView): + return f"block_type={view.block_type!r}, labels={view.labels!r}" + if isinstance(view, AttributeView): + return f"name={view.name!r}" + return "" + + +def build_schema() -> Dict[str, Any]: + """Build the full view API schema for --schema output.""" + views = {} + for rule_type, view_cls in _VIEW_REGISTRY.items(): + views[view_cls.__name__] = _schema_for_class(view_cls, rule_type) + + # Add base NodeView + views["NodeView"] = _schema_for_class(NodeView, None) + + return { + "docs": "https://github.com/amplify-education/python-hcl2/tree/main/docs", + "query_guide": { + "mode_preference": [ + "1. Structural (default) — preferred for all queries. jq-like syntax.", + "2. Hybrid (::) — only when you need Python on structural results.", + "3. Eval (-e) — last resort. Many expressions are blocked for safety.", + ], + "structural_syntax": { + "navigate": "resource.aws_instance.main.ami", + "wildcard": "variable[*]", + "skip_labels": "resource~[*]", + "pipes": "resource[*] | .tags | keys", + "select": "resource~[select(.ami)]", + "string_functions": ( + 'select(.source | contains("x")), ' + 'select(.ami | test("^ami-")), ' + 'select(.name | startswith("prod-")), ' + 'select(.path | endswith("/api"))' + ), + "has": 'select(has("tags"))', + "postfix_not": "select(.tags | not)", + "any_all": 'any(.elements; .type == "function_call")', + "construct": "{name: .name, type: .block_type, file: .__file__}", + "recursive": "*..function_call:*", + "optional": "nonexistent?", + }, + "output_flags": { + "--json": "JSON output", + "--value": "Raw value (keeps quotes on strings)", + "--raw": "Raw value (strips quotes, ideal for shell piping)", + "--no-filename": "Suppress filename prefix in multi-file mode", + }, + "examples": [ + "hq 'resource.aws_instance~[*] | .ami' dir/ --raw", + "hq 'module~[select(.source | contains(\"docker\"))]' dir/ --json", + "hq 'resource~[select(has(\"tags\"))] | {name: .name_labels, tags}' dir/ --json", + "hq 'variable~[select(.default)] | {name: .name_labels, default}' . --raw", + "hq file1.tf --diff file2.tf --json", + ], + }, + "views": views, + "eval_namespace": { + "note": "Eval mode (-e) is a last resort. Prefer structural queries.", + "builtins": sorted(_SAFE_CALLABLE_NAMES), + "variables": { + "doc": "DocumentView", + "_": "NodeView (per-result in hybrid mode)", + }, + }, + } + + +# pylint: disable-next=too-many-locals +def _schema_for_class(cls, rule_type) -> Dict[str, Any]: + """Build schema for a single view class.""" + result: Dict[str, Any] = {} + if rule_type is not None: + result["wraps"] = rule_type.__name__ + + props = {} + methods = {} + static_methods = {} + + # Collect staticmethod names from __dict__ of cls and its bases + static_names = set() + for klass in cls.__mro__: + for attr_name, attr_val in klass.__dict__.items(): + if isinstance(attr_val, staticmethod): + static_names.add(attr_name) + + for name in sorted(dir(cls)): + if name.startswith("_"): + continue + obj = getattr(cls, name) + if isinstance(obj, property): + # Get return annotation if available + ann = "" + if obj.fget and hasattr(obj.fget, "__annotations__"): + ret = obj.fget.__annotations__.get("return") + if ret: + ann = str(ret) + prop_info: Dict[str, str] = {"type": ann or "Any"} + # Extract description from property docstring + doc = obj.fget.__doc__ if obj.fget else None + if doc: + prop_info["description"] = doc.strip() + props[name] = prop_info + elif name in static_names: + try: + sig = str(inspect.signature(obj)) + except (ValueError, TypeError): + sig = "(...)" + static_methods[name] = sig + elif callable(obj): + try: + sig = str(inspect.signature(obj)) + except (ValueError, TypeError): + sig = "(...)" + methods[name] = sig + + if props: + result["properties"] = props + if methods: + result["methods"] = methods + if static_methods: + result["static_methods"] = static_methods + + return result diff --git a/hcl2/query/path.py b/hcl2/query/path.py new file mode 100644 index 00000000..de0e4a71 --- /dev/null +++ b/hcl2/query/path.py @@ -0,0 +1,258 @@ +"""Structural path parser for the hq query language.""" + +import re +from dataclasses import dataclass +from typing import List, Optional, Tuple + + +class QuerySyntaxError(Exception): + """Raised when a structural path cannot be parsed.""" + + +@dataclass(frozen=True) +class PathSegment: + """A single segment in a structural path.""" + + name: str # identifier or "*" for wildcard + select_all: bool # True if [*] suffix + index: Optional[int] # integer if [N] suffix, None otherwise + recursive: bool = False # True for ".." recursive descent + predicate: object = None # PredicateNode if [select(...)] suffix + type_filter: Optional[str] = None # e.g. "function_call" in function_call:name + skip_labels: bool = False # True if ~ suffix (skip remaining block labels) + + +# Optional type qualifier prefix: type_filter:name~?[bracket]? +_SEGMENT_RE = re.compile( + r"^(?:([a-z_]+):)?([a-zA-Z_][a-zA-Z0-9_-]*|\*)(~)?(?:\[(\*|[0-9]+)\])?\??$" +) + + +def parse_path(path_str: str) -> List[PathSegment]: # pylint: disable=too-many-locals + """Parse a structural path string into segments. + + Grammar: + path := segment ("." segment)* + segment := name ("[*]" | "[" INT "]")? + name := "*" | IDENTIFIER + + Raises QuerySyntaxError on invalid input. + """ + if not path_str or not path_str.strip(): + raise QuerySyntaxError("Empty path") + + # jq compat: .[] is an alias for [*] + path_str = path_str.replace(".[]", "[*]") + + segments: List[PathSegment] = [] + parts = _split_path(path_str) + + for is_recursive, part in parts: + # Check for [select(...)] syntax + select_match = _extract_select(part) + if select_match is not None: + seg_name, predicate, type_filter, skip, sel_all, sel_idx = select_match + segments.append( + PathSegment( + name=seg_name, + select_all=sel_all, + index=sel_idx, + recursive=is_recursive, + predicate=predicate, + type_filter=type_filter, + skip_labels=skip, + ) + ) + continue + + match = _SEGMENT_RE.match(part) + if not match: + raise QuerySyntaxError(f"Invalid path segment: {part!r} in {path_str!r}") + + type_filter = match.group(1) # optional "type:" prefix + name = match.group(2) + skip_labels = match.group(3) is not None # "~" suffix + bracket = match.group(4) + + if bracket is None: + segments.append( + PathSegment( + name=name, + select_all=False, + index=None, + recursive=is_recursive, + type_filter=type_filter, + skip_labels=skip_labels, + ) + ) + elif bracket == "*": + segments.append( + PathSegment( + name=name, + select_all=True, + index=None, + recursive=is_recursive, + type_filter=type_filter, + skip_labels=skip_labels, + ) + ) + else: + segments.append( + PathSegment( + name=name, + select_all=False, + index=int(bracket), + recursive=is_recursive, + type_filter=type_filter, + skip_labels=skip_labels, + ) + ) + + return segments + + +# pylint: disable-next=too-many-statements +def _split_path(path_str: str) -> List[Tuple[bool, str]]: + """Split a path string into (is_recursive, segment_text) pairs. + + Handles both single dots (normal) and double dots (recursive descent). + Bracket-aware: dots inside ``[...]`` are not treated as separators. + """ + result: List[Tuple[bool, str]] = [] + i = 0 + current: List[str] = [] + bracket_depth = 0 + paren_depth = 0 + + while i < len(path_str): + char = path_str[i] + + if char == "[": + bracket_depth += 1 + current.append(char) + i += 1 + elif char == "]": + bracket_depth -= 1 + current.append(char) + i += 1 + elif char == "(": + paren_depth += 1 + current.append(char) + i += 1 + elif char == ")": + paren_depth -= 1 + current.append(char) + i += 1 + elif char == '"': + # Consume entire quoted string, respecting escaped quotes + current.append(char) + i += 1 + while i < len(path_str) and path_str[i] != '"': + if path_str[i] == "\\" and i + 1 < len(path_str): + current.append(path_str[i]) + i += 1 + current.append(path_str[i]) + i += 1 + if i < len(path_str): + current.append(path_str[i]) + i += 1 + elif char == "." and bracket_depth == 0 and paren_depth == 0: + # Emit current segment if any + if current: + result.append((False, "".join(current))) + current = [] + elif not result: + raise QuerySyntaxError(f"Path cannot start with '.': {path_str!r}") + + # Check for ".." (recursive descent) + if i + 1 < len(path_str) and path_str[i + 1] == ".": + i += 2 # skip both dots + # Collect the next segment (respecting brackets) + next_seg: List[str] = [] + bracket_depth = 0 + while i < len(path_str): + char = path_str[i] + if char == "[": + bracket_depth += 1 + elif char == "]": + bracket_depth -= 1 + elif char == "." and bracket_depth == 0: + break + next_seg.append(char) + i += 1 + if not next_seg: + raise QuerySyntaxError(f"Expected segment after '..': {path_str!r}") + result.append((True, "".join(next_seg))) + else: + i += 1 # skip single dot + else: + current.append(char) + i += 1 + + if current: + result.append((False, "".join(current))) + + if not result: + raise QuerySyntaxError(f"Empty path: {path_str!r}") + + return result + + +def _extract_select(part: str) -> Optional[tuple]: # pylint: disable=too-many-locals + """Extract ``name[select(...)]`` from a segment string. + + Returns ``(name, predicate_node)`` or ``None`` if not a select bracket. + """ + select_marker = "[select(" + idx = part.find(select_marker) + if idx == -1: + return None + + seg_name = part[:idx] + if not seg_name or not re.match( + r"^(?:[a-z_]+:)?(?:[a-zA-Z_][a-zA-Z0-9_-]*|\*)~?$", seg_name + ): + raise QuerySyntaxError(f"Invalid segment name before [select(): {seg_name!r}") + + # Parse optional type_filter:name prefix + type_filter = None + if ":" in seg_name: + type_filter, seg_name = seg_name.split(":", 1) + + # Parse optional ~ suffix + skip_labels = seg_name.endswith("~") + if skip_labels: + seg_name = seg_name[:-1] + + # Find matching )] for select(...), allowing optional trailing [*] or [N] + inner_start = idx + len(select_marker) + close_idx = part.find(")]", inner_start) + if close_idx == -1: + raise QuerySyntaxError(f"Expected )] at end of select bracket in: {part!r}") + inner = part[inner_start:close_idx] + tail = part[close_idx + 2 :] # text after ")]" + + from hcl2.query.predicate import parse_predicate + + predicate = parse_predicate(inner) + + # Parse optional trailing [*] or [N] after [select(...)], with optional ? + select_all = True # default: select returns all matches + index = None + if tail: + # Strip trailing ? (optional operator is a no-op at segment level) + clean_tail = tail.rstrip("?") + if clean_tail: + tail_match = re.match(r"^\[(\*|[0-9]+)\]$", clean_tail) + if not tail_match: + raise QuerySyntaxError( + f"Unexpected suffix after [select(...)]: {tail!r} in {part!r}" + ) + bracket = tail_match.group(1) + if bracket == "*": + select_all = True + else: + select_all = False + index = int(bracket) + + return (seg_name, predicate, type_filter, skip_labels, select_all, index) diff --git a/hcl2/query/pipeline.py b/hcl2/query/pipeline.py new file mode 100644 index 00000000..06b163fe --- /dev/null +++ b/hcl2/query/pipeline.py @@ -0,0 +1,474 @@ +"""Pipeline operator for chaining query stages.""" + +from dataclasses import dataclass +from typing import Any, List, Tuple + +from hcl2.query.path import QuerySyntaxError, PathSegment, parse_path + + +@dataclass(frozen=True) +class PathStage: + """A normal dotted-path stage.""" + + segments: List[PathSegment] + + +@dataclass(frozen=True) +class BuiltinStage: + """A builtin function stage (keys, values, length).""" + + name: str + unpack: bool = False # True when [*] suffix is used + + +@dataclass(frozen=True) +class SelectStage: + """A select() predicate stage.""" + + predicate: Any # PredicateNode from predicate.py + + +@dataclass(frozen=True) +class ConstructStage: + """A ``{field1, field2, key: .path}`` object construction stage.""" + + fields: List[Tuple[str, List[PathSegment]]] # [(output_key, path_segments), ...] + + +class _LocatedDict(dict): + """Dict that carries source location metadata from a construct stage. + + When ``execute_pipeline`` builds a dict from a ``ConstructStage``, the + input ``NodeView``'s ``_meta`` (line/column info) is stored here so + downstream consumers (e.g. ``--with-location``) can still access it. + """ + + _source_meta: Any = None + + +def split_pipeline(query_str: str) -> List[str]: + """Split a query string on ``|`` at depth 0. + + Tracks ``[]``, ``()`` depth and ``"..."`` quote state so that + pipes inside brackets, parentheses, or strings are not split. + + Raises :class:`QuerySyntaxError` on empty stages. + """ + stages: List[str] = [] + current: List[str] = [] + bracket_depth = 0 + paren_depth = 0 + brace_depth = 0 + in_string = False + + for i, char in enumerate(query_str): + if in_string: + current.append(char) + if char == '"' and (i == 0 or query_str[i - 1] != "\\"): + in_string = False + continue + + if char == '"': + in_string = True + current.append(char) + elif char == "[": + bracket_depth += 1 + current.append(char) + elif char == "]": + bracket_depth -= 1 + current.append(char) + elif char == "(": + paren_depth += 1 + current.append(char) + elif char == ")": + paren_depth -= 1 + current.append(char) + elif char == "{": + brace_depth += 1 + current.append(char) + elif char == "}": + brace_depth -= 1 + current.append(char) + elif ( + char == "|" and bracket_depth == 0 and paren_depth == 0 and brace_depth == 0 + ): + stage = "".join(current).strip() + if not stage: + raise QuerySyntaxError("Empty stage in pipeline") + stages.append(stage) + current = [] + else: + current.append(char) + + # Final stage + tail = "".join(current).strip() + if not tail and stages: + raise QuerySyntaxError("Empty stage in pipeline") + if tail: + stages.append(tail) + + if not stages: + raise QuerySyntaxError("Empty pipeline") + + return stages + + +def classify_stage(stage_str: str) -> Any: + """Classify a stage string into a PipeStage type. + + - ``select(...)`` → :class:`SelectStage` + - ``keys`` / ``values`` / ``length`` → :class:`BuiltinStage` + - Otherwise → :class:`PathStage` + """ + from hcl2.query.builtins import BUILTIN_NAMES + + stripped = stage_str.strip() + + # Strip trailing ? (optional operator is a no-op at stage level) + if stripped.endswith("?"): + stripped = stripped[:-1].rstrip() + + if stripped.startswith("select(") and stripped.endswith(")"): + from hcl2.query.predicate import parse_predicate + + inner = stripped[len("select(") : -1] + predicate = parse_predicate(inner) + return SelectStage(predicate=predicate) + + if stripped in BUILTIN_NAMES: + return BuiltinStage(name=stripped) + + # Allow builtin[*] to unpack list results into individual items + if stripped.endswith("[*]") and stripped[:-3] in BUILTIN_NAMES: + return BuiltinStage(name=stripped[:-3], unpack=True) + + # Object construction: {field1, field2} or {key: .path, ...} + if stripped.startswith("{") and stripped.endswith("}"): + fields = _parse_construct(stripped[1:-1]) + return ConstructStage(fields=fields) + + # Allow jq-style leading dot (e.g. ".foo" in a pipe stage) + path_str = stripped + if path_str == ".": + return PathStage(segments=[]) + if path_str.startswith(".") and len(path_str) > 1 and path_str[1] != ".": + path_str = path_str[1:] + + return PathStage(segments=parse_path(path_str)) + + +def _split_construct_fields(inner: str) -> List[str]: + """Split the inner part of ``{...}`` on commas, respecting brackets and parens.""" + fields: List[str] = [] + current: List[str] = [] + bracket_depth = 0 + paren_depth = 0 + + for char in inner: + if char == "[": + bracket_depth += 1 + current.append(char) + elif char == "]": + bracket_depth -= 1 + current.append(char) + elif char == "(": + paren_depth += 1 + current.append(char) + elif char == ")": + paren_depth -= 1 + current.append(char) + elif char == "," and bracket_depth == 0 and paren_depth == 0: + field = "".join(current).strip() + if field: + fields.append(field) + current = [] + else: + current.append(char) + + tail = "".join(current).strip() + if tail: + fields.append(tail) + + return fields + + +def _parse_construct(inner: str) -> List[Tuple[str, List[PathSegment]]]: + """Parse the fields inside ``{...}`` into (key, path_segments) pairs.""" + raw_fields = _split_construct_fields(inner) + if not raw_fields: + raise QuerySyntaxError("Empty object construction: {}") + + result: List[Tuple[str, List[PathSegment]]] = [] + for field in raw_fields: + if ":" in field: + # Renamed: key: .path + colon_idx = field.index(":") + key = field[:colon_idx].strip() + path_str = field[colon_idx + 1 :].strip() + if path_str.startswith(".") and len(path_str) > 1: + path_str = path_str[1:] + result.append((key, parse_path(path_str))) + elif field.startswith("."): + # Dotted shorthand: .path → key=last segment + path_str = field[1:] + segments = parse_path(path_str) + key = segments[-1].name + result.append((key, segments)) + else: + # Shorthand: field_name → key=field_name, path=field_name + result.append((field, parse_path(field))) + + return result + + +def _unwrap_construct_value(value: Any) -> Any: + """Unwrap an AttributeView to its value for object construction. + + When constructing ``{name, type}``, resolving ``name`` returns an + ``AttributeView`` whose ``to_dict()`` produces ``{"name": "..."}`` + — but we want just the value, not the key-value wrapper. + """ + from hcl2.query.attributes import AttributeView + + if isinstance(value, AttributeView): + return value.value_node + if isinstance(value, list): + return [_unwrap_construct_value(v) for v in value] + return value + + +def _to_json_value(value: Any) -> Any: + """Convert a value to a JSON-serializable Python value.""" + from hcl2.query._base import NodeView + + if isinstance(value, NodeView): + return value.to_dict() + if isinstance(value, list): + return [_to_json_value(v) for v in value] + return value + + +def _resolve_path_item(item: Any, segments: List[PathSegment]) -> List[Any]: + """Resolve a path stage against a single item. + + Tries property access, then structural resolution, then structural + resolution on an unwrapped version of the item. As a last resort, + checks whether the unwrapped item itself satisfies a type-qualifier + filter (so ``object:*`` in a pipe stage acts like ``select(.type == …)``). + """ + from hcl2.query._base import NodeView + from hcl2.query.resolver import resolve_path + + if not isinstance(item, NodeView): + return [] + + # Try property access first (before unwrapping) + prop = _try_property_access(item, segments) + if prop is not None: + return [prop] + + # Structural resolution on the item as-is + resolved = resolve_path(item, segments) + if resolved: + return resolved + + # Try structural resolution on unwrapped item + unwrapped_item = _unwrap_single(item) + if unwrapped_item is not item: + resolved = resolve_path(unwrapped_item, segments) + if resolved: + return resolved + + # Last resort: single type-qualified wildcard in a pipe stage can match + # the unwrapped item itself (e.g. ``| object:*`` keeps only objects). + if unwrapped_item is not item: + matched = _try_type_match(unwrapped_item, segments) + if matched is not None: + return [matched] + + return [] + + +# pylint: disable-next=too-many-locals +def execute_pipeline(root: Any, stages: List[Any], file_path: str = "") -> List[Any]: + """Execute a list of stages against a root view. + + Starts with ``[root]`` and feeds results through each stage. + """ + from hcl2.query.builtins import apply_builtin + from hcl2.query.predicate import evaluate_predicate + + results: List[Any] = [root] + + for i, stage in enumerate(stages): + next_results: List[Any] = [] + + if isinstance(stage, PathStage): + for item in results: + next_results.extend(_resolve_path_item(item, stage.segments)) + + # When the next stage is a builtin or select, unwrap so they + # see underlying values instead of wrapper views. + # Don't unwrap for ConstructStage — it needs original views + # for property access like .block_type, .name_labels. + if i < len(stages) - 1 and not isinstance( + stages[i + 1], (PathStage, ConstructStage) + ): + next_results = _unwrap_for_next_stage(next_results) + + elif isinstance(stage, BuiltinStage): + next_results = apply_builtin(stage.name, results) + if stage.unpack: + unpacked: List[Any] = [] + for item in next_results: + if isinstance(item, list): + unpacked.extend(item) + else: + unpacked.append(item) + next_results = unpacked + elif isinstance(stage, SelectStage): + for item in results: + if evaluate_predicate(stage.predicate, item): + next_results.append(item) + elif isinstance(stage, ConstructStage): + from hcl2.query._base import NodeView + + for item in results: + obj = _LocatedDict() + if isinstance(item, NodeView): + obj._source_meta = getattr(item.raw, "_meta", None) + elif isinstance(item, _LocatedDict): + obj._source_meta = item._source_meta + for key, segments in stage.fields: + # __file__ is a virtual field resolved to the source path + if len(segments) == 1 and segments[0].name == "__file__": + obj[key] = file_path + continue + resolved = _resolve_path_item(item, segments) + if resolved: + val = resolved[0] if len(resolved) == 1 else resolved + obj[key] = _to_json_value(_unwrap_construct_value(val)) + else: + obj[key] = None + next_results.append(obj) + else: + raise QuerySyntaxError(f"Unknown stage type: {stage!r}") + + results = next_results + if not results: + return [] + + return results + + +def _try_type_match(node: Any, segments: List[PathSegment]) -> Any: + """Check if a node matches a single type-qualified wildcard segment. + + Enables ``| object:*`` as a pipe-stage type filter. Returns the node + if it matches, or ``None`` otherwise. + """ + from hcl2.query._base import NodeView, view_type_name + + if len(segments) != 1: + return None + + seg = segments[0] + if seg.type_filter is None or seg.name != "*": + return None + + if not isinstance(node, NodeView): + return None + + if view_type_name(node) == seg.type_filter: + return node + return None + + +def _try_property_access( # pylint: disable=too-many-return-statements + node: Any, segments: List[PathSegment] +) -> Any: + """Try resolving segments as Python property accesses on a view. + + Falls back to this when structural resolution returns nothing. + Only handles single-segment paths (no dots) with no type filter. + Returns the property value, or ``None`` if no matching property exists. + """ + from hcl2.query._base import NodeView + + if len(segments) != 1: + return None + + seg = segments[0] + if seg.type_filter is not None or not isinstance(node, NodeView): + return None + + # Check for a Python property on the view class + # In query context, .value resolves to .value_node so it formats + # consistently across output modes (HCL expression, not ${...} wrapped). + prop_name = seg.name + if prop_name == "value" and hasattr(type(node), "value_node"): + prop_name = "value_node" + + prop_descriptor = getattr(type(node), prop_name, None) + if not isinstance(prop_descriptor, property): + return None + + value = getattr(node, prop_name) + + # Apply index/select_all to list-valued properties + if seg.select_all and isinstance(value, list): + return value + if seg.index is not None and isinstance(value, list): + if 0 <= seg.index < len(value): + return value[seg.index] + return None + + return value + + +def _unwrap_single(item: Any) -> Any: + """Unwrap a single view for structural resolution. + + Returns the unwrapped view, or the original item if no unwrapping applies. + """ + from hcl2.query._base import NodeView, view_for + from hcl2.query.attributes import AttributeView + from hcl2.query.blocks import BlockView + from hcl2.rules.expressions import ExprTermRule + + if isinstance(item, AttributeView): + item = item.value_node + elif isinstance(item, BlockView): + item = item.body + if isinstance(item, NodeView) and isinstance(item._node, ExprTermRule): + inner = item._node.expression + if inner is not None: + item = view_for(inner) + return item + + +def _unwrap_for_next_stage(results: List[Any]) -> List[Any]: + """Unwrap views for pipeline chaining between stages. + + - AttributeView → value node (unwrapped from ExprTermRule) + - BlockView → body (so subsequent stages see attributes/blocks, not labels) + - ExprTermRule wrapper → concrete inner view + """ + from hcl2.query._base import NodeView, view_for + from hcl2.query.attributes import AttributeView + from hcl2.query.blocks import BlockView + from hcl2.rules.expressions import ExprTermRule + + unwrapped: List[Any] = [] + for item in results: + if isinstance(item, AttributeView): + item = item.value_node + elif isinstance(item, BlockView): + item = item.body + # Unwrap ExprTermRule wrappers to concrete view types + if isinstance(item, NodeView) and isinstance(item._node, ExprTermRule): + inner = item._node.expression + if inner is not None: + item = view_for(inner) + unwrapped.append(item) + return unwrapped diff --git a/hcl2/query/predicate.py b/hcl2/query/predicate.py new file mode 100644 index 00000000..cf8756c0 --- /dev/null +++ b/hcl2/query/predicate.py @@ -0,0 +1,570 @@ +"""Self-contained recursive descent parser and evaluator for select() predicates. + +Predicate grammar:: + + predicate := or_expr + or_expr := and_expr ("or" and_expr)* + and_expr := not_expr ("and" not_expr)* + not_expr := "not" not_expr | comparison + comparison := accessor (comp_op literal)? | any_all | has_expr + any_all := ("any" | "all") "(" accessor ";" predicate ")" + has_expr := "has" "(" STRING ")" + accessor := "." IDENT ("." IDENT)* ("[" INT "]")? ("|" BUILTIN_OR_FUNC)? + BUILTIN := "keys" | "values" | "length" | "not" + FUNC := ("contains" | "test" | "startswith" | "endswith") "(" STRING ")" + literal := STRING | NUMBER | "true" | "false" | "null" + comp_op := "==" | "!=" | "<" | ">" | "<=" | ">=" + +No Python eval() is used. +""" + +import re +from dataclasses import dataclass +from typing import Any, List, Optional, Union + +from hcl2.query.path import QuerySyntaxError + + +# --------------------------------------------------------------------------- +# AST nodes +# --------------------------------------------------------------------------- + + +_STRING_FUNCTIONS = frozenset({"contains", "test", "startswith", "endswith"}) + + +@dataclass(frozen=True) +class Accessor: + """A dotted accessor, e.g. ``.foo.bar[0]`` or ``.foo | length``.""" + + parts: List[str] # ["foo", "bar"] + index: Optional[int] = None # [0] suffix + builtin: Optional[str] = None # "length", "keys", "values", "not" + builtin_arg: Optional[str] = None # argument for string functions + + +@dataclass(frozen=True) +class Comparison: + """``accessor comp_op literal`` or bare ``accessor`` (existence check).""" + + accessor: Accessor + operator: Optional[str] = None # "==", "!=", "<", ">", "<=", ">=" + value: Any = None # Python literal value + + +@dataclass(frozen=True) +class NotExpr: + """``not expr``.""" + + child: Any # PredicateNode + + +@dataclass(frozen=True) +class AndExpr: + """``expr and expr ...``.""" + + children: List[Any] + + +@dataclass(frozen=True) +class OrExpr: + """``expr or expr ...``.""" + + children: List[Any] + + +@dataclass(frozen=True) +class AnyExpr: + """``any(accessor; predicate)`` — true if any element matches.""" + + accessor: "Accessor" + predicate: Any # PredicateNode + + +@dataclass(frozen=True) +class AllExpr: + """``all(accessor; predicate)`` — true if all elements match.""" + + accessor: "Accessor" + predicate: Any # PredicateNode + + +@dataclass(frozen=True) +class HasExpr: + """``has("key")`` — true if the key exists on the target.""" + + key: str + + +PredicateNode = Union[Comparison, NotExpr, AndExpr, OrExpr, AnyExpr, AllExpr, HasExpr] + + +# --------------------------------------------------------------------------- +# Tokeniser +# --------------------------------------------------------------------------- + +_TOKEN_RE = re.compile( + r""" + (?P\.) + | (?P\|) + | (?P;) + | (?P\() + | (?P\)) + | (?P\[) + | (?P\]) + | (?P==|!=|<=|>=|<|>) + | (?P"(?:[^"\\]|\\.)*") + | (?P-?[0-9]+(?:\.[0-9]+)?) + | (?P[a-zA-Z_][a-zA-Z0-9_-]*) + | (?P\s+) + """, + re.VERBOSE, +) + + +@dataclass +class Token: + """A single token from the predicate tokeniser.""" + + kind: str + value: str + + +def tokenize(text: str) -> List[Token]: + """Tokenize a predicate string.""" + tokens: List[Token] = [] + pos = 0 + while pos < len(text): + match = _TOKEN_RE.match(text, pos) + if match is None: + raise QuerySyntaxError( + f"Unexpected character at position {pos} in predicate: {text!r}" + ) + pos = match.end() + kind = match.lastgroup + assert kind is not None + if kind == "WS": + continue + tokens.append(Token(kind=kind, value=match.group())) + return tokens + + +# --------------------------------------------------------------------------- +# Recursive descent parser +# --------------------------------------------------------------------------- + + +class _Parser: # pylint: disable=too-few-public-methods + """Consumes token list and builds a predicate AST.""" + + def __init__(self, tokens: List[Token]): + self.tokens = tokens + self.pos = 0 + + def _peek(self) -> Optional[Token]: + """Return current token without consuming.""" + if self.pos < len(self.tokens): + return self.tokens[self.pos] + return None + + def _advance(self) -> Token: + """Consume and return the current token.""" + tok = self.tokens[self.pos] + self.pos += 1 + return tok + + def _expect(self, kind: str) -> Token: + """Consume token of *kind*, or raise.""" + tok = self._peek() + if tok is None or tok.kind != kind: + found = tok.value if tok else "end-of-input" + raise QuerySyntaxError(f"Expected {kind}, got {found!r}") + return self._advance() + + def parse(self) -> PredicateNode: + """Parse the full token stream into a predicate AST.""" + node = self._or_expr() + if self.pos < len(self.tokens): + raise QuerySyntaxError(f"Unexpected token: {self.tokens[self.pos].value!r}") + return node + + def _or_expr(self) -> PredicateNode: + """Parse ``and_expr ('or' and_expr)*``.""" + children = [self._and_expr()] + tok = self._peek() + while tok and tok.kind == "WORD" and tok.value == "or": + self._advance() + children.append(self._and_expr()) + tok = self._peek() + return children[0] if len(children) == 1 else OrExpr(children=children) + + def _and_expr(self) -> PredicateNode: + """Parse ``not_expr ('and' not_expr)*``.""" + children = [self._not_expr()] + tok = self._peek() + while tok and tok.kind == "WORD" and tok.value == "and": + self._advance() + children.append(self._not_expr()) + tok = self._peek() + return children[0] if len(children) == 1 else AndExpr(children=children) + + def _not_expr(self) -> PredicateNode: + """Parse ``'not' not_expr | comparison``.""" + tok = self._peek() + if tok and tok.kind == "WORD" and tok.value == "not": + self._advance() + return NotExpr(child=self._not_expr()) + return self._comparison() + + def _comparison(self) -> PredicateNode: + """Parse ``accessor (comp_op literal)?``, ``any/all(...)``, or ``has(...)``.""" + tok = self._peek() + if tok and tok.kind == "WORD" and tok.value in ("any", "all"): + return self._any_all() + + if tok and tok.kind == "WORD" and tok.value == "has": + return self._has_expr() + + accessor = self._accessor() + tok = self._peek() + if tok and tok.kind == "OP": + comp_op = self._advance().value + value = self._literal() + return Comparison(accessor=accessor, operator=comp_op, value=value) + return Comparison(accessor=accessor) + + def _has_expr(self) -> PredicateNode: + """Parse ``has("key")``.""" + self._advance() # consume "has" + self._expect("LPAREN") + key_tok = self._expect("STRING") + key = key_tok.value[1:-1].replace('\\"', '"').replace("\\\\", "\\") + self._expect("RPAREN") + return HasExpr(key=key) + + def _any_all(self) -> PredicateNode: + """Parse ``any(accessor; predicate)`` or ``all(accessor; predicate)``.""" + func_name = self._advance().value # "any" or "all" + self._expect("LPAREN") + accessor = self._accessor() + self._expect("SEMI") + predicate = self._or_expr() + self._expect("RPAREN") + if func_name == "any": + return AnyExpr(accessor=accessor, predicate=predicate) + return AllExpr(accessor=accessor, predicate=predicate) + + def _accessor(self) -> Accessor: + """Parse ``'.' IDENT ('.' IDENT)* ('[' INT ']')? ('|' BUILTIN)?``.""" + from hcl2.query.builtins import BUILTIN_NAMES + + parts: List[str] = [] + self._expect("DOT") + parts.append(self._expect("WORD").value) + + tok = self._peek() + while tok and tok.kind == "DOT": + self._advance() + parts.append(self._expect("WORD").value) + tok = self._peek() + + # Optional [N] index + index = None + tok = self._peek() + if tok and tok.kind == "LBRACKET": + self._advance() + num_tok = self._expect("NUMBER") + index = int(num_tok.value) + self._expect("RBRACKET") + + # Optional | builtin/function (e.g. ``| length``, ``| contains("x")``, + # ``| not``) + builtin = None + builtin_arg = None + tok = self._peek() + if tok and tok.kind == "PIPE": + self._advance() + # Allow optional leading dot (jq-style ``| .length``) + dot_tok = self._peek() + if dot_tok and dot_tok.kind == "DOT": + self._advance() + word_tok = self._expect("WORD") + if word_tok.value in _STRING_FUNCTIONS: + builtin = word_tok.value + self._expect("LPAREN") + arg_tok = self._expect("STRING") + builtin_arg = ( + arg_tok.value[1:-1].replace('\\"', '"').replace("\\\\", "\\") + ) + self._expect("RPAREN") + elif word_tok.value == "not": + builtin = "not" + elif word_tok.value in BUILTIN_NAMES: + builtin = word_tok.value + else: + raise QuerySyntaxError( + f"Expected builtin or string function after |, " + f"got {word_tok.value!r}" + ) + + return Accessor( + parts=parts, index=index, builtin=builtin, builtin_arg=builtin_arg + ) + + def _literal(self) -> Any: # pylint: disable=too-many-return-statements + """Parse a literal value (string, number, boolean, or null).""" + tok = self._peek() + if tok is None: + raise QuerySyntaxError("Expected literal, got end-of-input") + + if tok.kind == "STRING": + self._advance() + return tok.value[1:-1].replace('\\"', '"').replace("\\\\", "\\") + + if tok.kind == "NUMBER": + self._advance() + if "." in tok.value: + return float(tok.value) + return int(tok.value) + + if tok.kind == "WORD": + if tok.value == "true": + self._advance() + return True + if tok.value == "false": + self._advance() + return False + if tok.value == "null": + self._advance() + return None + + raise QuerySyntaxError(f"Expected literal, got {tok.value!r}") + + +def parse_predicate(text: str) -> PredicateNode: + """Parse a predicate expression string into an AST.""" + tokens = tokenize(text) + if not tokens: + raise QuerySyntaxError("Empty predicate") + return _Parser(tokens).parse() + + +# --------------------------------------------------------------------------- +# Evaluator +# --------------------------------------------------------------------------- + + +def _resolve_accessor( # pylint: disable=too-many-return-statements + accessor: Accessor, target: Any +) -> Any: + """Resolve an accessor path against a target (typically a NodeView).""" + from hcl2.query._base import NodeView + from hcl2.query.blocks import BlockView + from hcl2.query.path import parse_path + from hcl2.query.resolver import resolve_path + + current = target + + for part in accessor.parts: + if current is None: + return None + + # Virtual ".type" accessor — returns short type name string + # Unwraps ExprTermRule so concrete inner type is reported. + if part == "type" and isinstance(current, NodeView): + from hcl2.query._base import view_for, view_type_name + from hcl2.rules.expressions import ExprTermRule + + unwrapped = current + if ( + type(current).__name__ == "NodeView" + and isinstance(current._node, ExprTermRule) + and current._node.expression is not None + ): + unwrapped = view_for(current._node.expression) + current = view_type_name(unwrapped) + continue + + # Try Python property first + if isinstance(current, NodeView) and hasattr(type(current), part): + prop = getattr(type(current), part, None) + if isinstance(prop, property): + current = getattr(current, part) + continue + + # Try structural resolution + if isinstance(current, NodeView): + segments = parse_path(part) + resolved = resolve_path(current, segments) + # For BlockViews, if label matching fails, try the body directly + if not resolved and isinstance(current, BlockView): + resolved = resolve_path(current.body, segments) + if not resolved: + current = None + break + current = resolved[0] if len(resolved) == 1 else resolved + elif isinstance(current, dict): + current = current.get(part) + else: + current = None + break + + # Apply index + if accessor.index is not None: + if isinstance(current, (list, tuple)): + if 0 <= accessor.index < len(current): + current = current[accessor.index] + else: + return None + elif hasattr(current, "__getitem__"): + try: + current = current[accessor.index] + except (IndexError, KeyError): + return None + else: + return None + + # Apply builtin transform (e.g. ``| length``, ``| contains("x")``, ``| not``) + # Note: postfix not and string functions must run even when current is None + if accessor.builtin is not None: + if accessor.builtin == "not": + return not (current is not None and current is not False and current != 0) + if accessor.builtin_arg is not None: + return _apply_string_function( + accessor.builtin, accessor.builtin_arg, current + ) + if current is not None: + current = _apply_accessor_builtin(accessor.builtin, current) + + return current + + +def _coerce_str(value: Any) -> str: + """Coerce a value to a string for string function matching.""" + from hcl2.query._base import NodeView + + if isinstance(value, NodeView): + dict_val = value.to_dict() + if isinstance(dict_val, str): + return dict_val + return str(dict_val) + if isinstance(value, str): + # Strip surrounding quotes from serialized HCL strings + if len(value) >= 2 and value[0] == '"' and value[-1] == '"': + return value[1:-1] + return value + if value is None: + return "" + return str(value) + + +def _apply_string_function(name: str, arg: str, current: Any) -> bool: + """Apply a string function (contains, test, startswith, endswith).""" + if current is None: + return False + coerced = _coerce_str(current) + if name == "contains": + return arg in coerced + if name == "startswith": + return coerced.startswith(arg) + if name == "endswith": + return coerced.endswith(arg) + if name == "test": + try: + return bool(re.search(arg, coerced)) + except re.error as exc: + raise QuerySyntaxError(f"Invalid regex in test(): {exc}") from exc + raise QuerySyntaxError(f"Unknown string function: {name!r}") + + +def _apply_accessor_builtin(name: str, value: Any) -> Any: + """Apply a builtin transform inside a predicate accessor.""" + from hcl2.query.builtins import apply_builtin + + results = apply_builtin(name, [value]) + if results: + return results[0] + return None + + +_KEYWORD_MAP = {"true": True, "false": False, "null": None} + + +def _to_comparable(value: Any) -> Any: + """Convert a NodeView to a comparable Python value.""" + from hcl2.query._base import NodeView + + if isinstance(value, NodeView): + value = value.to_dict() + # Coerce HCL keyword strings to Python types so that + # ``select(.x == true)`` matches the HCL keyword ``true``. + if isinstance(value, str) and value in _KEYWORD_MAP: + return _KEYWORD_MAP[value] + return value + + +_COMPARISON_OPS = { + "==": lambda a, b: a == b, + "!=": lambda a, b: a != b, + "<": lambda a, b: a < b, + ">": lambda a, b: a > b, + "<=": lambda a, b: a <= b, + ">=": lambda a, b: a >= b, +} + + +# pylint: disable-next=too-many-return-statements +def evaluate_predicate(pred: PredicateNode, target: Any) -> bool: + """Evaluate a predicate against a target (typically a NodeView).""" + if isinstance(pred, HasExpr): + return _evaluate_has(pred.key, target) + + if isinstance(pred, Comparison): + resolved = _resolve_accessor(pred.accessor, target) + if pred.operator is None: + # String functions and postfix not return bool directly + if isinstance(resolved, bool): + return resolved + # Existence / truthy check + return resolved is not None and resolved is not False and resolved != 0 + left = _to_comparable(resolved) + comp_fn = _COMPARISON_OPS.get(pred.operator) + if comp_fn is None: + raise QuerySyntaxError(f"Unknown operator: {pred.operator!r}") + return comp_fn(left, pred.value) + + if isinstance(pred, NotExpr): + return not evaluate_predicate(pred.child, target) + + if isinstance(pred, AndExpr): + return all(evaluate_predicate(c, target) for c in pred.children) + + if isinstance(pred, OrExpr): + return any(evaluate_predicate(c, target) for c in pred.children) + + if isinstance(pred, (AnyExpr, AllExpr)): + return _evaluate_any_all(pred, target) + + raise QuerySyntaxError(f"Unknown predicate node type: {type(pred).__name__}") + + +def _evaluate_has(key: str, target: Any) -> bool: + """Evaluate ``has("key")`` — check if a key exists on the target.""" + # Same as existence check for the given key + accessor = Accessor(parts=[key]) + resolved = _resolve_accessor(accessor, target) + return resolved is not None and resolved is not False and resolved != 0 + + +def _evaluate_any_all(pred: Union[AnyExpr, AllExpr], target: Any) -> bool: + """Evaluate ``any(accessor; predicate)`` or ``all(accessor; predicate)``.""" + resolved = _resolve_accessor(pred.accessor, target) + if resolved is None: + return isinstance(pred, AllExpr) # all() on empty is True, any() is False + + # Ensure we iterate over a list + if not isinstance(resolved, list): + resolved = [resolved] + + check = all if isinstance(pred, AllExpr) else any + return check(evaluate_predicate(pred.predicate, item) for item in resolved) diff --git a/hcl2/query/resolver.py b/hcl2/query/resolver.py new file mode 100644 index 00000000..46ec38a0 --- /dev/null +++ b/hcl2/query/resolver.py @@ -0,0 +1,350 @@ +"""Structural path resolver for the hq query language.""" + +from dataclasses import dataclass +from typing import TYPE_CHECKING, List, cast + +from hcl2 import walk as _walk_mod +from hcl2.query._base import NodeView +from hcl2.query.path import PathSegment + +if TYPE_CHECKING: + from hcl2.query.blocks import BlockView + + +@dataclass +class _ResolverState: + """Tracks position within multi-label blocks during resolution.""" + + node: NodeView + label_depth: int = 0 # how many block labels consumed so far + + +def resolve_path(root: NodeView, segments: List[PathSegment]) -> List[NodeView]: + """Resolve a structural path against a document view.""" + if not segments: + return [root] + + states = [_ResolverState(node=root)] + + for segment in segments: + next_states: List[_ResolverState] = [] + + if segment.recursive: + # Recursive descent: collect all descendants, then match + for state in states: + next_states.extend(_resolve_recursive(state, segment)) + else: + for state in states: + next_states.extend(_resolve_segment(state, segment)) + + states = next_states + if not states: + return [] + + return [s.node for s in states] + + +def _resolve_segment( # pylint: disable=too-many-return-statements + state: _ResolverState, segment: PathSegment +) -> List[_ResolverState]: + """Resolve a single segment against a state.""" + from hcl2.query.attributes import AttributeView + from hcl2.query.blocks import BlockView + from hcl2.query.body import BodyView, DocumentView + from hcl2.query.containers import ObjectView, TupleView + from hcl2.query.expressions import ConditionalView + from hcl2.query.functions import FunctionCallView + + node = state.node + + # DocumentView/BodyView: look up blocks and attributes by name + if isinstance(node, (DocumentView, BodyView)): + return _resolve_on_body(node, segment) + + # BlockView with unconsumed labels + if isinstance(node, BlockView) and state.label_depth < len(node.name_labels): + return _resolve_on_block_labels(node, segment, state.label_depth) + + # BlockView with labels consumed: delegate to body + if isinstance(node, BlockView): + return _resolve_on_body(node.body, segment) + + # AttributeView: unwrap to value_node + if isinstance(node, AttributeView): + value_view = node.value_node + return _resolve_segment(_ResolverState(node=value_view), segment) + + # ExprTermRule wrapper: unwrap to inner rule + if _is_expr_term(node): + inner = _unwrap_expr_term(node) + if inner is not None: + return _resolve_segment(_ResolverState(node=inner), segment) + return [] + + # ObjectView + if isinstance(node, ObjectView): + return _resolve_on_object(node, segment) + + # TupleView + if isinstance(node, TupleView): + return _resolve_on_tuple(node, segment) + + # FunctionCallView: resolve .args and .name + if isinstance(node, FunctionCallView): + return _resolve_on_function_call(node, segment) + + # ConditionalView: resolve .condition, .true_val, .false_val + if isinstance(node, ConditionalView): + return _resolve_on_conditional(node, segment) + + return [] + + +def _resolve_recursive( + state: _ResolverState, segment: PathSegment +) -> List[_ResolverState]: + """Recursive descent: try matching segment on the node and all descendants.""" + from hcl2.query._base import view_for + + results: List[_ResolverState] = [] + seen_ids: set = set() + + # Collect all descendant views to try matching against + candidates = [state] + for element in _walk_mod.walk_semantic(state.node._node): + wrapped = view_for(element) + candidates.append(_ResolverState(node=wrapped)) + + if segment.type_filter is not None: + # Type-qualified matching: match by type and name directly + results = _match_by_type_and_name(candidates, segment, seen_ids) + else: + non_recursive = PathSegment( + name=segment.name, + select_all=segment.select_all, + index=segment.index, + recursive=False, + predicate=segment.predicate, + type_filter=None, + ) + for candidate in candidates: + for match in _resolve_segment(candidate, non_recursive): + node_id = id(match.node._node) + if node_id not in seen_ids: + seen_ids.add(node_id) + results.append(match) + + return _apply_index_filter(results, segment) + + +def _match_by_type_and_name( + candidates: List[_ResolverState], segment: PathSegment, seen_ids: set +) -> List[_ResolverState]: + """Match candidates by type filter and name property.""" + from hcl2.query._base import view_type_name + + results: List[_ResolverState] = [] + for candidate in candidates: + node = candidate.node + type_name = view_type_name(node) + if type_name != segment.type_filter: + continue + + # Check name match + if segment.name == "*" or _node_matches_name(node, segment.name): + node_id = id(node._node) + if node_id not in seen_ids: + seen_ids.add(node_id) + results.append(candidate) + + return results + + +def _node_matches_name(node: NodeView, name: str) -> bool: + """Check if a node's name property matches the given name.""" + node_name = getattr(node, "name", None) + if node_name is not None: + return node_name == name + # BlockView: check block_type + block_type = getattr(node, "block_type", None) + if block_type is not None: + return block_type == name + return False + + +def _resolve_on_body(node: NodeView, segment: PathSegment) -> List[_ResolverState]: + """Resolve a segment on a DocumentView or BodyView.""" + from hcl2.query.body import BodyView, DocumentView + + # Get the actual body view for delegation + if isinstance(node, DocumentView): + body = node.body + elif isinstance(node, BodyView): + body = node + else: + return [] + + candidates: List[_ResolverState] = [] + + if segment.name == "*": + # Wildcard: all blocks and attributes + for blk in body.blocks(): + blk_view = cast("BlockView", blk) + depth = len(blk_view.name_labels) if segment.skip_labels else 0 + candidates.append(_ResolverState(node=blk, label_depth=depth)) + for attr in body.attributes(): + candidates.append(_ResolverState(node=attr)) + else: + # Match block types + for blk in body.blocks(segment.name): + blk_view = cast("BlockView", blk) + depth = len(blk_view.name_labels) if segment.skip_labels else 0 + candidates.append(_ResolverState(node=blk, label_depth=depth)) + # Match attribute names + for attr in body.attributes(segment.name): + candidates.append(_ResolverState(node=attr)) + + return _apply_index_filter(candidates, segment) + + +def _resolve_on_block_labels( + node: "NodeView", segment: PathSegment, label_depth: int +) -> List[_ResolverState]: + """Resolve a segment against unconsumed block labels.""" + from hcl2.query.blocks import BlockView + + # Type-qualified segments (e.g. tuple:*) never match labels + if segment.type_filter is not None: + return [] + + block: BlockView = node # type: ignore[assignment] + name_labels = block.name_labels + + if segment.name == "*": + # Wildcard matches any label + return [_ResolverState(node=block, label_depth=label_depth + 1)] + + if label_depth < len(name_labels) and name_labels[label_depth] == segment.name: + return [_ResolverState(node=block, label_depth=label_depth + 1)] + + return [] + + +def _resolve_on_object(node: "NodeView", segment: PathSegment) -> List[_ResolverState]: + """Resolve a segment on an ObjectView.""" + from hcl2.query.containers import ObjectView + + obj: ObjectView = node # type: ignore[assignment] + + if segment.name == "*": + candidates = [_ResolverState(node=v) for _, v in obj.entries] + return _apply_index_filter(candidates, segment) + + val = obj.get(segment.name) + if val is not None: + return [_ResolverState(node=val)] + return [] + + +def _resolve_on_tuple(node: "NodeView", segment: PathSegment) -> List[_ResolverState]: + """Resolve a segment on a TupleView.""" + from hcl2.query.containers import TupleView + + tup: TupleView = node # type: ignore[assignment] + + if segment.select_all: + return [_ResolverState(node=elem) for elem in tup.elements] + + if segment.index is not None: + try: + elem = tup[segment.index] + return [_ResolverState(node=elem)] + except IndexError: + return [] + + return [] + + +def _resolve_on_function_call( + node: "NodeView", segment: PathSegment +) -> List[_ResolverState]: + """Resolve a segment on a FunctionCallView.""" + from hcl2.query.functions import FunctionCallView + + func: FunctionCallView = node # type: ignore[assignment] + + if segment.name == "args": + args = func.args + candidates = [_ResolverState(node=arg) for arg in args] + return _apply_index_filter(candidates, segment) + + return [] + + +def _resolve_on_conditional( + node: "NodeView", segment: PathSegment +) -> List[_ResolverState]: + """Resolve a segment on a ConditionalView.""" + from hcl2.query.expressions import ConditionalView + + cond: ConditionalView = node # type: ignore[assignment] + + if segment.name == "condition": + return [_ResolverState(node=cond.condition)] + if segment.name == "true_val": + return [_ResolverState(node=cond.true_val)] + if segment.name == "false_val": + return [_ResolverState(node=cond.false_val)] + + return [] + + +def _apply_index_filter( + candidates: List[_ResolverState], segment: PathSegment +) -> List[_ResolverState]: + """Apply type filter, predicate filter, and [*]/[N] index to candidates.""" + # Apply type filter if present + if segment.type_filter is not None: + from hcl2.query._base import view_type_name + + candidates = [ + c for c in candidates if view_type_name(c.node) == segment.type_filter + ] + + # Apply predicate filter if present + if segment.predicate is not None: + from hcl2.query.predicate import evaluate_predicate + + pred = segment.predicate + candidates = [ + c + for c in candidates + if evaluate_predicate(pred, c.node) # type: ignore[arg-type] + ] + + if segment.select_all: + return candidates + if segment.index is not None: + if 0 <= segment.index < len(candidates): + return [candidates[segment.index]] + return [] + return candidates + + +def _is_expr_term(node: NodeView) -> bool: + """Check if a node wraps an ExprTermRule.""" + from hcl2.rules.expressions import ExprTermRule + + return isinstance(node._node, ExprTermRule) + + +def _unwrap_expr_term(node: NodeView): + """Unwrap ExprTermRule to a view over its inner rule.""" + from hcl2.query._base import view_for + from hcl2.rules.expressions import ExprTermRule + + expr_term: ExprTermRule = node._node # type: ignore[assignment] + inner = expr_term.expression + if inner is not None: + return view_for(inner) + return None diff --git a/hcl2/query/safe_eval.py b/hcl2/query/safe_eval.py new file mode 100644 index 00000000..12f277d9 --- /dev/null +++ b/hcl2/query/safe_eval.py @@ -0,0 +1,165 @@ +"""AST-validated restricted eval for the hq query language.""" + +import ast +from typing import Any, Dict + + +class UnsafeExpressionError(Exception): + """Raised when an expression contains disallowed constructs.""" + + +_ALLOWED_NODES = { + # Expression wrapper + ast.Expression, + # Core access patterns + ast.Attribute, + ast.Subscript, + ast.Call, + ast.Name, + ast.Constant, + ast.Starred, + # Index/slice + ast.Slice, + # Literal collections (as arguments) + ast.List, + ast.Tuple, + # Lambdas (for find_by_predicate, sorted key=, etc.) + ast.Lambda, + ast.arguments, + ast.arg, + # Keyword args + ast.keyword, + # Comparisons and boolean ops + ast.Compare, + ast.BoolOp, + ast.UnaryOp, + ast.BinOp, + ast.Eq, + ast.NotEq, + ast.Lt, + ast.Gt, + ast.LtE, + ast.GtE, + ast.Is, + ast.IsNot, + ast.In, + ast.NotIn, + ast.And, + ast.Or, + ast.Not, + ast.Add, + ast.Sub, + ast.Mult, + ast.Div, + ast.Mod, + ast.FloorDiv, + ast.USub, + ast.UAdd, + # Context + ast.Load, +} + +# Python 3.8 wraps subscript slices in ast.Index / ast.ExtSlice; +# these were removed in 3.9+. +if hasattr(ast, "Index"): + _ALLOWED_NODES.add(ast.Index) +if hasattr(ast, "ExtSlice"): + _ALLOWED_NODES.add(ast.ExtSlice) + +_SAFE_CALLABLE_NAMES = frozenset( + { + "len", + "str", + "int", + "float", + "bool", + "list", + "tuple", + "type", + "isinstance", + "sorted", + "reversed", + "enumerate", + "zip", + "range", + "min", + "max", + "print", + "any", + "all", + "filter", + "map", + } +) + +_SAFE_BUILTINS = { + name: ( + __builtins__[name] # type: ignore[index] + if isinstance(__builtins__, dict) + else getattr(__builtins__, name) + ) + for name in _SAFE_CALLABLE_NAMES +} +_SAFE_BUILTINS.update({"True": True, "False": False, "None": None}) + +_MAX_AST_DEPTH = 20 +_MAX_NODE_COUNT = 200 + + +def validate_expression(expr_str: str) -> ast.Expression: + """Parse and validate a Python expression. Raises UnsafeExpressionError on violations.""" + try: + tree = ast.parse(expr_str, mode="eval") + except SyntaxError as exc: + raise UnsafeExpressionError(f"Syntax error: {exc}") from exc + + node_count = 0 + + def _validate(node, depth=0): + nonlocal node_count + node_count += 1 + + if depth > _MAX_AST_DEPTH: + raise UnsafeExpressionError("Expression exceeds maximum depth") + if node_count > _MAX_NODE_COUNT: + raise UnsafeExpressionError("Expression exceeds maximum node count") + + if type(node) not in _ALLOWED_NODES: + raise UnsafeExpressionError(f"{type(node).__name__} is not allowed") + + # Block dunder attribute access (prevents sandbox escapes via + # __class__, __subclasses__, __globals__, etc.) + if isinstance(node, ast.Attribute) and node.attr.startswith("__"): + raise UnsafeExpressionError( + f"Access to dunder attribute {node.attr!r} is not allowed" + ) + + # Validate Call nodes + if isinstance(node, ast.Call): + func = node.func + # Allow method calls (attr access) + if isinstance(func, ast.Attribute): + pass + # Allow safe built-in names + elif isinstance(func, ast.Name): + if func.id not in _SAFE_CALLABLE_NAMES: + raise UnsafeExpressionError(f"Calling {func.id!r} is not allowed") + else: + raise UnsafeExpressionError( + "Only method calls and safe built-in calls are allowed" + ) + + for child in ast.iter_child_nodes(node): + _validate(child, depth + 1) + + _validate(tree) + return tree + + +def safe_eval(expr_str: str, variables: Dict[str, Any]) -> Any: + """Validate, compile, and eval with restricted namespace.""" + tree = validate_expression(expr_str) + code = compile(tree, "", "eval") + namespace = dict(_SAFE_BUILTINS) + namespace.update(variables) + return eval(code, {"__builtins__": {}}, namespace) # pylint: disable=eval-used diff --git a/hcl2/reconstructor.py b/hcl2/reconstructor.py index 7f957d7b..3f968627 100644 --- a/hcl2/reconstructor.py +++ b/hcl2/reconstructor.py @@ -1,734 +1,368 @@ -"""A reconstructor for HCL2 implemented using Lark's experimental reconstruction functionality""" - -import re -from typing import List, Dict, Callable, Optional, Union, Any, Tuple - -from lark import Lark, Tree -from lark.grammar import Terminal, Symbol -from lark.lexer import Token, PatternStr, TerminalDef -from lark.reconstruct import Reconstructor -from lark.tree_matcher import is_discarded_terminal -from lark.visitors import Transformer_InPlace -from regex import regex - -from hcl2.const import START_LINE_KEY, END_LINE_KEY -from hcl2.parser import reconstruction_parser - - -# function to remove the backslashes within interpolated portions -def reverse_quotes_within_interpolation(interp_s: str) -> str: - """ - A common operation is to `json.dumps(s)` where s is a string to output in - HCL. This is useful for automatically escaping any quotes within the - string, but this escapes quotes within interpolation incorrectly. This - method removes any erroneous escapes within interpolated segments of a - string. - """ - return re.sub(r"\$\{(.*)}", lambda m: m.group(0).replace('\\"', '"'), interp_s) - - -class WriteTokensAndMetaTransformer(Transformer_InPlace): - """ - Inserts discarded tokens into their correct place, according to the rules - of grammar, and annotates with metadata during reassembly. The metadata - tracked here include the terminal which generated a particular string - output, and the rule that that terminal was matched on. - - This is a modification of lark.reconstruct.WriteTokensTransformer - """ - - tokens: Dict[str, TerminalDef] - term_subs: Dict[str, Callable[[Symbol], str]] - - def __init__( - self, - tokens: Dict[str, TerminalDef], - term_subs: Dict[str, Callable[[Symbol], str]], - ) -> None: - super().__init__() - self.tokens = tokens - self.term_subs = term_subs - - def __default__(self, data, children, meta): - """ - This method is called for every token the transformer visits. - """ - - if not getattr(meta, "match_tree", False): - return Tree(data, children) - iter_args = iter( - [child[2] if isinstance(child, tuple) else child for child in children] - ) - to_write = [] - for sym in meta.orig_expansion: - if is_discarded_terminal(sym): - try: - value = self.term_subs[sym.name](sym) - except KeyError as exc: - token = self.tokens[sym.name] - if not isinstance(token.pattern, PatternStr): - raise NotImplementedError( - f"Reconstructing regexps not supported yet: {token}" - ) from exc - - value = token.pattern.value - - # annotate the leaf with the specific rule (data) and terminal - # (sym) it was generated from - to_write.append((data, sym, value)) - else: - item = next(iter_args) - if isinstance(item, list): - to_write += item - else: - if isinstance(item, Token): - # annotate the leaf with the specific rule (data) and - # terminal (sym) it was generated from - to_write.append((data, sym, item)) - else: - to_write.append(item) - - return to_write - - -class HCLReconstructor(Reconstructor): +"""Reconstruct HCL2 text from a Lark Tree AST.""" + +from typing import List, Optional, Union + +from lark import Tree, Token +from hcl2.rules import tokens +from hcl2.rules.base import BlockRule +from hcl2.rules.containers import ObjectElemRule +from hcl2.rules.directives import ( + TemplateIfRule, + TemplateForRule, + TemplateIfStartRule, + TemplateElseRule, + TemplateEndifRule, + TemplateForStartRule, + TemplateEndforRule, +) +from hcl2.rules.for_expressions import ForIntroRule, ForTupleExprRule, ForObjectExprRule +from hcl2.rules.literal_rules import IdentifierRule +from hcl2.rules.strings import StringRule +from hcl2.rules.expressions import ( + ExprTermRule, + ConditionalRule, + UnaryOpRule, +) + + +class HCLReconstructor: """This class converts a Lark.Tree AST back into a string representing the underlying HCL code.""" - def __init__( - self, - parser: Lark, - term_subs: Optional[Dict[str, Callable[[Symbol], str]]] = None, - ): - Reconstructor.__init__(self, parser, term_subs) - - self.write_tokens: WriteTokensAndMetaTransformer = ( - WriteTokensAndMetaTransformer( - {token.name: token for token in self.tokens}, term_subs or {} - ) - ) - - # these variables track state during reconstruction to enable us to make - # informed decisions about formatting output. They are primarily used - # by the _should_add_space(...) method. - self._last_char_space = True - self._last_terminal: Union[Terminal, None] = None - self._last_rule: Union[Tree, Token, None] = None - self._deferred_item = None - - def should_be_wrapped_in_spaces(self, terminal: Terminal) -> bool: - """Whether given terminal should be wrapped in spaces""" - return terminal.name in { - "IF", - "IN", - "FOR", - "FOR_EACH", - "FOR_OBJECT_ARROW", - "COLON", - "QMARK", - "BINARY_OP", - } - - def _is_equals_sign(self, terminal) -> bool: - return ( - isinstance(self._last_rule, Token) - and self._last_rule.value in ("attribute", "object_elem") - and self._last_terminal == Terminal("EQ") - and terminal != Terminal("NL_OR_COMMENT") - ) - - # pylint: disable=too-many-branches, too-many-return-statements - def _should_add_space(self, rule, current_terminal, is_block_label: bool = False): - """ - This method documents the situations in which we add space around - certain tokens while reconstructing the generated HCL. - - Additional rules can be added here if the generated HCL has - improper whitespace (affecting parse OR affecting ability to perfectly - reconstruct a file down to the whitespace level.) - - It has the following information available to make its decision: - - - the last token (terminal) we output - - the last rule that token belonged to - - the current token (terminal) we're about to output - - the rule the current token belongs to + _binary_op_types = { + "DOUBLE_EQ", + "NEQ", + "LT", + "GT", + "LEQ", + "GEQ", + "MINUS", + "ASTERISK", + "SLASH", + "PERCENT", + "DOUBLE_AMP", + "DOUBLE_PIPE", + "PLUS", + } + + _directive_rule_names = { + TemplateIfStartRule.lark_name(), + TemplateElseRule.lark_name(), + TemplateEndifRule.lark_name(), + TemplateForStartRule.lark_name(), + TemplateEndforRule.lark_name(), + TemplateIfRule.lark_name(), + TemplateForRule.lark_name(), + } - This should be sufficient to make a spacing decision. - """ - - # we don't need to add multiple spaces - if self._last_char_space: + def __init__(self): + self._last_was_space = True + self._current_indent = 0 + self._last_token_name: Optional[str] = None + self._last_rule_name: Optional[str] = None + + def _reset_state(self): + """Reset state tracking for formatting decisions.""" + self._last_was_space = True + self._current_indent = 0 + self._last_token_name = None + self._last_rule_name = None + + # pylint:disable=R0911,R0912 + def _should_add_space_before( + self, current_node: Union[Tree, Token], parent_rule_name: Optional[str] = None + ) -> bool: + """Determine if we should add a space before the current token/rule.""" + + # Don't add space if we already have one + if self._last_was_space: return False - # we don't add a space at the start of the file - if not self._last_terminal or not self._last_rule: + # Don't add space at the beginning + if self._last_token_name is None: return False - if self._is_equals_sign(current_terminal): - return True + if isinstance(current_node, Token): + token_type = current_node.type - if is_block_label and isinstance(rule, Token) and rule.value == "string": + # Space before '{' in blocks if ( - current_terminal == self._last_terminal == Terminal("DBLQUOTE") - or current_terminal == Terminal("DBLQUOTE") - and self._last_terminal == Terminal("NAME") + token_type == tokens.LBRACE.lark_name() + and parent_rule_name == BlockRule.lark_name() ): return True - # if we're in a ternary or binary operator, add space around the operator - if ( - isinstance(rule, Token) - and rule.value - in [ - "conditional", - "binary_operator", - ] - and self.should_be_wrapped_in_spaces(current_terminal) - ): - return True - - # if we just left a ternary or binary operator, add space around the - # operator unless there's a newline already - if ( - isinstance(self._last_rule, Token) - and self._last_rule.value - in [ - "conditional", - "binary_operator", - ] - and self.should_be_wrapped_in_spaces(self._last_terminal) - and current_terminal != Terminal("NL_OR_COMMENT") - ): - return True - - # if we're in a for or if statement and find a keyword, add a space - if ( - isinstance(rule, Token) - and rule.value - in [ - "for_object_expr", - "for_cond", - "for_intro", - ] - and self.should_be_wrapped_in_spaces(current_terminal) - ): - return True - - # if we've just left a for or if statement and find a keyword, add a - # space, unless we have a newline - if ( - isinstance(self._last_rule, Token) - and self._last_rule.value - in [ - "for_object_expr", - "for_cond", - "for_intro", - ] - and self.should_be_wrapped_in_spaces(self._last_terminal) - and current_terminal != Terminal("NL_OR_COMMENT") - ): - return True - - # if we're in a block - if (isinstance(rule, Token) and rule.value == "block") or ( - isinstance(rule, str) and re.match(r"^__block_(star|plus)_.*", rule) - ): - # always add space before the starting brace - if current_terminal == Terminal("LBRACE"): + # Space around Conditional Expression operators + if parent_rule_name == ConditionalRule.lark_name() and ( + token_type in [tokens.COLON.lark_name(), tokens.QMARK.lark_name()] + or self._last_token_name + in [tokens.COLON.lark_name(), tokens.QMARK.lark_name()] + ): + # COLON may already carry leading whitespace from the grammar + if token_type == tokens.COLON.lark_name() and str( + current_node + ).startswith((" ", "\t")): + return False return True - # always add space before the closing brace - if current_terminal == Terminal( - "RBRACE" - ) and self._last_terminal != Terminal("LBRACE"): + # Space before colon in for_intro + if ( + parent_rule_name == ForIntroRule.lark_name() + and token_type == tokens.COLON.lark_name() + ): + if str(current_node).startswith((" ", "\t")): + return False return True - # always add space between string literals - if current_terminal == Terminal("STRING_CHARS"): + # Space after commas in tuples and function arguments... + if self._last_token_name == tokens.COMMA.lark_name(): + # ... except before closing brackets or newlines + if token_type in (tokens.RSQB.lark_name(), "NL_OR_COMMENT"): + return False return True - # if we just opened a block, add a space, unless the block is empty - # or has a newline - if ( - isinstance(self._last_rule, Token) - and self._last_rule.value == "block" - and self._last_terminal == Terminal("LBRACE") - and current_terminal not in [Terminal("RBRACE"), Terminal("NL_OR_COMMENT")] - ): - return True - - # if we're in a tuple or function arguments (this rule matches commas between items) - if isinstance(self._last_rule, str) and re.match( - r"^__(tuple|arguments)_(star|plus)_.*", self._last_rule - ): - - # string literals, decimals, and identifiers should always be - # preceded by a space if they're following a comma in a tuple or - # function arg - if current_terminal in [ - Terminal("DBLQUOTE"), - Terminal("DECIMAL"), - Terminal("NAME"), - Terminal("NEGATIVE_DECIMAL"), + # Template directive spacing: %{~ keyword ~} patterns + if parent_rule_name in self._directive_rule_names: + # Space after DIRECTIVE_START (before keyword or strip marker) + if self._last_token_name == tokens.DIRECTIVE_START.lark_name(): + # No space before strip marker + if token_type == tokens.STRIP_MARKER.lark_name(): + return False + return True + # Space after STRIP_MARKER (before keyword) + if self._last_token_name == tokens.STRIP_MARKER.lark_name(): + # After strip marker: space before keyword, no space before RBRACE + if token_type == tokens.RBRACE.lark_name(): + return False + return True + # Space after keywords + if self._last_token_name in [ + tokens.FOR.lark_name(), + tokens.IN.lark_name(), + tokens.IF.lark_name(), + ]: + return True + # Space before IN keyword (after identifier) + if token_type == tokens.IN.lark_name(): + return True + # Space before STRIP_MARKER (before closing }) + if token_type == tokens.STRIP_MARKER.lark_name(): + return True + # Space before RBRACE (closing directive, no strip marker) + if token_type == tokens.RBRACE.lark_name(): + return True + # Space after COMMA in for directives + if self._last_token_name == tokens.COMMA.lark_name(): + return True + return False + + if token_type in [ + tokens.FOR.lark_name(), + tokens.IN.lark_name(), + tokens.IF.lark_name(), + tokens.ELLIPSIS.lark_name(), ]: return True - # the catch-all case, we're not sure, so don't add a space - return False - - def _reconstruct(self, tree, is_block_label=False): - unreduced_tree = self.match_tree(tree, tree.data) - res = self.write_tokens.transform(unreduced_tree) - for item in res: - # any time we encounter a child tree, we recurse - if isinstance(item, Tree): - yield from self._reconstruct( - item, (unreduced_tree.data == "block" and item.data != "body") - ) - - # every leaf should be a tuple, which contains information about - # which terminal the leaf represents - elif isinstance(item, tuple): - rule, terminal, value = item - - # first, handle any deferred items - if self._deferred_item is not None: - ( - deferred_rule, - deferred_terminal, - deferred_value, - ) = self._deferred_item - - # if we deferred a comma and the next character ends a - # parenthesis or block, we can throw it out - if deferred_terminal == Terminal("COMMA") and terminal in [ - Terminal("RPAR"), - Terminal("RBRACE"), - ]: - pass - # in any other case, we print the deferred item - else: - yield deferred_value - - # and do our bookkeeping - self._last_terminal = deferred_terminal - self._last_rule = deferred_rule - if deferred_value and not deferred_value[-1].isspace(): - self._last_char_space = False - - # clear the deferred item - self._deferred_item = None - - # potentially add a space before the next token - if self._should_add_space(rule, terminal, is_block_label): - yield " " - self._last_char_space = True - - # potentially defer the item if needed - if terminal in [Terminal("COMMA")]: - self._deferred_item = item - else: - # otherwise print the next token - yield value - - # and do our bookkeeping so we can make an informed - # decision about formatting next time - self._last_terminal = terminal - self._last_rule = rule - if value: - self._last_char_space = value[-1].isspace() - - else: - raise RuntimeError(f"Unknown bare token type: {item}") - - def reconstruct(self, tree, postproc=None, insert_spaces=False): - """Convert a Lark.Tree AST back into a string representation of HCL.""" - return Reconstructor.reconstruct( - self, - tree, - postproc, - insert_spaces, - ) - - -class HCLReverseTransformer: - """ - The reverse of hcl2.transformer.DictTransformer. This method attempts to - convert a dict back into a working AST, which can be written back out. - """ - - @staticmethod - def _name_to_identifier(name: str) -> Tree: - """Converts a string to a NAME token within an identifier rule.""" - return Tree(Token("RULE", "identifier"), [Token("NAME", name)]) - - @staticmethod - def _escape_interpolated_str(interp_s: str) -> str: - if interp_s.strip().startswith("<<-") or interp_s.strip().startswith("<<"): - # For heredoc strings, preserve their format exactly - return reverse_quotes_within_interpolation(interp_s) - # Escape backslashes first (very important to do this first) - escaped = interp_s.replace("\\", "\\\\") - # Escape quotes - escaped = escaped.replace('"', '\\"') - # Escape control characters - escaped = escaped.replace("\n", "\\n") - escaped = escaped.replace("\r", "\\r") - escaped = escaped.replace("\t", "\\t") - escaped = escaped.replace("\b", "\\b") - escaped = escaped.replace("\f", "\\f") - # find each interpolation within the string and remove the backslashes - interp_s = reverse_quotes_within_interpolation(f"{escaped}") - return interp_s - - @staticmethod - def _block_has_label(block: dict) -> bool: - return len(block.keys()) == 1 - - def __init__(self): - pass + if ( + self._last_token_name + in [ + tokens.FOR.lark_name(), + tokens.IN.lark_name(), + tokens.IF.lark_name(), + ] + and token_type != "NL_OR_COMMENT" + ): + return True - def transform(self, hcl_dict: dict) -> Tree: - """Given a dict, return a Lark.Tree representing the HCL AST.""" - level = 0 - body = self._transform_dict_to_body(hcl_dict, level) - start = Tree(Token("RULE", "start"), [body]) - return start + # Space around for_object arrow + if tokens.FOR_OBJECT_ARROW.lark_name() in [ + token_type, + self._last_token_name, + ]: + return True - @staticmethod - def _is_string_wrapped_tf(interp_s: str) -> bool: - """ - Determines whether a string is a complex HCL data structure - wrapped in ${ interpolation } characters. - """ - if not interp_s.startswith("${") or not interp_s.endswith("}"): - return False + # Space after ellipsis in function arguments + # ... except before newlines which provide their own whitespace + if self._last_token_name == tokens.ELLIPSIS.lark_name(): + if token_type == "NL_OR_COMMENT": + return False + return True - nested_tokens = [] - for match in re.finditer(r"\$?\{|}", interp_s): - if match.group(0) in ["${", "{"]: - nested_tokens.append(match.group(0)) - elif match.group(0) == "}": - nested_tokens.pop() - - # if we exit ${ interpolation } before the end of the string, - # this interpolated string has string parts and can't represent - # a valid HCL expression on its own (without quotes) - if len(nested_tokens) == 0 and match.end() != len(interp_s): + # Space around EQ and COLON separators in attributes/object elements. + # Both terminals may carry leading whitespace from the original + # source (e.g. " =" for aligned attributes, " :" for object + # elements). Skip the automatic space when the token already + # provides it. COLON only gets space if it already has leading + # whitespace (unlike EQ which always gets at least one space). + if token_type == tokens.EQ.lark_name(): + if str(current_node).startswith((" ", "\t")): + return False + return True + if token_type == tokens.COLON.lark_name(): return False + if self._last_token_name == tokens.EQ.lark_name(): + # Don't add space before newlines which provide their own whitespace + if token_type == "NL_OR_COMMENT": + return False + return True - return True - - @classmethod - def _unwrap_interpolation(cls, value: str) -> str: - if cls._is_string_wrapped_tf(value): - return value[2:-1] - return value - - def _newline(self, level: int, count: int = 1) -> Tree: - return Tree( - Token("RULE", "new_line_or_comment"), - [Token("NL_OR_COMMENT", f"\n{' ' * level}") for _ in range(count)], - ) - - def _build_string_rule(self, string: str, level: int = 0) -> Tree: - # grammar in hcl2.lark defines that a string is built of any number of string parts, - # each string part can be either interpolation expression, escaped interpolation string - # or regular string - # this method build hcl2 string rule based on arbitrary string, - # splitting such string into individual parts and building a lark tree out of them - # - result = [] + # Don't add space around operator tokens inside unary_op + if parent_rule_name == UnaryOpRule.lark_name(): + return False - pattern = regex.compile(r"(\${1,2}\{(?:[^{}]|(?R))*\})") - parts = [part for part in pattern.split(string) if part != ""] - # e.g. 'aaa$${bbb}ccc${"ddd-${eee}"}' -> ['aaa', '$${bbb}', 'ccc', '${"ddd-${eee}"}'] - # 'aa-${"bb-${"cc-${"dd-${5 + 5}"}"}"}' -> ['aa-', '${"bb-${"cc-${"dd-${5 + 5}"}"}"}'] - - for part in parts: - if part.startswith("$${") and part.endswith("}"): - result.append(Token("ESCAPED_INTERPOLATION", part)) - - # unwrap interpolation expression and recurse into it - elif part.startswith("${") and part.endswith("}"): - part = part[2:-1] - if part.startswith('"') and part.endswith('"'): - part = part[1:-1] - part = self._transform_value_to_expr_term(part, level) - else: - part = Tree( - Token("RULE", "expr_term"), - [Tree(Token("RULE", "identifier"), [Token("NAME", part)])], - ) - - result.append(Tree(Token("RULE", "interpolation"), [part])) - - else: - result.append(Token("STRING_CHARS", part)) - - result = [Tree(Token("RULE", "string_part"), [element]) for element in result] - return Tree(Token("RULE", "string"), result) - - def _is_block(self, value: Any) -> bool: - if isinstance(value, dict): - block_body = value - if START_LINE_KEY in block_body.keys() or END_LINE_KEY in block_body.keys(): + if ( + token_type in self._binary_op_types + or self._last_token_name in self._binary_op_types + ): return True - try: - # if block is labeled, actual body might be nested - # pylint: disable=W0612 - block_label, block_body = next(iter(value.items())) - except StopIteration: - # no more potential labels = nothing more to check - return False + elif isinstance(current_node, Tree): + rule_name = current_node.data - return self._is_block(block_body) + # Space after binary operator tokens before a tree node (e.g. && !foo) + if self._last_token_name in self._binary_op_types: + return True - if isinstance(value, list): - if len(value) > 0: - return self._is_block(value[0]) + if parent_rule_name == BlockRule.lark_name(): + # Add space between multiple string/identifier labels in blocks + if rule_name in [ + StringRule.lark_name(), + IdentifierRule.lark_name(), + ] and self._last_rule_name in [ + StringRule.lark_name(), + IdentifierRule.lark_name(), + ]: + return True + + # Space after QMARK/COLON in conditional expressions + if ( + parent_rule_name == ConditionalRule.lark_name() + and self._last_token_name + in [tokens.COLON.lark_name(), tokens.QMARK.lark_name()] + ): + return True + + # Space after colon in for expressions and object elements + # (before value expression, but not before newline/comment + # which provides its own whitespace) + if ( + self._last_token_name == tokens.COLON.lark_name() + and parent_rule_name + in [ + ForTupleExprRule.lark_name(), + ForObjectExprRule.lark_name(), + ObjectElemRule.lark_name(), + ] + and rule_name != "new_line_or_comment" + ): + return True return False - def _calculate_block_labels(self, block: dict) -> Tuple[List[str], dict]: - # if block doesn't have a label - if len(block.keys()) != 1: - return [], block - - # otherwise, find the label - curr_label = list(block)[0] - potential_body = block[curr_label] - - # __start_line__ and __end_line__ metadata are not labels - if ( - START_LINE_KEY in potential_body.keys() - or END_LINE_KEY in potential_body.keys() - ): - return [curr_label], potential_body - - # recurse and append the label - next_label, block_body = self._calculate_block_labels(potential_body) - return [curr_label] + next_label, block_body - - # pylint:disable=R0914 - def _transform_dict_to_body(self, hcl_dict: dict, level: int) -> Tree: - # we add a newline at the top of a body within a block, not the root body - # >2 here is to ignore the __start_line__ and __end_line__ metadata - if level > 0 and len(hcl_dict) > 2: - children = [self._newline(level)] + def _reconstruct_tree( + self, tree: Tree, parent_rule_name: Optional[str] = None + ) -> List[str]: + """Recursively reconstruct a Tree node into HCL text fragments.""" + result = [] + rule_name = tree.data + + # Check spacing BEFORE processing children, while _last_rule_name + # still reflects the previous sibling (not a child of this tree). + needs_space = self._should_add_space_before(tree, parent_rule_name) + if needs_space: + # A space will be inserted before this tree's output, so tell + # children that the last character was a space to prevent the + # first child from adding a duplicate leading space. + self._last_was_space = True + + if rule_name == UnaryOpRule.lark_name(): + for i, child in enumerate(tree.children): + result.extend(self._reconstruct_node(child, rule_name)) + if i == 0: + # Suppress space between unary operator and its operand + self._last_was_space = True + + elif rule_name == ExprTermRule.lark_name(): + for child in tree.children: + result.extend(self._reconstruct_node(child, rule_name)) + else: - children = [] - - # iterate through each attribute or sub-block of this block - for key, value in hcl_dict.items(): - if key in [START_LINE_KEY, END_LINE_KEY]: - continue - - # construct the identifier, whether that be a block type name or an attribute key - identifier_name = self._name_to_identifier(key) - - # first, check whether the value is a "block" - if self._is_block(value): - for block_v in value: - block_labels, block_body_dict = self._calculate_block_labels( - block_v - ) - block_label_trees = [ - self._build_string_rule(block_label, level) - for block_label in block_labels - ] - block_body = self._transform_dict_to_body( - block_body_dict, level + 1 - ) - - # create our actual block to add to our own body - block = Tree( - Token("RULE", "block"), - [identifier_name] + block_label_trees + [block_body], - ) - children.append(block) - # add empty line after block - new_line = self._newline(level - 1) - # add empty line with indentation for next element in the block - new_line.children.append(self._newline(level).children[0]) - - children.append(new_line) - - # if the value isn't a block, it's an attribute - else: - expr_term = self._transform_value_to_expr_term(value, level) - attribute = Tree( - Token("RULE", "attribute"), - [identifier_name, Token("EQ", " ="), expr_term], - ) - children.append(attribute) - children.append(self._newline(level)) - - # since we're leaving a block body here, reduce the indentation of the - # final newline if it exists - if ( - len(children) > 0 - and isinstance(children[-1], Tree) - and children[-1].data.type == "RULE" - and children[-1].data.value == "new_line_or_comment" - ): - children[-1] = self._newline(level - 1) - - return Tree(Token("RULE", "body"), children) - - # pylint: disable=too-many-branches, too-many-return-statements too-many-statements - def _transform_value_to_expr_term(self, value, level) -> Union[Token, Tree]: - """Transforms a value from a dictionary into an "expr_term" (a value in HCL2) - - Anything passed to this function is treated "naively". Any lists passed - are assumed to be tuples, and any dicts passed are assumed to be objects. - No more checks will be performed for either to see if they are "blocks" - as this check happens in `_transform_dict_to_body`. + for child in tree.children: + result.extend(self._reconstruct_node(child, rule_name)) + + if needs_space: + result.insert(0, " ") + + # Update state tracking + self._last_rule_name = rule_name + if result: + self._last_was_space = result[-1].endswith(" ") or result[-1].endswith("\n") + + return result + + def _reconstruct_token( + self, token: Token, parent_rule_name: Optional[str] = None + ) -> str: + """Reconstruct a Token node into HCL text fragments.""" + result = str(token.value) + if self._should_add_space_before(token, parent_rule_name): + result = " " + result + + self._last_token_name = token.type + if len(token) != 0: + self._last_was_space = result[-1].endswith(" ") or result[-1].endswith("\n") + + return result + + def _reconstruct_node( + self, node: Union[Tree, Token], parent_rule_name: Optional[str] = None + ) -> List[str]: + """Reconstruct any node (Tree or Token) into HCL text fragments.""" + if isinstance(node, Tree): + return self._reconstruct_tree(node, parent_rule_name) + if isinstance(node, Token): + return [self._reconstruct_token(node, parent_rule_name)] + # Fallback: convert to string + return [str(node)] + + def reconstruct_fragment(self, tree) -> str: + """Reconstruct a subtree without trailing-newline normalization. + + Suitable for rendering individual nodes (blocks, attributes, etc.) + rather than full documents. """ + from hcl2.rules.abstract import LarkRule + + self._reset_state() + if isinstance(tree, LarkRule): + tree = tree.to_lark() + fragments = self._reconstruct_node(tree) + return "".join(fragments) + + def reconstruct(self, tree: Tree, postproc=None) -> str: + """Convert a Lark.Tree AST back into a string representation of HCL.""" + # Reset state + self._reset_state() + + # Reconstruct the tree + fragments = self._reconstruct_node(tree) + + # Join fragments and apply post-processing + result = "".join(fragments) + + if postproc: + result = postproc(result) + + # The grammar's body rule ends with an optional new_line_or_comment + # which captures the final newline. The parser often produces two + # NL_OR_COMMENT tokens for a single trailing newline (the statement + # separator plus the EOF newline), resulting in a spurious blank line. + # Strip exactly one trailing newline when there are two or more. + if result.endswith("\n\n"): + result = result[:-1] + + # Ensure file ends with newline + if result and not result.endswith("\n"): + result += "\n" - # for lists, recursively turn the child elements into expr_terms and - # store within a tuple - if isinstance(value, list): - tuple_tree = Tree( - Token("RULE", "tuple"), - [ - self._transform_value_to_expr_term(tuple_v, level) - for tuple_v in value - ], - ) - return Tree(Token("RULE", "expr_term"), [tuple_tree]) - - if value is None: - return Tree( - Token("RULE", "expr_term"), - [Tree(Token("RULE", "identifier"), [Token("NAME", "null")])], - ) - - # for dicts, recursively turn the child k/v pairs into object elements - # and store within an object - if isinstance(value, dict): - elements = [] - - # if the object has elements, put it on a newline - if len(value) > 0: - elements.append(self._newline(level + 1)) - - # iterate through the items and add them to the object - for i, (k, dict_v) in enumerate(value.items()): - if k in [START_LINE_KEY, END_LINE_KEY]: - continue - - value_expr_term = self._transform_value_to_expr_term(dict_v, level + 1) - k = self._unwrap_interpolation(k) - elements.append( - Tree( - Token("RULE", "object_elem"), - [ - Tree( - Token("RULE", "object_elem_key"), - [Tree(Token("RULE", "identifier"), [Token("NAME", k)])], - ), - Token("EQ", " ="), - value_expr_term, - ], - ) - ) - - # add indentation appropriately - if i < len(value) - 1: - elements.append(self._newline(level + 1)) - else: - elements.append(self._newline(level)) - return Tree( - Token("RULE", "expr_term"), [Tree(Token("RULE", "object"), elements)] - ) - - # treat booleans appropriately - if isinstance(value, bool): - return Tree( - Token("RULE", "expr_term"), - [ - Tree( - Token("RULE", "identifier"), - [Token("NAME", "true" if value else "false")], - ) - ], - ) - - # store integers as literals, digit by digit - if isinstance(value, int): - return Tree( - Token("RULE", "expr_term"), - [ - Tree( - Token("RULE", "int_lit"), - [Token("DECIMAL", digit) for digit in str(value)], - ) - ], - ) - - if isinstance(value, float): - value = str(value) - literal = [] - - if value[0] == "-": - # pop two first chars - minus and a digit - literal.append(Token("NEGATIVE_DECIMAL", value[:2])) - value = value[2:] - - while value != "": - char = value[0] - - if char == ".": - # current char marks beginning of decimal part: pop all remaining chars and end the loop - literal.append(Token("DOT", char)) - literal.extend(Token("DECIMAL", char) for char in value[1:]) - break - - if char == "e": - # current char marks beginning of e-notation: pop all remaining chars and end the loop - literal.append(Token("EXP_MARK", value)) - break - - literal.append(Token("DECIMAL", char)) - value = value[1:] - - return Tree( - Token("RULE", "expr_term"), - [Tree(Token("RULE", "float_lit"), literal)], - ) - - # store strings as single literals - if isinstance(value, str): - # potentially unpack a complex syntax structure - if self._is_string_wrapped_tf(value): - # we have to unpack it by parsing it - wrapped_value = re.match(r"\$\{(.*)}", value).group(1) # type:ignore - ast = reconstruction_parser().parse(f"value = {wrapped_value}") - - if ast.data != Token("RULE", "start"): - raise RuntimeError("Token must be `start` RULE") - - body = ast.children[0] - if body.data != Token("RULE", "body"): - raise RuntimeError("Token must be `body` RULE") - - attribute = body.children[0] - if attribute.data != Token("RULE", "attribute"): - raise RuntimeError("Token must be `attribute` RULE") - - if attribute.children[1] != Token("EQ", " ="): - raise RuntimeError("Token must be `EQ (=)` rule") - - parsed_value = attribute.children[2] - return parsed_value - - # otherwise it's a string - return Tree( - Token("RULE", "expr_term"), - [self._build_string_rule(self._escape_interpolated_str(value), level)], - ) - - # otherwise, we don't know the type - raise RuntimeError(f"Unknown type to transform {type(value)}") + return result diff --git a/hcl2/rules/__init__.py b/hcl2/rules/__init__.py new file mode 100644 index 00000000..e69de29b diff --git a/hcl2/rules/abstract.py b/hcl2/rules/abstract.py new file mode 100644 index 00000000..554bc44d --- /dev/null +++ b/hcl2/rules/abstract.py @@ -0,0 +1,139 @@ +"""Abstract base classes for the LarkElement tree intermediate representation.""" + +from abc import ABC, abstractmethod +from typing import Any, Union, List, Optional, Callable + +from lark import Token, Tree +from lark.tree import Meta + +from hcl2.utils import SerializationOptions, SerializationContext + + +class LarkElement(ABC): + """Base class for all elements in the LarkElement tree.""" + + @staticmethod + @abstractmethod + def lark_name() -> str: + """Return the corresponding Lark grammar rule or token name.""" + raise NotImplementedError() + + def __init__(self, index: int = -1, parent: Optional["LarkElement"] = None): + self._index = index + self._parent = parent + + def set_index(self, i: int): + """Set the position index of this element within its parent.""" + self._index = i + + def set_parent(self, node: "LarkElement"): + """Set the parent element that contains this element.""" + self._parent = node + + @abstractmethod + def to_lark(self) -> Any: + """Convert this element back to a Lark Tree or Token.""" + raise NotImplementedError() + + @abstractmethod + def serialize( + self, options=SerializationOptions(), context=SerializationContext() + ) -> Any: + """Serialize this element to a Python object (dict, list, str, etc.).""" + raise NotImplementedError() + + +class LarkToken(LarkElement, ABC): + """Base class for terminal token elements (leaves of the tree).""" + + def __init__(self, value: Optional[Union[str, int, float]] = None): + self._value = value + super().__init__() + + @property + @abstractmethod + def serialize_conversion(self) -> Callable: + """Return the callable used to convert this token's value during serialization.""" + raise NotImplementedError() + + @property + def value(self): + """Return the raw value of this token.""" + return self._value + + def set_value(self, value: Any): + """Set the raw value of this token.""" + self._value = value + + def serialize( + self, options=SerializationOptions(), context=SerializationContext() + ) -> Any: + """Serialize this token using its serialize_conversion callable.""" + return self.serialize_conversion(self.value) + + def to_lark(self) -> Token: + """Convert this token back to a Lark Token.""" + return Token(self.lark_name(), self.value) + + def __str__(self) -> str: + return str(self._value) + + def __repr__(self) -> str: + return f"" + + +class LarkRule(LarkElement, ABC): + """Base class for non-terminal rule elements (internal nodes of the tree). + + Subclasses should declare `_children_layout: Tuple[...]` (without assignment) + to document the expected positional structure of `_children`. For variable-length + rules, use `_children_layout: List[Union[...]]`. This annotation exists only in + `__annotations__` and does not create an attribute or conflict with the runtime + `_children` list. + """ + + @abstractmethod + def serialize( + self, options=SerializationOptions(), context=SerializationContext() + ) -> Any: + """Serialize this rule and its children to a Python object.""" + raise NotImplementedError() + + @property + def children(self) -> List[Any]: + """Return the list of child elements.""" + return self._children + + @property + def parent(self): + """Return the parent element.""" + return self._parent + + @property + def index(self): + """Return the position index within the parent.""" + return self._index + + def to_lark(self) -> Tree: + """Convert this rule and its children back to a Lark Tree.""" + result_children = [] + for child in self._children: + if child is None: + continue + + result_children.append(child.to_lark()) + + return Tree(self.lark_name(), result_children, meta=self._meta) + + def __init__(self, children: List[Any], meta: Optional[Meta] = None): + super().__init__() + self._children: List[Any] = children + self._meta = meta or Meta() + + for index, child in enumerate(children): + if child is not None: + child.set_index(index) + child.set_parent(self) + + def __repr__(self): + return f"" diff --git a/hcl2/rules/base.py b/hcl2/rules/base.py new file mode 100644 index 00000000..dacec8b4 --- /dev/null +++ b/hcl2/rules/base.py @@ -0,0 +1,172 @@ +"""Rule classes for HCL2 structural elements (attributes, bodies, blocks).""" + +from collections import defaultdict +from typing import Tuple, Any, List, Union, Optional + +from lark.tree import Meta + +from hcl2.const import IS_BLOCK, INLINE_COMMENTS_KEY +from hcl2.rules.abstract import LarkRule, LarkToken +from hcl2.rules.expressions import ExprTermRule +from hcl2.rules.literal_rules import IdentifierRule +from hcl2.rules.strings import StringRule +from hcl2.rules.tokens import EQ, LBRACE, RBRACE + +from hcl2.rules.whitespace import NewLineOrCommentRule +from hcl2.utils import SerializationOptions, SerializationContext + + +class AttributeRule(LarkRule): + """Rule for key = value attribute assignments.""" + + _children_layout: Tuple[ + IdentifierRule, + EQ, + ExprTermRule, + ] + + @staticmethod + def lark_name() -> str: + """Return the grammar rule name.""" + return "attribute" + + @property + def identifier(self) -> IdentifierRule: + """Return the attribute name identifier.""" + return self._children[0] + + @property + def expression(self) -> ExprTermRule: + """Return the attribute value expression.""" + return self._children[2] + + def serialize( + self, options=SerializationOptions(), context=SerializationContext() + ) -> Any: + """Serialize to a single-entry dict.""" + return {self.identifier.serialize(options): self.expression.serialize(options)} + + +class BodyRule(LarkRule): + """Rule for a body containing attributes, blocks, and comments.""" + + _children_layout: List[ + Union[ + NewLineOrCommentRule, + AttributeRule, + "BlockRule", + ] + ] + + @staticmethod + def lark_name() -> str: + """Return the grammar rule name.""" + return "body" + + def serialize( + self, options=SerializationOptions(), context=SerializationContext() + ) -> Any: + """Serialize to a dict, grouping blocks under their type name.""" + attribute_names = set() + comments = [] + inline_comments = [] + + result = defaultdict(list) + + for child in self._children: + + if isinstance(child, BlockRule): + name = child.labels[0].serialize(options) + if name in attribute_names: + raise RuntimeError(f"Attribute {name} is already defined.") + result[name].append(child.serialize(options)) + + if isinstance(child, AttributeRule): + attribute_names.add(child.identifier.serialize(options)) + result.update(child.serialize(options)) + if options.with_comments: + inline_comments.extend(child.expression.inline_comments()) + comments.extend(child.expression.absorbed_comments()) + + if isinstance(child, NewLineOrCommentRule) and options.with_comments: + child_comments = child.to_list() + if child_comments: + comments.extend(child_comments) + + if options.with_comments: + if comments: + result["__comments__"] = comments + if inline_comments: + result[INLINE_COMMENTS_KEY] = inline_comments + + return dict(result.items()) + + +class StartRule(LarkRule): + """Rule for the top-level start rule of an HCL2 document.""" + + _children_layout: Tuple[BodyRule] + + @property + def body(self) -> BodyRule: + """Return the document body.""" + return self._children[0] + + @staticmethod + def lark_name() -> str: + """Return the grammar rule name.""" + return "start" + + def serialize( + self, options=SerializationOptions(), context=SerializationContext() + ) -> Any: + """Serialize by delegating to the body.""" + return self.body.serialize(options) + + +class BlockRule(LarkRule): + """Rule for HCL2 blocks (e.g. resource 'type' 'name' { ... }).""" + + _children_layout: Tuple[ + IdentifierRule, + Optional[Union[IdentifierRule, StringRule]], + LBRACE, + BodyRule, + RBRACE, + ] + + def __init__(self, children, meta: Optional[Meta] = None): + super().__init__(children, meta) + + *self._labels, self._body = [ + child for child in children if not isinstance(child, LarkToken) + ] + + @staticmethod + def lark_name() -> str: + """Return the grammar rule name.""" + return "block" + + @property + def labels(self) -> List[Union[IdentifierRule, StringRule]]: + """Return the block label chain (type name, optional string labels).""" + return list(filter(lambda label: label is not None, self._labels)) + + @property + def body(self) -> BodyRule: + """Return the block body.""" + return self._body + + def serialize( + self, options=SerializationOptions(), context=SerializationContext() + ) -> Any: + """Serialize to a nested dict with labels as keys.""" + result = self._body.serialize(options) + if options.explicit_blocks: + result.update({IS_BLOCK: True}) + + labels = self._labels + for label in reversed(labels[1:]): + result = {label.serialize(options): result} + + return result diff --git a/hcl2/rules/containers.py b/hcl2/rules/containers.py new file mode 100644 index 00000000..671d98b7 --- /dev/null +++ b/hcl2/rules/containers.py @@ -0,0 +1,229 @@ +"""Rule classes for HCL2 tuples, objects, and their elements.""" + +from typing import Tuple, List, Optional, Union, Any + +from hcl2.rules.abstract import LarkRule +from hcl2.rules.expressions import ExpressionRule +from hcl2.rules.literal_rules import ( + FloatLitRule, + IntLitRule, + IdentifierRule, +) +from hcl2.rules.strings import StringRule +from hcl2.rules.tokens import ( + COLON, + EQ, + LBRACE, + COMMA, + RBRACE, + LSQB, + RSQB, +) +from hcl2.rules.whitespace import ( + NewLineOrCommentRule, + InlineCommentMixIn, +) +from hcl2.utils import ( + SerializationOptions, + SerializationContext, + to_dollar_string, +) + + +class TupleRule(InlineCommentMixIn): + """Rule for tuple/array literals ([elem, ...]).""" + + _children_layout: Tuple[ + LSQB, + Optional[NewLineOrCommentRule], + Tuple[ + ExpressionRule, + Optional[NewLineOrCommentRule], + COMMA, + Optional[NewLineOrCommentRule], + # ... + ], + ExpressionRule, + Optional[NewLineOrCommentRule], + Optional[COMMA], + Optional[NewLineOrCommentRule], + RSQB, + ] + + @staticmethod + def lark_name() -> str: + """Return the grammar rule name.""" + return "tuple" + + @property + def elements(self) -> List[ExpressionRule]: + """Return the expression elements of the tuple.""" + return [ + child for child in self.children[1:-1] if isinstance(child, ExpressionRule) + ] + + def serialize( + self, options=SerializationOptions(), context=SerializationContext() + ) -> Any: + """Serialize to a Python list or bracketed string.""" + if not options.wrap_tuples and not context.inside_dollar_string: + return [element.serialize(options, context) for element in self.elements] + + with context.modify(inside_dollar_string=True): + result = "[" + result += ", ".join( + str(element.serialize(options, context)) for element in self.elements + ) + result += "]" + + if not context.inside_dollar_string: + result = to_dollar_string(result) + + return result + + +class ObjectElemKeyRule(LarkRule): + """Rule for an object element key.""" + + key_T = Union[FloatLitRule, IntLitRule, IdentifierRule, StringRule] + + _children_layout: Tuple[key_T] + + @staticmethod + def lark_name() -> str: + """Return the grammar rule name.""" + return "object_elem_key" + + @property + def value(self) -> key_T: + """Return the key value (identifier, string, or number).""" + return self._children[0] + + def serialize( + self, options=SerializationOptions(), context=SerializationContext() + ) -> Any: + """Serialize the key, coercing numbers to strings.""" + result = self.value.serialize(options, context) + # Object keys must be strings for JSON compatibility + if isinstance(result, (int, float)): + result = str(result) + return result + + +class ObjectElemKeyExpressionRule(LarkRule): + """Rule for expression keys in objects (bare or parenthesized). + + Holds a single ExpressionRule child. Parenthesized keys like + ``(var.account)`` arrive as an ExprTermRule whose own ``serialize()`` + already emits the surrounding ``(…)``, so this class does not need + separate handling for bare vs parenthesized forms. + """ + + _children_layout: Tuple[ExpressionRule] + + @staticmethod + def lark_name() -> str: + """Return the grammar rule name.""" + return "object_elem_key_expr" + + @property + def expression(self) -> ExpressionRule: + """Return the key expression.""" + return self._children[0] + + def serialize( + self, options=SerializationOptions(), context=SerializationContext() + ) -> Any: + """Serialize to '${expression}' string.""" + with context.modify(inside_dollar_string=True): + result = str(self.expression.serialize(options, context)) + if not context.inside_dollar_string: + result = to_dollar_string(result) + return result + + +class ObjectElemRule(LarkRule): + """Rule for a single key = value element in an object.""" + + _children_layout: Tuple[ + ObjectElemKeyRule, + Union[EQ, COLON], + ExpressionRule, + ] + + @staticmethod + def lark_name() -> str: + """Return the grammar rule name.""" + return "object_elem" + + @property + def key(self) -> ObjectElemKeyRule: + """Return the key rule.""" + return self._children[0] + + @property + def expression(self): + """Return the value expression.""" + return self._children[2] + + def serialize( + self, options=SerializationOptions(), context=SerializationContext() + ) -> Any: + """Serialize to a single-entry dict.""" + return { + self.key.serialize(options, context): self.expression.serialize( + options, context + ) + } + + +class ObjectRule(InlineCommentMixIn): + """Rule for object literals ({key = value, ...}).""" + + _children_layout: Tuple[ + LBRACE, + Optional[NewLineOrCommentRule], + Tuple[ + ObjectElemRule, + Optional[NewLineOrCommentRule], + Optional[COMMA], + Optional[NewLineOrCommentRule], + ], + RBRACE, + ] + + @staticmethod + def lark_name() -> str: + """Return the grammar rule name.""" + return "object" + + @property + def elements(self) -> List[ObjectElemRule]: + """Return the list of object element rules.""" + return [ + child for child in self.children[1:-1] if isinstance(child, ObjectElemRule) + ] + + def serialize( + self, options=SerializationOptions(), context=SerializationContext() + ) -> Any: + """Serialize to a Python dict or braced string.""" + if not options.wrap_objects and not context.inside_dollar_string: + dict_result: dict = {} + for element in self.elements: + dict_result.update(element.serialize(options, context)) + return dict_result + + with context.modify(inside_dollar_string=True): + str_result = "{" + str_result += ", ".join( + f"{element.key.serialize(options, context)}" + f" = " + f"{element.expression.serialize(options, context)}" + for element in self.elements + ) + str_result += "}" + + if not context.inside_dollar_string: + str_result = to_dollar_string(str_result) + return str_result diff --git a/hcl2/rules/directives.py b/hcl2/rules/directives.py new file mode 100644 index 00000000..ff9cc532 --- /dev/null +++ b/hcl2/rules/directives.py @@ -0,0 +1,429 @@ +"""Rule classes for HCL2 template directives (%{if}, %{for}).""" + +from typing import Any, List, Optional, Tuple + +from lark.tree import Meta + +from hcl2.rules.abstract import LarkRule +from hcl2.rules.expressions import ExpressionRule +from hcl2.rules.literal_rules import IdentifierRule +from hcl2.rules.tokens import ( + DIRECTIVE_START, + STRIP_MARKER, + IF, + ELSE, + ENDIF, + FOR, + IN, + ENDFOR, + COMMA, + RBRACE, + StaticStringToken, +) +from hcl2.utils import SerializationOptions, SerializationContext + + +def _is_strip(child) -> bool: + """Check if a child is a STRIP_MARKER token.""" + return isinstance(child, StaticStringToken) and child.lark_name() == "STRIP_MARKER" + + +def _strip_prefix(is_strip: bool) -> str: + """Return strip-marker prefix string for directive serialization.""" + return "~ " if is_strip else " " + + +def _strip_suffix(is_strip: bool) -> str: + """Return strip-marker suffix string for directive serialization.""" + return " ~" if is_strip else " " + + +def _insert_strip_optionals(children: List, indexes: List[int]): + """Insert None placeholders at positions where optional STRIP_MARKER may appear.""" + for index in sorted(indexes): + try: + child = children[index] + except IndexError: + children.insert(index, None) + else: + if not _is_strip(child): + children.insert(index, None) + + +class TemplateIfStartRule(LarkRule): + """Rule for %{if condition} opening directive.""" + + _children_layout: Tuple[ + DIRECTIVE_START, + Optional[STRIP_MARKER], + IF, + ExpressionRule, + Optional[STRIP_MARKER], + RBRACE, + ] + + def __init__(self, children, meta: Optional[Meta] = None): + _insert_strip_optionals(children, [1, 4]) + super().__init__(children, meta) + + @staticmethod + def lark_name() -> str: + """Return the grammar rule name.""" + return "template_if_start" + + @property + def strip_open(self) -> bool: + """Check if there's a strip marker after %{.""" + return self._children[1] is not None + + @property + def condition(self) -> ExpressionRule: + """Return the condition expression.""" + return self._children[3] + + @property + def strip_close(self) -> bool: + """Check if there's a strip marker before }.""" + return self._children[4] is not None + + def serialize( + self, options=SerializationOptions(), context=SerializationContext() + ) -> Any: + """Serialize to %{ if EXPR } or %{~ if EXPR ~}.""" + with context.modify(inside_dollar_string=True): + cond_str = self.condition.serialize(options, context) + prefix = _strip_prefix(self.strip_open) + suffix = _strip_suffix(self.strip_close) + return f"%{{{prefix}if {cond_str}{suffix}}}" + + +class TemplateElseRule(LarkRule): + """Rule for %{else} directive.""" + + _children_layout: Tuple[ + DIRECTIVE_START, + Optional[STRIP_MARKER], + ELSE, + Optional[STRIP_MARKER], + RBRACE, + ] + + def __init__(self, children, meta: Optional[Meta] = None): + _insert_strip_optionals(children, [1, 3]) + super().__init__(children, meta) + + @staticmethod + def lark_name() -> str: + """Return the grammar rule name.""" + return "template_else" + + @property + def strip_open(self) -> bool: + """Check if there's a strip marker after %{.""" + return self._children[1] is not None + + @property + def strip_close(self) -> bool: + """Check if there's a strip marker before }.""" + return self._children[3] is not None + + def serialize( + self, options=SerializationOptions(), context=SerializationContext() + ) -> Any: + """Serialize to %{ else } or %{~ else ~}.""" + prefix = _strip_prefix(self.strip_open) + suffix = _strip_suffix(self.strip_close) + return f"%{{{prefix}else{suffix}}}" + + +class TemplateEndifRule(LarkRule): + """Rule for %{endif} directive.""" + + _children_layout: Tuple[ + DIRECTIVE_START, + Optional[STRIP_MARKER], + ENDIF, + Optional[STRIP_MARKER], + RBRACE, + ] + + def __init__(self, children, meta: Optional[Meta] = None): + _insert_strip_optionals(children, [1, 3]) + super().__init__(children, meta) + + @staticmethod + def lark_name() -> str: + """Return the grammar rule name.""" + return "template_endif" + + @property + def strip_open(self) -> bool: + """Check if there's a strip marker after %{.""" + return self._children[1] is not None + + @property + def strip_close(self) -> bool: + """Check if there's a strip marker before }.""" + return self._children[3] is not None + + def serialize( + self, options=SerializationOptions(), context=SerializationContext() + ) -> Any: + """Serialize to %{ endif } or %{~ endif ~}.""" + prefix = _strip_prefix(self.strip_open) + suffix = _strip_suffix(self.strip_close) + return f"%{{{prefix}endif{suffix}}}" + + +class TemplateForStartRule(LarkRule): + """Rule for %{for VAR in EXPR} opening directive.""" + + _children_layout: Tuple[ + DIRECTIVE_START, + Optional[STRIP_MARKER], + FOR, + IdentifierRule, + Optional[COMMA], + Optional[IdentifierRule], + IN, + ExpressionRule, + Optional[STRIP_MARKER], + RBRACE, + ] + + def __init__(self, children, meta: Optional[Meta] = None): + self._setup_optionals(children) + super().__init__(children, meta) + + def _setup_optionals(self, children: List): + """Insert None placeholders for optional strip markers and second iterator. + + Parser output varies: + [DIRECTIVE_START, STRIP?, FOR, id, (COMMA, id)?, IN, expr, STRIP?, RBRACE] + Target layout (10 positions): + [0:DIRECTIVE_START, 1:STRIP?, 2:FOR, 3:id, 4:COMMA?, 5:id?, 6:IN, 7:expr, 8:STRIP?, 9:RBRACE] + """ + # Step 1: Insert strip_open placeholder at position 1 + _insert_strip_optionals(children, [1]) + + # Step 2: Handle optional comma + second identifier + # After step 1, FOR is at index 2, first identifier at 3. + # Count identifiers before IN to distinguish iterator(s) from collection + ids_before_in = [] + for child in children: + if isinstance(child, StaticStringToken) and child.lark_name() == "IN": + break + if isinstance(child, IdentifierRule): + ids_before_in.append(child) + if len(ids_before_in) < 2: + # No second iterator — insert None for COMMA and second id at 4, 5 + children.insert(4, None) + children.insert(5, None) + + # Step 3: Insert strip_close placeholder at position 8 + _insert_strip_optionals(children, [8]) + + @staticmethod + def lark_name() -> str: + """Return the grammar rule name.""" + return "template_for_start" + + @property + def strip_open(self) -> bool: + """Check if there's a strip marker after %{.""" + return self._children[1] is not None + + @property + def strip_close(self) -> bool: + """Check if there's a strip marker before }.""" + return self._children[8] is not None + + @property + def iterator(self) -> IdentifierRule: + """Return the first iterator identifier.""" + return self._children[3] + + @property + def key_iterator(self) -> Optional[IdentifierRule]: + """Return the second iterator identifier, or None.""" + return self._children[5] + + @property + def collection(self) -> ExpressionRule: + """Return the collection expression after IN.""" + return self._children[7] + + def serialize( + self, options=SerializationOptions(), context=SerializationContext() + ) -> Any: + """Serialize to %{ for VAR in EXPR } or %{~ for VAR in EXPR ~}.""" + prefix = _strip_prefix(self.strip_open) + suffix = _strip_suffix(self.strip_close) + with context.modify(inside_dollar_string=True): + iter_str = self.iterator.serialize(options, context) + if self.key_iterator is not None: + iter_str += f", {self.key_iterator.serialize(options, context)}" + coll_str = self.collection.serialize(options, context) + return f"%{{{prefix}for {iter_str} in {coll_str}{suffix}}}" + + +class TemplateEndforRule(LarkRule): + """Rule for %{endfor} directive.""" + + _children_layout: Tuple[ + DIRECTIVE_START, + Optional[STRIP_MARKER], + ENDFOR, + Optional[STRIP_MARKER], + RBRACE, + ] + + def __init__(self, children, meta: Optional[Meta] = None): + _insert_strip_optionals(children, [1, 3]) + super().__init__(children, meta) + + @staticmethod + def lark_name() -> str: + """Return the grammar rule name.""" + return "template_endfor" + + @property + def strip_open(self) -> bool: + """Check if there's a strip marker after %{.""" + return self._children[1] is not None + + @property + def strip_close(self) -> bool: + """Check if there's a strip marker before }.""" + return self._children[3] is not None + + def serialize( + self, options=SerializationOptions(), context=SerializationContext() + ) -> Any: + """Serialize to %{ endfor } or %{~ endfor ~}.""" + prefix = _strip_prefix(self.strip_open) + suffix = _strip_suffix(self.strip_close) + return f"%{{{prefix}endfor{suffix}}}" + + +class TemplateIfRule(LarkRule): + """Assembled rule for a complete %{if}...%{else}...%{endif} template. + + This is NOT produced by the parser directly — it is assembled by the + transformer from flat TemplateIfStartRule/TemplateElseRule/TemplateEndifRule + and interleaved StringPartRule children. + """ + + _children_layout: Tuple[ + TemplateIfStartRule, + # ... variable number of body StringPartRules ... + # Optional[TemplateElseRule], + # ... variable number of else body StringPartRules ... + TemplateEndifRule, + ] + + @staticmethod + def lark_name() -> str: + """Return the grammar rule name.""" + return "template_if" + + def __init__( # pylint: disable=R0917 + self, + if_start: TemplateIfStartRule, + if_body: list, + else_rule: Optional[TemplateElseRule], + else_body: Optional[list], + endif: TemplateEndifRule, + meta: Optional[Meta] = None, + ): + self._if_start = if_start + self._if_body = if_body + self._else_rule = else_rule + self._else_body = else_body or [] + self._endif = endif + + # Build children list for to_lark + children = [if_start, *if_body] + if else_rule is not None: + children.extend([else_rule, *self._else_body]) + children.append(endif) + super().__init__(children, meta) + + def serialize( + self, options=SerializationOptions(), context=SerializationContext() + ) -> Any: + """Serialize the full if/else/endif directive.""" + result = self._if_start.serialize(options, context) + for part in self._if_body: + result += part.serialize(options, context) + if self._else_rule is not None: + result += self._else_rule.serialize(options, context) + for part in self._else_body: + result += part.serialize(options, context) + result += self._endif.serialize(options, context) + return result + + def to_lark(self): + """Convert back to flat sequence of Lark trees for reconstruction.""" + result_children = [] + result_children.extend(self._if_start.to_lark().children) + for part in self._if_body: + result_children.append(part.to_lark()) + if self._else_rule is not None: + result_children.extend(self._else_rule.to_lark().children) + for part in self._else_body: + result_children.append(part.to_lark()) + result_children.extend(self._endif.to_lark().children) + from lark import Tree # pylint: disable=C0415 + + return Tree("template_if", result_children, meta=self._meta) + + +class TemplateForRule(LarkRule): + """Assembled rule for a complete %{for}...%{endfor} template.""" + + _children_layout: Tuple[ + TemplateForStartRule, + # ... variable number of body StringPartRules ... + TemplateEndforRule, + ] + + @staticmethod + def lark_name() -> str: + """Return the grammar rule name.""" + return "template_for" + + def __init__( + self, + for_start: TemplateForStartRule, + body: list, + endfor: TemplateEndforRule, + meta: Optional[Meta] = None, + ): + self._for_start = for_start + self._body = body + self._endfor = endfor + + children = [for_start, *body, endfor] + super().__init__(children, meta) + + def serialize( + self, options=SerializationOptions(), context=SerializationContext() + ) -> Any: + """Serialize the full for/endfor directive.""" + result = self._for_start.serialize(options, context) + for part in self._body: + result += part.serialize(options, context) + result += self._endfor.serialize(options, context) + return result + + def to_lark(self): + """Convert back to flat sequence of Lark trees for reconstruction.""" + result_children = [] + result_children.extend(self._for_start.to_lark().children) + for part in self._body: + result_children.append(part.to_lark()) + result_children.extend(self._endfor.to_lark().children) + from lark import Tree # pylint: disable=C0415 + + return Tree("template_for", result_children, meta=self._meta) diff --git a/hcl2/rules/expressions.py b/hcl2/rules/expressions.py new file mode 100644 index 00000000..057e1ffc --- /dev/null +++ b/hcl2/rules/expressions.py @@ -0,0 +1,332 @@ +"""Rule classes for HCL2 expressions, conditionals, and binary/unary operations.""" + +from abc import ABC +from typing import Any, Optional, Tuple + +from lark.tree import Meta + +from hcl2.rules.abstract import ( + LarkToken, +) +from hcl2.rules.literal_rules import BinaryOperatorRule +from hcl2.rules.tokens import LPAR, RPAR, QMARK, COLON +from hcl2.rules.whitespace import ( + NewLineOrCommentRule, + InlineCommentMixIn, +) +from hcl2.utils import ( + wrap_into_parentheses, + to_dollar_string, + SerializationOptions, + SerializationContext, +) + + +class ExpressionRule(InlineCommentMixIn, ABC): + """Base class for all HCL2 expression rules.""" + + @staticmethod + def lark_name() -> str: + """?expression is transparent in Lark — subclasses must override.""" + raise NotImplementedError("ExpressionRule.lark_name() must be overridden") + + def __init__( + self, children, meta: Optional[Meta] = None, parentheses: bool = False + ): + super().__init__(children, meta) + self._parentheses = parentheses + + def _wrap_into_parentheses( + self, + value: str, + _options=SerializationOptions(), + context=SerializationContext(), + ) -> str: + """Wrap value in parentheses if inside a nested expression.""" + # do not wrap into parentheses if + # 1. already wrapped or + # 2. is top-level expression (unless explicitly wrapped) + if context.inside_parentheses: + return value + # Look through ExprTermRule wrapper to determine if truly nested + parent = getattr(self, "parent", None) + if parent is None: + return value + if isinstance(parent, ExprTermRule): + if not isinstance(parent.parent, ExpressionRule): + return value + elif not isinstance(parent, ExpressionRule): + return value + return wrap_into_parentheses(value) + + +class ExprTermRule(ExpressionRule): + """Rule for expression terms, optionally wrapped in parentheses.""" + + _children_layout: Tuple[ + Optional[LPAR], + Optional[NewLineOrCommentRule], + ExpressionRule, + Optional[NewLineOrCommentRule], + Optional[RPAR], + ] + + @staticmethod + def lark_name() -> str: + """Return the grammar rule name.""" + return "expr_term" + + def __init__(self, children, meta: Optional[Meta] = None): + parentheses = False + if ( + isinstance(children[0], LarkToken) + and children[0].lark_name() == "LPAR" + and isinstance(children[-1], LarkToken) + and children[-1].lark_name() == "RPAR" + ): + parentheses = True + else: + children = [None, *children, None] + self._insert_optionals(children, [1, 3]) + super().__init__(children, meta, parentheses) + + @property + def parentheses(self) -> bool: + """Return whether this term is wrapped in parentheses.""" + return self._parentheses + + @property + def expression(self) -> ExpressionRule: + """Return the inner expression.""" + return self._children[2] + + def serialize( + self, options=SerializationOptions(), context=SerializationContext() + ) -> Any: + """Serialize, handling parenthesized expression wrapping.""" + with context.modify( + inside_parentheses=self.parentheses or context.inside_parentheses + ): + result = self.expression.serialize(options, context) + + if self.parentheses: + result = wrap_into_parentheses(result) + if not context.inside_dollar_string: + result = to_dollar_string(result) + + return result + + +class ConditionalRule(ExpressionRule): + """Rule for ternary conditional expressions (condition ? true : false).""" + + _children_layout: Tuple[ + ExpressionRule, + Optional[NewLineOrCommentRule], + QMARK, + Optional[NewLineOrCommentRule], + ExpressionRule, + Optional[NewLineOrCommentRule], + COLON, + Optional[NewLineOrCommentRule], + ExpressionRule, + ] + + @staticmethod + def lark_name() -> str: + """Return the grammar rule name.""" + return "conditional" + + def __init__(self, children, meta: Optional[Meta] = None): + self._insert_optionals(children, [1, 3, 5, 7]) + super().__init__(children, meta) + + @property + def condition(self) -> ExpressionRule: + """Return the condition expression.""" + return self._children[0] + + @property + def if_true(self) -> ExpressionRule: + """Return the true-branch expression.""" + return self._children[4] + + @property + def if_false(self) -> ExpressionRule: + """Return the false-branch expression.""" + return self._children[8] + + def serialize( + self, options=SerializationOptions(), context=SerializationContext() + ) -> Any: + """Serialize to ternary expression string.""" + with context.modify(inside_dollar_string=True): + result = ( + f"{self.condition.serialize(options, context)} " + f"? {self.if_true.serialize(options, context)} " + f": {self.if_false.serialize(options, context)}" + ) + + if not context.inside_dollar_string: + result = to_dollar_string(result) + + if options.force_operation_parentheses: + result = self._wrap_into_parentheses(result, options, context) + + return result + + +class BinaryTermRule(ExpressionRule): + """Rule for the operator+operand portion of a binary operation.""" + + _children_layout: Tuple[ + Optional[NewLineOrCommentRule], + BinaryOperatorRule, + Optional[NewLineOrCommentRule], + ExprTermRule, + ] + + @staticmethod + def lark_name() -> str: + """Return the grammar rule name.""" + return "binary_term" + + def __init__(self, children, meta: Optional[Meta] = None): + self._insert_optionals(children, [0, 2]) + super().__init__(children, meta) + + @property + def binary_operator(self) -> BinaryOperatorRule: + """Return the binary operator.""" + return self._children[1] + + @property + def expr_term(self) -> ExprTermRule: + """Return the right-hand operand.""" + return self._children[3] + + def serialize( + self, options=SerializationOptions(), context=SerializationContext() + ) -> Any: + """Serialize to 'operator operand' string.""" + op_str = self.binary_operator.serialize(options, context) + term_str = self.expr_term.serialize(options, context) + return f"{op_str} {term_str}" + + +class BinaryOpRule(ExpressionRule): + """Rule for complete binary operations (lhs operator rhs).""" + + _children_layout: Tuple[ + ExprTermRule, + BinaryTermRule, + Optional[NewLineOrCommentRule], + ] + + def __init__(self, children, meta: Optional[Meta] = None): + self._insert_optionals(children, [2]) + super().__init__(children, meta) + + @staticmethod + def lark_name() -> str: + """Return the grammar rule name.""" + return "binary_op" + + @property + def expr_term(self) -> ExprTermRule: + """Return the left-hand operand.""" + return self._children[0] + + @property + def binary_term(self) -> BinaryTermRule: + """Return the binary term (operator + right-hand operand).""" + return self._children[1] + + @property + def _trailing_nl(self) -> Optional[NewLineOrCommentRule]: + """Return the trailing new_line_or_comment child, if present.""" + child = self._children[2] + if isinstance(child, NewLineOrCommentRule): + return child + return None + + def inline_comments(self): + """Collect inline comments, excluding absorbed body-level comments.""" + trailing = self._trailing_nl + result = [] + for child in self._children: + if isinstance(child, NewLineOrCommentRule): + # Trailing NL_OR_COMMENT with a leading newline contains + # body-level comments absorbed by the grammar, not inline ones. + if child is trailing and not child.is_inline: + continue + comments = child.to_list() + if comments is not None: + result.extend(comments) + elif isinstance(child, InlineCommentMixIn): + result.extend(child.inline_comments()) + return result + + def absorbed_comments(self): + """Return body-level comments absorbed into the trailing NL_OR_COMMENT.""" + trailing = self._trailing_nl + if trailing is not None and not trailing.is_inline: + return trailing.to_list() or [] + return [] + + def serialize( + self, options=SerializationOptions(), context=SerializationContext() + ) -> Any: + """Serialize to 'lhs operator rhs' string.""" + with context.modify(inside_dollar_string=True): + lhs = self.expr_term.serialize(options, context) + operator = str( + self.binary_term.binary_operator.serialize(options, context) + ).strip() + rhs = self.binary_term.expr_term.serialize(options, context) + + result = f"{lhs} {operator} {rhs}" + + if not context.inside_dollar_string: + result = to_dollar_string(result) + + if options.force_operation_parentheses: + result = self._wrap_into_parentheses(result, options, context) + return result + + +class UnaryOpRule(ExpressionRule): + """Rule for unary operations (e.g. negation, logical not).""" + + _children_layout: Tuple[LarkToken, ExprTermRule] + + @staticmethod + def lark_name() -> str: + """Return the grammar rule name.""" + return "unary_op" + + @property + def operator(self) -> str: + """Return the unary operator string.""" + return str(self._children[0]) + + @property + def expr_term(self): + """Return the operand.""" + return self._children[1] + + def serialize( + self, options=SerializationOptions(), context=SerializationContext() + ) -> Any: + """Serialize to 'operator operand' string.""" + with context.modify(inside_dollar_string=True): + operator = self.operator.rstrip() + result = f"{operator}{self.expr_term.serialize(options, context)}" + + if not context.inside_dollar_string: + result = to_dollar_string(result) + + if options.force_operation_parentheses: + result = self._wrap_into_parentheses(result, options, context) + + return result diff --git a/hcl2/rules/for_expressions.py b/hcl2/rules/for_expressions.py new file mode 100644 index 00000000..eb018343 --- /dev/null +++ b/hcl2/rules/for_expressions.py @@ -0,0 +1,321 @@ +"""Rule classes for HCL2 for-tuple and for-object expressions.""" + +from dataclasses import replace +from typing import Any, Tuple, Optional, List + +from lark.tree import Meta + +from hcl2.rules.expressions import ExpressionRule +from hcl2.rules.literal_rules import IdentifierRule +from hcl2.rules.tokens import ( + LSQB, + RSQB, + LBRACE, + RBRACE, + FOR, + IN, + IF, + COMMA, + COLON, + ELLIPSIS, + FOR_OBJECT_ARROW, + StaticStringToken, +) +from hcl2.rules.whitespace import ( + NewLineOrCommentRule, + InlineCommentMixIn, +) +from hcl2.utils import ( + SerializationOptions, + SerializationContext, + to_dollar_string, +) + + +class ForIntroRule(InlineCommentMixIn): + """Rule for the intro part of for expressions: 'for key, value in collection :'""" + + _children_layout: Tuple[ + FOR, + Optional[NewLineOrCommentRule], + IdentifierRule, + Optional[COMMA], + Optional[IdentifierRule], + Optional[NewLineOrCommentRule], + IN, + Optional[NewLineOrCommentRule], + ExpressionRule, + Optional[NewLineOrCommentRule], + COLON, + Optional[NewLineOrCommentRule], + ] + + @staticmethod + def lark_name() -> str: + """Return the grammar rule name.""" + return "for_intro" + + def __init__(self, children, meta: Optional[Meta] = None): + + self._insert_optionals(children) + super().__init__(children, meta) + + def _insert_optionals( # type: ignore[override] + self, children: List, indexes: Optional[List[int]] = None + ): + """Insert None placeholders, handling optional comma and second identifier.""" + identifiers = [child for child in children if isinstance(child, IdentifierRule)] + second_identifier = identifiers[1] if len(identifiers) == 2 else None + + indexes = [1, 5, 7, 9, 11] + if second_identifier is None: + indexes.extend([3, 4]) + + super()._insert_optionals(children, sorted(indexes)) + + if second_identifier is not None: + children[3] = COMMA() # type: ignore[abstract] # pylint: disable=abstract-class-instantiated + children[4] = second_identifier + + @property + def first_iterator(self) -> IdentifierRule: + """Return the first iterator identifier.""" + return self._children[2] + + @property + def second_iterator(self) -> Optional[IdentifierRule]: + """Return the second iterator identifier, or None if not present.""" + return self._children[4] + + @property + def iterable(self) -> ExpressionRule: + """Return the collection expression being iterated over.""" + return self._children[8] + + def serialize( + self, options=SerializationOptions(), context=SerializationContext() + ) -> str: + """Serialize to 'for key, value in collection : ' string.""" + result = "for " + + result += f"{self.first_iterator.serialize(options, context)}" + if self.second_iterator: + result += f", {self.second_iterator.serialize(options, context)}" + + result += f" in {self.iterable.serialize(options, context)} : " + return result + + +class ForCondRule(InlineCommentMixIn): + """Rule for the optional condition in for expressions: 'if condition'""" + + _children_layout: Tuple[ + IF, + Optional[NewLineOrCommentRule], + ExpressionRule, # condition expression + ] + + @staticmethod + def lark_name() -> str: + """Return the grammar rule name.""" + return "for_cond" + + def __init__(self, children, meta: Optional[Meta] = None): + self._insert_optionals(children, [1]) + super().__init__(children, meta) + + @property + def condition_expr(self) -> ExpressionRule: + """Return the condition expression.""" + return self._children[2] + + def serialize( + self, options=SerializationOptions(), context=SerializationContext() + ) -> str: + """Serialize to 'if condition' string.""" + return f"if {self.condition_expr.serialize(options, context)}" + + +class ForTupleExprRule(ExpressionRule): + """Rule for tuple/array for expressions: [for item in items : expression]""" + + _children_layout: Tuple[ + LSQB, + Optional[NewLineOrCommentRule], + ForIntroRule, + Optional[NewLineOrCommentRule], + ExpressionRule, + Optional[NewLineOrCommentRule], + Optional[ForCondRule], + Optional[NewLineOrCommentRule], + RSQB, + ] + + @staticmethod + def lark_name() -> str: + """Return the grammar rule name.""" + return "for_tuple_expr" + + def __init__(self, children, meta: Optional[Meta] = None): + self._insert_optionals(children) + super().__init__(children, meta) + + def _insert_optionals( # type: ignore[override] + self, children: List, indexes: Optional[List[int]] = None + ): + """Insert None placeholders, handling optional condition.""" + condition = None + + for child in children: + if isinstance(child, ForCondRule): + condition = child + break + + indexes = [1, 3, 5, 7] + + if condition is None: + indexes.append(6) + + super()._insert_optionals(children, sorted(indexes)) + + children[6] = condition + + @property + def for_intro(self) -> ForIntroRule: + """Return the for intro rule.""" + return self._children[2] + + @property + def value_expr(self) -> ExpressionRule: + """Return the value expression.""" + return self._children[4] + + @property + def condition(self) -> Optional[ForCondRule]: + """Return the optional condition rule.""" + return self._children[6] + + def serialize( + self, options=SerializationOptions(), context=SerializationContext() + ) -> Any: + """Serialize to '[for ... : expr]' string.""" + result = "[" + + with context.modify(inside_dollar_string=True): + result += self.for_intro.serialize(options, context) + result += self.value_expr.serialize(options, context) + + if self.condition is not None: + result += f" {self.condition.serialize(options, context)}" + + result += "]" + if not context.inside_dollar_string: + result = to_dollar_string(result) + return result + + +class ForObjectExprRule(ExpressionRule): + """Rule for object for expressions: {for key, value in items : key => value}""" + + _children_layout: Tuple[ + LBRACE, + Optional[NewLineOrCommentRule], + ForIntroRule, + Optional[NewLineOrCommentRule], + ExpressionRule, + FOR_OBJECT_ARROW, + Optional[NewLineOrCommentRule], + ExpressionRule, + Optional[NewLineOrCommentRule], + Optional[ELLIPSIS], + Optional[NewLineOrCommentRule], + Optional[ForCondRule], + Optional[NewLineOrCommentRule], + RBRACE, + ] + + @staticmethod + def lark_name() -> str: + """Return the grammar rule name.""" + return "for_object_expr" + + def __init__(self, children, meta: Optional[Meta] = None): + self._insert_optionals(children) + super().__init__(children, meta) + + def _insert_optionals( # type: ignore[override] + self, children: List, indexes: Optional[List[int]] = None + ): + """Insert None placeholders, handling optional ellipsis and condition.""" + ellipsis_ = None + condition = None + + for child in children: + if ( + ellipsis_ is None + and isinstance(child, StaticStringToken) + and child.lark_name() == "ELLIPSIS" + ): + ellipsis_ = child + if condition is None and isinstance(child, ForCondRule): + condition = child + + indexes = [1, 3, 6, 8, 10, 12] + + if ellipsis_ is None: + indexes.append(9) + if condition is None: + indexes.append(11) + + super()._insert_optionals(children, sorted(indexes)) + + children[9] = ellipsis_ + children[11] = condition + + @property + def for_intro(self) -> ForIntroRule: + """Return the for intro rule.""" + return self._children[2] + + @property + def key_expr(self) -> ExpressionRule: + """Return the key expression.""" + return self._children[4] + + @property + def value_expr(self) -> ExpressionRule: + """Return the value expression.""" + return self._children[7] + + @property + def ellipsis(self): + """Return the optional ellipsis token.""" + return self._children[9] + + @property + def condition(self) -> Optional[ForCondRule]: + """Return the optional condition rule.""" + return self._children[11] + + def serialize( + self, options=SerializationOptions(), context=SerializationContext() + ) -> Any: + """Serialize to '{for ... : key => value}' string.""" + result = "{" + with context.modify(inside_dollar_string=True): + result += self.for_intro.serialize(options, context) + result += f"{self.key_expr.serialize(options, context)} => " + + result += self.value_expr.serialize( + replace(options, wrap_objects=True), context + ) + if self.ellipsis is not None: + result += self.ellipsis.serialize(options, context) + + if self.condition is not None: + result += f" {self.condition.serialize(options, context)}" + + result += "}" + if not context.inside_dollar_string: + result = to_dollar_string(result) + return result diff --git a/hcl2/rules/functions.py b/hcl2/rules/functions.py new file mode 100644 index 00000000..bd574ebe --- /dev/null +++ b/hcl2/rules/functions.py @@ -0,0 +1,113 @@ +"""Rule classes for HCL2 function calls and arguments.""" + +from typing import Any, Optional, Tuple, Union, List + +from hcl2.rules.expressions import ExpressionRule +from hcl2.rules.literal_rules import IdentifierRule +from hcl2.rules.tokens import COMMA, ELLIPSIS, StringToken, LPAR, RPAR +from hcl2.rules.whitespace import ( + InlineCommentMixIn, + NewLineOrCommentRule, +) +from hcl2.utils import ( + SerializationOptions, + SerializationContext, + to_dollar_string, +) + + +class ArgumentsRule(InlineCommentMixIn): + """Rule for a comma-separated list of function arguments.""" + + _children_layout: Tuple[ + ExpressionRule, + Tuple[ + Optional[NewLineOrCommentRule], + COMMA, + Optional[NewLineOrCommentRule], + ExpressionRule, + # ... + ], + Optional[Union[COMMA, ELLIPSIS]], + Optional[NewLineOrCommentRule], + ] + + @staticmethod + def lark_name() -> str: + """Return the grammar rule name.""" + return "arguments" + + @property + def has_ellipsis(self) -> bool: + """Return whether the argument list ends with an ellipsis (...).""" + for child in self._children[-2:]: + if isinstance(child, StringToken) and child.lark_name() == "ELLIPSIS": + return True + return False + + @property + def arguments(self) -> List[ExpressionRule]: + """Return the list of expression arguments.""" + return [child for child in self._children if isinstance(child, ExpressionRule)] + + def serialize( + self, options=SerializationOptions(), context=SerializationContext() + ) -> Any: + """Serialize to a comma-separated argument string.""" + result = ", ".join( + str(argument.serialize(options, context)) for argument in self.arguments + ) + if self.has_ellipsis: + result += " ..." + return result + + +class FunctionCallRule(InlineCommentMixIn): + """Rule for function call expressions (e.g. func(args)).""" + + _children_layout: Tuple[ + IdentifierRule, + Optional[IdentifierRule], + Optional[IdentifierRule], + LPAR, + Optional[NewLineOrCommentRule], + Optional[ArgumentsRule], + Optional[NewLineOrCommentRule], + RPAR, + ] + + @staticmethod + def lark_name() -> str: + """Return the grammar rule name.""" + return "function_call" + + @property + def identifiers(self) -> List[IdentifierRule]: + """Return the function name identifier(s).""" + return [child for child in self._children if isinstance(child, IdentifierRule)] + + @property + def arguments(self) -> Optional[ArgumentsRule]: + """Return the arguments rule, or None if no arguments.""" + for child in self._children: + if isinstance(child, ArgumentsRule): + return child + return None + + def serialize( + self, options=SerializationOptions(), context=SerializationContext() + ) -> Any: + """Serialize to 'func(args)' string.""" + with context.modify(inside_dollar_string=True): + name = "::".join( + identifier.serialize(options, context) + for identifier in self.identifiers + ) + args = self.arguments + args_str = args.serialize(options, context) if args else "" + result = f"{name}({args_str})" + + if not context.inside_dollar_string: + result = to_dollar_string(result) + + return result diff --git a/hcl2/rules/indexing.py b/hcl2/rules/indexing.py new file mode 100644 index 00000000..4cc292c0 --- /dev/null +++ b/hcl2/rules/indexing.py @@ -0,0 +1,293 @@ +"""Rule classes for HCL2 indexing, attribute access, and splat expressions.""" + +from typing import List, Optional, Tuple, Any, Union + +from lark.tree import Meta + +from hcl2.rules.abstract import LarkRule +from hcl2.rules.expressions import ExprTermRule, ExpressionRule +from hcl2.rules.literal_rules import IdentifierRule +from hcl2.rules.tokens import ( + DOT, + IntLiteral, + LSQB, + RSQB, + ATTR_SPLAT, + FULL_SPLAT, +) +from hcl2.rules.whitespace import ( + InlineCommentMixIn, + NewLineOrCommentRule, +) +from hcl2.utils import ( + SerializationOptions, + to_dollar_string, + SerializationContext, +) + + +class ShortIndexRule(LarkRule): + """Rule for dot-numeric index access (e.g. .0).""" + + _children_layout: Tuple[ + DOT, + IntLiteral, + ] + + @staticmethod + def lark_name() -> str: + """Return the grammar rule name.""" + return "short_index" + + @property + def index(self): + """Return the index token.""" + return self.children[1] + + def serialize( + self, options=SerializationOptions(), context=SerializationContext() + ) -> Any: + """Serialize to '.N' string.""" + return f".{self.index.serialize(options, context)}" + + +class SqbIndexRule(InlineCommentMixIn): + """Rule for square-bracket index access (e.g. [expr]).""" + + _children_layout: Tuple[ + LSQB, + Optional[NewLineOrCommentRule], + ExprTermRule, + Optional[NewLineOrCommentRule], + RSQB, + ] + + @staticmethod + def lark_name() -> str: + """Return the grammar rule name.""" + return "braces_index" + + @property + def index_expression(self): + """Return the index expression inside the brackets.""" + return self.children[2] + + def serialize( + self, options=SerializationOptions(), context=SerializationContext() + ) -> Any: + """Serialize to '[expr]' string.""" + return f"[{self.index_expression.serialize(options, context)}]" + + def __init__(self, children, meta: Optional[Meta] = None): + self._insert_optionals(children, [1, 3]) + super().__init__(children, meta) + + +class IndexExprTermRule(ExpressionRule): + """Rule for index access on an expression term.""" + + _children_layout: Tuple[ExprTermRule, SqbIndexRule] + + @staticmethod + def lark_name() -> str: + """Return the grammar rule name.""" + return "index_expr_term" + + def serialize( + self, options=SerializationOptions(), context=SerializationContext() + ) -> Any: + """Serialize to 'expr[index]' string.""" + with context.modify(inside_dollar_string=True): + expr = self.children[0].serialize(options, context) + index = self.children[1].serialize(options, context) + result = f"{expr}{index}" + if not context.inside_dollar_string: + result = to_dollar_string(result) + return result + + +class GetAttrRule(LarkRule): + """Rule for dot-attribute access (e.g. .name).""" + + _children_layout: Tuple[ + DOT, + IdentifierRule, + ] + + @staticmethod + def lark_name() -> str: + """Return the grammar rule name.""" + return "get_attr" + + @property + def identifier(self) -> IdentifierRule: + """Return the accessed identifier.""" + return self._children[1] + + def serialize( + self, options=SerializationOptions(), context=SerializationContext() + ) -> Any: + """Serialize to '.identifier' string.""" + return f".{self.identifier.serialize(options, context)}" + + +class GetAttrExprTermRule(ExpressionRule): + """Rule for attribute access on an expression term.""" + + _children_layout: Tuple[ + ExprTermRule, + GetAttrRule, + ] + + @staticmethod + def lark_name() -> str: + """Return the grammar rule name.""" + return "get_attr_expr_term" + + @property + def expr_term(self) -> ExprTermRule: + """Return the base expression term.""" + return self._children[0] + + @property + def get_attr(self) -> GetAttrRule: + """Return the attribute access rule.""" + return self._children[1] + + def serialize( + self, options=SerializationOptions(), context=SerializationContext() + ) -> Any: + """Serialize to 'expr.attr' string.""" + with context.modify(inside_dollar_string=True): + expr = self.expr_term.serialize(options, context) + attr = self.get_attr.serialize(options, context) + result = f"{expr}{attr}" + if not context.inside_dollar_string: + result = to_dollar_string(result) + return result + + +class AttrSplatRule(LarkRule): + """Rule for attribute splat expressions (e.g. .*.attr).""" + + _children_layout: Tuple[ + ATTR_SPLAT, + Tuple[Union[GetAttrRule, Union[SqbIndexRule, ShortIndexRule]], ...], + ] + + @staticmethod + def lark_name() -> str: + """Return the grammar rule name.""" + return "attr_splat" + + @property + def get_attrs( + self, + ) -> List[Union[GetAttrRule, SqbIndexRule, ShortIndexRule]]: + """Return the trailing accessor chain.""" + return self._children[1:] + + def serialize( + self, options=SerializationOptions(), context=SerializationContext() + ) -> Any: + """Serialize to '.*...' string.""" + return ".*" + "".join( + get_attr.serialize(options, context) for get_attr in self.get_attrs + ) + + +class AttrSplatExprTermRule(ExpressionRule): + """Rule for attribute splat on an expression term.""" + + _children_layout: Tuple[ExprTermRule, AttrSplatRule] + + @staticmethod + def lark_name() -> str: + """Return the grammar rule name.""" + return "attr_splat_expr_term" + + @property + def expr_term(self) -> ExprTermRule: + """Return the base expression term.""" + return self._children[0] + + @property + def attr_splat(self) -> AttrSplatRule: + """Return the attribute splat rule.""" + return self._children[1] + + def serialize( + self, options=SerializationOptions(), context=SerializationContext() + ) -> Any: + """Serialize to 'expr.*...' string.""" + with context.modify(inside_dollar_string=True): + expr = self.expr_term.serialize(options, context) + splat = self.attr_splat.serialize(options, context) + result = f"{expr}{splat}" + + if not context.inside_dollar_string: + result = to_dollar_string(result) + return result + + +class FullSplatRule(LarkRule): + """Rule for full splat expressions (e.g. [*].attr).""" + + _children_layout: Tuple[ + FULL_SPLAT, + Tuple[Union[GetAttrRule, Union[SqbIndexRule, ShortIndexRule]], ...], + ] + + @staticmethod + def lark_name() -> str: + """Return the grammar rule name.""" + return "full_splat" + + @property + def get_attrs( + self, + ) -> List[Union[GetAttrRule, SqbIndexRule, ShortIndexRule]]: + """Return the trailing accessor chain.""" + return self._children[1:] + + def serialize( + self, options=SerializationOptions(), context=SerializationContext() + ) -> Any: + """Serialize to '[*]...' string.""" + return "[*]" + "".join( + get_attr.serialize(options, context) for get_attr in self.get_attrs + ) + + +class FullSplatExprTermRule(ExpressionRule): + """Rule for full splat on an expression term.""" + + _children_layout: Tuple[ExprTermRule, FullSplatRule] + + @staticmethod + def lark_name() -> str: + """Return the grammar rule name.""" + return "full_splat_expr_term" + + @property + def expr_term(self) -> ExprTermRule: + """Return the base expression term.""" + return self._children[0] + + @property + def attr_splat(self) -> FullSplatRule: + """Return the full splat rule.""" + return self._children[1] + + def serialize( + self, options=SerializationOptions(), context=SerializationContext() + ) -> Any: + """Serialize to 'expr[*]...' string.""" + with context.modify(inside_dollar_string=True): + expr = self.expr_term.serialize(options, context) + splat = self.attr_splat.serialize(options, context) + result = f"{expr}{splat}" + + if not context.inside_dollar_string: + result = to_dollar_string(result) + return result diff --git a/hcl2/rules/literal_rules.py b/hcl2/rules/literal_rules.py new file mode 100644 index 00000000..1db333f5 --- /dev/null +++ b/hcl2/rules/literal_rules.py @@ -0,0 +1,86 @@ +"""Rule classes for literal values (keywords, identifiers, numbers, operators).""" + +from abc import ABC +from typing import Any, Tuple + +from hcl2.rules.abstract import LarkRule, LarkToken +from hcl2.utils import SerializationOptions, SerializationContext, to_dollar_string + + +class TokenRule(LarkRule, ABC): + """Base rule wrapping a single token child.""" + + _children_layout: Tuple[LarkToken] + + @property + def token(self) -> LarkToken: + """Return the single token child.""" + return self._children[0] + + def serialize( + self, options=SerializationOptions(), context=SerializationContext() + ) -> Any: + """Serialize by delegating to the token's own serialization.""" + return self.token.serialize() + + +class KeywordRule(TokenRule): + """Rule for HCL2 keyword literals (true, false, null).""" + + @staticmethod + def lark_name() -> str: + """Return the grammar rule name.""" + return "keyword" + + +class IdentifierRule(TokenRule): + """Rule for HCL2 identifiers.""" + + @staticmethod + def lark_name() -> str: + """Return the grammar rule name.""" + return "identifier" + + +class IntLitRule(TokenRule): + """Rule for integer literal expressions.""" + + @staticmethod + def lark_name() -> str: + """Return the grammar rule name.""" + return "int_lit" + + +class FloatLitRule(TokenRule): + """Rule for floating-point literal expressions.""" + + @staticmethod + def lark_name() -> str: + """Return the grammar rule name.""" + return "float_lit" + + def serialize( + self, options=SerializationOptions(), context=SerializationContext() + ) -> Any: + """Serialize, preserving scientific notation when configured.""" + value = self.token.value + # Scientific notation (e.g. 1.23e5) cannot survive a Python float() + # round-trip, so preserve it as a ${...} expression string. + if ( + options.preserve_scientific_notation + and isinstance(value, str) + and "e" in value.lower() + ): + if context.inside_dollar_string: + return value + return to_dollar_string(value) + return self.token.serialize() + + +class BinaryOperatorRule(TokenRule): + """Rule for binary operator tokens.""" + + @staticmethod + def lark_name() -> str: + """Return the grammar rule name.""" + return "binary_operator" diff --git a/hcl2/rules/strings.py b/hcl2/rules/strings.py new file mode 100644 index 00000000..c71aeb87 --- /dev/null +++ b/hcl2/rules/strings.py @@ -0,0 +1,232 @@ +"""Rule classes for HCL2 string literals, interpolation, and heredoc templates.""" + +import sys +from typing import Tuple, List, Any, Union + +from hcl2.rules.abstract import LarkRule +from hcl2.rules.expressions import ExpressionRule +from hcl2.rules.tokens import ( + INTERP_START, + RBRACE, + DBLQUOTE, + STRING_CHARS, + ESCAPED_INTERPOLATION, + ESCAPED_DIRECTIVE, + TEMPLATE_STRING, + HEREDOC_TEMPLATE, + HEREDOC_TRIM_TEMPLATE, +) +from hcl2.utils import ( + SerializationOptions, + SerializationContext, + to_dollar_string, + HEREDOC_TRIM_PATTERN, + HEREDOC_PATTERN, +) + + +class InterpolationRule(LarkRule): + """Rule for ${expression} interpolation within strings.""" + + _children_layout: Tuple[ + INTERP_START, + ExpressionRule, + RBRACE, + ] + + @staticmethod + def lark_name() -> str: + """Return the grammar rule name.""" + return "interpolation" + + @property + def expression(self): + """Return the interpolated expression.""" + return self.children[1] + + def serialize( + self, options=SerializationOptions(), context=SerializationContext() + ) -> Any: + """Serialize to ${expression} string.""" + return to_dollar_string(self.expression.serialize(options, context)) + + +class StringPartRule(LarkRule): + """Rule for a single part of a string (literal text, escape, interpolation, or directive).""" + + # Content may be a plain token (STRING_CHARS, ESCAPED_INTERPOLATION, + # ESCAPED_DIRECTIVE), an InterpolationRule, or a template directive rule + # (TemplateIfRule, TemplateForRule, and flat variants). Forward refs are + # quoted to avoid circular imports. + _children_layout: Tuple[ # type: ignore[type-arg] + Union[STRING_CHARS, ESCAPED_INTERPOLATION, ESCAPED_DIRECTIVE, InterpolationRule] + ] + + @staticmethod + def lark_name() -> str: + """Return the grammar rule name.""" + return "string_part" + + @property + def content(self): + """Return the content element (string chars, escape, interpolation, or directive).""" + return self._children[0] + + def serialize( + self, options=SerializationOptions(), context=SerializationContext() + ) -> Any: + """Serialize this string part.""" + return self.content.serialize(options, context) + + +class StringRule(LarkRule): + """Rule for quoted string literals.""" + + _children_layout: Tuple[DBLQUOTE, List[StringPartRule], DBLQUOTE] + + @staticmethod + def lark_name() -> str: + """Return the grammar rule name.""" + return "string" + + @property + def string_parts(self): + """Return the list of string parts between quotes.""" + return self.children[1:-1] + + def serialize( + self, options=SerializationOptions(), context=SerializationContext() + ) -> Any: + """Serialize to a quoted string.""" + inner = "".join(part.serialize(options, context) for part in self.string_parts) + if options.strip_string_quotes: + return inner + return '"' + inner + '"' + + +class HeredocTemplateRule(LarkRule): + """Rule for heredoc template strings (< str: + """Return the grammar rule name.""" + return "heredoc_template" + + @property + def heredoc(self): + """Return the raw heredoc token.""" + return self.children[0] + + def serialize( + self, options=SerializationOptions(), context=SerializationContext() + ) -> Any: + """Serialize the heredoc, optionally stripping to a plain string.""" + heredoc = self.heredoc.serialize(options, context) + + if not options.preserve_heredocs: + match = HEREDOC_PATTERN.match(heredoc) + if not match: + raise RuntimeError(f"Invalid Heredoc token: {heredoc}") + heredoc = match.group(2).rstrip(self._trim_chars) + heredoc = ( + heredoc.replace("\\", "\\\\").replace('"', '\\"').replace("\n", "\\n") + ) + if options.strip_string_quotes: + return heredoc + return f'"{heredoc}"' + + result = heredoc.rstrip(self._trim_chars) + if options.strip_string_quotes: + return result + return f'"{result}"' + + +class HeredocTrimTemplateRule(HeredocTemplateRule): + """Rule for indented heredoc template strings (<<-MARKER).""" + + _children_layout: Tuple[HEREDOC_TRIM_TEMPLATE] + + @staticmethod + def lark_name() -> str: + """Return the grammar rule name.""" + return "heredoc_template_trim" + + def serialize( + self, options=SerializationOptions(), context=SerializationContext() + ) -> Any: + """Serialize the trim heredoc, stripping common leading whitespace.""" + # See https://github.com/hashicorp/hcl2/blob/master/hcl/hclsyntax/spec.md#template-expressions + # This is a special version of heredocs that are declared with "<<-" + # This will calculate the minimum number of leading spaces in each line of a heredoc + # and then remove that number of spaces from each line + + heredoc = self.heredoc.serialize(options, context) + + if not options.preserve_heredocs: + match = HEREDOC_TRIM_PATTERN.match(heredoc) + if not match: + raise RuntimeError(f"Invalid Heredoc token: {heredoc}") + heredoc = match.group(2) + + heredoc = heredoc.rstrip(self._trim_chars) + lines = heredoc.split("\n") + + # calculate the min number of leading spaces in each line + min_spaces = sys.maxsize + for line in lines: + leading_spaces = len(line) - len(line.lstrip(" ")) + min_spaces = min(min_spaces, leading_spaces) + + # trim off that number of leading spaces from each line + lines = [line[min_spaces:] for line in lines] + + if not options.preserve_heredocs: + lines = [line.replace("\\", "\\\\").replace('"', '\\"') for line in lines] + + sep = "\\n" if not options.preserve_heredocs else "\n" + inner = sep.join(lines) + if options.strip_string_quotes: + return inner + return '"' + inner + '"' + + +class TemplateStringRule(LarkRule): + """Rule for escaped-quote-delimited strings in template expressions (\\\"...\\\" ).""" + + _children_layout: Tuple[TEMPLATE_STRING] + + @staticmethod + def lark_name() -> str: + """Return the grammar rule name.""" + return "template_string" + + @property + def raw_value(self) -> str: + """Return the raw token value including escaped quotes.""" + return str(self._children[0].value) + + @property + def inner_value(self) -> str: + """Return the string content without the escaped quote delimiters.""" + raw = self.raw_value + # Strip leading \" and trailing \" + if raw.startswith('\\"') and raw.endswith('\\"'): + return raw[2:-2] + return raw + + def serialize( + self, options=SerializationOptions(), context=SerializationContext() + ) -> Any: + """Serialize preserving escaped-quote delimiters for round-trip fidelity. + + Inside template directive expressions, strings are delimited by \\" + rather than plain ". We preserve these as \\" in serialized form so + the deserializer can reconstruct them correctly. + """ + raw = self.raw_value + if options.strip_string_quotes: + return self.inner_value + return raw diff --git a/hcl2/rules/tokens.py b/hcl2/rules/tokens.py new file mode 100644 index 00000000..c182e62c --- /dev/null +++ b/hcl2/rules/tokens.py @@ -0,0 +1,157 @@ +"""Token classes for terminal elements in the LarkElement tree.""" + +from functools import lru_cache +from typing import Callable, Any, Dict, Type, Optional, Tuple + +from hcl2.rules.abstract import LarkToken + + +class StringToken(LarkToken): + """ + Single run-time base class; every `StringToken["..."]` call returns a + cached subclass whose static `lark_name()` yields the given string. + """ + + @staticmethod + def lark_name() -> str: + """Overridden by dynamic subclasses created via ``__class_getitem__``.""" + raise NotImplementedError( + "Use StringToken['NAME'] to create a concrete subclass" + ) + + @classmethod + @lru_cache(maxsize=None) + def __build_subclass(cls, name: str) -> Type["StringToken"]: + """Create a subclass with a constant `lark_name`.""" + return type( # type: ignore + f"{name}_TOKEN", + (StringToken,), + { + "__slots__": (), + "lark_name": staticmethod(lambda _n=name: _n), + }, + ) + + def __class_getitem__(cls, name: str) -> Type["StringToken"]: + """Return a cached subclass keyed by the given grammar token name.""" + if not isinstance(name, str): + raise TypeError("StringToken[...] expects a single str argument") + return cls.__build_subclass(name) + + @property + def serialize_conversion(self) -> Callable[[Any], str]: + """Return str as the conversion callable.""" + return str + + +class StaticStringToken(StringToken): + """A StringToken subclass with a fixed default value set at class-creation time.""" + + classes_by_value: Dict[Optional[str], Type["StringToken"]] = {} + + @classmethod + @lru_cache(maxsize=None) + def __build_subclass( + cls, name: str, default_value: Optional[str] = None + ) -> Type["StringToken"]: + """Create a subclass with a constant `lark_name` and default value.""" + + result = type( # type: ignore + f"{name}_TOKEN", + (cls,), + { + "__slots__": (), + "lark_name": staticmethod(lambda _n=name: _n), + "_default_value": default_value, + }, + ) + cls.classes_by_value[default_value] = result + return result + + def __class_getitem__( # type: ignore[override] + cls, name: Tuple[str, str] + ) -> Type["StringToken"]: + """Return a cached subclass keyed by a (token_name, default_value) tuple.""" + token_name, default_value = name + return cls.__build_subclass(token_name, default_value) + + def __init__(self): + super().__init__(getattr(self, "_default_value")) + + @property + def serialize_conversion(self) -> Callable[[Any], str]: + """Return str as the conversion callable.""" + return str + + +# Explicitly define various kinds of string-based tokens for type hinting. +# mypy cannot follow the dynamic __class_getitem__ pattern, so every alias +# in this block carries a blanket ``type: ignore``. +# pylint: disable=invalid-name + +# variable values +NAME = StringToken["NAME"] # type: ignore +STRING_CHARS = StringToken["STRING_CHARS"] # type: ignore +ESCAPED_INTERPOLATION = StringToken["ESCAPED_INTERPOLATION"] # type: ignore +ESCAPED_DIRECTIVE = StringToken["ESCAPED_DIRECTIVE"] # type: ignore +TEMPLATE_STRING = StringToken["TEMPLATE_STRING"] # type: ignore +BINARY_OP = StringToken["BINARY_OP"] # type: ignore +HEREDOC_TEMPLATE = StringToken["HEREDOC_TEMPLATE"] # type: ignore +HEREDOC_TRIM_TEMPLATE = StringToken["HEREDOC_TRIM_TEMPLATE"] # type: ignore +NL_OR_COMMENT = StringToken["NL_OR_COMMENT"] # type: ignore +# static values +EQ = StaticStringToken[("EQ", "=")] # type: ignore +COLON = StaticStringToken[("COLON", ":")] # type: ignore +LPAR = StaticStringToken[("LPAR", "(")] # type: ignore +RPAR = StaticStringToken[("RPAR", ")")] # type: ignore +LBRACE = StaticStringToken[("LBRACE", "{")] # type: ignore +RBRACE = StaticStringToken[("RBRACE", "}")] # type: ignore +DOT = StaticStringToken[("DOT", ".")] # type: ignore +COMMA = StaticStringToken[("COMMA", ",")] # type: ignore +ELLIPSIS = StaticStringToken[("ELLIPSIS", "...")] # type: ignore +QMARK = StaticStringToken[("QMARK", "?")] # type: ignore +LSQB = StaticStringToken[("LSQB", "[")] # type: ignore +RSQB = StaticStringToken[("RSQB", "]")] # type: ignore +INTERP_START = StaticStringToken[("INTERP_START", "${")] # type: ignore +DIRECTIVE_START = StaticStringToken[("DIRECTIVE_START", "%{")] # type: ignore +STRIP_MARKER = StaticStringToken[("STRIP_MARKER", "~")] # type: ignore +DBLQUOTE = StaticStringToken[("DBLQUOTE", '"')] # type: ignore +ATTR_SPLAT = StaticStringToken[("ATTR_SPLAT", ".*")] # type: ignore +FULL_SPLAT = StaticStringToken[("FULL_SPLAT", "[*]")] # type: ignore +FOR = StaticStringToken[("FOR", "for")] # type: ignore +IN = StaticStringToken[("IN", "in")] # type: ignore +IF = StaticStringToken[("IF", "if")] # type: ignore +FOR_OBJECT_ARROW = StaticStringToken[("FOR_OBJECT_ARROW", "=>")] # type: ignore +ELSE = StaticStringToken[("ELSE", "else")] # type: ignore +ENDIF = StaticStringToken[("ENDIF", "endif")] # type: ignore +ENDFOR = StaticStringToken[("ENDFOR", "endfor")] # type: ignore + +# pylint: enable=invalid-name + + +class IntLiteral(LarkToken): + """Token for integer literal values.""" + + @staticmethod + def lark_name() -> str: + """Return the grammar token name.""" + return "INT_LITERAL" + + @property + def serialize_conversion(self) -> Callable: + """Return int as the conversion callable.""" + return int + + +class FloatLiteral(LarkToken): + """Token for floating-point literal values.""" + + @staticmethod + def lark_name() -> str: + """Return the grammar token name.""" + return "FLOAT_LITERAL" + + @property + def serialize_conversion(self) -> Callable: + """Return float as the conversion callable.""" + return float diff --git a/hcl2/rules/whitespace.py b/hcl2/rules/whitespace.py new file mode 100644 index 00000000..9cb464f7 --- /dev/null +++ b/hcl2/rules/whitespace.py @@ -0,0 +1,111 @@ +"""Rule classes for whitespace, comments, and inline comment handling.""" + +from abc import ABC +from typing import Optional, List, Any + +from hcl2.rules.abstract import LarkRule +from hcl2.rules.literal_rules import TokenRule +from hcl2.rules.tokens import NL_OR_COMMENT +from hcl2.utils import SerializationOptions, SerializationContext + + +class NewLineOrCommentRule(TokenRule): + """Rule for newline and comment tokens.""" + + @staticmethod + def lark_name() -> str: + """Return the grammar rule name.""" + return "new_line_or_comment" + + @classmethod + def from_string(cls, string: str) -> "NewLineOrCommentRule": + """Create an instance from a raw comment or newline string.""" + return cls([NL_OR_COMMENT(string)]) # type: ignore[abstract] # pylint: disable=abstract-class-instantiated + + def serialize( + self, options=SerializationOptions(), context=SerializationContext() + ) -> Any: + """Serialize to the raw comment/newline string.""" + return "".join(child.serialize() for child in self._children) + + @property + def is_inline(self) -> bool: + """True if this comment is on the same line as preceding code. + + A raw string starting with ``\\n`` means the comment sits on its own + line (standalone). One starting with ``#``, ``//``, or ``/*`` is + inline — it follows code on the same line. + """ + return not self.serialize().startswith("\n") + + def to_list( + self, options: SerializationOptions = SerializationOptions() + ) -> Optional[List[dict]]: + """Extract comment objects, or None if only a newline.""" + raw = self.serialize(options) + if raw == "\n": + return None + + stripped = raw.strip() + + # Block comments: keep as a single value + if stripped.startswith("/*") and stripped.endswith("*/"): + text = stripped[2:-2].strip() + if text: + return [{"value": text}] + return None + + # Line comments: one value per line + result = [] + for line in raw.split("\n"): + line = line.strip() + + for delimiter in ("//", "#"): + if line.startswith(delimiter): + line = line[len(delimiter) :] + break + + if line != "": + result.append({"value": line.strip()}) + + return result + + +class InlineCommentMixIn(LarkRule, ABC): + """Mixin for rules that may contain inline comments among their children.""" + + def _insert_optionals(self, children: List, indexes: Optional[List[int]] = None): + """Insert None placeholders at expected optional-child positions.""" + if indexes is None: + return + for index in indexes: + try: + child = children[index] + except IndexError: + children.insert(index, None) + else: + if not isinstance(child, NewLineOrCommentRule): + children.insert(index, None) + + def inline_comments(self): + """Collect all inline comment strings from this rule's children.""" + result = [] + for child in self._children: + + if isinstance(child, NewLineOrCommentRule): + comments = child.to_list() + if comments is not None: + result.extend(comments) + + elif isinstance(child, InlineCommentMixIn): + result.extend(child.inline_comments()) + + return result + + def absorbed_comments(self): + """Return body-level comments absorbed by grammar into this expression. + + Default: empty. ``BinaryOpRule`` overrides this because its trailing + ``new_line_or_comment?`` can swallow the next body-level comment. + """ + return [] diff --git a/hcl2/transformer.py b/hcl2/transformer.py index 382092d6..73146514 100644 --- a/hcl2/transformer.py +++ b/hcl2/transformer.py @@ -1,399 +1,424 @@ -"""A Lark Transformer for transforming a Lark parse tree into a Python dict""" -import json -import re -import sys -from collections import namedtuple -from typing import List, Dict, Any - -from lark import Token +"""Transform Lark parse trees into typed LarkElement rule trees.""" + +# pylint: disable=missing-function-docstring,unused-argument +from lark import Token, Tree, v_args, Transformer, Discard from lark.tree import Meta -from lark.visitors import Transformer, Discard, _DiscardType, v_args -from .reconstructor import reverse_quotes_within_interpolation +from hcl2.rules.base import ( + StartRule, + BodyRule, + BlockRule, + AttributeRule, +) +from hcl2.rules.containers import ( + ObjectRule, + ObjectElemRule, + ObjectElemKeyRule, + TupleRule, + ObjectElemKeyExpressionRule, +) +from hcl2.rules.expressions import ( + BinaryTermRule, + UnaryOpRule, + BinaryOpRule, + ExprTermRule, + ConditionalRule, +) +from hcl2.rules.for_expressions import ( + ForTupleExprRule, + ForObjectExprRule, + ForIntroRule, + ForCondRule, +) +from hcl2.rules.functions import ArgumentsRule, FunctionCallRule +from hcl2.rules.indexing import ( + IndexExprTermRule, + SqbIndexRule, + ShortIndexRule, + GetAttrRule, + GetAttrExprTermRule, + AttrSplatExprTermRule, + AttrSplatRule, + FullSplatRule, + FullSplatExprTermRule, +) +from hcl2.rules.literal_rules import ( + FloatLitRule, + IntLitRule, + IdentifierRule, + BinaryOperatorRule, + KeywordRule, +) +from hcl2.rules.strings import ( + InterpolationRule, + StringRule, + StringPartRule, + HeredocTemplateRule, + HeredocTrimTemplateRule, + TemplateStringRule, +) +from hcl2.rules.directives import ( + TemplateIfRule, + TemplateForRule, + TemplateIfStartRule, + TemplateElseRule, + TemplateEndifRule, + TemplateForStartRule, + TemplateEndforRule, +) +from hcl2.rules.tokens import ( + NAME, + IntLiteral, + FloatLiteral, + StringToken, + StaticStringToken, +) +from hcl2.rules.whitespace import NewLineOrCommentRule + + +class RuleTransformer(Transformer): + """Takes a syntax tree generated by the parser and + transforms it to a tree of LarkRule instances + """ + + with_meta: bool + def transform(self, tree: Tree) -> StartRule: + return super().transform(tree) -HEREDOC_PATTERN = re.compile(r"<<([a-zA-Z][a-zA-Z0-9._-]+)\n([\s\S]*)\1", re.S) -HEREDOC_TRIM_PATTERN = re.compile(r"<<-([a-zA-Z][a-zA-Z0-9._-]+)\n([\s\S]*)\1", re.S) + def __init__(self, discard_new_line_or_comments: bool = False): + super().__init__() + self.discard_new_line_or_comments = discard_new_line_or_comments + def __default_token__(self, token: Token) -> StringToken: + # TODO make this return StaticStringToken where applicable + value = token.value + # The EQ terminal /[ \t]*=(?!=|>)/ captures leading whitespace. + # Preserve it so the direct pipeline (to_lark → reconstruct) retains + # original alignment. The reconstructor skips its own space insertion + # when the EQ token already carries leading whitespace. -START_LINE = "__start_line__" -END_LINE = "__end_line__" + # Don't convert STRING_CHARS or ESCAPED_* tokens to static tokens. + # E.g., STRING_CHARS("=") must stay STRING_CHARS, not become EQ. + if token.type in ("STRING_CHARS", "ESCAPED_INTERPOLATION", "ESCAPED_DIRECTIVE"): + return StringToken[token.type](value) # type: ignore[misc] + if value in StaticStringToken.classes_by_value: + return StaticStringToken.classes_by_value[value]() + return StringToken[token.type](value) # type: ignore[misc] -Attribute = namedtuple("Attribute", ("key", "value")) + # pylint: disable=C0103 + def FLOAT_LITERAL(self, token: Token) -> FloatLiteral: + return FloatLiteral(token.value) + # pylint: disable=C0103 + def NAME(self, token: Token) -> NAME: + return NAME(token.value) -# pylint: disable=missing-function-docstring,unused-argument -class DictTransformer(Transformer): - """Takes a syntax tree generated by the parser and - transforms it to a dict. - """ + # pylint: disable=C0103 + def INT_LITERAL(self, token: Token) -> IntLiteral: + return IntLiteral(token.value) - with_meta: bool + @v_args(meta=True) + def start(self, meta: Meta, args) -> StartRule: + return StartRule(args, meta) - @staticmethod - def is_type_keyword(value: str) -> bool: - return value in {"bool", "number", "string"} + @v_args(meta=True) + def body(self, meta: Meta, args) -> BodyRule: + return BodyRule(args, meta) + + @v_args(meta=True) + def block(self, meta: Meta, args) -> BlockRule: + return BlockRule(args, meta) + + @v_args(meta=True) + def attribute(self, meta: Meta, args) -> AttributeRule: + # _attribute_name is flattened, so args[0] may be KeywordRule or IdentifierRule + if isinstance(args[0], KeywordRule): + args[0] = IdentifierRule([NAME(args[0].token.value)], meta) + return AttributeRule(args, meta) + + @v_args(meta=True) + def new_line_or_comment( + self, meta: Meta, args + ): # -> NewLineOrCommentRule | Discard + if self.discard_new_line_or_comments: + return Discard + return NewLineOrCommentRule(args, meta) + + @v_args(meta=True) + def identifier(self, meta: Meta, args) -> IdentifierRule: + return IdentifierRule(args, meta) - def __init__(self, with_meta: bool = False): + @v_args(meta=True) + def keyword(self, meta: Meta, args) -> KeywordRule: + return KeywordRule(args, meta) + + @v_args(meta=True) + def int_lit(self, meta: Meta, args) -> IntLitRule: + return IntLitRule(args, meta) + + @v_args(meta=True) + def float_lit(self, meta: Meta, args) -> FloatLitRule: + return FloatLitRule(args, meta) + + @v_args(meta=True) + def string(self, meta: Meta, args) -> StringRule: + # Assemble flat directive parts into nested TemplateIfRule/TemplateForRule + args = self._assemble_directives(list(args), meta) + return StringRule(args, meta) + + def _assemble_directives(self, parts, meta: Meta): + """Assemble flat directive string_parts into nested template rules. + + Scans for TemplateIfStartRule/TemplateForStartRule within StringPartRules + and collects children up to matching endif/endfor, handling nesting. """ - :param with_meta: If set to true then adds `__start_line__` and `__end_line__` - parameters to the output dict. Default to false. + result = [] + i = 0 + while i < len(parts): + assembled, i = self._try_assemble_nested(parts, i, meta) + if assembled is not None: + result.append(StringPartRule([assembled], meta)) + else: + result.append(parts[i]) + i += 1 + return result + + def _try_assemble_nested(self, parts, idx, meta): + """If parts[idx] starts a directive, assemble and return (rule, next_idx). + + Returns (None, idx) if parts[idx] is not a directive opener. """ - self.with_meta = with_meta - super().__init__() + part = parts[idx] + if isinstance(part, StringPartRule): + content = part.content + if isinstance(content, TemplateIfStartRule): + return self._assemble_template_if(parts, idx, meta) + if isinstance(content, TemplateForStartRule): + return self._assemble_template_for(parts, idx, meta) + return None, idx + + def _collect_body(self, parts, start, end_types, meta): + """Collect body parts from start until a StringPartRule with end_types content. + + Recursively assembles nested directives. Returns (body_list, end_content, next_idx). + """ + body: list = [] + i = start + while i < len(parts): + part = parts[i] + if isinstance(part, StringPartRule) and isinstance(part.content, end_types): + return body, part.content, i + 1 + assembled, i = self._try_assemble_nested(parts, i, meta) + if assembled is not None: + body.append(StringPartRule([assembled], meta)) + else: + body.append(parts[i]) + i += 1 + return body, None, i + + def _assemble_template_if(self, parts, start_idx, meta: Meta): + """Assemble a TemplateIfRule from flat parts starting at start_idx.""" + if_start = parts[start_idx].content + # Collect if-body until else or endif + if_body, end, i = self._collect_body( + parts, start_idx + 1, (TemplateElseRule, TemplateEndifRule), meta + ) + else_rule = None + else_body = None + if isinstance(end, TemplateElseRule): + else_rule = end + else_body, end, i = self._collect_body(parts, i, (TemplateEndifRule,), meta) + if not isinstance(end, TemplateEndifRule): + raise RuntimeError("Unterminated template if directive") + return TemplateIfRule(if_start, if_body, else_rule, else_body, end, meta), i + + def _assemble_template_for(self, parts, start_idx, meta: Meta): + """Assemble a TemplateForRule from flat parts starting at start_idx.""" + for_start = parts[start_idx].content + body, end, i = self._collect_body( + parts, start_idx + 1, (TemplateEndforRule,), meta + ) + if not isinstance(end, TemplateEndforRule): + raise RuntimeError("Unterminated template for directive") + return TemplateForRule(for_start, body, end, meta), i - def float_lit(self, args: List) -> float: - value = "".join([self.to_tf_inline(arg) for arg in args]) - if "e" in value: - return self.to_string_dollar(value) - return float(value) + @v_args(meta=True) + def string_part(self, meta: Meta, args) -> StringPartRule: + return StringPartRule(args, meta) + + @v_args(meta=True) + def interpolation(self, meta: Meta, args) -> InterpolationRule: + return InterpolationRule(args, meta) - def int_lit(self, args: List) -> int: - return int("".join([self.to_tf_inline(arg) for arg in args])) + @v_args(meta=True) + def heredoc_template(self, meta: Meta, args) -> HeredocTemplateRule: + return HeredocTemplateRule(args, meta) + + @v_args(meta=True) + def heredoc_template_trim(self, meta: Meta, args) -> HeredocTrimTemplateRule: + return HeredocTrimTemplateRule(args, meta) + + @v_args(meta=True) + def expr_term(self, meta: Meta, args) -> ExprTermRule: + return ExprTermRule(args, meta) - def expr_term(self, args: List) -> Any: - args = self.strip_new_line_tokens(args) + def _extract_nl_prefix(self, token): + """Strip leading newlines from a token value. - if args[0] == "true": - return True - if args[0] == "false": - return False - if args[0] == "null": + If the token contains a newline prefix (from the postlexer merging a + line-continuation newline into the operator token), strip it and + return a NewLineOrCommentRule. Otherwise return None. + """ + value = str(token.value) + stripped = value.lstrip("\n \t") + if len(stripped) == len(value): + return None + nl_text = value[: len(value) - len(stripped)] + token.set_value(stripped) + if self.discard_new_line_or_comments: return None + return NewLineOrCommentRule.from_string(nl_text) - if args[0] == "(" and args[-1] == ")": - return "".join(str(arg) for arg in args) + @v_args(meta=True) + def conditional(self, meta: Meta, args) -> ConditionalRule: + # args: [condition, QMARK, NL?, if_true, NL?, COLON, NL?, if_false] + # QMARK is at index 1 — check for NL prefix from the postlexer + qmark_token = args[1] + nl_rule = self._extract_nl_prefix(qmark_token) + if nl_rule is not None: + args = list(args) + args.insert(1, nl_rule) + return ConditionalRule(args, meta) - return args[0] + @v_args(meta=True) + def binary_operator(self, meta: Meta, args) -> BinaryOperatorRule: + return BinaryOperatorRule(args, meta) - def index_expr_term(self, args: List) -> str: - args = self.strip_new_line_tokens(args) - return f"{args[0]}{args[1]}" + @v_args(meta=True) + def binary_term(self, meta: Meta, args) -> BinaryTermRule: + # args: [BinaryOperatorRule, NL?, ExprTermRule] + # The operator's token may contain a NL prefix from the postlexer + op_rule = args[0] + nl_rule = self._extract_nl_prefix(op_rule.token) + if nl_rule is not None: + args = [nl_rule] + list(args) + return BinaryTermRule(args, meta) - def index(self, args: List) -> str: - args = self.strip_new_line_tokens(args) - return f"[{args[0]}]" + @v_args(meta=True) + def unary_op(self, meta: Meta, args) -> UnaryOpRule: + return UnaryOpRule(args, meta) - def get_attr_expr_term(self, args: List) -> str: - return f"{args[0]}{args[1]}" + @v_args(meta=True) + def binary_op(self, meta: Meta, args) -> BinaryOpRule: + return BinaryOpRule(args, meta) - def get_attr(self, args: List) -> str: - return f".{args[0]}" + @v_args(meta=True) + def tuple(self, meta: Meta, args) -> TupleRule: + return TupleRule(args, meta) - def attr_splat_expr_term(self, args: List) -> str: - return f"{args[0]}{args[1]}" + @v_args(meta=True) + def object(self, meta: Meta, args) -> ObjectRule: + return ObjectRule(args, meta) - def attr_splat(self, args: List) -> str: - args_str = "".join(self.to_tf_inline(arg) for arg in args) - return f".*{args_str}" + @v_args(meta=True) + def object_elem(self, meta: Meta, args) -> ObjectElemRule: + return ObjectElemRule(args, meta) - def full_splat_expr_term(self, args: List) -> str: - return f"{args[0]}{args[1]}" + @v_args(meta=True) + def object_elem_key(self, meta: Meta, args): + expr = args[0] + # Simple literals (identifier, string, int, float) wrapped in ExprTermRule + if isinstance(expr, ExprTermRule) and len(expr.children) == 5: + inner = expr.children[2] # position 2 in [None, None, inner, None, None] + if isinstance( + inner, (IdentifierRule, StringRule, IntLitRule, FloatLitRule) + ): + return ObjectElemKeyRule([inner], meta) + # Any other expression (parenthesized or bare) + return ObjectElemKeyExpressionRule([expr], meta) - def full_splat(self, args: List) -> str: - args_str = "".join(self.to_tf_inline(arg) for arg in args) - return f"[*]{args_str}" + @v_args(meta=True) + def arguments(self, meta: Meta, args) -> ArgumentsRule: + return ArgumentsRule(args, meta) - def tuple(self, args: List) -> List: - return [self.to_string_dollar(arg) for arg in self.strip_new_line_tokens(args)] + @v_args(meta=True) + def function_call(self, meta: Meta, args) -> FunctionCallRule: + return FunctionCallRule(args, meta) - def object_elem(self, args: List) -> Dict: - # This returns a dict with a single key/value pair to make it easier to merge these - # into a bigger dict that is returned by the "object" function + @v_args(meta=True) + def index_expr_term(self, meta: Meta, args) -> IndexExprTermRule: + return IndexExprTermRule(args, meta) - key = str(args[0].children[0]) - if not re.match(r".*?(\${).*}.*", key): - # do not strip quotes of a interpolation string - key = self.strip_quotes(key) + @v_args(meta=True) + def braces_index(self, meta: Meta, args) -> SqbIndexRule: + return SqbIndexRule(args, meta) - value = self.to_string_dollar(args[2]) - return {key: value} + @v_args(meta=True) + def short_index(self, meta: Meta, args) -> ShortIndexRule: + return ShortIndexRule(args, meta) - def object_elem_key_dot_accessor(self, args: List) -> str: - return "".join(args) + @v_args(meta=True) + def get_attr(self, meta: Meta, args) -> GetAttrRule: + return GetAttrRule(args, meta) - def object_elem_key_expression(self, args: List) -> str: - return self.to_string_dollar("".join(args)) + @v_args(meta=True) + def get_attr_expr_term(self, meta: Meta, args) -> GetAttrExprTermRule: + return GetAttrExprTermRule(args, meta) - def object(self, args: List) -> Dict: - args = self.strip_new_line_tokens(args) - result: Dict[str, Any] = {} - for arg in args: - if ( - isinstance(arg, Token) and arg.type == "COMMA" - ): # skip optional comma at the end of object element - continue + @v_args(meta=True) + def attr_splat(self, meta: Meta, args) -> AttrSplatRule: + return AttrSplatRule(args, meta) - result.update(arg) - return result + @v_args(meta=True) + def attr_splat_expr_term(self, meta: Meta, args) -> AttrSplatExprTermRule: + return AttrSplatExprTermRule(args, meta) - def function_call(self, args: List) -> str: - args = self.strip_new_line_tokens(args) - args_str = "" - if len(args) > 1: - args_str = ", ".join( - [self.to_tf_inline(arg) for arg in args[1] if arg is not Discard] - ) - return f"{args[0]}({args_str})" - - def provider_function_call(self, args: List) -> str: - args = self.strip_new_line_tokens(args) - args_str = "" - if len(args) > 5: - args_str = ", ".join( - [self.to_tf_inline(arg) for arg in args[5] if arg is not Discard] - ) - provider_func = "::".join([args[0], args[2], args[4]]) - return f"{provider_func}({args_str})" - - def arguments(self, args: List) -> List: - return self.process_nulls(args) - - @v_args(meta=True) - def block(self, meta: Meta, args: List) -> Dict: - *block_labels, block_body = args - result: Dict[str, Any] = block_body - if self.with_meta: - result.update( - { - START_LINE: meta.line, - END_LINE: meta.end_line, - } - ) - - # create nested dict. i.e. {label1: {label2: {labelN: result}}} - for label in reversed(block_labels): - label_str = self.strip_quotes(label) - result = {label_str: result} + @v_args(meta=True) + def full_splat(self, meta: Meta, args) -> FullSplatRule: + return FullSplatRule(args, meta) - return result + @v_args(meta=True) + def full_splat_expr_term(self, meta: Meta, args) -> FullSplatExprTermRule: + return FullSplatExprTermRule(args, meta) - def attribute(self, args: List) -> Attribute: - key = str(args[0]) - if key.startswith('"') and key.endswith('"'): - key = key[1:-1] - value = self.to_string_dollar(args[2]) - return Attribute(key, value) - - def conditional(self, args: List) -> str: - args = self.strip_new_line_tokens(args) - args = self.process_nulls(args) - return f"{args[0]} ? {args[1]} : {args[2]}" - - def binary_op(self, args: List) -> str: - return " ".join( - [self.unwrap_string_dollar(self.to_tf_inline(arg)) for arg in args] - ) + @v_args(meta=True) + def for_tuple_expr(self, meta: Meta, args) -> ForTupleExprRule: + return ForTupleExprRule(args, meta) - def unary_op(self, args: List) -> str: - args = self.process_nulls(args) - return "".join([self.to_tf_inline(arg) for arg in args]) - - def binary_term(self, args: List) -> str: - args = self.strip_new_line_tokens(args) - args = self.process_nulls(args) - return " ".join([self.to_tf_inline(arg) for arg in args]) - - def body(self, args: List) -> Dict[str, List]: - # See https://github.com/hashicorp/hcl/blob/main/hclsyntax/spec.md#bodies - # --- - # A body is a collection of associated attributes and blocks. - # - # An attribute definition assigns a value to a particular attribute - # name within a body. Each distinct attribute name may be defined no - # more than once within a single body. - # - # A block creates a child body that is annotated with a block type and - # zero or more block labels. Blocks create a structural hierarchy which - # can be interpreted by the calling application. - # --- - # - # There can be more than one child body with the same block type and - # labels. This means that all blocks (even when there is only one) - # should be transformed into lists of blocks. - args = self.strip_new_line_tokens(args) - attributes = set() - result: Dict[str, Any] = {} - for arg in args: - if isinstance(arg, Attribute): - if arg.key in result: - raise RuntimeError(f"{arg.key} already defined") - result[arg.key] = arg.value - attributes.add(arg.key) - else: - # This is a block. - for key, value in arg.items(): - key = str(key) - if key in result: - if key in attributes: - raise RuntimeError(f"{key} already defined") - result[key].append(value) - else: - result[key] = [value] + @v_args(meta=True) + def for_object_expr(self, meta: Meta, args) -> ForObjectExprRule: + return ForObjectExprRule(args, meta) - return result + @v_args(meta=True) + def for_intro(self, meta: Meta, args) -> ForIntroRule: + return ForIntroRule(args, meta) - def start(self, args: List) -> Dict: - args = self.strip_new_line_tokens(args) - return args[0] - - def binary_operator(self, args: List) -> str: - return str(args[0]) - - def heredoc_template(self, args: List) -> str: - match = HEREDOC_PATTERN.match(str(args[0])) - if not match: - raise RuntimeError(f"Invalid Heredoc token: {args[0]}") - - trim_chars = "\n\t " - result = match.group(2).rstrip(trim_chars) - return f'"{result}"' - - def heredoc_template_trim(self, args: List) -> str: - # See https://github.com/hashicorp/hcl2/blob/master/hcl/hclsyntax/spec.md#template-expressions - # This is a special version of heredocs that are declared with "<<-" - # This will calculate the minimum number of leading spaces in each line of a heredoc - # and then remove that number of spaces from each line - match = HEREDOC_TRIM_PATTERN.match(str(args[0])) - if not match: - raise RuntimeError(f"Invalid Heredoc token: {args[0]}") - - trim_chars = "\n\t " - text = match.group(2).rstrip(trim_chars) - lines = text.split("\n") - - # calculate the min number of leading spaces in each line - min_spaces = sys.maxsize - for line in lines: - leading_spaces = len(line) - len(line.lstrip(" ")) - min_spaces = min(min_spaces, leading_spaces) - - # trim off that number of leading spaces from each line - lines = [line[min_spaces:] for line in lines] - - return '"%s"' % "\n".join(lines) - - def new_line_or_comment(self, args: List) -> _DiscardType: - return Discard - - def for_tuple_expr(self, args: List) -> str: - args = self.strip_new_line_tokens(args) - for_expr = " ".join([self.to_tf_inline(arg) for arg in args[1:-1]]) - return f"[{for_expr}]" - - def for_intro(self, args: List) -> str: - args = self.strip_new_line_tokens(args) - return " ".join([self.to_tf_inline(arg) for arg in args]) - - def for_cond(self, args: List) -> str: - args = self.strip_new_line_tokens(args) - return " ".join([self.to_tf_inline(arg) for arg in args]) - - def for_object_expr(self, args: List) -> str: - args = self.strip_new_line_tokens(args) - for_expr = " ".join([self.to_tf_inline(arg) for arg in args[1:-1]]) - # doubled curly braces stands for inlining the braces - # and the third pair of braces is for the interpolation - # e.g. f"{2 + 2} {{2 + 2}}" == "4 {2 + 2}" - return f"{{{for_expr}}}" - - def string(self, args: List) -> str: - return '"' + "".join(args) + '"' - - def string_part(self, args: List) -> str: - value = self.to_tf_inline(args[0]) - if value.startswith('"') and value.endswith('"'): - value = value[1:-1] - return value - - def interpolation(self, args: List) -> str: - return '"${' + str(args[0]) + '}"' - - def strip_new_line_tokens(self, args: List) -> List: - """ - Remove new line and Discard tokens. - The parser will sometimes include these in the tree so we need to strip them out here - """ - return [arg for arg in args if arg != "\n" and arg is not Discard] - - def is_string_dollar(self, value: str) -> bool: - if not isinstance(value, str): - return False - return value.startswith("${") and value.endswith("}") - - def to_string_dollar(self, value: Any) -> Any: - """Wrap a string in ${ and }""" - if not isinstance(value, str): - return value - # if it's already wrapped, pass it unmodified - if self.is_string_dollar(value): - return value - - if value.startswith('"') and value.endswith('"'): - value = str(value)[1:-1] - return self.process_escape_sequences(value) - - if self.is_type_keyword(value): - return value - - return f"${{{value}}}" - - def unwrap_string_dollar(self, value: str): - if self.is_string_dollar(value): - return value[2:-1] - return value - - def strip_quotes(self, value: Any) -> Any: - """Remove quote characters from the start and end of a string""" - if isinstance(value, str): - if value.startswith('"') and value.endswith('"'): - value = str(value)[1:-1] - return self.process_escape_sequences(value) - return value - - def process_escape_sequences(self, value: str) -> str: - """Process HCL escape sequences within quoted template expressions.""" - if isinstance(value, str): - # normal escape sequences - value = value.replace("\\n", "\n") - value = value.replace("\\r", "\r") - value = value.replace("\\t", "\t") - value = value.replace('\\"', '"') - value = value.replace("\\\\", "\\") - - # we will leave Unicode escapes (\uNNNN and \UNNNNNNNN) untouched - # for now, but this method can be extended in the future - return value - - def process_nulls(self, args: List) -> List: - return ["null" if arg is None else arg for arg in args] - - def to_tf_inline(self, value: Any) -> str: - """ - Converts complex objects (e.g.) dicts to an "inline" HCL syntax - for use in function calls and ${interpolation} strings - """ - if isinstance(value, dict): - dict_v = json.dumps(value) - return reverse_quotes_within_interpolation(dict_v) - if isinstance(value, list): - value = [self.to_tf_inline(item) for item in value] - return f"[{', '.join(value)}]" - if isinstance(value, bool): - return "true" if value else "false" - if isinstance(value, str): - return value - if isinstance(value, (int, float)): - return str(value) - if value is None: - return "None" - - raise RuntimeError(f"Invalid type to convert to inline HCL: {type(value)}") - - def identifier(self, value: Any) -> Any: - # Making identifier a token by capitalizing it to IDENTIFIER - # seems to return a token object instead of the str - # So treat it like a regular rule - # In this case we just convert the whole thing to a string - return str(value[0]) + @v_args(meta=True) + def for_cond(self, meta: Meta, args) -> ForCondRule: + return ForCondRule(args, meta) + + @v_args(meta=True) + def template_if_start(self, meta: Meta, args) -> TemplateIfStartRule: + return TemplateIfStartRule(args, meta) + + @v_args(meta=True) + def template_else(self, meta: Meta, args) -> TemplateElseRule: + return TemplateElseRule(args, meta) + + @v_args(meta=True) + def template_endif(self, meta: Meta, args) -> TemplateEndifRule: + return TemplateEndifRule(args, meta) + + @v_args(meta=True) + def template_for_start(self, meta: Meta, args) -> TemplateForStartRule: + return TemplateForStartRule(args, meta) + + @v_args(meta=True) + def template_endfor(self, meta: Meta, args) -> TemplateEndforRule: + return TemplateEndforRule(args, meta) + + @v_args(meta=True) + def template_string(self, meta: Meta, args) -> TemplateStringRule: + return TemplateStringRule(args, meta) diff --git a/hcl2/utils.py b/hcl2/utils.py new file mode 100644 index 00000000..d701ae25 --- /dev/null +++ b/hcl2/utils.py @@ -0,0 +1,94 @@ +"""Serialization options, context tracking, and string utility helpers.""" +import re +from contextlib import contextmanager +from dataclasses import dataclass, replace + +HEREDOC_PATTERN = re.compile(r"<<([a-zA-Z][a-zA-Z0-9._-]+)\n([\s\S]*)\1", re.S) +HEREDOC_TRIM_PATTERN = re.compile(r"<<-([a-zA-Z][a-zA-Z0-9._-]+)\n([\s\S]*)\1", re.S) + + +@dataclass +class SerializationOptions: + """Options controlling how LarkElement trees are serialized to Python dicts.""" + + # Include __comments__ and __inline_comments__ keys in the output. + with_comments: bool = True + # Add __start_line__ and __end_line__ metadata to each block/attribute. + with_meta: bool = False + # Serialize nested objects as inline HCL strings (e.g. "${{key = value}}") + # instead of Python dicts. + wrap_objects: bool = False + # Serialize tuples as inline HCL strings (e.g. "${[1, 2, 3]}") + # instead of Python lists. + wrap_tuples: bool = False + # Add __is_block__ markers to distinguish blocks from plain objects. + # Note: round-trip through from_dict/dumps is NOT supported WITHOUT this option. + explicit_blocks: bool = True + # Keep heredoc syntax (< "SerializationContext": + """Return a new context with the given fields overridden.""" + return replace(self, **kwargs) + + @contextmanager + def modify(self, **kwargs): + """Context manager that temporarily mutates fields, restoring on exit.""" + original_values = {key: getattr(self, key) for key in kwargs} + + for key, value in kwargs.items(): + setattr(self, key, value) + + try: + yield + finally: + # Restore original values + for key, value in original_values.items(): + setattr(self, key, value) + + +def is_dollar_string(value: str) -> bool: + """Return True if value is a ${...} interpolation wrapper.""" + if not isinstance(value, str): + return False + return value.startswith("${") and value.endswith("}") + + +def to_dollar_string(value: str) -> str: + """Wrap value in ${...} if not already wrapped.""" + if not is_dollar_string(value): + return f"${{{value}}}" + return value + + +def unwrap_dollar_string(value: str) -> str: + """Strip the ${...} wrapper from value if present.""" + if is_dollar_string(value): + return value[2:-1] + return value + + +def wrap_into_parentheses(value: str) -> str: + """Wrap value in parentheses, preserving ${...} wrappers.""" + if is_dollar_string(value): + value = unwrap_dollar_string(value) + return to_dollar_string(f"({value})") + return f"({value})" diff --git a/hcl2/walk.py b/hcl2/walk.py new file mode 100644 index 00000000..be2e4bc0 --- /dev/null +++ b/hcl2/walk.py @@ -0,0 +1,62 @@ +"""Generic tree-walking primitives for the LarkElement IR tree.""" + +from typing import Callable, Iterator, Optional, Type, TypeVar + +from hcl2.rules.abstract import LarkElement, LarkRule +from hcl2.rules.whitespace import NewLineOrCommentRule + +T = TypeVar("T", bound=LarkElement) + + +def walk(node: LarkElement) -> Iterator[LarkElement]: + """Depth-first pre-order traversal yielding all nodes including tokens.""" + yield node + if isinstance(node, LarkRule): + for child in node.children: + if child is not None: + yield from walk(child) + + +def walk_rules(node: LarkElement) -> Iterator[LarkRule]: + """Walk yielding only LarkRule nodes (skip LarkTokens).""" + for element in walk(node): + if isinstance(element, LarkRule): + yield element + + +def walk_semantic(node: LarkElement) -> Iterator[LarkRule]: + """Walk yielding only semantic LarkRule nodes (skip tokens and whitespace/comments).""" + for element in walk_rules(node): + if not isinstance(element, NewLineOrCommentRule): + yield element + + +def find_all(node: LarkElement, rule_type: Type[T]) -> Iterator[T]: + """Find all descendants matching a rule class (semantic walk).""" + for element in walk_semantic(node): + if isinstance(element, rule_type): + yield element + + +def find_first(node: LarkElement, rule_type: Type[T]) -> Optional[T]: + """Find first descendant matching a rule class, or None.""" + for element in find_all(node, rule_type): + return element + return None + + +def find_by_predicate( + node: LarkElement, predicate: Callable[[LarkElement], bool] +) -> Iterator[LarkElement]: + """Find all descendants matching an arbitrary predicate.""" + for element in walk(node): + if predicate(element): + yield element + + +def ancestors(node: LarkElement) -> Iterator[LarkElement]: + """Walk up the parent chain (excludes node itself).""" + current = getattr(node, "_parent", None) + while current is not None: + yield current + current = getattr(current, "_parent", None) diff --git a/pylintrc b/pylintrc index edd28005..05707ffb 100644 --- a/pylintrc +++ b/pylintrc @@ -9,7 +9,7 @@ # Add to the black list. It should be a base name, not a # path. You may set this option multiple times. -ignore=CVS +ignore=CVS,version.py # Pickle collected data for later comparisons. persistent=yes @@ -46,7 +46,10 @@ load-plugins= # E1103: %s %r has no %r member (but some types could not be inferred) - fails to infer real members of types, e.g. in Celery # W0231: method from base class is not called - complains about not invoking empty __init__s in parents, which is annoying # R0921: abstract class not referenced, when in fact referenced from another egg -disable=F0401,E0611,E1101,W0212,W0703,R0801,R0901,W0511,E1103,W0231 +# C0415: import-outside-toplevel - needed for circular dep avoidance in query package +# W1113: keyword-arg-before-vararg - intentional API design (blocks(block_type=None, *labels)) +# R0912: too-many-branches - introspect schema builder needs the branches +disable=F0401,E0611,E1101,W0212,W0703,R0801,R0901,W0511,E1103,W0231,C0415,W1113,R0912,R0401 [REPORTS] diff --git a/pyproject.toml b/pyproject.toml index 4440461a..e5591815 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -9,14 +9,13 @@ license = {text = "MIT"} description = "A parser for HCL2" keywords = [] classifiers = [ - "Development Status :: 4 - Beta", + "Development Status :: 5 - Production/Stable", "Topic :: Software Development :: Libraries :: Python Modules", "Intended Audience :: Developers", "License :: OSI Approved :: MIT License", "Operating System :: OS Independent", "Programming Language :: Python", "Programming Language :: Python :: 3", - "Programming Language :: Python :: 3.7", "Programming Language :: Python :: 3.8", "Programming Language :: Python :: 3.9", "Programming Language :: Python :: 3.10", @@ -24,7 +23,7 @@ classifiers = [ "Programming Language :: Python :: 3.12", "Programming Language :: Python :: 3.13", ] -requires-python = ">=3.7.0" +requires-python = ">=3.8.0" dependencies = [ "lark>=1.1.5,<2.0", @@ -40,10 +39,12 @@ content-type = "text/markdown" Homepage = "https://github.com/amplify-education/python-hcl2" [project.scripts] -hcl2tojson = "hcl2.__main__:main" +hcl2tojson = "cli.hcl_to_json:main" +jsontohcl2 = "cli.json_to_hcl:main" +hq = "cli.hq:main" [tool.setuptools] -packages = ["hcl2"] +packages = ["hcl2", "hcl2.rules", "hcl2.query", "cli"] zip-safe = false include-package-data = true diff --git a/test/helpers/__init__.py b/test/helpers/__init__.py deleted file mode 100644 index ba33e308..00000000 --- a/test/helpers/__init__.py +++ /dev/null @@ -1,3 +0,0 @@ -""" -Helper functions for tests -""" diff --git a/test/helpers/hcl2_helper.py b/test/helpers/hcl2_helper.py deleted file mode 100644 index 5acee1e7..00000000 --- a/test/helpers/hcl2_helper.py +++ /dev/null @@ -1,21 +0,0 @@ -# pylint:disable=C0114,C0115,C0116 - -from lark import Tree - -from hcl2.parser import parser -from hcl2.transformer import DictTransformer - - -class Hcl2Helper: - @classmethod - def load(cls, syntax: str) -> Tree: - return parser().parse(syntax) - - @classmethod - def load_to_dict(cls, syntax) -> dict: - tree = cls.load(syntax) - return DictTransformer().transform(tree) - - @classmethod - def build_argument(cls, identifier: str, expression: str = '"expression"') -> str: - return f"{identifier} = {expression}" diff --git a/test/helpers/terraform-config-json/backend.json b/test/helpers/terraform-config-json/backend.json deleted file mode 100644 index 482838c7..00000000 --- a/test/helpers/terraform-config-json/backend.json +++ /dev/null @@ -1,40 +0,0 @@ -{ - "provider": [ - { - "aws": { - "region": "${var.region}" - } - }, - { - "aws": { - "region": "${(var.backup_region)}", - "alias": "backup" - } - } - ], - "terraform": [ - { - "required_version": "0.12" - }, - { - "backend": [ - { - "gcs": {} - } - ], - "required_providers": [ - { - "aws": { - "source": "hashicorp/aws" - }, - "null": { - "source": "hashicorp/null" - }, - "template": { - "source": "hashicorp/template" - } - } - ] - } - ] -} diff --git a/test/helpers/terraform-config-json/blocks.json b/test/helpers/terraform-config-json/blocks.json deleted file mode 100644 index 716ece56..00000000 --- a/test/helpers/terraform-config-json/blocks.json +++ /dev/null @@ -1,34 +0,0 @@ -{ - "block": [ - { - "a": 1 - }, - { - "label": { - "b": 2, - "nested_block_1": [ - { - "a": { - "foo": "bar" - } - }, - { - "a": { - "b": { - "bar": "foo" - } - } - }, - { - "foobar": "barfoo" - } - ], - "nested_block_2": [ - { - "barfoo": "foobar" - } - ] - } - } - ] -} diff --git a/test/helpers/terraform-config-json/cloudwatch.json b/test/helpers/terraform-config-json/cloudwatch.json deleted file mode 100644 index f9dafc99..00000000 --- a/test/helpers/terraform-config-json/cloudwatch.json +++ /dev/null @@ -1,28 +0,0 @@ -{ - "resource": [ - { - "aws_cloudwatch_event_rule": { - "aws_cloudwatch_event_rule": { - "name": "name", - "event_pattern": " {\n \"foo\": \"bar\",\n \"foo2\": \"EOF_CONFIG\"\n }" - } - } - }, - { - "aws_cloudwatch_event_rule": { - "aws_cloudwatch_event_rule2": { - "name": "name", - "event_pattern": "{\n \"foo\": \"bar\",\n \"foo2\": \"EOF_CONFIG\"\n}" - } - } - }, - { - "aws_cloudwatch_event_rule": { - "aws_cloudwatch_event_rule2": { - "name": "name", - "event_pattern": "${jsonencode(var.cloudwatch_pattern_deploytool)}" - } - } - } - ] -} diff --git a/test/helpers/terraform-config-json/data_sources.json b/test/helpers/terraform-config-json/data_sources.json deleted file mode 100644 index f159c937..00000000 --- a/test/helpers/terraform-config-json/data_sources.json +++ /dev/null @@ -1,12 +0,0 @@ -{ - "data": [ - { - "terraform_remote_state": { - "map": { - "for_each": "${{for s3_bucket_key in data.aws_s3_bucket_objects.remote_state_objects.keys : regex(local.remote_state_regex, s3_bucket_key)[\"account_alias\"] => s3_bucket_key if length(regexall(local.remote_state_regex, s3_bucket_key)) > 0}}", - "backend": "s3" - } - } - } - ] -} diff --git a/test/helpers/terraform-config-json/empty-heredoc.json b/test/helpers/terraform-config-json/empty-heredoc.json deleted file mode 100644 index c1989c0d..00000000 --- a/test/helpers/terraform-config-json/empty-heredoc.json +++ /dev/null @@ -1 +0,0 @@ -{"bar": ""} diff --git a/test/helpers/terraform-config-json/escapes.json b/test/helpers/terraform-config-json/escapes.json deleted file mode 100644 index 41c7d54f..00000000 --- a/test/helpers/terraform-config-json/escapes.json +++ /dev/null @@ -1,9 +0,0 @@ -{ - "block": [ - { - "block_with_newlines": { - "a": "line1\nline2" - } - } - ] -} diff --git a/test/helpers/terraform-config-json/iam.json b/test/helpers/terraform-config-json/iam.json deleted file mode 100644 index 8705360e..00000000 --- a/test/helpers/terraform-config-json/iam.json +++ /dev/null @@ -1,41 +0,0 @@ -{ - "data": [ - { - "aws_iam_policy_document": { - "policy": { - "statement": [ - { - "effect": "Deny", - "principals": [ - { - "type": "AWS", - "identifiers": [ - "*" - ] - } - ], - "actions": [ - "s3:PutObjectAcl" - ], - "resources": "${aws_s3_bucket.bucket.*.arn.bar}" - } - ] - } - } - }, - { - "aws_iam_policy_document": { - "s3_proxy_policy": { - "statement": [ - { - "actions": [ - "s3:GetObject" - ], - "resources": "${[for bucket_name in local.buckets_to_proxy : \"arn:aws:s3:::${bucket_name}/*\" if substr(bucket_name, 0, 1) == \"l\"]}" - } - ] - } - } - } - ] -} diff --git a/test/helpers/terraform-config-json/locals_embedded_condition.json b/test/helpers/terraform-config-json/locals_embedded_condition.json deleted file mode 100644 index 6c41e5e8..00000000 --- a/test/helpers/terraform-config-json/locals_embedded_condition.json +++ /dev/null @@ -1,11 +0,0 @@ -{ - "locals": [ - { - "terraform": { - "channels": "${(local.running_in_ci ? local.ci_channels : local.local_channels)}", - "authentication": [], - "foo": null - } - } - ] -} diff --git a/test/helpers/terraform-config-json/locals_embedded_function.json b/test/helpers/terraform-config-json/locals_embedded_function.json deleted file mode 100644 index 51cf6454..00000000 --- a/test/helpers/terraform-config-json/locals_embedded_function.json +++ /dev/null @@ -1,7 +0,0 @@ -{ - "locals": [ - { - "function_test": "${var.basename}-${var.forwarder_function_name}_${md5(\"${var.vpc_id}${data.aws_region.current.name}\")}" - } - ] -} diff --git a/test/helpers/terraform-config-json/locals_embedded_multi_function_nested.json b/test/helpers/terraform-config-json/locals_embedded_multi_function_nested.json deleted file mode 100644 index f210a087..00000000 --- a/test/helpers/terraform-config-json/locals_embedded_multi_function_nested.json +++ /dev/null @@ -1,8 +0,0 @@ -{ - "locals": [ - { - "multi_function": "${substr(split(\"-\", \"us-west-2\")[0], 0, 1)}", - "multi_function_embedded": "${substr(split(\"-\", \"us-west-2\")[0], 0, 1)}" - } - ] -} diff --git a/test/helpers/terraform-config-json/multiline_expressions.json b/test/helpers/terraform-config-json/multiline_expressions.json deleted file mode 100644 index 7f3405c0..00000000 --- a/test/helpers/terraform-config-json/multiline_expressions.json +++ /dev/null @@ -1,56 +0,0 @@ -{ - "resource": [ - { - "null_resource": { - "multiline_comment_multiline": { - "triggers": [] - } - } - }, - { - "null_resource": { - "multiline_comment_single_line_before_closing_bracket": { - "triggers": [] - } - } - }, - { - "null_resource": { - "multiline_comment_single_line_between_brackets": { - "triggers": [] - } - } - }, - { - "null_resource": { - "multiline_comment_single_line_after_opening_bracket": { - "triggers": [] - } - } - }, - { - "null_resource": { - "multiline_comment_multiple_single_element": { - "triggers": [ - 2 - ] - } - } - } - ], - "variable": [ - { - "some_var2": { - "description": "description", - "type": "string", - "default": "${cidrsubnets(\"10.0.0.0/24\", 2, 2)}" - } - }, - { - "some_var3": { - "description": "description", - "default": "${concat([{\"1\": \"1\"}], [{\"2\": \"2\"}])}" - } - } - ] -} diff --git a/test/helpers/terraform-config-json/nulls.json b/test/helpers/terraform-config-json/nulls.json deleted file mode 100644 index d4a9d448..00000000 --- a/test/helpers/terraform-config-json/nulls.json +++ /dev/null @@ -1 +0,0 @@ -{"terraform": {"unary": "${!null}", "binary": "${(a == null)}", "tuple": [null, 1, 2], "single": null, "conditional": "${null ? null : null}"}} diff --git a/test/helpers/terraform-config-json/provider_function.json b/test/helpers/terraform-config-json/provider_function.json deleted file mode 100644 index 2b749c13..00000000 --- a/test/helpers/terraform-config-json/provider_function.json +++ /dev/null @@ -1,8 +0,0 @@ -{ - "locals": [ - { - "name2": "${provider::test2::test(\"a\")}", - "name3": "${test(\"a\")}" - } - ] -} diff --git a/test/helpers/terraform-config-json/resource_keyword_attribute.json b/test/helpers/terraform-config-json/resource_keyword_attribute.json deleted file mode 100644 index 11ff88f9..00000000 --- a/test/helpers/terraform-config-json/resource_keyword_attribute.json +++ /dev/null @@ -1,16 +0,0 @@ -{ - "resource": [ - { - "custom_provider_resource": { - "resource_name": { - "name": "resource_name", - "attribute": "attribute_value", - "if" : "attribute_value2", - "in" : "attribute_value3", - "for" : "attribute_value4", - "for_each" : "attribute_value5" - } - } - } - ] -} diff --git a/test/helpers/terraform-config-json/route_table.json b/test/helpers/terraform-config-json/route_table.json deleted file mode 100644 index af21a922..00000000 --- a/test/helpers/terraform-config-json/route_table.json +++ /dev/null @@ -1,24 +0,0 @@ -{ - "resource": [ - { - "aws_route": { - "tgw": { - "count": "${(var.tgw_name == \"\" ? 0 : var.number_of_az)}", - "route_table_id": "${aws_route_table.rt[count.index].id}", - "destination_cidr_block": "10.0.0.0/8", - "transit_gateway_id": "${data.aws_ec2_transit_gateway.tgw[0].id}" - } - } - }, - { - "aws_route": { - "tgw-dot-index": { - "count": "${(var.tgw_name == \"\" ? 0 : var.number_of_az)}", - "route_table_id": "${aws_route_table.rt[count.index].id}", - "destination_cidr_block": "10.0.0.0/8", - "transit_gateway_id": "${data.aws_ec2_transit_gateway.tgw[0].id}" - } - } - } - ] -} diff --git a/test/helpers/terraform-config-json/s3.json b/test/helpers/terraform-config-json/s3.json deleted file mode 100644 index d3318a21..00000000 --- a/test/helpers/terraform-config-json/s3.json +++ /dev/null @@ -1,47 +0,0 @@ -{ - "resource": [ - { - "aws_s3_bucket": { - "name": { - "bucket": "name", - "acl": "log-delivery-write", - "lifecycle_rule": [ - { - "id": "to_glacier", - "prefix": "", - "enabled": true, - "expiration": [ - { - "days": 365 - } - ], - "transition": { - "days": 30, - "storage_class": "GLACIER" - } - } - ], - "versioning": [ - { - "enabled": true - } - ] - } - } - } - ], - "module": [ - { - "bucket_name": { - "source": "s3_bucket_name", - "name": "audit", - "account": "${var.account}", - "region": "${var.region}", - "providers": { - "aws.ue1": "${aws}", - "aws.uw2.attribute": "${aws.backup}" - } - } - } - ] -} diff --git a/test/helpers/terraform-config-json/string_interpolations.json b/test/helpers/terraform-config-json/string_interpolations.json deleted file mode 100644 index 885baf89..00000000 --- a/test/helpers/terraform-config-json/string_interpolations.json +++ /dev/null @@ -1,13 +0,0 @@ -{ - "locals": [ - { - "simple_interpolation": "prefix:${var.foo}-suffix", - "embedded_interpolation": "(long substring without interpolation); ${module.special_constants.aws_accounts[\"aaa-${local.foo}-${local.bar}\"]}/us-west-2/key_foo", - "deeply_nested_interpolation": "prefix1-${\"prefix2-${\"prefix3-$${foo:bar}\"}\"}", - "escaped_interpolation": "prefix:$${aws:username}-suffix", - "simple_and_escaped": "${\"bar\"}$${baz:bat}", - "simple_and_escaped_reversed": "$${baz:bat}${\"bar\"}", - "nested_escaped": "bar-${\"$${baz:bat}\"}" - } - ] -} diff --git a/test/helpers/terraform-config-json/test_floats.json b/test/helpers/terraform-config-json/test_floats.json deleted file mode 100644 index 87ed65c3..00000000 --- a/test/helpers/terraform-config-json/test_floats.json +++ /dev/null @@ -1,30 +0,0 @@ -{ - "locals": [ - { - "simple_float": 123.456, - "small_float": 0.123, - "large_float": 9876543.21, - "negative_float": -42.5, - "negative_small": -0.001, - "scientific_positive": "${1.23e5}", - "scientific_negative": "${9.87e-3}", - "scientific_large": "${6.022e+23}", - "integer_as_float": 100.0, - "float_calculation": "${105e+2 * 3.0 / 2.1}", - "float_comparison": "${5e1 > 2.3 ? 1.0 : 0.0}", - "float_list": [ - 1.1, - 2.2, - 3.3, - -4.4, - "${5.5e2}" - ], - "float_object": { - "pi": 3.14159, - "euler": 2.71828, - "sqrt2": 1.41421, - "scientific": "${-123e+2}" - } - } - ] -} diff --git a/test/helpers/terraform-config-json/unicode_strings.json b/test/helpers/terraform-config-json/unicode_strings.json deleted file mode 100644 index 8eedf932..00000000 --- a/test/helpers/terraform-config-json/unicode_strings.json +++ /dev/null @@ -1,20 +0,0 @@ -{ - "locals": [ - { - "basic_unicode": "Hello, 世界! こんにちは Привет नमस्ते", - "unicode_escapes": "© ♥ ♪ ☠ ☺", - "emoji_string": "🚀 🌍 🔥 🎉", - "rtl_text": "English and العربية text mixed", - "complex_unicode": "Python (파이썬) es 很棒的! ♥ αβγδ", - "ascii": "ASCII: abc123", - "emoji": "Emoji: 🚀🌍🔥🎉", - "math": "Math: ∑∫√∞≠≤≥", - "currency": "Currency: £€¥₹₽₩", - "arrows": "Arrows: ←↑→↓↔↕", - "cjk": "CJK: 你好世界안녕하세요こんにちは", - "cyrillic": "Cyrillic: Привет мир", - "special": "Special: ©®™§¶†‡", - "mixed_content": "Line with interpolation: ${var.name}\nLine with emoji: 👨‍👩‍👧‍👦\nLine with quotes: \"quoted text\"\nLine with backslash: \\escaped" - } - ] -} diff --git a/test/helpers/terraform-config-json/variables.json b/test/helpers/terraform-config-json/variables.json deleted file mode 100644 index d344902c..00000000 --- a/test/helpers/terraform-config-json/variables.json +++ /dev/null @@ -1,117 +0,0 @@ -{ - "variable": [ - { - "region": {} - }, - { - "account": {} - }, - { - "azs": { - "default": { - "us-west-1": "us-west-1c,us-west-1b", - "us-west-2": "us-west-2c,us-west-2b,us-west-2a", - "us-east-1": "us-east-1c,us-east-1b,us-east-1a", - "eu-central-1": "eu-central-1a,eu-central-1b,eu-central-1c", - "sa-east-1": "sa-east-1a,sa-east-1c", - "ap-northeast-1": "ap-northeast-1a,ap-northeast-1c,ap-northeast-1d", - "ap-southeast-1": "ap-southeast-1a,ap-southeast-1b,ap-southeast-1c", - "ap-southeast-2": "ap-southeast-2a,ap-southeast-2b,ap-southeast-2c" - } - } - }, - { - "options": { - "type": "string", - "default": {} - } - }, - { - "var_with_validation": { - "type": "${list(object({\"id\": \"string\", \"nested\": \"${list(object({\"id\": \"string\", \"type\": \"string\"}))}\"}))}", - "validation": [ - { - "condition": "${!contains([for v in flatten(var.var_with_validation[*].id) : can(regex(\"^(A|B)$\", v))], false)}", - "error_message": "The property `id` must be one of value [A, B]." - }, - { - "condition": "${!contains([for v in flatten(var.var_with_validation[*].nested[*].type) : can(regex(\"^(A|B)$\", v))], false)}", - "error_message": "The property `nested.type` must be one of value [A, B]." - } - ] - } - } - ], - "locals": [ - { - "foo": "${var.account}_bar", - "bar": { - "baz": 1, - "${(var.account)}": 2, - "${(format(\"key_prefix_%s\", local.foo))}": 3, - "\"prefix_${var.account}:${var.user}_suffix\"": "interpolation" - }, - "tuple": ["${local.foo}"], - "empty_tuple": [] - }, - { - "route53_forwarding_rule_shares": "${{for forwarding_rule_key in keys(var.route53_resolver_forwarding_rule_shares) : \"${forwarding_rule_key}\" => {\"aws_account_ids\": \"${[for account_name in var.route53_resolver_forwarding_rule_shares[forwarding_rule_key].aws_account_names : module.remote_state_subaccounts.map[account_name].outputs[\"aws_account_id\"]]}\"} ...}}", - "has_valid_forwarding_rules_template_inputs": "${(length(keys(var.forwarding_rules_template.copy_resolver_rules)) > 0 && length(var.forwarding_rules_template.replace_with_target_ips) > 0 && length(var.forwarding_rules_template.exclude_cidrs) > 0)}", - "for_whitespace": "${{for i in [1, 2, 3] : i => i ...}}" - }, - { - "nested_data": [ - { - "id": 1, - "nested": [ - { - "id": "a", - "again": [ - { - "id": "a1" - }, - { - "id": "b1" - } - ] - }, - { - "id": "c" - } - ] - }, - { - "id": 1, - "nested": [ - { - "id": "a", - "again": [ - { - "id": "a2" - }, - { - "id": "b2" - } - ] - }, - { - "id": "b", - "again": [ - { - "id": "a" - }, - { - "id": "b" - } - ] - } - ] - } - ], - "ids_level_1": "${distinct(local.nested_data[*].id)}", - "ids_level_2": "${flatten(local.nested_data[*].nested[*].id)}", - "ids_level_3": "${flatten(local.nested_data[*].nested[*].again[*][0].foo.bar[0])}", - "bindings_by_role": "${distinct(flatten([for name in local.real_entities : [for role , members in var.bindings : {\"name\": \"${name}\", \"role\": \"${role}\", \"members\": \"${members}\"}]]))}" - } - ] -} diff --git a/test/helpers/terraform-config-json/vars.auto.json b/test/helpers/terraform-config-json/vars.auto.json deleted file mode 100644 index e8ead394..00000000 --- a/test/helpers/terraform-config-json/vars.auto.json +++ /dev/null @@ -1,7 +0,0 @@ -{ - "foo": "bar", - "arr": [ - "foo", - "bar" - ] -} diff --git a/test/helpers/terraform-config/backend.tf b/test/helpers/terraform-config/backend.tf deleted file mode 100644 index bd22a869..00000000 --- a/test/helpers/terraform-config/backend.tf +++ /dev/null @@ -1,31 +0,0 @@ -// test new line braces style -provider "aws" -{ - region = var.region -} - -# another comment -provider "aws" { - region = (var.backup_region) - alias = "backup" -} - -/* -one last comment -*/ -terraform { required_version = "0.12" } - -terraform { - backend "gcs" {} - required_providers { - aws = { - source = "hashicorp/aws", - } - null = { - source = "hashicorp/null", - } - template = { - source = "hashicorp/template", - } - } -} diff --git a/test/helpers/terraform-config/blocks.tf b/test/helpers/terraform-config/blocks.tf deleted file mode 100644 index bd8e5159..00000000 --- a/test/helpers/terraform-config/blocks.tf +++ /dev/null @@ -1,22 +0,0 @@ -block { - a = 1 -} - -block "label" { - b = 2 - nested_block_1 "a" { - foo = "bar" - } - - nested_block_1 "a" "b" { - bar = "foo" - } - - nested_block_1 { - foobar = "barfoo" - } - - nested_block_2 { - barfoo = "foobar" - } -} diff --git a/test/helpers/terraform-config/cloudwatch.tf b/test/helpers/terraform-config/cloudwatch.tf deleted file mode 100644 index 8928b810..00000000 --- a/test/helpers/terraform-config/cloudwatch.tf +++ /dev/null @@ -1,24 +0,0 @@ -resource "aws_cloudwatch_event_rule" "aws_cloudwatch_event_rule" { - name = "name" - event_pattern = < s3_bucket_key - if length(regexall(local.remote_state_regex, s3_bucket_key)) > 0 - } - backend = "s3" -} diff --git a/test/helpers/terraform-config/empty-heredoc.hcl2 b/test/helpers/terraform-config/empty-heredoc.hcl2 deleted file mode 100644 index c701dac2..00000000 --- a/test/helpers/terraform-config/empty-heredoc.hcl2 +++ /dev/null @@ -1,2 +0,0 @@ -bar = < { - aws_account_ids = [ - for account_name in var.route53_resolver_forwarding_rule_shares[ - forwarding_rule_key - ].aws_account_names : - module.remote_state_subaccounts.map[account_name].outputs["aws_account_id"] - ] - } - ... - } - has_valid_forwarding_rules_template_inputs = ( - length(keys(var.forwarding_rules_template.copy_resolver_rules)) > 0 - && length(var.forwarding_rules_template.replace_with_target_ips) > 0 && - length(var.forwarding_rules_template.exclude_cidrs) > 0 - ) - - for_whitespace = { for i in [1, 2, 3] : - i => - i ... - } -} - -locals { - nested_data = [ - { - id = 1, - nested = [ - { - id = "a" - again = [ - { id = "a1" }, - { id = "b1" } - ] - }, - { id = "c" } - ] - }, - { - id = 1 - nested = [ - { - id = "a" - again = [ - { id = "a2" }, - { id = "b2" } - ] - }, - { - id = "b" - again = [ - { id = "a" }, - { id = "b" } - ] - } - ] - } - ] - - ids_level_1 = distinct(local.nested_data[*].id) - ids_level_2 = flatten(local.nested_data[*].nested[*].id) - ids_level_3 = flatten(local.nested_data[*].nested[*].again[*][0].foo.bar[0]) - bindings_by_role = distinct(flatten([ - for name in local.real_entities - : [ - for role, members in var.bindings - : { name = name, role = role, members = members } - ] - ])) -} diff --git a/test/helpers/terraform-config/vars.auto.tfvars b/test/helpers/terraform-config/vars.auto.tfvars deleted file mode 100644 index 9fd3a49d..00000000 --- a/test/helpers/terraform-config/vars.auto.tfvars +++ /dev/null @@ -1,2 +0,0 @@ -foo = "bar" -arr = ["foo", "bar"] diff --git a/test/helpers/with-meta/data_sources.json b/test/helpers/with-meta/data_sources.json deleted file mode 100644 index f04e0ff9..00000000 --- a/test/helpers/with-meta/data_sources.json +++ /dev/null @@ -1,14 +0,0 @@ -{ - "data": [ - { - "terraform_remote_state": { - "map": { - "for_each": "${{for s3_bucket_key in data.aws_s3_bucket_objects.remote_state_objects.keys : regex(local.remote_state_regex, s3_bucket_key)[\"account_alias\"] => s3_bucket_key if length(regexall(local.remote_state_regex, s3_bucket_key)) > 0}}", - "backend": "s3", - "__start_line__": 1, - "__end_line__": 8 - } - } - } - ] -} diff --git a/test/helpers/with-meta/data_sources.tf b/test/helpers/with-meta/data_sources.tf deleted file mode 100644 index 8e4cc25a..00000000 --- a/test/helpers/with-meta/data_sources.tf +++ /dev/null @@ -1,8 +0,0 @@ -data "terraform_remote_state" "map" { - for_each = { - for s3_bucket_key in data.aws_s3_bucket_objects.remote_state_objects.keys : - regex(local.remote_state_regex, s3_bucket_key)["account_alias"] => s3_bucket_key - if length(regexall(local.remote_state_regex, s3_bucket_key)) > 0 - } - backend = "s3" -} diff --git a/test/integration/__init__.py b/test/integration/__init__.py new file mode 100644 index 00000000..e69de29b diff --git a/test/helpers/terraform-config/test_floats.tf b/test/integration/hcl2_original/floats.tf similarity index 100% rename from test/helpers/terraform-config/test_floats.tf rename to test/integration/hcl2_original/floats.tf diff --git a/test/integration/hcl2_original/function_objects.tf b/test/integration/hcl2_original/function_objects.tf new file mode 100644 index 00000000..d9e733b7 --- /dev/null +++ b/test/integration/hcl2_original/function_objects.tf @@ -0,0 +1,22 @@ +variable "object" { + type = object({ + key = string + value = string + }) +} + +variable "nested" { + type = map(object({ + name = string + enabled = bool + })) +} + +variable "multi_arg" { + default = merge({ + a = 1 + b = 2 + }, { + c = 3 + }) +} diff --git a/test/integration/hcl2_original/function_tuples.tf b/test/integration/hcl2_original/function_tuples.tf new file mode 100644 index 00000000..8e60bcea --- /dev/null +++ b/test/integration/hcl2_original/function_tuples.tf @@ -0,0 +1,17 @@ +resource "octopusdeploy_process_templated_step" "step" { + parameters = { + "Octopus.Action.Aws.IamCapabilities" = jsonencode([ + "CAPABILITY_AUTO_EXPAND", + "CAPABILITY_IAM", + "CAPABILITY_NAMED_IAM", + ]) + } +} + +variable "list" { + default = toset([ + "a", + "b", + "c", + ]) +} diff --git a/test/helpers/terraform-config/nulls.tf b/test/integration/hcl2_original/nulls.tf similarity index 100% rename from test/helpers/terraform-config/nulls.tf rename to test/integration/hcl2_original/nulls.tf diff --git a/test/integration/hcl2_original/object_keys.tf b/test/integration/hcl2_original/object_keys.tf new file mode 100644 index 00000000..c3f33146 --- /dev/null +++ b/test/integration/hcl2_original/object_keys.tf @@ -0,0 +1,11 @@ +bar = { + 0: 0, + "foo": 1 + baz : 2, + (var.account) : 3 + (format("key_prefix_%s", local.foo)) : 4 + "prefix_${var.account}:${var.user}_suffix": 5, + 1 + 1 = "two", + (2 + 2) = "four", + format("key_%s", var.name) = "dynamic" +} diff --git a/test/integration/hcl2_original/operators.tf b/test/integration/hcl2_original/operators.tf new file mode 100644 index 00000000..f8351161 --- /dev/null +++ b/test/integration/hcl2_original/operators.tf @@ -0,0 +1,15 @@ +locals { + addition_1 = ((a + b) + c) + addition_2 = a + b + addition_3 = (a + b) + eq_before_and = var.env == "prod" && var.debug + and_before_ternary = true && true ? 1 : 0 + mixed_arith_cmp = var.a + var.b * var.c > 10 + full_chain = a + b == c && d || e + left_assoc_sub = a - b - c + left_assoc_mul_div = (a * b) / c + nested_ternary = (a ? b : c) ? d : e + unary_precedence = !a && b + neg_precedence = (-a) + b + neg_parentheses = -(a + b) +} diff --git a/test/helpers/terraform-config/resource_keyword_attribute.tf b/test/integration/hcl2_original/resource_keyword_attribute.tf similarity index 100% rename from test/helpers/terraform-config/resource_keyword_attribute.tf rename to test/integration/hcl2_original/resource_keyword_attribute.tf diff --git a/test/integration/hcl2_original/smoke.tf b/test/integration/hcl2_original/smoke.tf new file mode 100644 index 00000000..3e10e856 --- /dev/null +++ b/test/integration/hcl2_original/smoke.tf @@ -0,0 +1,93 @@ + +block label1 label2 { + a = 5 + b = 1256.5 + c = 15 + (10 * 12) + d = (-a) + e = ( + a == b + ? true : false + ) + f = "${"this is a string"}" + g = 1 == 2 + h = { + k1 = 5, + k2 = 10 + , + "k3" = {k4 = "a"} + (5 + 5) = "d" + k5.attr.attr = "e" + } + i = [ + a, b + , + "c${aaa}", + d, + [1, 2, 3,], + f(a), + provider::func::aa(5) + + ] + j = func( + a, b + , c, + d ... + + ) + k = a.b.5 + l = a.*.b + m = a[*][c].a.*.1 + + block b1 { + a = 1 + } +} + +block multiline_ternary { + foo = ( + bar + ? baz(foo) + : foo == "bar" + ? "baz" + : foo + ) +} + +block multiline_binary_ops { + expr = { + for k, v in local.map_a : k => v + if lookup(local.map_b[v.id + ], "enabled", false) + || ( + contains(local.map_c, v.id) + && contains(local.map_d, v.id) + ) + } +} + +block binary_op_before_unary { + dedup_keys_layer7 = { + for k, v in local.action_keys_layer7 : + k => v + if !contains(keys(local.dedup_keys_layer8), k) + && !contains(keys(local.dedup_keys_layer9), k) + && !contains(keys(local.dedup_keys_layer10), k) + } +} + +block { + route53_forwarding_rule_shares = { + for forwarding_rule_key in keys(var.route53_resolver_forwarding_rule_shares) : + "${forwarding_rule_key}" => { + aws_account_ids = [ + for account_name in var.route53_resolver_forwarding_rule_shares[ + forwarding_rule_key + ].aws_account_names : + module.remote_state_subaccounts.map[account_name].outputs["aws_account_id"] + ] + } + ... + if + substr(bucket_name, 0, 1) == "l" + } +} diff --git a/test/helpers/terraform-config/string_interpolations.tf b/test/integration/hcl2_original/string_interpolations.tf similarity index 68% rename from test/helpers/terraform-config/string_interpolations.tf rename to test/integration/hcl2_original/string_interpolations.tf index 582b4aac..f9ac4e18 100644 --- a/test/helpers/terraform-config/string_interpolations.tf +++ b/test/integration/hcl2_original/string_interpolations.tf @@ -1,6 +1,6 @@ -locals { - simple_interpolation = "prefix:${var.foo}-suffix" - embedded_interpolation = "(long substring without interpolation); ${module.special_constants.aws_accounts["aaa-${local.foo}-${local.bar}"]}/us-west-2/key_foo" +block label1 label3 { + simple_interpolation = "prefix:${var}-suffix" + embedded_interpolation = "(long substring without interpolation); ${"aaa-${local}-${local}"}/us-west-2/key_foo" deeply_nested_interpolation = "prefix1-${"prefix2-${"prefix3-$${foo:bar}"}"}" escaped_interpolation = "prefix:$${aws:username}-suffix" simple_and_escaped = "${"bar"}$${baz:bat}" diff --git a/test/helpers/terraform-config/unicode_strings.tf b/test/integration/hcl2_original/unicode_strings.tf similarity index 100% rename from test/helpers/terraform-config/unicode_strings.tf rename to test/integration/hcl2_original/unicode_strings.tf diff --git a/test/integration/hcl2_reconstructed/floats.tf b/test/integration/hcl2_reconstructed/floats.tf new file mode 100644 index 00000000..23dc46fe --- /dev/null +++ b/test/integration/hcl2_reconstructed/floats.tf @@ -0,0 +1,26 @@ +locals { + simple_float = 123.456 + small_float = 0.123 + large_float = 9876543.21 + negative_float = -42.5 + negative_small = -0.001 + scientific_positive = 1.23e5 + scientific_negative = 9.87e-3 + scientific_large = 6.022e+23 + integer_as_float = 100.0 + float_calculation = 105e+2 * 3.0 / 2.1 + float_comparison = 5e1 > 2.3 ? 1.0 : 0.0 + float_list = [ + 1.1, + 2.2, + 3.3, + -4.4, + 5.5e2, + ] + float_object = { + pi = 3.14159, + euler = 2.71828, + sqrt2 = 1.41421, + scientific = -123e+2, + } +} diff --git a/test/integration/hcl2_reconstructed/function_objects.tf b/test/integration/hcl2_reconstructed/function_objects.tf new file mode 100644 index 00000000..69dac286 --- /dev/null +++ b/test/integration/hcl2_reconstructed/function_objects.tf @@ -0,0 +1,24 @@ +variable "object" { + type = object({ + key = string, + value = string + }) +} + + +variable "nested" { + type = map(object({ + name = string, + enabled = bool + })) +} + + +variable "multi_arg" { + default = merge({ + a = 1, + b = 2 + }, { + c = 3 + }) +} diff --git a/test/integration/hcl2_reconstructed/function_tuples.tf b/test/integration/hcl2_reconstructed/function_tuples.tf new file mode 100644 index 00000000..0b436631 --- /dev/null +++ b/test/integration/hcl2_reconstructed/function_tuples.tf @@ -0,0 +1,18 @@ +resource "octopusdeploy_process_templated_step" "step" { + parameters = { + "Octopus.Action.Aws.IamCapabilities" = jsonencode([ + "CAPABILITY_AUTO_EXPAND", + "CAPABILITY_IAM", + "CAPABILITY_NAMED_IAM" + ]), + } +} + + +variable "list" { + default = toset([ + "a", + "b", + "c" + ]) +} diff --git a/test/integration/hcl2_reconstructed/nulls.tf b/test/integration/hcl2_reconstructed/nulls.tf new file mode 100644 index 00000000..1e487789 --- /dev/null +++ b/test/integration/hcl2_reconstructed/nulls.tf @@ -0,0 +1,11 @@ +terraform = { + unary = !null, + binary = (a == null), + tuple = [ + null, + 1, + 2, + ], + single = null, + conditional = null ? null : null, +} diff --git a/test/integration/hcl2_reconstructed/object_keys.tf b/test/integration/hcl2_reconstructed/object_keys.tf new file mode 100644 index 00000000..002bf6d9 --- /dev/null +++ b/test/integration/hcl2_reconstructed/object_keys.tf @@ -0,0 +1,11 @@ +bar = { + 0 = 0, + "foo" = 1, + baz = 2, + (var.account) = 3, + (format("key_prefix_%s", local.foo)) = 4, + "prefix_${var.account}:${var.user}_suffix" = 5, + 1 + 1 = "two", + (2 + 2) = "four", + format("key_%s", var.name) = "dynamic", +} diff --git a/test/integration/hcl2_reconstructed/operators.tf b/test/integration/hcl2_reconstructed/operators.tf new file mode 100644 index 00000000..323759aa --- /dev/null +++ b/test/integration/hcl2_reconstructed/operators.tf @@ -0,0 +1,15 @@ +locals { + addition_1 = ((a + b) + c) + addition_2 = a + b + addition_3 = (a + b) + eq_before_and = var.env == "prod" && var.debug + and_before_ternary = true && true ? 1 : 0 + mixed_arith_cmp = var.a + var.b * var.c > 10 + full_chain = a + b == c && d || e + left_assoc_sub = a - b - c + left_assoc_mul_div = (a * b) / c + nested_ternary = (a ? b : c) ? d : e + unary_precedence = !a && b + neg_precedence = (-a) + b + neg_parentheses = -(a + b) +} diff --git a/test/integration/hcl2_reconstructed/resource_keyword_attribute.tf b/test/integration/hcl2_reconstructed/resource_keyword_attribute.tf new file mode 100644 index 00000000..c9ada660 --- /dev/null +++ b/test/integration/hcl2_reconstructed/resource_keyword_attribute.tf @@ -0,0 +1,8 @@ +resource "custom_provider_resource" "resource_name" { + name = "resource_name" + attribute = "attribute_value" + if = "attribute_value2" + in = "attribute_value3" + for = "attribute_value4" + for_each = "attribute_value5" +} diff --git a/test/integration/hcl2_reconstructed/smoke.tf b/test/integration/hcl2_reconstructed/smoke.tf new file mode 100644 index 00000000..29beb3ac --- /dev/null +++ b/test/integration/hcl2_reconstructed/smoke.tf @@ -0,0 +1,76 @@ +block label1 label2 { + a = 5 + b = 1256.5 + c = 15 + (10 * 12) + d = (-a) + e = (a == b ? true : false) + f = "${"this is a string"}" + g = 1 == 2 + h = { + k1 = 5, + k2 = 10, + "k3" = { + k4 = "a", + }, + (5 + 5) = "d", + k5.attr.attr = "e", + } + i = [ + a, + b, + "c${aaa}", + d, + [ + 1, + 2, + 3, + ], + f(a), + provider::func::aa(5), + ] + j = func(a, b, c, d ... ) + k = a.b.5 + l = a.*.b + m = a[*][c].a.*.1 + + block b1 { + a = 1 + } +} + + +block multiline_ternary { + foo = (bar ? baz(foo) : foo == "bar" ? "baz" : foo) +} + + +block multiline_binary_ops { + expr = { + for k, v in local.map_a : + k => v + if lookup(local.map_b[v.id], "enabled", false) || (contains(local.map_c, v.id) && contains(local.map_d, v.id)) + } +} + + +block binary_op_before_unary { + dedup_keys_layer7 = { + for k, v in local.action_keys_layer7 : + k => v + if !contains(keys(local.dedup_keys_layer8), k) && !contains(keys(local.dedup_keys_layer9), k) && !contains(keys(local.dedup_keys_layer10), k) + } +} + + +block { + route53_forwarding_rule_shares = { + for forwarding_rule_key in keys(var.route53_resolver_forwarding_rule_shares) : + "${forwarding_rule_key}" => { + aws_account_ids = [ + for account_name in var.route53_resolver_forwarding_rule_shares[forwarding_rule_key].aws_account_names : + module.remote_state_subaccounts.map[account_name].outputs["aws_account_id"] + ] + } ... + if substr(bucket_name, 0, 1) == "l" + } +} diff --git a/test/integration/hcl2_reconstructed/string_interpolations.tf b/test/integration/hcl2_reconstructed/string_interpolations.tf new file mode 100644 index 00000000..73df4715 --- /dev/null +++ b/test/integration/hcl2_reconstructed/string_interpolations.tf @@ -0,0 +1,9 @@ +block label1 label3 { + simple_interpolation = "prefix:${var}-suffix" + embedded_interpolation = "(long substring without interpolation); ${"aaa-${local}-${local}"}/us-west-2/key_foo" + deeply_nested_interpolation = "prefix1-${"prefix2-${"prefix3-$${foo:bar}"}"}" + escaped_interpolation = "prefix:$${aws:username}-suffix" + simple_and_escaped = "${"bar"}$${baz:bat}" + simple_and_escaped_reversed = "$${baz:bat}${"bar"}" + nested_escaped = "bar-${"$${baz:bat}"}" +} diff --git a/test/integration/hcl2_reconstructed/unicode_strings.tf b/test/integration/hcl2_reconstructed/unicode_strings.tf new file mode 100644 index 00000000..8c4df70e --- /dev/null +++ b/test/integration/hcl2_reconstructed/unicode_strings.tf @@ -0,0 +1,21 @@ +locals { + basic_unicode = "Hello, 世界! こんにちは Привет नमस्ते" + unicode_escapes = "© ♥ ♪ ☠ ☺" + emoji_string = "🚀 🌍 🔥 🎉" + rtl_text = "English and العربية text mixed" + complex_unicode = "Python (파이썬) es 很棒的! ♥ αβγδ" + ascii = "ASCII: abc123" + emoji = "Emoji: 🚀🌍🔥🎉" + math = "Math: ∑∫√∞≠≤≥" + currency = "Currency: £€¥₹₽₩" + arrows = "Arrows: ←↑→↓↔↕" + cjk = "CJK: 你好世界안녕하세요こんにちは" + cyrillic = "Cyrillic: Привет мир" + special = "Special: ©®™§¶†‡" + mixed_content = <<-EOT + Line with interpolation: ${var.name} + Line with emoji: 👨‍👩‍👧‍👦 + Line with quotes: "quoted text" + Line with backslash: \escaped + EOT +} diff --git a/test/integration/json_reserialized/floats.json b/test/integration/json_reserialized/floats.json new file mode 100644 index 00000000..db301445 --- /dev/null +++ b/test/integration/json_reserialized/floats.json @@ -0,0 +1,31 @@ +{ + "locals": [ + { + "simple_float": 123.456, + "small_float": 0.123, + "large_float": 9876543.21, + "negative_float": -42.5, + "negative_small": -0.001, + "scientific_positive": "${1.23e5}", + "scientific_negative": "${9.87e-3}", + "scientific_large": "${6.022e+23}", + "integer_as_float": 100.0, + "float_calculation": "${105e+2 * 3.0 / 2.1}", + "float_comparison": "${5e1 > 2.3 ? 1.0 : 0.0}", + "float_list": [ + 1.1, + 2.2, + 3.3, + -4.4, + "${5.5e2}" + ], + "float_object": { + "pi": 3.14159, + "euler": 2.71828, + "sqrt2": 1.41421, + "scientific": "${-123e+2}" + }, + "__is_block__": true + } + ] +} diff --git a/test/integration/json_reserialized/function_objects.json b/test/integration/json_reserialized/function_objects.json new file mode 100644 index 00000000..86b8b131 --- /dev/null +++ b/test/integration/json_reserialized/function_objects.json @@ -0,0 +1,22 @@ +{ + "variable": [ + { + "\"object\"": { + "type": "${object({key = string, value = string})}", + "__is_block__": true + } + }, + { + "\"nested\"": { + "type": "${map(object({name = string, enabled = bool}))}", + "__is_block__": true + } + }, + { + "\"multi_arg\"": { + "default": "${merge({a = 1, b = 2}, {c = 3})}", + "__is_block__": true + } + } + ] +} diff --git a/test/integration/json_reserialized/function_tuples.json b/test/integration/json_reserialized/function_tuples.json new file mode 100644 index 00000000..6b645728 --- /dev/null +++ b/test/integration/json_reserialized/function_tuples.json @@ -0,0 +1,22 @@ +{ + "resource": [ + { + "\"octopusdeploy_process_templated_step\"": { + "\"step\"": { + "parameters": { + "\"Octopus.Action.Aws.IamCapabilities\"": "${jsonencode([\"CAPABILITY_AUTO_EXPAND\", \"CAPABILITY_IAM\", \"CAPABILITY_NAMED_IAM\"])}" + }, + "__is_block__": true + } + } + } + ], + "variable": [ + { + "\"list\"": { + "default": "${toset([\"a\", \"b\", \"c\"])}", + "__is_block__": true + } + } + ] +} diff --git a/test/integration/json_reserialized/nulls.json b/test/integration/json_reserialized/nulls.json new file mode 100644 index 00000000..9cbdd755 --- /dev/null +++ b/test/integration/json_reserialized/nulls.json @@ -0,0 +1,13 @@ +{ + "terraform": { + "unary": "${!null}", + "binary": "${(a == null)}", + "tuple": [ + "null", + 1, + 2 + ], + "single": "null", + "conditional": "${null ? null : null}" + } +} diff --git a/test/integration/json_reserialized/object_keys.json b/test/integration/json_reserialized/object_keys.json new file mode 100644 index 00000000..3146aa52 --- /dev/null +++ b/test/integration/json_reserialized/object_keys.json @@ -0,0 +1,13 @@ +{ + "bar": { + "0": 0, + "\"foo\"": 1, + "baz": 2, + "${(var.account)}": 3, + "${(format(\"key_prefix_%s\", local.foo))}": 4, + "\"prefix_${var.account}:${var.user}_suffix\"": 5, + "${1 + 1}": "\"two\"", + "${(2 + 2)}": "\"four\"", + "${format(\"key_%s\", var.name)}": "\"dynamic\"" + } +} diff --git a/test/integration/json_reserialized/operators.json b/test/integration/json_reserialized/operators.json new file mode 100644 index 00000000..5c611ea7 --- /dev/null +++ b/test/integration/json_reserialized/operators.json @@ -0,0 +1,20 @@ +{ + "locals": [ + { + "addition_1": "${((a + b) + c)}", + "addition_2": "${a + b}", + "addition_3": "${(a + b)}", + "eq_before_and": "${var.env == \"prod\" && var.debug}", + "and_before_ternary": "${true && true ? 1 : 0}", + "mixed_arith_cmp": "${var.a + var.b * var.c > 10}", + "full_chain": "${a + b == c && d || e}", + "left_assoc_sub": "${a - b - c}", + "left_assoc_mul_div": "${(a * b) / c}", + "nested_ternary": "${(a ? b : c) ? d : e}", + "unary_precedence": "${!a && b}", + "neg_precedence": "${(-a) + b}", + "neg_parentheses": "${-(a + b)}", + "__is_block__": true + } + ] +} diff --git a/test/integration/json_reserialized/resource_keyword_attribute.json b/test/integration/json_reserialized/resource_keyword_attribute.json new file mode 100644 index 00000000..6826a0b8 --- /dev/null +++ b/test/integration/json_reserialized/resource_keyword_attribute.json @@ -0,0 +1,17 @@ +{ + "resource": [ + { + "\"custom_provider_resource\"": { + "\"resource_name\"": { + "name": "\"resource_name\"", + "attribute": "\"attribute_value\"", + "if": "\"attribute_value2\"", + "in": "\"attribute_value3\"", + "for": "\"attribute_value4\"", + "for_each": "\"attribute_value5\"", + "__is_block__": true + } + } + } + ] +} diff --git a/test/integration/json_reserialized/smoke.json b/test/integration/json_reserialized/smoke.json new file mode 100644 index 00000000..a2382778 --- /dev/null +++ b/test/integration/json_reserialized/smoke.json @@ -0,0 +1,74 @@ +{ + "block": [ + { + "label1": { + "label2": { + "a": 5, + "b": 1256.5, + "c": "${15 + (10 * 12)}", + "d": "${(-a)}", + "e": "${(a == b ? true : false)}", + "f": "\"${\"this is a string\"}\"", + "g": "${1 == 2}", + "h": { + "k1": 5, + "k2": 10, + "\"k3\"": { + "k4": "\"a\"" + }, + "${(5 + 5)}": "\"d\"", + "${k5.attr.attr}": "\"e\"" + }, + "i": [ + "a", + "b", + "\"c${aaa}\"", + "d", + [ + 1, + 2, + 3 + ], + "${f(a)}", + "${provider::func::aa(5)}" + ], + "j": "${func(a, b, c, d ...)}", + "k": "${a.b.5}", + "l": "${a.*.b}", + "m": "${a[*][c].a.*.1}", + "block": [ + { + "b1": { + "a": 1, + "__is_block__": true + } + } + ], + "__is_block__": true + } + } + }, + { + "multiline_ternary": { + "foo": "${(bar ? baz(foo) : foo == \"bar\" ? \"baz\" : foo)}", + "__is_block__": true + } + }, + { + "multiline_binary_ops": { + "expr": "${{for k, v in local.map_a : k => v if lookup(local.map_b[v.id], \"enabled\", false) || (contains(local.map_c, v.id) && contains(local.map_d, v.id))}}", + "__is_block__": true + } + }, + { + "binary_op_before_unary": { + "dedup_keys_layer7": "${{for k, v in local.action_keys_layer7 : k => v if !contains(keys(local.dedup_keys_layer8), k) && !contains(keys(local.dedup_keys_layer9), k) && !contains(keys(local.dedup_keys_layer10), k)}}", + "__is_block__": true + } + }, + { + "route53_forwarding_rule_shares": "${{for forwarding_rule_key in keys(var.route53_resolver_forwarding_rule_shares) : \"${forwarding_rule_key}\" => {aws_account_ids = [for account_name in var.route53_resolver_forwarding_rule_shares[forwarding_rule_key].aws_account_names : module.remote_state_subaccounts.map[account_name].outputs[\"aws_account_id\"]]}... if substr(bucket_name, 0, 1) == \"l\"}}", + "__is_block__": true + } + ] +} diff --git a/test/integration/json_reserialized/string_interpolations.json b/test/integration/json_reserialized/string_interpolations.json new file mode 100644 index 00000000..f9df252c --- /dev/null +++ b/test/integration/json_reserialized/string_interpolations.json @@ -0,0 +1,18 @@ +{ + "block": [ + { + "label1": { + "label3": { + "simple_interpolation": "\"prefix:${var}-suffix\"", + "embedded_interpolation": "\"(long substring without interpolation); ${\"aaa-${local}-${local}\"}/us-west-2/key_foo\"", + "deeply_nested_interpolation": "\"prefix1-${\"prefix2-${\"prefix3-$${foo:bar}\"}\"}\"", + "escaped_interpolation": "\"prefix:$${aws:username}-suffix\"", + "simple_and_escaped": "\"${\"bar\"}$${baz:bat}\"", + "simple_and_escaped_reversed": "\"$${baz:bat}${\"bar\"}\"", + "nested_escaped": "\"bar-${\"$${baz:bat}\"}\"", + "__is_block__": true + } + } + } + ] +} diff --git a/test/integration/json_reserialized/unicode_strings.json b/test/integration/json_reserialized/unicode_strings.json new file mode 100644 index 00000000..5f8f0095 --- /dev/null +++ b/test/integration/json_reserialized/unicode_strings.json @@ -0,0 +1,21 @@ +{ + "locals": [ + { + "basic_unicode": "\"Hello, \u4e16\u754c! \u3053\u3093\u306b\u3061\u306f \u041f\u0440\u0438\u0432\u0435\u0442 \u0928\u092e\u0938\u094d\u0924\u0947\"", + "unicode_escapes": "\"\u00a9 \u2665 \u266a \u2620 \u263a\"", + "emoji_string": "\"\ud83d\ude80 \ud83c\udf0d \ud83d\udd25 \ud83c\udf89\"", + "rtl_text": "\"English and \u0627\u0644\u0639\u0631\u0628\u064a\u0629 text mixed\"", + "complex_unicode": "\"Python (\ud30c\uc774\uc36c) es \u5f88\u68d2\u7684! \u2665 \u03b1\u03b2\u03b3\u03b4\"", + "ascii": "\"ASCII: abc123\"", + "emoji": "\"Emoji: \ud83d\ude80\ud83c\udf0d\ud83d\udd25\ud83c\udf89\"", + "math": "\"Math: \u2211\u222b\u221a\u221e\u2260\u2264\u2265\"", + "currency": "\"Currency: \u00a3\u20ac\u00a5\u20b9\u20bd\u20a9\"", + "arrows": "\"Arrows: \u2190\u2191\u2192\u2193\u2194\u2195\"", + "cjk": "\"CJK: \u4f60\u597d\u4e16\u754c\uc548\ub155\ud558\uc138\uc694\u3053\u3093\u306b\u3061\u306f\"", + "cyrillic": "\"Cyrillic: \u041f\u0440\u0438\u0432\u0435\u0442 \u043c\u0438\u0440\"", + "special": "\"Special: \u00a9\u00ae\u2122\u00a7\u00b6\u2020\u2021\"", + "mixed_content": "\"<<-EOT\n Line with interpolation: ${var.name}\n Line with emoji: \ud83d\udc68\u200d\ud83d\udc69\u200d\ud83d\udc67\u200d\ud83d\udc66\n Line with quotes: \"quoted text\"\n Line with backslash: \\escaped\n EOT\"", + "__is_block__": true + } + ] +} diff --git a/test/integration/json_serialized/floats.json b/test/integration/json_serialized/floats.json new file mode 100644 index 00000000..db301445 --- /dev/null +++ b/test/integration/json_serialized/floats.json @@ -0,0 +1,31 @@ +{ + "locals": [ + { + "simple_float": 123.456, + "small_float": 0.123, + "large_float": 9876543.21, + "negative_float": -42.5, + "negative_small": -0.001, + "scientific_positive": "${1.23e5}", + "scientific_negative": "${9.87e-3}", + "scientific_large": "${6.022e+23}", + "integer_as_float": 100.0, + "float_calculation": "${105e+2 * 3.0 / 2.1}", + "float_comparison": "${5e1 > 2.3 ? 1.0 : 0.0}", + "float_list": [ + 1.1, + 2.2, + 3.3, + -4.4, + "${5.5e2}" + ], + "float_object": { + "pi": 3.14159, + "euler": 2.71828, + "sqrt2": 1.41421, + "scientific": "${-123e+2}" + }, + "__is_block__": true + } + ] +} diff --git a/test/integration/json_serialized/function_objects.json b/test/integration/json_serialized/function_objects.json new file mode 100644 index 00000000..86b8b131 --- /dev/null +++ b/test/integration/json_serialized/function_objects.json @@ -0,0 +1,22 @@ +{ + "variable": [ + { + "\"object\"": { + "type": "${object({key = string, value = string})}", + "__is_block__": true + } + }, + { + "\"nested\"": { + "type": "${map(object({name = string, enabled = bool}))}", + "__is_block__": true + } + }, + { + "\"multi_arg\"": { + "default": "${merge({a = 1, b = 2}, {c = 3})}", + "__is_block__": true + } + } + ] +} diff --git a/test/integration/json_serialized/function_tuples.json b/test/integration/json_serialized/function_tuples.json new file mode 100644 index 00000000..6b645728 --- /dev/null +++ b/test/integration/json_serialized/function_tuples.json @@ -0,0 +1,22 @@ +{ + "resource": [ + { + "\"octopusdeploy_process_templated_step\"": { + "\"step\"": { + "parameters": { + "\"Octopus.Action.Aws.IamCapabilities\"": "${jsonencode([\"CAPABILITY_AUTO_EXPAND\", \"CAPABILITY_IAM\", \"CAPABILITY_NAMED_IAM\"])}" + }, + "__is_block__": true + } + } + } + ], + "variable": [ + { + "\"list\"": { + "default": "${toset([\"a\", \"b\", \"c\"])}", + "__is_block__": true + } + } + ] +} diff --git a/test/integration/json_serialized/nulls.json b/test/integration/json_serialized/nulls.json new file mode 100644 index 00000000..9cbdd755 --- /dev/null +++ b/test/integration/json_serialized/nulls.json @@ -0,0 +1,13 @@ +{ + "terraform": { + "unary": "${!null}", + "binary": "${(a == null)}", + "tuple": [ + "null", + 1, + 2 + ], + "single": "null", + "conditional": "${null ? null : null}" + } +} diff --git a/test/integration/json_serialized/object_keys.json b/test/integration/json_serialized/object_keys.json new file mode 100644 index 00000000..3146aa52 --- /dev/null +++ b/test/integration/json_serialized/object_keys.json @@ -0,0 +1,13 @@ +{ + "bar": { + "0": 0, + "\"foo\"": 1, + "baz": 2, + "${(var.account)}": 3, + "${(format(\"key_prefix_%s\", local.foo))}": 4, + "\"prefix_${var.account}:${var.user}_suffix\"": 5, + "${1 + 1}": "\"two\"", + "${(2 + 2)}": "\"four\"", + "${format(\"key_%s\", var.name)}": "\"dynamic\"" + } +} diff --git a/test/integration/json_serialized/operators.json b/test/integration/json_serialized/operators.json new file mode 100644 index 00000000..5c611ea7 --- /dev/null +++ b/test/integration/json_serialized/operators.json @@ -0,0 +1,20 @@ +{ + "locals": [ + { + "addition_1": "${((a + b) + c)}", + "addition_2": "${a + b}", + "addition_3": "${(a + b)}", + "eq_before_and": "${var.env == \"prod\" && var.debug}", + "and_before_ternary": "${true && true ? 1 : 0}", + "mixed_arith_cmp": "${var.a + var.b * var.c > 10}", + "full_chain": "${a + b == c && d || e}", + "left_assoc_sub": "${a - b - c}", + "left_assoc_mul_div": "${(a * b) / c}", + "nested_ternary": "${(a ? b : c) ? d : e}", + "unary_precedence": "${!a && b}", + "neg_precedence": "${(-a) + b}", + "neg_parentheses": "${-(a + b)}", + "__is_block__": true + } + ] +} diff --git a/test/integration/json_serialized/resource_keyword_attribute.json b/test/integration/json_serialized/resource_keyword_attribute.json new file mode 100644 index 00000000..6826a0b8 --- /dev/null +++ b/test/integration/json_serialized/resource_keyword_attribute.json @@ -0,0 +1,17 @@ +{ + "resource": [ + { + "\"custom_provider_resource\"": { + "\"resource_name\"": { + "name": "\"resource_name\"", + "attribute": "\"attribute_value\"", + "if": "\"attribute_value2\"", + "in": "\"attribute_value3\"", + "for": "\"attribute_value4\"", + "for_each": "\"attribute_value5\"", + "__is_block__": true + } + } + } + ] +} diff --git a/test/integration/json_serialized/smoke.json b/test/integration/json_serialized/smoke.json new file mode 100644 index 00000000..a2382778 --- /dev/null +++ b/test/integration/json_serialized/smoke.json @@ -0,0 +1,74 @@ +{ + "block": [ + { + "label1": { + "label2": { + "a": 5, + "b": 1256.5, + "c": "${15 + (10 * 12)}", + "d": "${(-a)}", + "e": "${(a == b ? true : false)}", + "f": "\"${\"this is a string\"}\"", + "g": "${1 == 2}", + "h": { + "k1": 5, + "k2": 10, + "\"k3\"": { + "k4": "\"a\"" + }, + "${(5 + 5)}": "\"d\"", + "${k5.attr.attr}": "\"e\"" + }, + "i": [ + "a", + "b", + "\"c${aaa}\"", + "d", + [ + 1, + 2, + 3 + ], + "${f(a)}", + "${provider::func::aa(5)}" + ], + "j": "${func(a, b, c, d ...)}", + "k": "${a.b.5}", + "l": "${a.*.b}", + "m": "${a[*][c].a.*.1}", + "block": [ + { + "b1": { + "a": 1, + "__is_block__": true + } + } + ], + "__is_block__": true + } + } + }, + { + "multiline_ternary": { + "foo": "${(bar ? baz(foo) : foo == \"bar\" ? \"baz\" : foo)}", + "__is_block__": true + } + }, + { + "multiline_binary_ops": { + "expr": "${{for k, v in local.map_a : k => v if lookup(local.map_b[v.id], \"enabled\", false) || (contains(local.map_c, v.id) && contains(local.map_d, v.id))}}", + "__is_block__": true + } + }, + { + "binary_op_before_unary": { + "dedup_keys_layer7": "${{for k, v in local.action_keys_layer7 : k => v if !contains(keys(local.dedup_keys_layer8), k) && !contains(keys(local.dedup_keys_layer9), k) && !contains(keys(local.dedup_keys_layer10), k)}}", + "__is_block__": true + } + }, + { + "route53_forwarding_rule_shares": "${{for forwarding_rule_key in keys(var.route53_resolver_forwarding_rule_shares) : \"${forwarding_rule_key}\" => {aws_account_ids = [for account_name in var.route53_resolver_forwarding_rule_shares[forwarding_rule_key].aws_account_names : module.remote_state_subaccounts.map[account_name].outputs[\"aws_account_id\"]]}... if substr(bucket_name, 0, 1) == \"l\"}}", + "__is_block__": true + } + ] +} diff --git a/test/integration/json_serialized/string_interpolations.json b/test/integration/json_serialized/string_interpolations.json new file mode 100644 index 00000000..f9df252c --- /dev/null +++ b/test/integration/json_serialized/string_interpolations.json @@ -0,0 +1,18 @@ +{ + "block": [ + { + "label1": { + "label3": { + "simple_interpolation": "\"prefix:${var}-suffix\"", + "embedded_interpolation": "\"(long substring without interpolation); ${\"aaa-${local}-${local}\"}/us-west-2/key_foo\"", + "deeply_nested_interpolation": "\"prefix1-${\"prefix2-${\"prefix3-$${foo:bar}\"}\"}\"", + "escaped_interpolation": "\"prefix:$${aws:username}-suffix\"", + "simple_and_escaped": "\"${\"bar\"}$${baz:bat}\"", + "simple_and_escaped_reversed": "\"$${baz:bat}${\"bar\"}\"", + "nested_escaped": "\"bar-${\"$${baz:bat}\"}\"", + "__is_block__": true + } + } + } + ] +} diff --git a/test/integration/json_serialized/unicode_strings.json b/test/integration/json_serialized/unicode_strings.json new file mode 100644 index 00000000..5f8f0095 --- /dev/null +++ b/test/integration/json_serialized/unicode_strings.json @@ -0,0 +1,21 @@ +{ + "locals": [ + { + "basic_unicode": "\"Hello, \u4e16\u754c! \u3053\u3093\u306b\u3061\u306f \u041f\u0440\u0438\u0432\u0435\u0442 \u0928\u092e\u0938\u094d\u0924\u0947\"", + "unicode_escapes": "\"\u00a9 \u2665 \u266a \u2620 \u263a\"", + "emoji_string": "\"\ud83d\ude80 \ud83c\udf0d \ud83d\udd25 \ud83c\udf89\"", + "rtl_text": "\"English and \u0627\u0644\u0639\u0631\u0628\u064a\u0629 text mixed\"", + "complex_unicode": "\"Python (\ud30c\uc774\uc36c) es \u5f88\u68d2\u7684! \u2665 \u03b1\u03b2\u03b3\u03b4\"", + "ascii": "\"ASCII: abc123\"", + "emoji": "\"Emoji: \ud83d\ude80\ud83c\udf0d\ud83d\udd25\ud83c\udf89\"", + "math": "\"Math: \u2211\u222b\u221a\u221e\u2260\u2264\u2265\"", + "currency": "\"Currency: \u00a3\u20ac\u00a5\u20b9\u20bd\u20a9\"", + "arrows": "\"Arrows: \u2190\u2191\u2192\u2193\u2194\u2195\"", + "cjk": "\"CJK: \u4f60\u597d\u4e16\u754c\uc548\ub155\ud558\uc138\uc694\u3053\u3093\u306b\u3061\u306f\"", + "cyrillic": "\"Cyrillic: \u041f\u0440\u0438\u0432\u0435\u0442 \u043c\u0438\u0440\"", + "special": "\"Special: \u00a9\u00ae\u2122\u00a7\u00b6\u2020\u2021\"", + "mixed_content": "\"<<-EOT\n Line with interpolation: ${var.name}\n Line with emoji: \ud83d\udc68\u200d\ud83d\udc69\u200d\ud83d\udc67\u200d\ud83d\udc66\n Line with quotes: \"quoted text\"\n Line with backslash: \\escaped\n EOT\"", + "__is_block__": true + } + ] +} diff --git a/test/integration/specialized/builder_basic.json b/test/integration/specialized/builder_basic.json new file mode 100644 index 00000000..da62720b --- /dev/null +++ b/test/integration/specialized/builder_basic.json @@ -0,0 +1,63 @@ +{ + "__is_block__": true, + "resource": [ + { + "aws_instance": { + "web": { + "__is_block__": true, + "ami": "\"ami-12345\"", + "instance_type": "\"t2.micro\"", + "count": 2 + } + } + }, + { + "aws_s3_bucket": { + "data": { + "__is_block__": true, + "bucket": "\"my-bucket\"", + "acl": "\"private\"" + } + } + }, + { + "aws_instance": { + "nested": { + "__is_block__": true, + "ami": "\"ami-99999\"", + "provisioner": [ + { + "local-exec": { + "__is_block__": true, + "command": "\"echo hello\"" + } + }, + { + "remote-exec": { + "__is_block__": true, + "inline": "[\"puppet apply\"]" + } + } + ] + } + } + } + ], + "variable": [ + { + "instance_type": { + "__is_block__": true, + "default": "\"t2.micro\"", + "description": "\"The instance type\"" + } + } + ], + "locals": [ + { + "__is_block__": true, + "port": 8080, + "enabled": true, + "name": "\"my-app\"" + } + ] +} diff --git a/test/integration/specialized/builder_basic.tf b/test/integration/specialized/builder_basic.tf new file mode 100644 index 00000000..b7ee2131 --- /dev/null +++ b/test/integration/specialized/builder_basic.tf @@ -0,0 +1,38 @@ +resource aws_instance web { + ami = "ami-12345" + instance_type = "t2.micro" + count = 2 +} + + +resource aws_s3_bucket data { + bucket = "my-bucket" + acl = "private" +} + + +resource aws_instance nested { + ami = "ami-99999" + + provisioner local-exec { + command = "echo hello" + } + + + provisioner remote-exec { + inline = ["puppet apply"] + } +} + + +variable instance_type { + default = "t2.micro" + description = "The instance type" +} + + +locals { + port = 8080 + enabled = true + name = "my-app" +} diff --git a/test/integration/specialized/builder_basic_reparsed.json b/test/integration/specialized/builder_basic_reparsed.json new file mode 100644 index 00000000..32e4954d --- /dev/null +++ b/test/integration/specialized/builder_basic_reparsed.json @@ -0,0 +1,64 @@ +{ + "resource": [ + { + "aws_instance": { + "web": { + "ami": "\"ami-12345\"", + "instance_type": "\"t2.micro\"", + "count": 2, + "__is_block__": true + } + } + }, + { + "aws_s3_bucket": { + "data": { + "bucket": "\"my-bucket\"", + "acl": "\"private\"", + "__is_block__": true + } + } + }, + { + "aws_instance": { + "nested": { + "ami": "\"ami-99999\"", + "provisioner": [ + { + "local-exec": { + "command": "\"echo hello\"", + "__is_block__": true + } + }, + { + "remote-exec": { + "inline": [ + "\"puppet apply\"" + ], + "__is_block__": true + } + } + ], + "__is_block__": true + } + } + } + ], + "variable": [ + { + "instance_type": { + "default": "\"t2.micro\"", + "description": "\"The instance type\"", + "__is_block__": true + } + } + ], + "locals": [ + { + "port": 8080, + "enabled": "true", + "name": "\"my-app\"", + "__is_block__": true + } + ] +} diff --git a/test/integration/specialized/builder_basic_reserialized.json b/test/integration/specialized/builder_basic_reserialized.json new file mode 100644 index 00000000..364ef0c3 --- /dev/null +++ b/test/integration/specialized/builder_basic_reserialized.json @@ -0,0 +1,62 @@ +{ + "resource": [ + { + "aws_instance": { + "web": { + "ami": "\"ami-12345\"", + "instance_type": "\"t2.micro\"", + "count": 2, + "__is_block__": true + } + } + }, + { + "aws_s3_bucket": { + "data": { + "bucket": "\"my-bucket\"", + "acl": "\"private\"", + "__is_block__": true + } + } + }, + { + "aws_instance": { + "nested": { + "ami": "\"ami-99999\"", + "provisioner": [ + { + "local-exec": { + "command": "\"echo hello\"", + "__is_block__": true + } + }, + { + "remote-exec": { + "inline": "[\"puppet apply\"]", + "__is_block__": true + } + } + ], + "__is_block__": true + } + } + } + ], + "variable": [ + { + "instance_type": { + "default": "\"t2.micro\"", + "description": "\"The instance type\"", + "__is_block__": true + } + } + ], + "locals": [ + { + "port": 8080, + "enabled": "true", + "name": "\"my-app\"", + "__is_block__": true + } + ] +} diff --git a/test/integration/specialized/comments.json b/test/integration/specialized/comments.json new file mode 100644 index 00000000..5d7e6ef4 --- /dev/null +++ b/test/integration/specialized/comments.json @@ -0,0 +1,57 @@ +{ + "resource": [ + { + "\"aws_instance\"": { + "\"web\"": { + "ami": "\"abc-123\"", + "instance_type": "\"t2.micro\"", + "count": "${1 + 2}", + "tags": { + "Name": "\"web\"", + "Env": "\"prod\"" + }, + "enabled": "true", + "nested": [ + { + "key": "\"value\"", + "__comments__": [ + { + "value": "comment inside nested block" + } + ], + "__is_block__": true + } + ], + "__comments__": [ + { + "value": "standalone comment inside block" + }, + { + "value": "hash standalone comment" + }, + { + "value": "absorbed standalone after binary_op" + }, + { + "value": "multi-line\n block comment" + } + ], + "__inline_comments__": [ + { + "value": "comment inside object" + }, + { + "value": "inline after value" + } + ], + "__is_block__": true + } + } + } + ], + "__comments__": [ + { + "value": "top-level standalone comment" + } + ] +} diff --git a/test/integration/specialized/comments.tf b/test/integration/specialized/comments.tf new file mode 100644 index 00000000..6755f2d3 --- /dev/null +++ b/test/integration/specialized/comments.tf @@ -0,0 +1,28 @@ +// top-level standalone comment +resource "aws_instance" "web" { + ami = "abc-123" + + // standalone comment inside block + instance_type = "t2.micro" + + # hash standalone comment + count = 1 + 2 + # absorbed standalone after binary_op + + tags = { + Name = "web" + # comment inside object + Env = "prod" # inline after value + } + + /* + multi-line + block comment + */ + enabled = true + + nested { + // comment inside nested block + key = "value" + } +} diff --git a/test/integration/specialized/heredocs.tf b/test/integration/specialized/heredocs.tf new file mode 100644 index 00000000..9fc16498 --- /dev/null +++ b/test/integration/specialized/heredocs.tf @@ -0,0 +1,34 @@ +locals { + simple = < 10}", + "full_chain": "${(((a + b) == c) && d) || e}", + "left_assoc_sub": "${(a - b) - c}", + "left_assoc_mul_div": "${(a * b) / c}", + "nested_ternary": "${(a ? b : c) ? d : e}", + "unary_precedence": "${(!a) && b}", + "neg_precedence": "${(-a) + b}", + "neg_parentheses": "${-(a + b)}", + "__is_block__": true + } + ] +} diff --git a/test/integration/specialized/operator_precedence.tf b/test/integration/specialized/operator_precedence.tf new file mode 100644 index 00000000..f8351161 --- /dev/null +++ b/test/integration/specialized/operator_precedence.tf @@ -0,0 +1,15 @@ +locals { + addition_1 = ((a + b) + c) + addition_2 = a + b + addition_3 = (a + b) + eq_before_and = var.env == "prod" && var.debug + and_before_ternary = true && true ? 1 : 0 + mixed_arith_cmp = var.a + var.b * var.c > 10 + full_chain = a + b == c && d || e + left_assoc_sub = a - b - c + left_assoc_mul_div = (a * b) / c + nested_ternary = (a ? b : c) ? d : e + unary_precedence = !a && b + neg_precedence = (-a) + b + neg_parentheses = -(a + b) +} diff --git a/test/integration/specialized/template_directives.json b/test/integration/specialized/template_directives.json new file mode 100644 index 00000000..e1f149d4 --- /dev/null +++ b/test/integration/specialized/template_directives.json @@ -0,0 +1,14 @@ +{ + "basic_if": "\"prefix%{ if var.enabled }yes%{ endif }suffix\"", + "if_else": "\"%{ if var.enabled }yes%{ else }no%{ endif }\"", + "strip_markers": "\"%{~ if var.enabled ~}yes%{~ endif ~}\"", + "strip_partial": "\"%{~ if var.enabled }yes%{ endif ~}\"", + "basic_for": "\"%{ for item in var.list }${item}%{ endfor }\"", + "for_key_value": "\"%{ for k, v in var.map }${k}=${v}%{ endfor }\"", + "issue_247": "\"kms%{ if var.id != \\\"primary\\\" }-${var.id}%{ endif }\"", + "mixed": "\"${var.prefix}%{ if var.suffix }-${var.suffix}%{ endif }\"", + "nested_if": "\"%{ if a }%{ if b }both%{ endif }%{ endif }\"", + "escaped_directive": "\"use %%{literal} not directives\"", + "for_with_interp": "\"%{ for x in var.items }item=${x}, %{ endfor }\"", + "if_strip_else_strip": "\"%{~ if cond ~}a%{~ else ~}b%{~ endif ~}\"" +} diff --git a/test/integration/specialized/template_directives.tf b/test/integration/specialized/template_directives.tf new file mode 100644 index 00000000..276cd878 --- /dev/null +++ b/test/integration/specialized/template_directives.tf @@ -0,0 +1,12 @@ +basic_if = "prefix%{ if var.enabled }yes%{ endif }suffix" +if_else = "%{ if var.enabled }yes%{ else }no%{ endif }" +strip_markers = "%{~ if var.enabled ~}yes%{~ endif ~}" +strip_partial = "%{~ if var.enabled }yes%{ endif ~}" +basic_for = "%{ for item in var.list }${item}%{ endfor }" +for_key_value = "%{ for k, v in var.map }${k}=${v}%{ endfor }" +issue_247 = "kms%{ if var.id != \"primary\" }-${var.id}%{ endif }" +mixed = "${var.prefix}%{ if var.suffix }-${var.suffix}%{ endif }" +nested_if = "%{ if a }%{ if b }both%{ endif }%{ endif }" +escaped_directive = "use %%{literal} not directives" +for_with_interp = "%{ for x in var.items }item=${x}, %{ endfor }" +if_strip_else_strip = "%{~ if cond ~}a%{~ else ~}b%{~ endif ~}" diff --git a/test/integration/specialized/template_directives_reconstructed.tf b/test/integration/specialized/template_directives_reconstructed.tf new file mode 100644 index 00000000..276cd878 --- /dev/null +++ b/test/integration/specialized/template_directives_reconstructed.tf @@ -0,0 +1,12 @@ +basic_if = "prefix%{ if var.enabled }yes%{ endif }suffix" +if_else = "%{ if var.enabled }yes%{ else }no%{ endif }" +strip_markers = "%{~ if var.enabled ~}yes%{~ endif ~}" +strip_partial = "%{~ if var.enabled }yes%{ endif ~}" +basic_for = "%{ for item in var.list }${item}%{ endfor }" +for_key_value = "%{ for k, v in var.map }${k}=${v}%{ endfor }" +issue_247 = "kms%{ if var.id != \"primary\" }-${var.id}%{ endif }" +mixed = "${var.prefix}%{ if var.suffix }-${var.suffix}%{ endif }" +nested_if = "%{ if a }%{ if b }both%{ endif }%{ endif }" +escaped_directive = "use %%{literal} not directives" +for_with_interp = "%{ for x in var.items }item=${x}, %{ endfor }" +if_strip_else_strip = "%{~ if cond ~}a%{~ else ~}b%{~ endif ~}" diff --git a/test/integration/specialized/template_directives_reserialized.json b/test/integration/specialized/template_directives_reserialized.json new file mode 100644 index 00000000..e1f149d4 --- /dev/null +++ b/test/integration/specialized/template_directives_reserialized.json @@ -0,0 +1,14 @@ +{ + "basic_if": "\"prefix%{ if var.enabled }yes%{ endif }suffix\"", + "if_else": "\"%{ if var.enabled }yes%{ else }no%{ endif }\"", + "strip_markers": "\"%{~ if var.enabled ~}yes%{~ endif ~}\"", + "strip_partial": "\"%{~ if var.enabled }yes%{ endif ~}\"", + "basic_for": "\"%{ for item in var.list }${item}%{ endfor }\"", + "for_key_value": "\"%{ for k, v in var.map }${k}=${v}%{ endfor }\"", + "issue_247": "\"kms%{ if var.id != \\\"primary\\\" }-${var.id}%{ endif }\"", + "mixed": "\"${var.prefix}%{ if var.suffix }-${var.suffix}%{ endif }\"", + "nested_if": "\"%{ if a }%{ if b }both%{ endif }%{ endif }\"", + "escaped_directive": "\"use %%{literal} not directives\"", + "for_with_interp": "\"%{ for x in var.items }item=${x}, %{ endfor }\"", + "if_strip_else_strip": "\"%{~ if cond ~}a%{~ else ~}b%{~ endif ~}\"" +} diff --git a/test/integration/test_cli_subprocess.py b/test/integration/test_cli_subprocess.py new file mode 100644 index 00000000..9de3b4c5 --- /dev/null +++ b/test/integration/test_cli_subprocess.py @@ -0,0 +1,781 @@ +"""Subprocess-based integration tests for hcl2tojson and jsontohcl2 CLIs. + +These tests invoke the CLIs as real external processes via subprocess.run(), +verifying behavior that cannot be tested with mocked sys.argv/stdout/stdin: +real exit codes, stdout/stderr separation, stdin piping, pipe composition +between the two CLIs, and TTY vs pipe default behavior. + +Golden fixtures are reused from test/integration/hcl2_original/, json_serialized/, +hcl2_reconstructed/, and json_reserialized/. +""" +# pylint: disable=C0103,C0114,C0115,C0116 + +import json +import os +import subprocess +import sys +import tempfile +from pathlib import Path +from typing import List, Optional +from unittest import TestCase + +INTEGRATION_DIR = Path(__file__).absolute().parent +HCL_DIR = INTEGRATION_DIR / "hcl2_original" +JSON_DIR = INTEGRATION_DIR / "json_serialized" +HCL_RECONSTRUCTED_DIR = INTEGRATION_DIR / "hcl2_reconstructed" +JSON_RESERIALIZED_DIR = INTEGRATION_DIR / "json_reserialized" +PROJECT_ROOT = INTEGRATION_DIR.parent.parent + +_HCL2TOJSON = [sys.executable, "-c", "from cli.hcl_to_json import main; main()"] +_JSONTOHCL2 = [sys.executable, "-c", "from cli.json_to_hcl import main; main()"] + +_TIMEOUT = 30 + + +def _get_suites() -> List[str]: + return sorted(f.stem for f in HCL_DIR.iterdir() if f.is_file()) + + +def _run_hcl2tojson( + *args: str, + stdin: Optional[str] = None, + cwd: Optional[str] = None, +) -> subprocess.CompletedProcess: + return subprocess.run( + _HCL2TOJSON + list(args), + input=stdin, + capture_output=True, + text=True, + timeout=_TIMEOUT, + cwd=cwd or str(PROJECT_ROOT), + check=False, + ) + + +def _run_jsontohcl2( + *args: str, + stdin: Optional[str] = None, + cwd: Optional[str] = None, +) -> subprocess.CompletedProcess: + return subprocess.run( + _JSONTOHCL2 + list(args), + input=stdin, + capture_output=True, + text=True, + timeout=_TIMEOUT, + cwd=cwd or str(PROJECT_ROOT), + check=False, + ) + + +def _write_file(path, content): + with open(path, "w", encoding="utf-8") as f: + f.write(content) + + +# --------------------------------------------------------------------------- +# Exit codes +# --------------------------------------------------------------------------- + + +class TestHcl2ToJsonExitCodes(TestCase): + def test_success_exits_0(self): + result = _run_hcl2tojson(str(HCL_DIR / "nulls.tf")) + self.assertEqual(result.returncode, 0, f"stderr: {result.stderr}") + + def test_file_not_found_exits_4(self): + result = _run_hcl2tojson("/nonexistent/path.tf") + self.assertEqual(result.returncode, 4) + + def test_parse_error_exits_2(self): + with tempfile.TemporaryDirectory() as tmpdir: + bad = os.path.join(tmpdir, "bad.tf") + _write_file(bad, "{{{") + result = _run_hcl2tojson(bad) + self.assertEqual(result.returncode, 2) + + def test_skip_partial_exits_1(self): + with tempfile.TemporaryDirectory() as tmpdir: + indir = os.path.join(tmpdir, "in") + outdir = os.path.join(tmpdir, "out") + os.mkdir(indir) + _write_file(os.path.join(indir, "good.tf"), "x = 1\n") + _write_file(os.path.join(indir, "bad.tf"), "{{{") + result = _run_hcl2tojson("-s", indir, "-o", outdir) + self.assertEqual(result.returncode, 1, f"stderr: {result.stderr}") + + def test_ndjson_all_fail_with_skip_exits_2(self): + with tempfile.TemporaryDirectory() as tmpdir: + _write_file(os.path.join(tmpdir, "bad1.tf"), "{{{") + _write_file(os.path.join(tmpdir, "bad2.tf"), "<<<") + result = _run_hcl2tojson("--ndjson", "-s", tmpdir) + self.assertEqual(result.returncode, 2, f"stderr: {result.stderr}") + + def test_directory_without_ndjson_or_output_errors(self): + result = _run_hcl2tojson(str(HCL_DIR)) + self.assertNotEqual(result.returncode, 0) + self.assertIn("--ndjson", result.stderr) + + def test_stdin_skip_bad_input_exits_1(self): + result = _run_hcl2tojson("-s", "-", stdin="{{{") + # Skip mode: bad stdin is skipped gracefully (not exit 2), partial exit + self.assertEqual(result.returncode, 1, f"stderr: {result.stderr}") + + def test_single_file_skip_to_output_exits_1(self): + with tempfile.TemporaryDirectory() as tmpdir: + bad = os.path.join(tmpdir, "bad.tf") + out = os.path.join(tmpdir, "out.json") + _write_file(bad, "{{{") + result = _run_hcl2tojson("-s", bad, "-o", out) + self.assertEqual(result.returncode, 1, f"stderr: {result.stderr}") + self.assertFalse(os.path.exists(out)) + + +class TestJsonToHclExitCodes(TestCase): + def test_success_exits_0(self): + result = _run_jsontohcl2(str(JSON_DIR / "nulls.json")) + self.assertEqual(result.returncode, 0, f"stderr: {result.stderr}") + + def test_file_not_found_exits_4(self): + result = _run_jsontohcl2("/nonexistent/path.json") + self.assertEqual(result.returncode, 4) + + def test_invalid_json_exits_1(self): + with tempfile.TemporaryDirectory() as tmpdir: + bad = os.path.join(tmpdir, "bad.json") + _write_file(bad, "{bad json") + result = _run_jsontohcl2(bad) + self.assertEqual(result.returncode, 1) + + def test_structure_error_exits_2(self): + result = _run_jsontohcl2("--fragment", "-", stdin="[1,2,3]") + self.assertEqual(result.returncode, 2) + + def test_diff_with_differences_exits_5(self): + with tempfile.TemporaryDirectory() as tmpdir: + modified = os.path.join(tmpdir, "modified.json") + _write_file(modified, json.dumps({"x": 999})) + original_tf = str(HCL_RECONSTRUCTED_DIR / "nulls.tf") + result = _run_jsontohcl2("--diff", original_tf, modified) + self.assertEqual(result.returncode, 5, f"stderr: {result.stderr}") + + def test_diff_identical_exits_0(self): + result = _run_jsontohcl2( + "--diff", + str(HCL_RECONSTRUCTED_DIR / "nulls.tf"), + str(JSON_DIR / "nulls.json"), + ) + self.assertEqual(result.returncode, 0, f"stderr: {result.stderr}") + + def test_directory_without_output_errors(self): + result = _run_jsontohcl2(str(JSON_DIR)) + self.assertNotEqual(result.returncode, 0) + self.assertIn("-o", result.stderr) + + +# --------------------------------------------------------------------------- +# Stdout / stderr separation +# --------------------------------------------------------------------------- + + +class TestStdoutStderrSeparation(TestCase): + def test_hcl2tojson_json_on_stdout_progress_on_stderr(self): + fixture = str(HCL_DIR / "nulls.tf") + result = _run_hcl2tojson(fixture) + self.assertEqual(result.returncode, 0, f"stderr: {result.stderr}") + json.loads(result.stdout) # stdout must be valid JSON + self.assertIn("nulls.tf", result.stderr) + + def test_hcl2tojson_quiet_suppresses_stderr(self): + result = _run_hcl2tojson("-q", str(HCL_DIR / "nulls.tf")) + self.assertEqual(result.returncode, 0) + self.assertEqual(result.stderr, "") + json.loads(result.stdout) + + def test_hcl2tojson_error_on_stderr_only(self): + result = _run_hcl2tojson("/nonexistent/path.tf") + self.assertNotEqual(result.returncode, 0) + self.assertEqual(result.stdout, "") + self.assertIn("Error:", result.stderr) + + def test_jsontohcl2_hcl_on_stdout_progress_on_stderr(self): + fixture = str(JSON_DIR / "nulls.json") + result = _run_jsontohcl2(fixture) + self.assertEqual(result.returncode, 0, f"stderr: {result.stderr}") + self.assertIn("terraform", result.stdout) + self.assertIn("nulls.json", result.stderr) + + def test_jsontohcl2_quiet_suppresses_stderr(self): + result = _run_jsontohcl2("-q", str(JSON_DIR / "nulls.json")) + self.assertEqual(result.returncode, 0) + self.assertEqual(result.stderr, "") + self.assertIn("terraform", result.stdout) + + +# --------------------------------------------------------------------------- +# Stdin piping +# --------------------------------------------------------------------------- + + +class TestStdinPiping(TestCase): + def test_hcl2tojson_reads_stdin(self): + result = _run_hcl2tojson(stdin="x = 1\n") + self.assertEqual(result.returncode, 0, f"stderr: {result.stderr}") + data = json.loads(result.stdout) + self.assertEqual(data["x"], 1) + + def test_jsontohcl2_reads_stdin(self): + result = _run_jsontohcl2(stdin='{"x": 1}') + self.assertEqual(result.returncode, 0, f"stderr: {result.stderr}") + self.assertIn("x", result.stdout) + self.assertIn("1", result.stdout) + + def test_hcl2tojson_stdin_explicit_dash(self): + result = _run_hcl2tojson("-", stdin="x = 1\n") + self.assertEqual(result.returncode, 0, f"stderr: {result.stderr}") + data = json.loads(result.stdout) + self.assertEqual(data["x"], 1) + + def test_jsontohcl2_stdin_explicit_dash(self): + result = _run_jsontohcl2("-", stdin='{"x": 1}') + self.assertEqual(result.returncode, 0, f"stderr: {result.stderr}") + self.assertIn("x", result.stdout) + self.assertIn("1", result.stdout) + + def test_hcl2tojson_stdin_to_output_file(self): + with tempfile.TemporaryDirectory() as tmpdir: + out = os.path.join(tmpdir, "out.json") + result = _run_hcl2tojson("-", "-o", out, stdin="x = 1\n") + self.assertEqual(result.returncode, 0, f"stderr: {result.stderr}") + with open(out, encoding="utf-8") as f: + data = json.load(f) + self.assertEqual(data["x"], 1) + + def test_jsontohcl2_stdin_to_output_file(self): + with tempfile.TemporaryDirectory() as tmpdir: + out = os.path.join(tmpdir, "out.tf") + result = _run_jsontohcl2("-", "-o", out, stdin='{"x": 1}') + self.assertEqual(result.returncode, 0, f"stderr: {result.stderr}") + with open(out, encoding="utf-8") as f: + content = f.read() + self.assertIn("x", content) + self.assertIn("1", content) + + +# --------------------------------------------------------------------------- +# Pipe composition (highest value) +# --------------------------------------------------------------------------- + + +class TestPipeComposition(TestCase): + maxDiff = None + + def test_hcl_to_json_to_hcl_round_trip(self): + """hcl2tojson | jsontohcl2 | hcl2tojson — JSON must match.""" + for suite in _get_suites(): + with self.subTest(suite=suite): + hcl_path = str(HCL_DIR / f"{suite}.tf") + + step1 = _run_hcl2tojson(hcl_path) + self.assertEqual(step1.returncode, 0, f"step1 stderr: {step1.stderr}") + + step2 = _run_jsontohcl2(stdin=step1.stdout) + self.assertEqual(step2.returncode, 0, f"step2 stderr: {step2.stderr}") + + step3 = _run_hcl2tojson(stdin=step2.stdout) + self.assertEqual(step3.returncode, 0, f"step3 stderr: {step3.stderr}") + + json1 = json.loads(step1.stdout) + json3 = json.loads(step3.stdout) + self.assertEqual( + json3, + json1, + f"HCL -> JSON -> HCL -> JSON mismatch for {suite}", + ) + + def test_json_to_hcl_to_json_round_trip(self): + """jsontohcl2 | hcl2tojson — JSON must match reserialized golden.""" + for suite in _get_suites(): + with self.subTest(suite=suite): + json_path = str(JSON_DIR / f"{suite}.json") + golden_path = JSON_RESERIALIZED_DIR / f"{suite}.json" + + step1 = _run_jsontohcl2(json_path) + self.assertEqual(step1.returncode, 0, f"step1 stderr: {step1.stderr}") + + step2 = _run_hcl2tojson(stdin=step1.stdout) + self.assertEqual(step2.returncode, 0, f"step2 stderr: {step2.stderr}") + + actual = json.loads(step2.stdout) + expected = json.loads(golden_path.read_text()) + self.assertEqual( + actual, + expected, + f"JSON -> HCL -> JSON mismatch for {suite}", + ) + + def test_round_trip_matches_golden_hcl(self): + """hcl2tojson | jsontohcl2 — HCL must match reconstructed golden.""" + for suite in _get_suites(): + with self.subTest(suite=suite): + hcl_path = str(HCL_DIR / f"{suite}.tf") + golden_path = HCL_RECONSTRUCTED_DIR / f"{suite}.tf" + + step1 = _run_hcl2tojson(hcl_path) + self.assertEqual(step1.returncode, 0, f"step1 stderr: {step1.stderr}") + + step2 = _run_jsontohcl2(stdin=step1.stdout) + self.assertEqual(step2.returncode, 0, f"step2 stderr: {step2.stderr}") + + # The CLI helper appends a trailing newline after conversion + # output, so normalize before comparing with golden files. + expected = golden_path.read_text() + actual = step2.stdout.rstrip("\n") + "\n" + self.assertMultiLineEqual(actual, expected) + + +# --------------------------------------------------------------------------- +# File output (-o flag) +# --------------------------------------------------------------------------- + + +class TestFileOutput(TestCase): + maxDiff = None + + def test_hcl2tojson_single_file_to_output(self): + with tempfile.TemporaryDirectory() as tmpdir: + out_path = os.path.join(tmpdir, "out.json") + fixture = str(HCL_DIR / "nulls.tf") + golden = JSON_DIR / "nulls.json" + + result = _run_hcl2tojson(fixture, "-o", out_path) + self.assertEqual(result.returncode, 0, f"stderr: {result.stderr}") + + with open(out_path, encoding="utf-8") as f: + actual = json.load(f) + expected = json.loads(golden.read_text()) + self.assertEqual(actual, expected) + + def test_jsontohcl2_single_file_to_output(self): + with tempfile.TemporaryDirectory() as tmpdir: + out_path = os.path.join(tmpdir, "out.tf") + fixture = str(JSON_DIR / "nulls.json") + + result = _run_jsontohcl2(fixture, "-o", out_path) + self.assertEqual(result.returncode, 0, f"stderr: {result.stderr}") + + with open(out_path, encoding="utf-8") as f: + content = f.read() + self.assertIn("terraform", content) + + def test_hcl2tojson_directory_to_output_dir(self): + with tempfile.TemporaryDirectory() as tmpdir: + outdir = os.path.join(tmpdir, "out") + result = _run_hcl2tojson(str(HCL_DIR), "-o", outdir) + self.assertEqual(result.returncode, 0, f"stderr: {result.stderr}") + + expected_files = {f"{s}.json" for s in _get_suites()} + actual_files = set(os.listdir(outdir)) + self.assertEqual(actual_files, expected_files) + + def test_jsontohcl2_directory_to_output_dir(self): + with tempfile.TemporaryDirectory() as tmpdir: + outdir = os.path.join(tmpdir, "out") + result = _run_jsontohcl2(str(JSON_DIR), "-o", outdir) + self.assertEqual(result.returncode, 0, f"stderr: {result.stderr}") + + expected_files = {f"{s}.tf" for s in _get_suites()} + actual_files = set(os.listdir(outdir)) + self.assertEqual(actual_files, expected_files) + + +# --------------------------------------------------------------------------- +# NDJSON mode +# --------------------------------------------------------------------------- + + +class TestNdjsonSubprocess(TestCase): + def test_ndjson_single_file(self): + result = _run_hcl2tojson("--ndjson", str(HCL_DIR / "nulls.tf")) + self.assertEqual(result.returncode, 0, f"stderr: {result.stderr}") + lines = result.stdout.strip().split("\n") + self.assertEqual(len(lines), 1) + data = json.loads(lines[0]) + self.assertNotIn("__file__", data) + + def test_ndjson_directory(self): + result = _run_hcl2tojson("--ndjson", str(HCL_DIR)) + self.assertEqual(result.returncode, 0, f"stderr: {result.stderr}") + lines = result.stdout.strip().split("\n") + self.assertEqual(len(lines), len(_get_suites())) + for line in lines: + data = json.loads(line) + self.assertIn("__file__", data) + + def test_ndjson_from_stdin(self): + result = _run_hcl2tojson("--ndjson", "-", stdin="x = 1\n") + self.assertEqual(result.returncode, 0, f"stderr: {result.stderr}") + lines = result.stdout.strip().split("\n") + self.assertEqual(len(lines), 1) + data = json.loads(lines[0]) + self.assertEqual(data["x"], 1) + + def test_ndjson_parse_error_structured_json_on_stderr(self): + with tempfile.TemporaryDirectory() as tmpdir: + bad = os.path.join(tmpdir, "bad.tf") + _write_file(bad, "{{{") + result = _run_hcl2tojson("--ndjson", bad) + self.assertEqual(result.returncode, 2) + # NDJSON mode emits structured JSON errors to stderr + # (stderr may also contain the filename progress line) + json_lines = [] + for line in result.stderr.strip().splitlines(): + try: + json_lines.append(json.loads(line)) + except json.JSONDecodeError: + pass + self.assertTrue(json_lines, f"No JSON error on stderr: {result.stderr}") + err = json_lines[0] + self.assertIn("error", err) + self.assertIn("message", err) + + def test_ndjson_all_io_fail_with_skip_exits_4(self): + result = _run_hcl2tojson( + "--ndjson", "-s", "/nonexistent/a.tf", "/nonexistent/b.tf" + ) + # All-fail with IO errors should exit 4 (EXIT_IO_ERROR), not 2 + self.assertEqual(result.returncode, 4, f"stderr: {result.stderr}") + + def test_ndjson_only_filter_skips_empty(self): + result = _run_hcl2tojson( + "--ndjson", "--only", "nonexistent_block_type", str(HCL_DIR / "nulls.tf") + ) + self.assertEqual(result.returncode, 0, f"stderr: {result.stderr}") + # No NDJSON line emitted when all data is filtered out + self.assertEqual(result.stdout.strip(), "") + + def test_ndjson_json_indent_warning(self): + result = _run_hcl2tojson( + "--ndjson", "--json-indent", "2", str(HCL_DIR / "nulls.tf") + ) + self.assertEqual(result.returncode, 0) + self.assertIn("ignored", result.stderr.lower()) + + +# --------------------------------------------------------------------------- +# Compact / indent output +# --------------------------------------------------------------------------- + + +class TestCompactOutput(TestCase): + def test_compact_flag_single_line(self): + result = _run_hcl2tojson("--compact", str(HCL_DIR / "nulls.tf")) + self.assertEqual(result.returncode, 0, f"stderr: {result.stderr}") + # Compact output: only a trailing newline, no interior newlines + content = result.stdout.rstrip("\n") + self.assertNotIn("\n", content) + # Truly compact: no spaces after separators (uses ",:" not ", : ") + self.assertNotRegex(content, r'": ') + self.assertNotRegex(content, r", ") + + def test_pipe_default_is_compact(self): + # When stdout is a pipe (not TTY), the default is compact output. + # This code path cannot be tested in unit tests that mock sys.stdout. + result = _run_hcl2tojson(str(HCL_DIR / "nulls.tf")) + self.assertEqual(result.returncode, 0, f"stderr: {result.stderr}") + content = result.stdout.rstrip("\n") + self.assertNotIn("\n", content) + + def test_json_indent_overrides_default(self): + result = _run_hcl2tojson("--json-indent", "2", str(HCL_DIR / "nulls.tf")) + self.assertEqual(result.returncode, 0, f"stderr: {result.stderr}") + # Indented output should have multiple lines + lines = result.stdout.strip().split("\n") + self.assertGreater(len(lines), 1) + + +# --------------------------------------------------------------------------- +# Diff mode +# --------------------------------------------------------------------------- + + +class TestDiffMode(TestCase): + def test_diff_shows_differences(self): + with tempfile.TemporaryDirectory() as tmpdir: + modified = os.path.join(tmpdir, "modified.json") + _write_file(modified, json.dumps({"x": 999})) + original = str(HCL_RECONSTRUCTED_DIR / "nulls.tf") + + result = _run_jsontohcl2("--diff", original, modified) + self.assertEqual(result.returncode, 5, f"stderr: {result.stderr}") + self.assertIn("---", result.stdout) + self.assertIn("+++", result.stdout) + + def test_diff_no_differences(self): + result = _run_jsontohcl2( + "--diff", + str(HCL_RECONSTRUCTED_DIR / "nulls.tf"), + str(JSON_DIR / "nulls.json"), + ) + self.assertEqual(result.returncode, 0, f"stderr: {result.stderr}") + self.assertEqual(result.stdout, "") + + def test_diff_from_stdin(self): + json_text = (JSON_DIR / "nulls.json").read_text() + result = _run_jsontohcl2( + "--diff", str(HCL_RECONSTRUCTED_DIR / "nulls.tf"), "-", stdin=json_text + ) + self.assertEqual(result.returncode, 0, f"stderr: {result.stderr}") + self.assertEqual(result.stdout, "") + + +# --------------------------------------------------------------------------- +# Fragment mode +# --------------------------------------------------------------------------- + + +class TestSemanticDiffMode(TestCase): + def test_semantic_diff_no_changes_exits_0(self): + """Round-trip HCL→JSON should show no semantic differences.""" + for suite in _get_suites(): + with self.subTest(suite=suite): + hcl_path = str(HCL_DIR / f"{suite}.tf") + json_path = str(JSON_DIR / f"{suite}.json") + result = _run_jsontohcl2("--semantic-diff", hcl_path, json_path) + self.assertEqual( + result.returncode, + 0, + f"Unexpected diff for {suite}:\n{result.stdout}", + ) + self.assertEqual(result.stdout, "") + + def test_semantic_diff_detects_value_change(self): + with tempfile.TemporaryDirectory() as tmpdir: + hcl_path = os.path.join(tmpdir, "original.tf") + json_path = os.path.join(tmpdir, "modified.json") + _write_file(hcl_path, "x = 1\n") + _write_file(json_path, json.dumps({"x": 2})) + + result = _run_jsontohcl2("--semantic-diff", hcl_path, json_path) + self.assertEqual(result.returncode, 5) + self.assertIn("x", result.stdout) + self.assertIn("~", result.stdout) + + def test_semantic_diff_ignores_formatting(self): + """Text diff would show changes; semantic diff should show none.""" + hcl = 'resource "aws_instance" "main" {\n ami = "abc-123"\n}\n' + with tempfile.TemporaryDirectory() as tmpdir: + hcl_path = os.path.join(tmpdir, "original.tf") + json_path = os.path.join(tmpdir, "modified.json") + _write_file(hcl_path, hcl) + + # Convert to JSON first, then semantic-diff against original + step1 = _run_hcl2tojson(hcl_path) + self.assertEqual(step1.returncode, 0, f"step1 stderr: {step1.stderr}") + _write_file(json_path, step1.stdout) + + # Text diff would show formatting noise; semantic diff should be clean + result = _run_jsontohcl2("--semantic-diff", hcl_path, json_path) + self.assertEqual(result.returncode, 0, f"stdout: {result.stdout}") + self.assertEqual(result.stdout, "") + + def test_semantic_diff_json_output(self): + with tempfile.TemporaryDirectory() as tmpdir: + hcl_path = os.path.join(tmpdir, "original.tf") + json_path = os.path.join(tmpdir, "modified.json") + _write_file(hcl_path, "x = 1\n") + _write_file(json_path, json.dumps({"x": 2})) + + result = _run_jsontohcl2( + "--semantic-diff", hcl_path, "--diff-json", json_path + ) + self.assertEqual(result.returncode, 5) + entries = json.loads(result.stdout) + self.assertEqual(len(entries), 1) + self.assertEqual(entries[0]["kind"], "changed") + self.assertEqual(entries[0]["path"], "x") + + def test_semantic_diff_from_stdin(self): + with tempfile.TemporaryDirectory() as tmpdir: + hcl_path = os.path.join(tmpdir, "original.tf") + _write_file(hcl_path, "x = 1\n") + + result = _run_jsontohcl2( + "--semantic-diff", hcl_path, "-", stdin=json.dumps({"x": 99}) + ) + self.assertEqual(result.returncode, 5) + self.assertIn("x", result.stdout) + + def test_semantic_diff_file_not_found_exits_4(self): + with tempfile.TemporaryDirectory() as tmpdir: + json_path = os.path.join(tmpdir, "test.json") + _write_file(json_path, '{"x": 1}') + + result = _run_jsontohcl2("--semantic-diff", "/nonexistent.tf", json_path) + self.assertEqual(result.returncode, 4) + self.assertIn("Error:", result.stderr) + + def test_semantic_diff_pipe_composition(self): + """hcl2tojson | modify | jsontohcl2 --semantic-diff — should detect changes.""" + hcl_path = str(HCL_DIR / "nulls.tf") + step1 = _run_hcl2tojson(hcl_path) + self.assertEqual(step1.returncode, 0, f"step1 stderr: {step1.stderr}") + + # Modify one value in the JSON + data = json.loads(step1.stdout) + data["x_injected_key"] = 42 + modified_json = json.dumps(data) + + result = _run_jsontohcl2("--semantic-diff", hcl_path, "-", stdin=modified_json) + self.assertEqual(result.returncode, 5) + self.assertIn("x_injected_key", result.stdout) + + +class TestFragmentMode(TestCase): + def test_fragment_from_stdin(self): + result = _run_jsontohcl2( + "--fragment", "-", stdin='{"cpu": 512, "memory": 1024}' + ) + self.assertEqual(result.returncode, 0, f"stderr: {result.stderr}") + self.assertIn("cpu", result.stdout) + self.assertIn("512", result.stdout) + self.assertIn("memory", result.stdout) + self.assertIn("1024", result.stdout) + + def test_fragment_strips_is_block_markers(self): + """hcl2tojson output piped to jsontohcl2 --fragment strips __is_block__.""" + step1 = _run_hcl2tojson(str(HCL_DIR / "nulls.tf")) + self.assertEqual(step1.returncode, 0, f"step1 stderr: {step1.stderr}") + result = _run_jsontohcl2("--fragment", "-", stdin=step1.stdout) + self.assertEqual(result.returncode, 0, f"stderr: {result.stderr}") + self.assertNotIn("__is_block__", result.stdout) + self.assertIn("terraform", result.stdout) + + +# --------------------------------------------------------------------------- +# Stdout buffering with skip +# --------------------------------------------------------------------------- + + +class TestFieldsProjection(TestCase): + def test_fields_does_not_leak_leaf_lists(self): + """--fields should drop leaf list values not in the field set.""" + hcl = ( + 'module "test" {\n' + ' source = "../../modules/test/v1"\n' + " cpu = 1024\n" + " memory = 2048\n" + ' regions = ["us-east-1", "us-west-2"]\n' + ' tags = { env = "prod" }\n' + "}\n" + ) + result = _run_hcl2tojson( + "--only", "module", "--fields", "cpu,memory", stdin=hcl + ) + self.assertEqual(result.returncode, 0, f"stderr: {result.stderr}") + data = json.loads(result.stdout) + block = data["module"][0]['"test"'] + self.assertIn("cpu", block) + self.assertIn("memory", block) + self.assertNotIn("regions", block) + self.assertNotIn("tags", block) + self.assertNotIn("source", block) + + def test_fields_preserves_structural_lists(self): + """--fields should still recurse into block-wrapping lists.""" + hcl = ( + 'resource "aws_instance" "main" {\n' + ' ami = "abc-123"\n' + ' instance_type = "t2.micro"\n' + ' tags = { env = "prod" }\n' + "}\n" + ) + result = _run_hcl2tojson("--fields", "ami", stdin=hcl) + self.assertEqual(result.returncode, 0, f"stderr: {result.stderr}") + data = json.loads(result.stdout) + block = data["resource"][0]['"aws_instance"']['"main"'] + self.assertIn("ami", block) + self.assertNotIn("instance_type", block) + self.assertNotIn("tags", block) + + +class TestNonDictJsonRejection(TestCase): + def test_dry_run_list_json_exits_2(self): + result = _run_jsontohcl2("--dry-run", "-", stdin='["a", "b"]') + self.assertEqual(result.returncode, 2) + self.assertIn("Error:", result.stderr) + + def test_dry_run_scalar_json_exits_2(self): + result = _run_jsontohcl2("--dry-run", "-", stdin="42") + self.assertEqual(result.returncode, 2) + self.assertIn("Error:", result.stderr) + + def test_normal_mode_list_json_exits_2(self): + result = _run_jsontohcl2(stdin='["a", "b"]') + self.assertEqual(result.returncode, 2) + self.assertIn("Error:", result.stderr) + + def test_fragment_still_rejects_non_dict(self): + result = _run_jsontohcl2("--fragment", "-", stdin="[1, 2, 3]") + self.assertEqual(result.returncode, 2) + + +class TestStdoutBuffering(TestCase): + def test_skip_no_partial_stdout_on_failure(self): + """With -s, a failed file should not leave partial output on stdout.""" + with tempfile.TemporaryDirectory() as tmpdir: + outdir = os.path.join(tmpdir, "out") + indir = os.path.join(tmpdir, "in") + os.mkdir(indir) + _write_file(os.path.join(indir, "good.tf"), "x = 1\n") + _write_file(os.path.join(indir, "bad.tf"), "{{{") + + result = _run_hcl2tojson("-s", indir, "-o", outdir) + self.assertEqual(result.returncode, 1) + # good.tf should produce output, bad.tf should not + self.assertTrue(os.path.exists(os.path.join(outdir, "good.json"))) + self.assertFalse(os.path.exists(os.path.join(outdir, "bad.json"))) + + +# --------------------------------------------------------------------------- +# Multi-file basename collision +# --------------------------------------------------------------------------- + + +class TestMultiFileCollision(TestCase): + def test_basename_collision_preserves_directory_structure(self): + """Files with same basename from different dirs get separate output paths.""" + with tempfile.TemporaryDirectory() as tmpdir: + dir1 = os.path.join(tmpdir, "dir1") + dir2 = os.path.join(tmpdir, "dir2") + outdir = os.path.join(tmpdir, "out") + os.mkdir(dir1) + os.mkdir(dir2) + _write_file(os.path.join(dir1, "main.tf"), "x = 1\n") + _write_file(os.path.join(dir2, "main.tf"), "y = 2\n") + + result = _run_hcl2tojson( + os.path.join(dir1, "main.tf"), + os.path.join(dir2, "main.tf"), + "-o", + outdir, + ) + self.assertEqual(result.returncode, 0, f"stderr: {result.stderr}") + + # Both files should exist in output, not overwrite each other + out_files = [] + for root, _dirs, files in os.walk(outdir): + for fname in files: + out_files.append(os.path.join(root, fname)) + self.assertEqual( + len(out_files), 2, f"Expected 2 output files, got: {out_files}" + ) + + # Each should contain different data + contents = set() + for path in out_files: + with open(path, encoding="utf-8") as fobj: + contents.add(fobj.read()) + self.assertEqual( + len(contents), 2, "Output files should have different content" + ) diff --git a/test/integration/test_round_trip.py b/test/integration/test_round_trip.py new file mode 100644 index 00000000..b78141cb --- /dev/null +++ b/test/integration/test_round_trip.py @@ -0,0 +1,235 @@ +"""Round-trip tests for the HCL2 → JSON → HCL2 pipeline. + +Every test starts from the source HCL files in test/integration/hcl2_original/ +and runs the pipeline forward from there, comparing actuals against expected +outputs at each stage: + +1. HCL → JSON serialization (parse + transform + serialize) +2. JSON → JSON reserialization (serialize + deserialize + reserialize) +3. JSON → HCL reconstruction (serialize + deserialize + format + reconstruct) +4. Full round-trip (HCL → JSON → HCL → JSON produces identical JSON) +""" +# pylint: disable=C0103,C0114,C0115,C0116 + +import json +from enum import Enum +from pathlib import Path +from typing import List +from unittest import TestCase + +from hcl2.api import parses_to_tree +from hcl2.deserializer import BaseDeserializer +from hcl2.formatter import BaseFormatter +from hcl2.reconstructor import HCLReconstructor +from hcl2.transformer import RuleTransformer + +INTEGRATION_DIR = Path(__file__).absolute().parent +HCL2_ORIGINAL_DIR = INTEGRATION_DIR / "hcl2_original" + +_STEP_DIRS = { + "hcl2_original": HCL2_ORIGINAL_DIR, + "hcl2_reconstructed": INTEGRATION_DIR / "hcl2_reconstructed", + "json_serialized": INTEGRATION_DIR / "json_serialized", + "json_reserialized": INTEGRATION_DIR / "json_reserialized", +} + +_STEP_SUFFIXES = { + "hcl2_original": ".tf", + "hcl2_reconstructed": ".tf", + "json_serialized": ".json", + "json_reserialized": ".json", +} + + +class SuiteStep(Enum): + ORIGINAL = "hcl2_original" + RECONSTRUCTED = "hcl2_reconstructed" + JSON_SERIALIZED = "json_serialized" + JSON_RESERIALIZED = "json_reserialized" + + +def _get_suites() -> List[str]: + """ + Get a list of the test suites. + Names of a test suite is a name of file in `test/integration/hcl2_original/` without the .tf suffix. + + Override SUITES to run a specific subset, e.g. SUITES = ["config"] + """ + return SUITES or sorted( + file.stem for file in HCL2_ORIGINAL_DIR.iterdir() if file.is_file() + ) + + +# set this to arbitrary list of test suites to run, +# e.g. `SUITES = ["smoke"]` to run the tests only for `test/integration/hcl2_original/smoke.tf` +SUITES: List[str] = [] + + +def _get_suite_file(suite_name: str, step: SuiteStep) -> Path: + """Return the path for a given suite name and pipeline step.""" + return _STEP_DIRS[step.value] / (suite_name + _STEP_SUFFIXES[step.value]) + + +def _parse_and_serialize(hcl_text: str, options=None) -> dict: + """Parse HCL text and serialize to a Python dict.""" + parsed_tree = parses_to_tree(hcl_text) + rules = RuleTransformer().transform(parsed_tree) + if options: + return rules.serialize(options=options) + return rules.serialize() + + +def _direct_reconstruct(hcl_text: str) -> str: + """Parse HCL text, transform to IR, convert to Lark tree, and reconstruct.""" + parsed_tree = parses_to_tree(hcl_text) + rules = RuleTransformer().transform(parsed_tree) + lark_tree = rules.to_lark() + return HCLReconstructor().reconstruct(lark_tree) + + +def _deserialize_and_reserialize(serialized: dict) -> dict: + """Deserialize a Python dict back through the rule tree and reserialize.""" + deserializer = BaseDeserializer() + formatter = BaseFormatter() + deserialized = deserializer.load_python(serialized) + formatter.format_tree(deserialized) + return deserialized.serialize() + + +def _deserialize_and_reconstruct(serialized: dict) -> str: + """Deserialize a Python dict and reconstruct HCL text.""" + deserializer = BaseDeserializer() + formatter = BaseFormatter() + reconstructor = HCLReconstructor() + deserialized = deserializer.load_python(serialized) + formatter.format_tree(deserialized) + lark_tree = deserialized.to_lark() + return reconstructor.reconstruct(lark_tree) + + +class TestRoundTripSerialization(TestCase): + """Test HCL2 → JSON serialization: parse HCL, transform, serialize, compare with expected JSON.""" + + maxDiff = None + + def test_hcl_to_json(self): + for suite in _get_suites(): + with self.subTest(suite=suite): + hcl_path = _get_suite_file(suite, SuiteStep.ORIGINAL) + json_path = _get_suite_file(suite, SuiteStep.JSON_SERIALIZED) + + actual = _parse_and_serialize(hcl_path.read_text()) + expected = json.loads(json_path.read_text()) + + self.assertEqual( + actual, + expected, + f"HCL → JSON serialization mismatch for suite {suite}", + ) + + +class TestDirectReconstruction(TestCase): + """Test HCL2 → IR → HCL2 direct pipeline. + + Parse HCL, transform to IR, convert directly to Lark tree (skipping + serialization to dict), reconstruct HCL, then verify the result + re-parses to the same JSON as the original. + """ + + maxDiff = None + + def test_direct_reconstruct(self): + for suite in _get_suites(): + with self.subTest(suite=suite): + hcl_path = _get_suite_file(suite, SuiteStep.ORIGINAL) + original_hcl = hcl_path.read_text() + + # Direct: HCL → IR → Lark → HCL + reconstructed_hcl = _direct_reconstruct(original_hcl) + + self.assertMultiLineEqual( + reconstructed_hcl, + original_hcl, + f"Direct reconstruction mismatch for suite {suite}: " + f"HCL → IR → HCL did not match original HCL", + ) + + +class TestRoundTripReserialization(TestCase): + """Test JSON → JSON reserialization. + + Parse HCL, serialize, deserialize, reserialize, compare with expected. + """ + + maxDiff = None + + def test_json_reserialization(self): + for suite in _get_suites(): + with self.subTest(suite=suite): + hcl_path = _get_suite_file(suite, SuiteStep.ORIGINAL) + json_reserialized_path = _get_suite_file( + suite, SuiteStep.JSON_RESERIALIZED + ) + + serialized = _parse_and_serialize(hcl_path.read_text()) + actual = _deserialize_and_reserialize(serialized) + + expected = json.loads(json_reserialized_path.read_text()) + self.assertEqual( + actual, + expected, + f"JSON reserialization mismatch for suite {suite}", + ) + + +class TestRoundTripReconstruction(TestCase): + """Test JSON → HCL reconstruction. + + Parse HCL, serialize, deserialize, format, reconstruct, compare with expected HCL. + """ + + maxDiff = None + + def test_json_to_hcl(self): + for suite in _get_suites(): + with self.subTest(suite=suite): + hcl_path = _get_suite_file(suite, SuiteStep.ORIGINAL) + hcl_reconstructed_path = _get_suite_file(suite, SuiteStep.RECONSTRUCTED) + + serialized = _parse_and_serialize(hcl_path.read_text()) + actual = _deserialize_and_reconstruct(serialized) + + expected = hcl_reconstructed_path.read_text() + self.assertMultiLineEqual( + actual, + expected, + f"HCL reconstruction mismatch for suite {suite}", + ) + + +class TestRoundTripFull(TestCase): + """Test full round-trip: HCL → JSON → HCL → JSON should produce matching JSON.""" + + maxDiff = None + + def test_full_round_trip(self): + for suite in _get_suites(): + with self.subTest(suite=suite): + hcl_path = _get_suite_file(suite, SuiteStep.ORIGINAL) + original_hcl = hcl_path.read_text() + + # Forward: HCL → JSON + serialized = _parse_and_serialize(original_hcl) + + # Reconstruct: JSON → HCL + reconstructed_hcl = _deserialize_and_reconstruct(serialized) + + # Reparse: reconstructed HCL → JSON + reserialized = _parse_and_serialize(reconstructed_hcl) + + self.assertEqual( + reserialized, + serialized, + f"Full round-trip mismatch for suite {suite}: " + f"HCL → JSON → HCL → JSON did not produce identical JSON", + ) diff --git a/test/integration/test_specialized.py b/test/integration/test_specialized.py new file mode 100644 index 00000000..60faf194 --- /dev/null +++ b/test/integration/test_specialized.py @@ -0,0 +1,289 @@ +"""Specialized integration tests for specific features and scenarios. + +Unlike the suite-based round-trip tests, these target individual features +(operator precedence, Builder round-trip) with dedicated golden files +in test/integration/special/. +""" + +# pylint: disable=C0103,C0114,C0115,C0116 + +import json +from pathlib import Path +from typing import Optional +from unittest import TestCase + +from test.integration.test_round_trip import ( + _parse_and_serialize, + _deserialize_and_reserialize, + _deserialize_and_reconstruct, + _direct_reconstruct, +) + +from hcl2.deserializer import BaseDeserializer, DeserializerOptions +from hcl2.formatter import BaseFormatter +from hcl2.reconstructor import HCLReconstructor +from hcl2.utils import SerializationOptions + +SPECIAL_DIR = Path(__file__).absolute().parent / "specialized" + + +class TestOperatorPrecedence(TestCase): + """Test that parsed expressions correctly represent operator precedence. + + Serializes with force_operation_parentheses=True so that implicit + precedence becomes explicit parentheses in the output. + See: https://github.com/amplify-education/python-hcl2/issues/248 + """ + + maxDiff = None + _OPTIONS = SerializationOptions(force_operation_parentheses=True) + + def test_operator_precedence(self): + hcl_path = SPECIAL_DIR / "operator_precedence.tf" + json_path = SPECIAL_DIR / "operator_precedence.json" + + actual = _parse_and_serialize(hcl_path.read_text(), options=self._OPTIONS) + expected = json.loads(json_path.read_text()) + + self.assertEqual(actual, expected) + + +class TestBuilderRoundTrip(TestCase): + """Test that dicts produced by Builder can be deserialized, reconstructed to + valid HCL, and reparsed back to equivalent dicts. + + Pipeline: Builder.build() → from_dict → reconstruct → HCL text + HCL text → parse → serialize → dict (compare with expected) + """ + + maxDiff = None + + def _load_special(self, name, suffix): + return (SPECIAL_DIR / f"{name}{suffix}").read_text() + + def test_builder_reconstruction(self): + """Builder dict → deserialize → reconstruct → compare with expected HCL.""" + builder_dict = json.loads(self._load_special("builder_basic", ".json")) + actual_hcl = _deserialize_and_reconstruct(builder_dict) + expected_hcl = self._load_special("builder_basic", ".tf") + self.assertMultiLineEqual(actual_hcl, expected_hcl) + + def test_builder_full_round_trip(self): + """Builder dict → reconstruct → reparse → compare with expected JSON.""" + builder_dict = json.loads(self._load_special("builder_basic", ".json")) + reconstructed_hcl = _deserialize_and_reconstruct(builder_dict) + actual = _parse_and_serialize(reconstructed_hcl) + expected = json.loads(self._load_special("builder_basic_reparsed", ".json")) + self.assertEqual(actual, expected) + + def test_builder_reserialization(self): + """Builder dict → deserialize → reserialize → compare with expected dict.""" + builder_dict = json.loads(self._load_special("builder_basic", ".json")) + reserialized = _deserialize_and_reserialize(builder_dict) + expected = json.loads(self._load_special("builder_basic_reserialized", ".json")) + self.assertEqual(reserialized, expected) + + +def _deserialize_and_reconstruct_with_options( + serialized: dict, + deserializer_options: Optional[DeserializerOptions] = None, +) -> str: + """Deserialize a Python dict and reconstruct HCL text with custom options.""" + deserializer = BaseDeserializer(deserializer_options) + formatter = BaseFormatter() + reconstructor = HCLReconstructor() + deserialized = deserializer.load_python(serialized) + formatter.format_tree(deserialized) + lark_tree = deserialized.to_lark() + return reconstructor.reconstruct(lark_tree) + + +class TestTemplateDirectives(TestCase): + """Test template directives (%{if}, %{for}) parsing, serialization, and round-trip. + + Covers: basic if/else/endif, for/endfor, strip markers, escaped quotes in + directive expressions (issue #247), nested directives, and escaped directives. + """ + + maxDiff = None + + def _load_special(self, name, suffix): + return (SPECIAL_DIR / f"{name}{suffix}").read_text() + + def test_hcl_to_json(self): + """HCL with directives -> JSON serialization matches expected.""" + hcl_text = self._load_special("template_directives", ".tf") + actual = _parse_and_serialize(hcl_text) + expected = json.loads(self._load_special("template_directives", ".json")) + self.assertEqual(actual, expected) + + def test_direct_reconstruct(self): + """HCL -> IR -> Lark -> HCL matches original.""" + hcl_text = self._load_special("template_directives", ".tf") + actual = _direct_reconstruct(hcl_text) + self.assertMultiLineEqual(actual, hcl_text) + + def test_json_reserialization(self): + """JSON -> deserialize -> reserialize matches expected.""" + hcl_text = self._load_special("template_directives", ".tf") + serialized = _parse_and_serialize(hcl_text) + actual = _deserialize_and_reserialize(serialized) + expected = json.loads( + self._load_special("template_directives_reserialized", ".json") + ) + self.assertEqual(actual, expected) + + def test_json_to_hcl(self): + """JSON -> deserialize -> reconstruct matches expected HCL.""" + hcl_text = self._load_special("template_directives", ".tf") + serialized = _parse_and_serialize(hcl_text) + actual = _deserialize_and_reconstruct(serialized) + expected = self._load_special("template_directives_reconstructed", ".tf") + self.assertMultiLineEqual(actual, expected) + + def test_full_round_trip(self): + """HCL -> JSON -> HCL -> JSON produces identical JSON.""" + hcl_text = self._load_special("template_directives", ".tf") + serialized = _parse_and_serialize(hcl_text) + reconstructed = _deserialize_and_reconstruct(serialized) + reserialized = _parse_and_serialize(reconstructed) + self.assertEqual(reserialized, serialized) + + +class TestCommentSerialization(TestCase): + """Test that comments are correctly classified during HCL → JSON serialization. + + Covers: + - Standalone comments (// and #) at body level → __comments__ + - Standalone comments absorbed by binary_op grammar → __comments__ + - Comments inside expressions (objects) → __inline_comments__ + - Multi-line block comments → __comments__ + - Comments in nested blocks + - Top-level comments + """ + + maxDiff = None + _OPTIONS = SerializationOptions(with_comments=True) + + def test_comment_classification(self): + hcl_path = SPECIAL_DIR / "comments.tf" + json_path = SPECIAL_DIR / "comments.json" + + actual = _parse_and_serialize(hcl_path.read_text(), options=self._OPTIONS) + expected = json.loads(json_path.read_text()) + + self.assertEqual(actual, expected) + + def test_top_level_comments(self): + actual = _parse_and_serialize("// file header\nx = 1\n", options=self._OPTIONS) + self.assertEqual(actual["__comments__"], [{"value": "file header"}]) + + def test_standalone_in_body(self): + actual = _parse_and_serialize( + 'resource "a" "b" {\n # standalone\n x = 1\n}\n', + options=self._OPTIONS, + ) + block = actual["resource"][0]['"a"']['"b"'] + self.assertEqual(block["__comments__"], [{"value": "standalone"}]) + self.assertNotIn("__inline_comments__", block) + + def test_absorbed_after_binary_op(self): + actual = _parse_and_serialize( + "x {\n a = 1 + 2\n # absorbed\n b = 3\n}\n", + options=self._OPTIONS, + ) + block = actual["x"][0] + self.assertIn({"value": "absorbed"}, block["__comments__"]) + self.assertNotIn("__inline_comments__", block) + + def test_inline_after_binary_op(self): + actual = _parse_and_serialize( + "x {\n a = 1 + 2 # inline\n b = 3\n}\n", + options=self._OPTIONS, + ) + block = actual["x"][0] + self.assertEqual(block["__inline_comments__"], [{"value": "inline"}]) + + def test_comment_inside_object(self): + actual = _parse_and_serialize( + "x {\n m = {\n # inside\n k = 1\n }\n}\n", + options=self._OPTIONS, + ) + block = actual["x"][0] + self.assertEqual(block["__inline_comments__"], [{"value": "inside"}]) + self.assertNotIn("__comments__", block) + + def test_multiline_block_comment(self): + actual = _parse_and_serialize( + "x {\n /*\n multi\n line\n */\n a = 1\n}\n", + options=self._OPTIONS, + ) + block = actual["x"][0] + self.assertEqual(block["__comments__"], [{"value": "multi\n line"}]) + + def test_no_comments_without_option(self): + actual = _parse_and_serialize( + "// comment\nx = 1\n", + options=SerializationOptions(with_comments=False), + ) + self.assertNotIn("__comments__", actual) + self.assertNotIn("__inline_comments__", actual) + + +class TestHeredocs(TestCase): + """Test heredoc serialization, flattening, restoration, and round-trips. + + Scenarios: + 1. HCL with heredocs → JSON (preserve_heredocs=True) + 2. HCL with heredocs → JSON (preserve_heredocs=False, newlines escaped) + 3. Flattened JSON → HCL (strings_to_heredocs=True restores multiline) + 4. Full round-trip: flatten → restore → reparse → reflatten matches + """ + + maxDiff = None + _FLATTEN_OPTIONS = SerializationOptions(preserve_heredocs=False) + + def _load_special(self, name, suffix): + return (SPECIAL_DIR / f"{name}{suffix}").read_text() + + def test_parse_preserves_heredocs(self): + """HCL → JSON with default options preserves heredoc markers.""" + hcl_text = self._load_special("heredocs", ".tf") + actual = _parse_and_serialize(hcl_text) + expected = json.loads(self._load_special("heredocs_preserved", ".json")) + self.assertEqual(actual, expected) + + def test_parse_flattens_heredocs(self): + """HCL → JSON with preserve_heredocs=False escapes newlines in quoted strings.""" + hcl_text = self._load_special("heredocs", ".tf") + actual = _parse_and_serialize(hcl_text, options=self._FLATTEN_OPTIONS) + expected = json.loads(self._load_special("heredocs_flattened", ".json")) + self.assertEqual(actual, expected) + + def test_flattened_to_hcl_restores_heredocs(self): + """Flattened JSON → HCL with strings_to_heredocs=True restores multiline heredocs.""" + flattened = json.loads(self._load_special("heredocs_flattened", ".json")) + d_opts = DeserializerOptions(strings_to_heredocs=True) + actual = _deserialize_and_reconstruct_with_options(flattened, d_opts) + expected = self._load_special("heredocs_restored", ".tf") + self.assertMultiLineEqual(actual, expected) + + def test_flatten_restore_round_trip(self): + """Flatten → restore → reparse → reflatten produces identical flattened JSON.""" + hcl_text = self._load_special("heredocs", ".tf") + + # Forward: HCL → flattened JSON + flattened = _parse_and_serialize(hcl_text, options=self._FLATTEN_OPTIONS) + + # Restore: flattened JSON → HCL with heredocs + d_opts = DeserializerOptions(strings_to_heredocs=True) + restored_hcl = _deserialize_and_reconstruct_with_options(flattened, d_opts) + + # Reflatten: restored HCL → flattened JSON + reflattened = _parse_and_serialize(restored_hcl, options=self._FLATTEN_OPTIONS) + + self.assertEqual( + reflattened, + flattened, + "Flatten → restore → reflatten did not produce identical JSON", + ) diff --git a/test/unit/__init__.py b/test/unit/__init__.py index c497b297..e69de29b 100644 --- a/test/unit/__init__.py +++ b/test/unit/__init__.py @@ -1 +0,0 @@ -"""Unit tests -- tests that verify the code of this egg in isolation""" diff --git a/test/unit/cli/__init__.py b/test/unit/cli/__init__.py new file mode 100644 index 00000000..e69de29b diff --git a/test/unit/cli/test_hcl_to_json.py b/test/unit/cli/test_hcl_to_json.py new file mode 100644 index 00000000..458fc430 --- /dev/null +++ b/test/unit/cli/test_hcl_to_json.py @@ -0,0 +1,878 @@ +# pylint: disable=C0103,C0114,C0115,C0116 +import json +import os +import tempfile +from io import StringIO +from unittest import TestCase +from unittest.mock import patch + +from cli.helpers import EXIT_IO_ERROR, EXIT_PARSE_ERROR, EXIT_PARTIAL +from cli.hcl_to_json import main + + +SIMPLE_HCL = "x = 1\n" +SIMPLE_JSON_DICT = {"x": 1} + + +def _write_file(path, content): + with open(path, "w", encoding="utf-8") as f: + f.write(content) + + +def _read_file(path): + with open(path, "r", encoding="utf-8") as f: + return f.read() + + +class TestHclToJson(TestCase): + def test_single_file_to_stdout(self): + with tempfile.TemporaryDirectory() as tmpdir: + hcl_path = os.path.join(tmpdir, "test.tf") + _write_file(hcl_path, SIMPLE_HCL) + + stdout = StringIO() + with patch("sys.argv", ["hcl2tojson", hcl_path]): + with patch("sys.stdout", stdout): + main() + + result = json.loads(stdout.getvalue()) + self.assertEqual(result["x"], 1) + + def test_single_file_to_output(self): + with tempfile.TemporaryDirectory() as tmpdir: + hcl_path = os.path.join(tmpdir, "test.tf") + out_path = os.path.join(tmpdir, "test.json") + _write_file(hcl_path, SIMPLE_HCL) + + with patch("sys.argv", ["hcl2tojson", hcl_path, "-o", out_path]): + main() + + result = json.loads(_read_file(out_path)) + self.assertEqual(result["x"], 1) + + def test_single_file_to_stdout_single_trailing_newline(self): + with tempfile.TemporaryDirectory() as tmpdir: + hcl_path = os.path.join(tmpdir, "test.tf") + _write_file(hcl_path, SIMPLE_HCL) + + stdout = StringIO() + with patch("sys.argv", ["hcl2tojson", hcl_path]): + with patch("sys.stdout", stdout): + main() + + output = stdout.getvalue() + self.assertTrue(output.endswith("\n"), "output should end with newline") + self.assertFalse( + output.endswith("\n\n"), + "output should not have double trailing newline", + ) + + def test_stdin(self): + stdout = StringIO() + stdin = StringIO(SIMPLE_HCL) + with patch("sys.argv", ["hcl2tojson", "-"]): + with patch("sys.stdin", stdin), patch("sys.stdout", stdout): + main() + + result = json.loads(stdout.getvalue()) + self.assertEqual(result["x"], 1) + + def test_stdin_single_trailing_newline(self): + stdout = StringIO() + stdin = StringIO(SIMPLE_HCL) + with patch("sys.argv", ["hcl2tojson", "-"]): + with patch("sys.stdin", stdin), patch("sys.stdout", stdout): + main() + + output = stdout.getvalue() + self.assertTrue(output.endswith("\n"), "output should end with newline") + self.assertFalse( + output.endswith("\n\n"), "output should not have double trailing newline" + ) + + def test_directory_mode(self): + with tempfile.TemporaryDirectory() as tmpdir: + in_dir = os.path.join(tmpdir, "input") + out_dir = os.path.join(tmpdir, "output") + os.mkdir(in_dir) + + _write_file(os.path.join(in_dir, "a.tf"), SIMPLE_HCL) + _write_file(os.path.join(in_dir, "b.hcl"), SIMPLE_HCL) + _write_file(os.path.join(in_dir, "readme.txt"), "not hcl") + + with patch("sys.argv", ["hcl2tojson", in_dir, "-o", out_dir]): + main() + + self.assertTrue(os.path.exists(os.path.join(out_dir, "a.json"))) + self.assertTrue(os.path.exists(os.path.join(out_dir, "b.json"))) + self.assertFalse(os.path.exists(os.path.join(out_dir, "readme.json"))) + + result = json.loads(_read_file(os.path.join(out_dir, "a.json"))) + self.assertEqual(result["x"], 1) + + def test_with_meta_flag(self): + hcl_block = 'resource "a" "b" {\n x = 1\n}\n' + with tempfile.TemporaryDirectory() as tmpdir: + hcl_path = os.path.join(tmpdir, "test.tf") + _write_file(hcl_path, hcl_block) + + stdout = StringIO() + with patch("sys.argv", ["hcl2tojson", "--with-meta", hcl_path]): + with patch("sys.stdout", stdout): + main() + + result = json.loads(stdout.getvalue()) + self.assertIn("resource", result) + + def test_no_comments_flag(self): + hcl_with_comment = "# a comment\nx = 1\n" + with tempfile.TemporaryDirectory() as tmpdir: + hcl_path = os.path.join(tmpdir, "test.tf") + _write_file(hcl_path, hcl_with_comment) + + stdout = StringIO() + with patch("sys.argv", ["hcl2tojson", "--with-comments", hcl_path]): + with patch("sys.stdout", stdout): + main() + + output = stdout.getvalue() + self.assertIn("comment", output) + + def test_wrap_objects_flag(self): + hcl_input = "x = {\n a = 1\n}\n" + with tempfile.TemporaryDirectory() as tmpdir: + hcl_path = os.path.join(tmpdir, "test.tf") + _write_file(hcl_path, hcl_input) + + stdout_default = StringIO() + stdout_wrapped = StringIO() + with patch("sys.argv", ["hcl2tojson", hcl_path]): + with patch("sys.stdout", stdout_default): + main() + with patch("sys.argv", ["hcl2tojson", "--wrap-objects", hcl_path]): + with patch("sys.stdout", stdout_wrapped): + main() + + default = json.loads(stdout_default.getvalue()) + wrapped = json.loads(stdout_wrapped.getvalue()) + self.assertNotEqual(default["x"], wrapped["x"]) + + def test_wrap_tuples_flag(self): + hcl_input = "x = [1, 2]\n" + with tempfile.TemporaryDirectory() as tmpdir: + hcl_path = os.path.join(tmpdir, "test.tf") + _write_file(hcl_path, hcl_input) + + stdout_default = StringIO() + stdout_wrapped = StringIO() + with patch("sys.argv", ["hcl2tojson", hcl_path]): + with patch("sys.stdout", stdout_default): + main() + with patch("sys.argv", ["hcl2tojson", "--wrap-tuples", hcl_path]): + with patch("sys.stdout", stdout_wrapped): + main() + + default = json.loads(stdout_default.getvalue()) + wrapped = json.loads(stdout_wrapped.getvalue()) + self.assertNotEqual(default["x"], wrapped["x"]) + + def test_skip_flag(self): + with tempfile.TemporaryDirectory() as tmpdir: + in_dir = os.path.join(tmpdir, "input") + out_dir = os.path.join(tmpdir, "output") + os.mkdir(in_dir) + + _write_file(os.path.join(in_dir, "good.tf"), SIMPLE_HCL) + _write_file(os.path.join(in_dir, "bad.tf"), "this is {{{{ not valid hcl") + + with patch("sys.argv", ["hcl2tojson", "-s", in_dir, "-o", out_dir]): + with self.assertRaises(SystemExit) as cm: + main() + self.assertEqual(cm.exception.code, EXIT_PARTIAL) + + self.assertTrue(os.path.exists(os.path.join(out_dir, "good.json"))) + + def test_directory_to_stdout_without_ndjson_errors(self): + """Directory without -o or --ndjson is an error.""" + with tempfile.TemporaryDirectory() as tmpdir: + in_dir = os.path.join(tmpdir, "input") + os.mkdir(in_dir) + _write_file(os.path.join(in_dir, "a.tf"), "a = 1\n") + _write_file(os.path.join(in_dir, "b.tf"), "b = 2\n") + + with patch("sys.argv", ["hcl2tojson", in_dir]): + with self.assertRaises(SystemExit) as cm: + main() + self.assertEqual(cm.exception.code, 2) + + def test_stdin_default_when_no_args(self): + """No PATH args reads from stdin (like jq).""" + stdout = StringIO() + stdin = StringIO(SIMPLE_HCL) + with patch("sys.argv", ["hcl2tojson"]): + with patch("sys.stdin", stdin), patch("sys.stdout", stdout): + main() + + result = json.loads(stdout.getvalue()) + self.assertEqual(result["x"], 1) + + def test_multiple_files_to_stdout_without_ndjson_errors(self): + """Multiple files without -o or --ndjson is an error.""" + with tempfile.TemporaryDirectory() as tmpdir: + path_a = os.path.join(tmpdir, "a.tf") + path_b = os.path.join(tmpdir, "b.tf") + _write_file(path_a, "a = 1\n") + _write_file(path_b, "b = 2\n") + + with patch("sys.argv", ["hcl2tojson", path_a, path_b]): + with self.assertRaises(SystemExit) as cm: + main() + self.assertEqual(cm.exception.code, 2) + + def test_multiple_files_to_output_dir(self): + with tempfile.TemporaryDirectory() as tmpdir: + path_a = os.path.join(tmpdir, "a.tf") + path_b = os.path.join(tmpdir, "b.tf") + out_dir = os.path.join(tmpdir, "out") + _write_file(path_a, "a = 1\n") + _write_file(path_b, "b = 2\n") + + with patch("sys.argv", ["hcl2tojson", path_a, path_b, "-o", out_dir]): + main() + + self.assertTrue(os.path.exists(os.path.join(out_dir, "a.json"))) + self.assertTrue(os.path.exists(os.path.join(out_dir, "b.json"))) + + def test_multiple_files_invalid_path_exits_4(self): + with tempfile.TemporaryDirectory() as tmpdir: + path_a = os.path.join(tmpdir, "a.tf") + out_dir = os.path.join(tmpdir, "out") + os.mkdir(out_dir) + _write_file(path_a, "a = 1\n") + + with patch( + "sys.argv", + ["hcl2tojson", path_a, "/nonexistent.tf", "-o", out_dir], + ): + with self.assertRaises(SystemExit) as cm: + main() + self.assertEqual(cm.exception.code, EXIT_IO_ERROR) + + def test_invalid_path_exits_4(self): + with patch("sys.argv", ["hcl2tojson", "/nonexistent/path/foo.tf"]): + with self.assertRaises(SystemExit) as cm: + main() + self.assertEqual(cm.exception.code, EXIT_IO_ERROR) + + +class TestSingleFileErrorHandling(TestCase): + def test_skip_error_with_output_file(self): + with tempfile.TemporaryDirectory() as tmpdir: + in_path = os.path.join(tmpdir, "test.tf") + out_path = os.path.join(tmpdir, "out.json") + _write_file(in_path, "this is {{{{ not valid hcl") + + with patch("sys.argv", ["hcl2tojson", "-s", in_path, "-o", out_path]): + with self.assertRaises(SystemExit) as cm: + main() + self.assertEqual(cm.exception.code, EXIT_PARTIAL) + + # The partial output file is cleaned up on skipped errors. + self.assertFalse(os.path.exists(out_path)) + + def test_parse_error_exits_2(self): + with tempfile.TemporaryDirectory() as tmpdir: + in_path = os.path.join(tmpdir, "test.tf") + out_path = os.path.join(tmpdir, "out.json") + _write_file(in_path, "this is {{{{ not valid hcl") + + with patch("sys.argv", ["hcl2tojson", in_path, "-o", out_path]): + with self.assertRaises(SystemExit) as cm: + main() + self.assertEqual(cm.exception.code, EXIT_PARSE_ERROR) + + def test_skip_error_to_stdout_exits_1(self): + with tempfile.TemporaryDirectory() as tmpdir: + in_path = os.path.join(tmpdir, "test.tf") + _write_file(in_path, "this is {{{{ not valid hcl") + + stdout = StringIO() + with patch("sys.argv", ["hcl2tojson", "-s", in_path]): + with patch("sys.stdout", stdout): + with self.assertRaises(SystemExit) as cm: + main() + self.assertEqual(cm.exception.code, EXIT_PARTIAL) + + self.assertEqual(stdout.getvalue(), "") + + def test_skip_stdin_bad_input_exits_1(self): + """With -s, stdin parse errors are skipped and exit code is 1.""" + stdout = StringIO() + stdin = StringIO("this is {{{{ not valid hcl") + with patch("sys.argv", ["hcl2tojson", "-s", "-"]): + with patch("sys.stdin", stdin), patch("sys.stdout", stdout): + with self.assertRaises(SystemExit) as cm: + main() + self.assertEqual(cm.exception.code, EXIT_PARTIAL) + self.assertEqual(stdout.getvalue(), "") + + def test_multi_file_stdin_rejected(self): + """Stdin (-) cannot be combined with other file paths.""" + with tempfile.TemporaryDirectory() as tmpdir: + hcl_path = os.path.join(tmpdir, "test.tf") + _write_file(hcl_path, SIMPLE_HCL) + with self.assertRaises(SystemExit) as cm: + with patch("sys.argv", ["hcl2tojson", hcl_path, "-", "-o", tmpdir]): + main() + self.assertEqual(cm.exception.code, 2) # argparse error + + def test_parse_error_to_stdout_exits_2(self): + with tempfile.TemporaryDirectory() as tmpdir: + in_path = os.path.join(tmpdir, "test.tf") + _write_file(in_path, "this is {{{{ not valid hcl") + + stdout = StringIO() + with patch("sys.argv", ["hcl2tojson", in_path]): + with patch("sys.stdout", stdout): + with self.assertRaises(SystemExit) as cm: + main() + self.assertEqual(cm.exception.code, EXIT_PARSE_ERROR) + + +class TestHclToJsonFlags(TestCase): + def _run_hcl_to_json(self, hcl_content, extra_flags=None): + """Helper: write HCL to a temp file, run main() with flags, return parsed JSON.""" + with tempfile.TemporaryDirectory() as tmpdir: + hcl_path = os.path.join(tmpdir, "test.tf") + _write_file(hcl_path, hcl_content) + + stdout = StringIO() + argv = ["hcl2tojson"] + (extra_flags or []) + [hcl_path] + with patch("sys.argv", argv): + with patch("sys.stdout", stdout): + main() + return json.loads(stdout.getvalue()) + + def test_no_explicit_blocks_flag(self): + hcl = 'resource "a" "b" {\n x = 1\n}\n' + default = self._run_hcl_to_json(hcl) + no_blocks = self._run_hcl_to_json(hcl, ["--no-explicit-blocks"]) + # With explicit blocks, the value is wrapped in a list; without, it may differ + self.assertNotEqual(default, no_blocks) + + def test_no_preserve_heredocs_flag(self): + hcl = "x = < 0) + finally: + os.unlink(f1.name) + os.unlink(f2.name) + + def test_missing_query_with_file_arg_errors(self): + """When user passes only a file path (no query), error instead of hanging on stdin.""" + with tempfile.NamedTemporaryFile(mode="w", suffix=".tf", delete=False) as f: + f.write("x = 1\n") + f.flush() + try: + with patch("sys.argv", ["hq", f.name, "--json"]): + with patch("sys.stdin") as mock_stdin: + mock_stdin.isatty.return_value = True + with patch("sys.stderr", new_callable=StringIO): + with self.assertRaises(SystemExit) as cm: + main() + self.assertEqual(cm.exception.code, 2) + finally: + os.unlink(f.name) + + def test_main_pipe_query(self): + with tempfile.NamedTemporaryFile(mode="w", suffix=".tf", delete=False) as f: + f.write("x = {\n a = 1\n b = 2\n}\n") + f.flush() + try: + with patch("sys.argv", ["hq", "x | keys", f.name, "--json"]): + with patch("sys.stdout", new_callable=StringIO) as mock_out: + with self.assertRaises(SystemExit) as cm: + main() + self.assertEqual(cm.exception.code, EXIT_SUCCESS) + data = json.loads(mock_out.getvalue()) + self.assertIsInstance(data, list) + finally: + os.unlink(f.name) + + +class TestExitCodes(TestCase): + """Test distinct exit codes for different error conditions.""" + + def test_success_exits_0(self): + with tempfile.NamedTemporaryFile(mode="w", suffix=".tf", delete=False) as f: + f.write("x = 1\n") + f.flush() + try: + with patch("sys.argv", ["hq", "x", f.name, "--value"]): + with patch("sys.stdout", new_callable=StringIO): + with self.assertRaises(SystemExit) as cm: + main() + self.assertEqual(cm.exception.code, EXIT_SUCCESS) + finally: + os.unlink(f.name) + + def test_no_results_exits_1(self): + with tempfile.NamedTemporaryFile(mode="w", suffix=".tf", delete=False) as f: + f.write("x = 1\n") + f.flush() + try: + with patch("sys.argv", ["hq", "nonexistent", f.name]): + with patch("sys.stdout", new_callable=StringIO): + with self.assertRaises(SystemExit) as cm: + main() + self.assertEqual(cm.exception.code, EXIT_NO_RESULTS) + finally: + os.unlink(f.name) + + def test_parse_error_exits_2(self): + with tempfile.NamedTemporaryFile(mode="w", suffix=".tf", delete=False) as f: + f.write("{invalid hcl content\n") + f.flush() + try: + with patch("sys.argv", ["hq", "x", f.name]): + with patch("sys.stdout", new_callable=StringIO): + with patch("sys.stderr", new_callable=StringIO): + with self.assertRaises(SystemExit) as cm: + main() + self.assertEqual(cm.exception.code, EXIT_PARSE_ERROR) + finally: + os.unlink(f.name) + + def test_query_syntax_error_exits_3(self): + with tempfile.NamedTemporaryFile(mode="w", suffix=".tf", delete=False) as f: + f.write("x = 1\n") + f.flush() + try: + with patch("sys.argv", ["hq", "[[[", f.name]): + with patch("sys.stdout", new_callable=StringIO): + with patch("sys.stderr", new_callable=StringIO): + with self.assertRaises(SystemExit) as cm: + main() + self.assertEqual(cm.exception.code, EXIT_QUERY_ERROR) + finally: + os.unlink(f.name) + + def test_io_error_exits_4(self): + with patch("sys.argv", ["hq", "x", "/nonexistent/file.tf"]): + with patch("sys.stdout", new_callable=StringIO): + with patch("sys.stderr", new_callable=StringIO): + with self.assertRaises(SystemExit) as cm: + main() + self.assertEqual(cm.exception.code, EXIT_IO_ERROR) + + def test_multi_file_success_masks_parse_error(self): + """If any file produces results, exit 0 even if others fail.""" + with tempfile.NamedTemporaryFile( + mode="w", suffix=".tf", delete=False + ) as good, tempfile.NamedTemporaryFile( + mode="w", suffix=".tf", delete=False + ) as bad: + good.write("x = 1\n") + bad.write("{invalid\n") + good.flush() + bad.flush() + try: + with patch( + "sys.argv", + ["hq", "x", good.name, bad.name, "--value"], + ): + with patch("sys.stdout", new_callable=StringIO): + with patch("sys.stderr", new_callable=StringIO): + with self.assertRaises(SystemExit) as cm: + main() + self.assertEqual(cm.exception.code, EXIT_SUCCESS) + finally: + os.unlink(good.name) + os.unlink(bad.name) + + def test_multi_file_worst_exit_wins(self): + """When no results, worst error code wins.""" + with patch( + "sys.argv", + ["hq", "x", "/nonexistent1.tf", "/nonexistent2.tf"], + ): + with patch("sys.stdout", new_callable=StringIO): + with patch("sys.stderr", new_callable=StringIO): + with self.assertRaises(SystemExit) as cm: + main() + self.assertEqual(cm.exception.code, EXIT_IO_ERROR) + + +class TestMultipleFileArgs(TestCase): + def test_two_files(self): + with tempfile.NamedTemporaryFile( + mode="w", suffix=".tf", delete=False + ) as f1, tempfile.NamedTemporaryFile( + mode="w", suffix=".tf", delete=False + ) as f2: + f1.write("x = 1\n") + f2.write("x = 2\n") + f1.flush() + f2.flush() + try: + with patch( + "sys.argv", + ["hq", "x", f1.name, f2.name, "--value"], + ): + with patch("sys.stdout", new_callable=StringIO) as mock_out: + with self.assertRaises(SystemExit) as cm: + main() + self.assertEqual(cm.exception.code, EXIT_SUCCESS) + output = mock_out.getvalue() + # Both files should have output with filename prefix + self.assertIn(f1.name, output) + self.assertIn(f2.name, output) + finally: + os.unlink(f1.name) + os.unlink(f2.name) + + def test_dir_and_file_mix(self): + with tempfile.TemporaryDirectory() as tmpdir: + tf_path = os.path.join(tmpdir, "a.tf") + with open(tf_path, "w", encoding="utf-8") as f: + f.write("x = 1\n") + with tempfile.NamedTemporaryFile( + mode="w", suffix=".tf", delete=False + ) as f2: + f2.write("x = 2\n") + f2.flush() + try: + with patch( + "sys.argv", + ["hq", "x", tmpdir, f2.name, "--value"], + ): + with patch("sys.stdout", new_callable=StringIO) as mock_out: + with self.assertRaises(SystemExit) as cm: + main() + self.assertEqual(cm.exception.code, EXIT_SUCCESS) + output = mock_out.getvalue() + self.assertIn("1", output) + self.assertIn("2", output) + finally: + os.unlink(f2.name) + + def test_stdin_default_when_no_file(self): + """When no FILE args, defaults to stdin.""" + with patch("sys.argv", ["hq", "x", "--value"]): + with patch("sys.stdin", StringIO("x = 1\n")): + with patch("sys.stdout", new_callable=StringIO) as mock_out: + with self.assertRaises(SystemExit) as cm: + main() + self.assertEqual(cm.exception.code, EXIT_SUCCESS) + self.assertIn("1", mock_out.getvalue()) + + +class TestGlobExpansion(TestCase): + def test_glob_pattern_in_file_arg(self): + with tempfile.TemporaryDirectory() as tmpdir: + for name in ("a.tf", "b.tf"): + with open(os.path.join(tmpdir, name), "w", encoding="utf-8") as f: + f.write("x = 1\n") + pattern = os.path.join(tmpdir, "*.tf") + with patch("sys.argv", ["hq", "x", pattern, "--value"]): + with patch("sys.stdout", new_callable=StringIO) as mock_out: + with self.assertRaises(SystemExit) as cm: + main() + self.assertEqual(cm.exception.code, EXIT_SUCCESS) + lines = [ + line for line in mock_out.getvalue().strip().split("\n") if line + ] + # Should have results from both files + self.assertEqual(len(lines), 2) + + +class TestCompactJson(TestCase): + def test_tty_gets_indented(self): + with tempfile.NamedTemporaryFile(mode="w", suffix=".tf", delete=False) as f: + f.write("x = 1\n") + f.flush() + try: + with patch("sys.argv", ["hq", "x", f.name, "--json"]): + with patch("sys.stdout", new_callable=StringIO) as mock_out: + mock_out.isatty = lambda: True + with self.assertRaises(SystemExit) as cm: + main() + self.assertEqual(cm.exception.code, EXIT_SUCCESS) + output = mock_out.getvalue() + # Indented output has newlines + self.assertIn("\n", output.strip()) + finally: + os.unlink(f.name) + + def test_non_tty_gets_compact(self): + with tempfile.NamedTemporaryFile(mode="w", suffix=".tf", delete=False) as f: + f.write("x = 1\n") + f.flush() + try: + with patch("sys.argv", ["hq", "x", f.name, "--json"]): + with patch("sys.stdout", new_callable=StringIO) as mock_out: + mock_out.isatty = lambda: False + with self.assertRaises(SystemExit) as cm: + main() + self.assertEqual(cm.exception.code, EXIT_SUCCESS) + output = mock_out.getvalue().strip() + # Compact: single line + self.assertEqual(output.count("\n"), 0) + finally: + os.unlink(f.name) + + def test_explicit_indent_overrides(self): + with tempfile.NamedTemporaryFile(mode="w", suffix=".tf", delete=False) as f: + f.write("x = {\n a = 1\n}\n") + f.flush() + try: + with patch( + "sys.argv", + ["hq", "x", f.name, "--json", "--json-indent", "4"], + ): + with patch("sys.stdout", new_callable=StringIO) as mock_out: + mock_out.isatty = lambda: False + with self.assertRaises(SystemExit) as cm: + main() + self.assertEqual(cm.exception.code, EXIT_SUCCESS) + output = mock_out.getvalue() + # Should be indented with 4 spaces + self.assertIn(" ", output) + finally: + os.unlink(f.name) + + +class TestNdjson(TestCase): + def test_single_file_multi_result(self): + with tempfile.NamedTemporaryFile(mode="w", suffix=".tf", delete=False) as f: + f.write('variable "a" {}\nvariable "b" {}\n') + f.flush() + try: + with patch("sys.argv", ["hq", "variable[*]", f.name, "--ndjson"]): + with patch("sys.stdout", new_callable=StringIO) as mock_out: + with self.assertRaises(SystemExit) as cm: + main() + self.assertEqual(cm.exception.code, EXIT_SUCCESS) + lines = [ + l + for l in mock_out.getvalue().strip().split("\n") + if l.strip() + ] + self.assertEqual(len(lines), 2) + # Each line should be valid JSON + for line in lines: + json.loads(line) + finally: + os.unlink(f.name) + + def test_ndjson_with_value_errors(self): + with patch("sys.argv", ["hq", "x", "--ndjson", "--value"]): + with patch("sys.stderr", new_callable=StringIO): + with self.assertRaises(SystemExit) as cm: + main() + self.assertEqual(cm.exception.code, 2) # argparse error + + def test_ndjson_with_raw_errors(self): + with patch("sys.argv", ["hq", "x", "--ndjson", "--raw"]): + with patch("sys.stderr", new_callable=StringIO): + with self.assertRaises(SystemExit) as cm: + main() + self.assertEqual(cm.exception.code, 2) # argparse error + + def test_multi_file_ndjson_has_provenance(self): + with tempfile.NamedTemporaryFile( + mode="w", suffix=".tf", delete=False + ) as f1, tempfile.NamedTemporaryFile( + mode="w", suffix=".tf", delete=False + ) as f2: + f1.write("x = 1\n") + f2.write("x = 2\n") + f1.flush() + f2.flush() + try: + with patch( + "sys.argv", + ["hq", "x", f1.name, f2.name, "--ndjson"], + ): + with patch("sys.stdout", new_callable=StringIO) as mock_out: + with self.assertRaises(SystemExit) as cm: + main() + self.assertEqual(cm.exception.code, EXIT_SUCCESS) + lines = [ + l + for l in mock_out.getvalue().strip().split("\n") + if l.strip() + ] + self.assertEqual(len(lines), 2) + d1 = json.loads(lines[0]) + d2 = json.loads(lines[1]) + self.assertIn("__file__", d1) + self.assertIn("__file__", d2) + finally: + os.unlink(f1.name) + os.unlink(f2.name) + + +class TestProvenance(TestCase): + def test_multi_file_json_has_file_key(self): + with tempfile.NamedTemporaryFile( + mode="w", suffix=".tf", delete=False + ) as f1, tempfile.NamedTemporaryFile( + mode="w", suffix=".tf", delete=False + ) as f2: + f1.write("x = 1\n") + f2.write("x = 2\n") + f1.flush() + f2.flush() + try: + with patch( + "sys.argv", + ["hq", "x", f1.name, f2.name, "--ndjson"], + ): + with patch("sys.stdout", new_callable=StringIO) as mock_out: + with self.assertRaises(SystemExit) as cm: + main() + self.assertEqual(cm.exception.code, EXIT_SUCCESS) + lines = [ + l + for l in mock_out.getvalue().strip().split("\n") + if l.strip() + ] + for line in lines: + data = json.loads(line) + self.assertIn("__file__", data) + finally: + os.unlink(f1.name) + os.unlink(f2.name) + + def test_multi_file_json_produces_valid_merged_array(self): + """--json with multiple files must produce a single valid JSON array.""" + with tempfile.NamedTemporaryFile( + mode="w", suffix=".tf", delete=False + ) as f1, tempfile.NamedTemporaryFile( + mode="w", suffix=".tf", delete=False + ) as f2: + f1.write("x = 1\n") + f2.write("y = 2\n") + f1.flush() + f2.flush() + try: + with patch( + "sys.argv", + ["hq", "*", f1.name, f2.name, "--json"], + ): + with patch("sys.stdout", new_callable=StringIO) as mock_out: + with self.assertRaises(SystemExit) as cm: + main() + self.assertEqual(cm.exception.code, EXIT_SUCCESS) + raw = mock_out.getvalue().strip() + # Must be valid JSON (single array, not concatenated) + data = json.loads(raw) + self.assertIsInstance(data, list) + self.assertEqual(len(data), 2) + # Each result should have __file__ provenance + for item in data: + self.assertIn("__file__", item) + finally: + os.unlink(f1.name) + os.unlink(f2.name) + + def test_single_file_json_no_provenance(self): + with tempfile.NamedTemporaryFile(mode="w", suffix=".tf", delete=False) as f: + f.write("x = 1\n") + f.flush() + try: + with patch("sys.argv", ["hq", "x", f.name, "--ndjson"]): + with patch("sys.stdout", new_callable=StringIO) as mock_out: + with self.assertRaises(SystemExit) as cm: + main() + self.assertEqual(cm.exception.code, EXIT_SUCCESS) + data = json.loads(mock_out.getvalue().strip()) + self.assertNotIn("__file__", data) + finally: + os.unlink(f.name) + + +class TestWithLocation(TestCase): + def test_with_location_json(self): + with tempfile.NamedTemporaryFile(mode="w", suffix=".tf", delete=False) as f: + f.write("x = 1\n") + f.flush() + try: + with patch( + "sys.argv", + ["hq", "x", f.name, "--json", "--with-location"], + ): + with patch("sys.stdout", new_callable=StringIO) as mock_out: + mock_out.isatty = lambda: True + with self.assertRaises(SystemExit) as cm: + main() + self.assertEqual(cm.exception.code, EXIT_SUCCESS) + data = json.loads(mock_out.getvalue()) + self.assertIn("__file__", data) + self.assertIn("__line__", data) + finally: + os.unlink(f.name) + + def test_with_location_ndjson(self): + with tempfile.NamedTemporaryFile(mode="w", suffix=".tf", delete=False) as f: + f.write("x = 1\ny = 2\n") + f.flush() + try: + with patch( + "sys.argv", + ["hq", "*[*]", f.name, "--ndjson", "--with-location"], + ): + with patch("sys.stdout", new_callable=StringIO) as mock_out: + with self.assertRaises(SystemExit) as cm: + main() + self.assertEqual(cm.exception.code, EXIT_SUCCESS) + lines = [ + l + for l in mock_out.getvalue().strip().split("\n") + if l.strip() + ] + self.assertTrue(len(lines) >= 2) + for line in lines: + data = json.loads(line) + self.assertIn("__file__", data) + self.assertIn("__line__", data) + finally: + os.unlink(f.name) + + def test_with_location_after_construct(self): + """--with-location preserves line numbers through object construction.""" + with tempfile.NamedTemporaryFile(mode="w", suffix=".tf", delete=False) as f: + f.write('resource "aws_instance" "main" {\n ami = "ami-123"\n}\n') + f.flush() + try: + with patch( + "sys.argv", + [ + "hq", + "resource[*] | {ami: .ami}", + f.name, + "--json", + "--with-location", + ], + ): + with patch("sys.stdout", new_callable=StringIO) as mock_out: + mock_out.isatty = lambda: True + with self.assertRaises(SystemExit) as cm: + main() + self.assertEqual(cm.exception.code, EXIT_SUCCESS) + data = json.loads(mock_out.getvalue()) + self.assertIn("__file__", data) + self.assertIn("__line__", data) + self.assertIn("ami", data) + finally: + os.unlink(f.name) + + def test_with_location_requires_json(self): + with patch("sys.argv", ["hq", "x", "--with-location", "--value"]): + with patch("sys.stderr", new_callable=StringIO): + with self.assertRaises(SystemExit) as cm: + main() + self.assertEqual(cm.exception.code, 2) # argparse error + + +class TestWithComments(TestCase): + def test_with_comments_json(self): + with tempfile.NamedTemporaryFile(mode="w", suffix=".tf", delete=False) as f: + f.write("# a comment\nx = 1\n") + f.flush() + try: + with patch( + "sys.argv", + ["hq", "x", f.name, "--json", "--with-comments"], + ): + with patch("sys.stdout", new_callable=StringIO) as mock_out: + mock_out.isatty = lambda: True + with self.assertRaises(SystemExit) as cm: + main() + self.assertEqual(cm.exception.code, EXIT_SUCCESS) + data = json.loads(mock_out.getvalue()) + # The exact format depends on SerializationOptions, + # but the output should be valid JSON + self.assertIsNotNone(data) + finally: + os.unlink(f.name) + + def test_with_comments_requires_json(self): + with patch("sys.argv", ["hq", "x", "--with-comments", "--value"]): + with patch("sys.stderr", new_callable=StringIO): + with self.assertRaises(SystemExit) as cm: + main() + self.assertEqual(cm.exception.code, 2) # argparse error + + +class TestValueAutoUnwrap(TestCase): + def test_attribute_value_unwrapped(self): + """--value on an attribute should return the inner value, not {key: val}.""" + with tempfile.NamedTemporaryFile(mode="w", suffix=".tf", delete=False) as f: + f.write('x = "hello"\n') + f.flush() + try: + with patch("sys.argv", ["hq", "x", f.name, "--value"]): + with patch("sys.stdout", new_callable=StringIO) as mock_out: + with self.assertRaises(SystemExit) as cm: + main() + self.assertEqual(cm.exception.code, EXIT_SUCCESS) + output = mock_out.getvalue().strip() + # Should not contain the key wrapper + self.assertNotIn("x", output) + # Should contain the value + self.assertIn("hello", output) + finally: + os.unlink(f.name) + + def test_attribute_integer_unwrapped(self): + with tempfile.NamedTemporaryFile(mode="w", suffix=".tf", delete=False) as f: + f.write("count = 42\n") + f.flush() + try: + with patch("sys.argv", ["hq", "count", f.name, "--value"]): + with patch("sys.stdout", new_callable=StringIO) as mock_out: + with self.assertRaises(SystemExit) as cm: + main() + self.assertEqual(cm.exception.code, EXIT_SUCCESS) + output = mock_out.getvalue().strip() + self.assertEqual(output, "42") + finally: + os.unlink(f.name) + + +class TestOptionalWithSelect(TestCase): + def test_optional_after_select(self): + """? after [select(...)] should work.""" + with tempfile.NamedTemporaryFile(mode="w", suffix=".tf", delete=False) as f: + f.write("x = 1\ny = 2\n") + f.flush() + try: + with patch( + "sys.argv", + ["hq", '*[select(.name == "nonexistent")]?', f.name, "--value"], + ): + with patch("sys.stdout", new_callable=StringIO): + with self.assertRaises(SystemExit) as cm: + main() + self.assertEqual(cm.exception.code, EXIT_SUCCESS) + finally: + os.unlink(f.name) + + +class TestProcessFile(TestCase): + """Unit tests for the _process_file worker function.""" + + def test_success(self): + with tempfile.NamedTemporaryFile(mode="w", suffix=".tf", delete=False) as f: + f.write("x = 1\n") + f.flush() + try: + args = (f.name, "x", False, "x", True, OutputConfig(output_json=True)) + _fp, code, converted, err = _process_file(args) + self.assertEqual(code, EXIT_SUCCESS) + self.assertIsNone(err) + self.assertEqual(len(converted), 1) + self.assertIn("__file__", converted[0]) + finally: + os.unlink(f.name) + + def test_io_error(self): + args = ( + "/nonexistent.tf", + "x", + False, + "x", + True, + OutputConfig(output_json=True), + ) + _fp, code, _converted, err = _process_file(args) + self.assertEqual(code, EXIT_IO_ERROR) + self.assertIsNotNone(err) + self.assertIsNone(_converted) + + def test_parse_error(self): + with tempfile.NamedTemporaryFile(mode="w", suffix=".tf", delete=False) as f: + f.write("{invalid\n") + f.flush() + try: + args = (f.name, "x", False, "x", True, OutputConfig(output_json=True)) + _fp, code, _converted, err = _process_file(args) + self.assertEqual(code, EXIT_PARSE_ERROR) + self.assertIsNotNone(err) + finally: + os.unlink(f.name) + + def test_no_results(self): + with tempfile.NamedTemporaryFile(mode="w", suffix=".tf", delete=False) as f: + f.write("x = 1\n") + f.flush() + try: + args = ( + f.name, + "nonexistent", + False, + "nonexistent", + True, + OutputConfig(output_json=True), + ) + _fp, code, converted, err = _process_file(args) + self.assertEqual(code, EXIT_SUCCESS) + self.assertEqual(converted, []) + self.assertIsNone(err) + finally: + os.unlink(f.name) + + +class TestParallelMode(TestCase): + """Integration tests for --jobs parallel mode.""" + + def _make_files(self, tmpdir, count): + """Create count .tf files with x = N.""" + paths = [] + for i in range(count): + path = os.path.join(tmpdir, f"f{i:03d}.tf") + with open(path, "w", encoding="utf-8") as f: + f.write(f"x = {i}\n") + paths.append(path) + return paths + + def test_parallel_json_merged(self): + """Parallel JSON mode produces a valid merged array.""" + with tempfile.TemporaryDirectory() as tmpdir: + paths = self._make_files(tmpdir, 25) + argv = ["hq", "x"] + paths + ["--json", "--json-indent", "0"] + with patch("sys.argv", argv): + with patch("sys.stdout", new_callable=StringIO) as mock_out: + mock_out.isatty = lambda: False + with self.assertRaises(SystemExit) as cm: + main() + self.assertEqual(cm.exception.code, EXIT_SUCCESS) + data = json.loads(mock_out.getvalue()) + self.assertIsInstance(data, list) + self.assertEqual(len(data), 25) + # Each should have __file__ provenance + for item in data: + self.assertIn("__file__", item) + + def test_parallel_ndjson(self): + """Parallel NDJSON mode produces valid per-line JSON.""" + with tempfile.TemporaryDirectory() as tmpdir: + paths = self._make_files(tmpdir, 25) + argv = ["hq", "x"] + paths + ["--ndjson"] + with patch("sys.argv", argv): + with patch("sys.stdout", new_callable=StringIO) as mock_out: + with self.assertRaises(SystemExit) as cm: + main() + self.assertEqual(cm.exception.code, EXIT_SUCCESS) + lines = [ + l for l in mock_out.getvalue().strip().split("\n") if l.strip() + ] + self.assertEqual(len(lines), 25) + for line in lines: + data = json.loads(line) + self.assertIn("__file__", data) + + def test_serial_fallback_with_jobs_0(self): + """--jobs 0 forces serial even with many files.""" + with tempfile.TemporaryDirectory() as tmpdir: + paths = self._make_files(tmpdir, 25) + argv = ["hq", "x"] + paths + ["--json", "--json-indent", "0", "--jobs", "0"] + with patch("sys.argv", argv): + with patch("sys.stdout", new_callable=StringIO) as mock_out: + mock_out.isatty = lambda: False + with self.assertRaises(SystemExit) as cm: + main() + self.assertEqual(cm.exception.code, EXIT_SUCCESS) + data = json.loads(mock_out.getvalue()) + self.assertEqual(len(data), 25) + + def test_serial_for_few_files(self): + """< 20 files stays serial (no pool overhead).""" + with tempfile.TemporaryDirectory() as tmpdir: + paths = self._make_files(tmpdir, 5) + argv = ["hq", "x"] + paths + ["--json", "--json-indent", "0"] + with patch("sys.argv", argv): + with patch("sys.stdout", new_callable=StringIO) as mock_out: + mock_out.isatty = lambda: False + with self.assertRaises(SystemExit) as cm: + main() + self.assertEqual(cm.exception.code, EXIT_SUCCESS) + data = json.loads(mock_out.getvalue()) + self.assertEqual(len(data), 5) + + def test_serial_for_value_mode(self): + """--value mode stays serial even with many files.""" + with tempfile.TemporaryDirectory() as tmpdir: + paths = self._make_files(tmpdir, 25) + argv = ["hq", "x"] + paths + ["--value"] + with patch("sys.argv", argv): + with patch("sys.stdout", new_callable=StringIO) as mock_out: + with self.assertRaises(SystemExit) as cm: + main() + self.assertEqual(cm.exception.code, EXIT_SUCCESS) + lines = [ + l for l in mock_out.getvalue().strip().split("\n") if l.strip() + ] + self.assertEqual(len(lines), 25) diff --git a/test/unit/cli/test_json_to_hcl.py b/test/unit/cli/test_json_to_hcl.py new file mode 100644 index 00000000..f7daaf8f --- /dev/null +++ b/test/unit/cli/test_json_to_hcl.py @@ -0,0 +1,802 @@ +# pylint: disable=C0103,C0114,C0115,C0116 +import json +import os +import tempfile +from io import StringIO +from unittest import TestCase +from unittest.mock import patch + +from cli.helpers import EXIT_DIFF, EXIT_IO_ERROR, EXIT_PARSE_ERROR, EXIT_PARTIAL +from cli.json_to_hcl import main + + +SIMPLE_JSON_DICT = {"x": 1} +SIMPLE_JSON = json.dumps(SIMPLE_JSON_DICT) + +BLOCK_JSON_DICT = {"resource": [{"aws_instance": [{"example": [{"ami": "abc-123"}]}]}]} +BLOCK_JSON = json.dumps(BLOCK_JSON_DICT) + + +def _write_file(path, content): + with open(path, "w", encoding="utf-8") as f: + f.write(content) + + +def _read_file(path): + with open(path, "r", encoding="utf-8") as f: + return f.read() + + +class TestJsonToHcl(TestCase): + def test_single_file_to_stdout(self): + with tempfile.TemporaryDirectory() as tmpdir: + json_path = os.path.join(tmpdir, "test.json") + _write_file(json_path, SIMPLE_JSON) + + stdout = StringIO() + with patch("sys.argv", ["jsontohcl2", json_path]): + with patch("sys.stdout", stdout): + main() + + output = stdout.getvalue().strip() + self.assertIn("x", output) + self.assertIn("1", output) + + def test_single_file_to_output(self): + with tempfile.TemporaryDirectory() as tmpdir: + json_path = os.path.join(tmpdir, "test.json") + out_path = os.path.join(tmpdir, "test.tf") + _write_file(json_path, SIMPLE_JSON) + + with patch("sys.argv", ["jsontohcl2", json_path, "-o", out_path]): + main() + + output = _read_file(out_path) + self.assertIn("x", output) + self.assertIn("1", output) + + def test_stdin(self): + stdout = StringIO() + stdin = StringIO(SIMPLE_JSON) + with patch("sys.argv", ["jsontohcl2", "-"]): + with patch("sys.stdin", stdin), patch("sys.stdout", stdout): + main() + + output = stdout.getvalue().strip() + self.assertIn("x", output) + self.assertIn("1", output) + + def test_directory_mode(self): + with tempfile.TemporaryDirectory() as tmpdir: + in_dir = os.path.join(tmpdir, "input") + out_dir = os.path.join(tmpdir, "output") + os.mkdir(in_dir) + + _write_file(os.path.join(in_dir, "a.json"), SIMPLE_JSON) + _write_file(os.path.join(in_dir, "readme.txt"), "not json") + + with patch("sys.argv", ["jsontohcl2", in_dir, "-o", out_dir]): + main() + + self.assertTrue(os.path.exists(os.path.join(out_dir, "a.tf"))) + self.assertFalse(os.path.exists(os.path.join(out_dir, "readme.tf"))) + + def test_indent_flag(self): + with tempfile.TemporaryDirectory() as tmpdir: + json_path = os.path.join(tmpdir, "test.json") + _write_file(json_path, BLOCK_JSON) + + stdout = StringIO() + with patch("sys.argv", ["jsontohcl2", "--indent", "4", json_path]): + with patch("sys.stdout", stdout): + main() + + output = stdout.getvalue() + self.assertIn(" ami", output) + + def test_no_align_flag(self): + hcl_json = json.dumps({"short": 1, "very_long_name": 2}) + with tempfile.TemporaryDirectory() as tmpdir: + json_path = os.path.join(tmpdir, "test.json") + _write_file(json_path, hcl_json) + + stdout = StringIO() + with patch("sys.argv", ["jsontohcl2", "--no-align", json_path]): + with patch("sys.stdout", stdout): + main() + + output = stdout.getvalue() + for line in output.strip().split("\n"): + line = line.strip() + if line.startswith("short"): + self.assertNotIn(" =", line) + + def test_colon_separator_flag(self): + hcl_json = json.dumps({"x": {"a": 1}}) + with tempfile.TemporaryDirectory() as tmpdir: + json_path = os.path.join(tmpdir, "test.json") + _write_file(json_path, hcl_json) + + stdout = StringIO() + with patch("sys.argv", ["jsontohcl2", "--colon-separator", json_path]): + with patch("sys.stdout", stdout): + main() + + output = stdout.getvalue() + self.assertIn(":", output) + + def test_skip_flag_on_invalid_json(self): + with tempfile.TemporaryDirectory() as tmpdir: + in_dir = os.path.join(tmpdir, "input") + out_dir = os.path.join(tmpdir, "output") + os.mkdir(in_dir) + + _write_file(os.path.join(in_dir, "good.json"), SIMPLE_JSON) + _write_file(os.path.join(in_dir, "bad.json"), "{not valid json") + + with patch("sys.argv", ["jsontohcl2", "-s", in_dir, "-o", out_dir]): + with self.assertRaises(SystemExit) as cm: + main() + self.assertEqual(cm.exception.code, EXIT_PARTIAL) + + self.assertTrue(os.path.exists(os.path.join(out_dir, "good.tf"))) + + def test_skip_stdin_bad_input_exits_1(self): + """With -s, stdin JSON parse errors exit 1 (partial).""" + stdout = StringIO() + stdin = StringIO("{not valid json") + with patch("sys.argv", ["jsontohcl2", "-s", "-"]): + with patch("sys.stdin", stdin), patch("sys.stdout", stdout): + with self.assertRaises(SystemExit) as cm: + main() + self.assertEqual(cm.exception.code, EXIT_PARTIAL) + self.assertEqual(stdout.getvalue(), "") + + def test_multi_file_stdin_rejected(self): + """Stdin (-) cannot be combined with other file paths.""" + with tempfile.TemporaryDirectory() as tmpdir: + json_path = os.path.join(tmpdir, "test.json") + _write_file(json_path, SIMPLE_JSON) + with self.assertRaises(SystemExit) as cm: + with patch("sys.argv", ["jsontohcl2", json_path, "-", "-o", tmpdir]): + main() + self.assertEqual(cm.exception.code, 2) # argparse error + + def test_invalid_path_exits_4(self): + with patch("sys.argv", ["jsontohcl2", "/nonexistent/path/foo.json"]): + with self.assertRaises(SystemExit) as cm: + main() + self.assertEqual(cm.exception.code, EXIT_IO_ERROR) + + def test_stdin_default_when_no_args(self): + """No PATH args reads from stdin.""" + stdout = StringIO() + stdin = StringIO(SIMPLE_JSON) + with patch("sys.argv", ["jsontohcl2"]): + with patch("sys.stdin", stdin), patch("sys.stdout", stdout): + main() + + output = stdout.getvalue().strip() + self.assertIn("x", output) + self.assertIn("1", output) + + def test_multiple_files_to_stdout(self): + with tempfile.TemporaryDirectory() as tmpdir: + path_a = os.path.join(tmpdir, "a.json") + path_b = os.path.join(tmpdir, "b.json") + _write_file(path_a, json.dumps({"a": 1})) + _write_file(path_b, json.dumps({"b": 2})) + + stdout = StringIO() + with patch("sys.argv", ["jsontohcl2", path_a, path_b]): + with patch("sys.stdout", stdout): + main() + + output = stdout.getvalue() + self.assertIn("a", output) + self.assertIn("b", output) + + def test_multiple_files_to_output_dir(self): + with tempfile.TemporaryDirectory() as tmpdir: + path_a = os.path.join(tmpdir, "a.json") + path_b = os.path.join(tmpdir, "b.json") + out_dir = os.path.join(tmpdir, "out") + _write_file(path_a, json.dumps({"a": 1})) + _write_file(path_b, json.dumps({"b": 2})) + + with patch("sys.argv", ["jsontohcl2", path_a, path_b, "-o", out_dir]): + main() + + self.assertTrue(os.path.exists(os.path.join(out_dir, "a.tf"))) + self.assertTrue(os.path.exists(os.path.join(out_dir, "b.tf"))) + + +class TestMutuallyExclusiveModes(TestCase): + def test_diff_and_dry_run_rejected(self): + """Fix #5: --diff and --dry-run cannot be combined.""" + with tempfile.TemporaryDirectory() as tmpdir: + path = os.path.join(tmpdir, "test.json") + _write_file(path, SIMPLE_JSON) + + stderr = StringIO() + with patch( + "sys.argv", + ["jsontohcl2", "--diff", path, "--dry-run", path], + ): + with patch("sys.stderr", stderr): + with self.assertRaises(SystemExit) as cm: + main() + self.assertEqual(cm.exception.code, 2) + + def test_diff_and_fragment_rejected(self): + """Fix #5: --diff and --fragment cannot be combined.""" + with tempfile.TemporaryDirectory() as tmpdir: + path = os.path.join(tmpdir, "test.json") + _write_file(path, SIMPLE_JSON) + + stderr = StringIO() + with patch( + "sys.argv", + ["jsontohcl2", "--diff", path, "--fragment", path], + ): + with patch("sys.stderr", stderr): + with self.assertRaises(SystemExit) as cm: + main() + self.assertEqual(cm.exception.code, 2) + + def test_dry_run_and_fragment_rejected(self): + """Fix #5: --dry-run and --fragment cannot be combined.""" + with tempfile.TemporaryDirectory() as tmpdir: + path = os.path.join(tmpdir, "test.json") + _write_file(path, SIMPLE_JSON) + + stderr = StringIO() + with patch( + "sys.argv", + ["jsontohcl2", "--dry-run", "--fragment", path], + ): + with patch("sys.stderr", stderr): + with self.assertRaises(SystemExit) as cm: + main() + self.assertEqual(cm.exception.code, 2) + + +class TestDirectoryWithoutOutput(TestCase): + def test_directory_without_output_errors(self): + """Fix #1: jsontohcl2 dir/ without -o should error, not crash.""" + with tempfile.TemporaryDirectory() as tmpdir: + in_dir = os.path.join(tmpdir, "input") + os.mkdir(in_dir) + _write_file(os.path.join(in_dir, "a.json"), SIMPLE_JSON) + + with patch("sys.argv", ["jsontohcl2", in_dir]): + with self.assertRaises(SystemExit) as cm: + main() + # argparse parser.error() exits with code 2 + self.assertEqual(cm.exception.code, 2) + + +class TestPartialFailureExitCode(TestCase): + def test_directory_skip_exits_1(self): + """Fix #3: directory mode with -s and partial failures should exit 1.""" + with tempfile.TemporaryDirectory() as tmpdir: + in_dir = os.path.join(tmpdir, "input") + out_dir = os.path.join(tmpdir, "output") + os.mkdir(in_dir) + + _write_file(os.path.join(in_dir, "good.json"), SIMPLE_JSON) + _write_file(os.path.join(in_dir, "bad.json"), "{not valid json") + + with patch("sys.argv", ["jsontohcl2", "-s", in_dir, "-o", out_dir]): + with self.assertRaises(SystemExit) as cm: + main() + self.assertEqual(cm.exception.code, EXIT_PARTIAL) + + def test_multiple_files_skip_exits_1(self): + """Fix #3: multi-file with -s and partial failures should exit 1.""" + with tempfile.TemporaryDirectory() as tmpdir: + good = os.path.join(tmpdir, "good.json") + bad = os.path.join(tmpdir, "bad.json") + out_dir = os.path.join(tmpdir, "out") + _write_file(good, SIMPLE_JSON) + _write_file(bad, "{not valid json") + + with patch("sys.argv", ["jsontohcl2", "-s", good, bad, "-o", out_dir]): + with self.assertRaises(SystemExit) as cm: + main() + self.assertEqual(cm.exception.code, EXIT_PARTIAL) + + def test_multiple_files_to_stdout_skip_exits_1(self): + """Fix #3: multi-file to stdout with -s and partial failures should exit 1.""" + with tempfile.TemporaryDirectory() as tmpdir: + good = os.path.join(tmpdir, "good.json") + bad = os.path.join(tmpdir, "bad.json") + _write_file(good, SIMPLE_JSON) + _write_file(bad, "{not valid json") + + stdout = StringIO() + with patch("sys.argv", ["jsontohcl2", "-s", good, bad]): + with patch("sys.stdout", stdout): + with self.assertRaises(SystemExit) as cm: + main() + self.assertEqual(cm.exception.code, EXIT_PARTIAL) + + +class TestJsonToHclFlags(TestCase): + def _run_json_to_hcl(self, json_dict, extra_flags=None): + """Helper: write JSON to a temp file, run main() with flags, return HCL output.""" + with tempfile.TemporaryDirectory() as tmpdir: + json_path = os.path.join(tmpdir, "test.json") + _write_file(json_path, json.dumps(json_dict)) + + stdout = StringIO() + argv = ["jsontohcl2"] + (extra_flags or []) + [json_path] + with patch("sys.argv", argv): + with patch("sys.stdout", stdout): + main() + return stdout.getvalue() + + def test_no_trailing_comma_flag(self): + data = {"x": {"a": 1, "b": 2}} + default = self._run_json_to_hcl(data) + no_comma = self._run_json_to_hcl(data, ["--no-trailing-comma"]) + # Default has trailing commas in objects; without flag it doesn't + self.assertNotEqual(default, no_comma) + + def test_heredocs_to_strings_flag(self): + # Serialized heredocs are quoted strings containing heredoc markers + data = {"x": '"< 0) + + def test_repr(self): + attr = _make_attribute("x", 1) + view = view_for(attr) + r = repr(view) + self.assertIn("AttributeView", r) + + def test_find_by_predicate(self): + from hcl2.query.body import DocumentView + + doc = DocumentView.parse("x = 1\ny = 2\n") + found = doc.find_by_predicate(lambda n: hasattr(n, "name") and n.name == "x") + self.assertEqual(len(found), 1) + self.assertEqual(found[0].name, "x") + + def test_find_by_predicate_no_match(self): + from hcl2.query.body import DocumentView + + doc = DocumentView.parse("x = 1\n") + found = doc.find_by_predicate(lambda n: False) + self.assertEqual(len(found), 0) + + def test_walk_rules(self): + from hcl2.query.body import DocumentView + + doc = DocumentView.parse("x = 1\n") + rules = doc.walk_rules() + self.assertTrue(len(rules) > 0) + + def test_to_dict_with_options(self): + from hcl2.query.body import DocumentView + + doc = DocumentView.parse("x = 1\n") + attr = doc.attribute("x") + opts = SerializationOptions(with_meta=False) + result = attr.to_dict(options=opts) + self.assertEqual(result, {"x": 1}) + + def test_view_for_mro_fallback(self): + # ExprTermRule is not directly registered but its parent ExpressionRule + # is also not registered — should fall back to NodeView + expr = StubExpression("val") + view = view_for(expr) + self.assertIsInstance(view, NodeView) + self.assertEqual(type(view), NodeView) diff --git a/test/unit/query/test_blocks.py b/test/unit/query/test_blocks.py new file mode 100644 index 00000000..80f3a7ac --- /dev/null +++ b/test/unit/query/test_blocks.py @@ -0,0 +1,124 @@ +# pylint: disable=C0103,C0114,C0115,C0116 +from unittest import TestCase + +from hcl2.query.body import DocumentView +from hcl2.utils import SerializationOptions + + +class TestBlockView(TestCase): + def test_block_type(self): + doc = DocumentView.parse('resource "aws_instance" "main" {}\n') + block = doc.blocks("resource")[0] + self.assertEqual(block.block_type, "resource") + + def test_labels(self): + doc = DocumentView.parse('resource "aws_instance" "main" {}\n') + block = doc.blocks("resource")[0] + self.assertEqual(block.labels, ["resource", "aws_instance", "main"]) + + def test_name_labels(self): + doc = DocumentView.parse('resource "aws_instance" "main" {}\n') + block = doc.blocks("resource")[0] + self.assertEqual(block.name_labels, ["aws_instance", "main"]) + + def test_body(self): + doc = DocumentView.parse('resource "type" "name" {\n ami = "test"\n}\n') + block = doc.blocks("resource")[0] + body = block.body + self.assertIsNotNone(body) + + def test_nested_attribute(self): + doc = DocumentView.parse('resource "type" "name" {\n ami = "test"\n}\n') + block = doc.blocks("resource")[0] + attr = block.attribute("ami") + self.assertIsNotNone(attr) + self.assertEqual(attr.name, "ami") + + def test_nested_blocks(self): + hcl = 'resource "type" "name" {\n provisioner "local-exec" {\n command = "echo"\n }\n}\n' + doc = DocumentView.parse(hcl) + block = doc.blocks("resource")[0] + inner = block.blocks("provisioner") + self.assertEqual(len(inner), 1) + + def test_to_hcl(self): + doc = DocumentView.parse('resource "type" "name" {\n ami = "test"\n}\n') + block = doc.blocks("resource")[0] + hcl = block.to_hcl() + self.assertIn("resource", hcl) + self.assertIn("ami", hcl) + + def test_identifier_label(self): + doc = DocumentView.parse("locals {\n x = 1\n}\n") + block = doc.blocks("locals")[0] + self.assertEqual(block.block_type, "locals") + self.assertEqual(block.name_labels, []) + + def test_attributes_list(self): + doc = DocumentView.parse('resource "type" "name" {\n a = 1\n b = 2\n}\n') + block = doc.blocks("resource")[0] + attrs = block.attributes() + self.assertEqual(len(attrs), 2) + + def test_attributes_filtered(self): + doc = DocumentView.parse('resource "type" "name" {\n a = 1\n b = 2\n}\n') + block = doc.blocks("resource")[0] + attrs = block.attributes("a") + self.assertEqual(len(attrs), 1) + self.assertEqual(attrs[0].name, "a") + + +class TestBlockViewAdjacentComments(TestCase): + """Tests for adjacent comment merging in BlockView.to_dict().""" + + _OPTS = SerializationOptions(with_comments=True) + + def test_adjacent_comments_at_outer_level(self): + doc = DocumentView.parse( + '# about resource\nresource "type" "name" {\n x = 1\n}\n' + ) + block = doc.blocks("resource")[0] + result = block.to_dict(options=self._OPTS) + # Adjacent comments go at outer level, alongside the label key + self.assertEqual(result["__comments__"], [{"value": "about resource"}]) + self.assertNotIn("__comments__", result['"type"']['"name"']) + + def test_adjacent_separate_from_inner_comments(self): + doc = DocumentView.parse( + '# adjacent\nresource "type" "name" {\n # inner\n x = 1\n}\n' + ) + block = doc.blocks("resource")[0] + result = block.to_dict(options=self._OPTS) + # Adjacent at outer level + self.assertEqual(result["__comments__"], [{"value": "adjacent"}]) + # Inner stays in body dict under __comments__ + body = result['"type"']['"name"'] + self.assertEqual(body["__comments__"], [{"value": "inner"}]) + + def test_no_comments_without_option(self): + doc = DocumentView.parse('# about\nresource "type" "name" {}\n') + block = doc.blocks("resource")[0] + result = block.to_dict() + self.assertNotIn("__comments__", result) + + def test_no_labels_block_merges_adjacent_and_inner(self): + doc = DocumentView.parse("# about locals\nlocals {\n # inner\n x = 1\n}\n") + block = doc.blocks("locals")[0] + result = block.to_dict(options=self._OPTS) + # No name labels -> body dict IS the top level, so they merge + self.assertEqual( + result["__comments__"], + [{"value": "about locals"}, {"value": "inner"}], + ) + + def test_single_label_block(self): + doc = DocumentView.parse('# about var\nvariable "name" {\n default = 1\n}\n') + block = doc.blocks("variable")[0] + result = block.to_dict(options=self._OPTS) + self.assertEqual(result["__comments__"], [{"value": "about var"}]) + + def test_no_adjacent_comments(self): + doc = DocumentView.parse('resource "type" "name" {\n x = 1\n}\n') + block = doc.blocks("resource")[0] + result = block.to_dict(options=self._OPTS) + self.assertNotIn("__comments__", result) diff --git a/test/unit/query/test_body.py b/test/unit/query/test_body.py new file mode 100644 index 00000000..0a8b75bf --- /dev/null +++ b/test/unit/query/test_body.py @@ -0,0 +1,175 @@ +# pylint: disable=C0103,C0114,C0115,C0116 +from unittest import TestCase + +from hcl2.query.body import DocumentView, BodyView, _collect_leading_comments + + +class TestDocumentView(TestCase): + def test_parse(self): + doc = DocumentView.parse("x = 1\n") + self.assertIsInstance(doc, DocumentView) + + def test_body(self): + doc = DocumentView.parse("x = 1\n") + body = doc.body + self.assertIsInstance(body, BodyView) + + def test_blocks(self): + doc = DocumentView.parse( + 'resource "aws_instance" "main" {\n ami = "test"\n}\n' + ) + blocks = doc.blocks("resource") + self.assertEqual(len(blocks), 1) + self.assertEqual(blocks[0].block_type, "resource") + + def test_blocks_no_filter(self): + doc = DocumentView.parse('resource "a" "b" {}\nvariable "c" {}\n') + blocks = doc.blocks() + self.assertEqual(len(blocks), 2) + + def test_blocks_with_labels(self): + doc = DocumentView.parse( + 'resource "aws_instance" "main" {}\nresource "aws_s3_bucket" "data" {}\n' + ) + blocks = doc.blocks("resource", "aws_instance") + self.assertEqual(len(blocks), 1) + + def test_attributes(self): + doc = DocumentView.parse("x = 1\ny = 2\n") + attrs = doc.attributes() + self.assertEqual(len(attrs), 2) + + def test_attributes_filtered(self): + doc = DocumentView.parse("x = 1\ny = 2\n") + attrs = doc.attributes("x") + self.assertEqual(len(attrs), 1) + + def test_attribute(self): + doc = DocumentView.parse("x = 1\ny = 2\n") + attr = doc.attribute("x") + self.assertIsNotNone(attr) + self.assertEqual(attr.name, "x") + + def test_attribute_missing(self): + doc = DocumentView.parse("x = 1\n") + attr = doc.attribute("missing") + self.assertIsNone(attr) + + def test_parse_file(self): + import os + import tempfile + + with tempfile.NamedTemporaryFile(mode="w", suffix=".tf", delete=False) as tmp: + tmp.write("x = 1\n") + tmp.flush() + try: + doc = DocumentView.parse_file(tmp.name) + self.assertIsInstance(doc, DocumentView) + attr = doc.attribute("x") + self.assertIsNotNone(attr) + finally: + os.unlink(tmp.name) + + def test_blocks_label_too_many(self): + doc = DocumentView.parse('resource "type" {}\n') + # Ask for more labels than the block has + blocks = doc.blocks("resource", "type", "extra") + self.assertEqual(len(blocks), 0) + + def test_blocks_label_partial_mismatch(self): + doc = DocumentView.parse('resource "aws_instance" "main" {}\n') + blocks = doc.blocks("resource", "aws_s3_bucket") + self.assertEqual(len(blocks), 0) + + +class TestBodyView(TestCase): + def test_blocks(self): + doc = DocumentView.parse('resource "a" "b" {}\n') + body = doc.body + blocks = body.blocks() + self.assertEqual(len(blocks), 1) + + def test_attributes(self): + doc = DocumentView.parse("x = 1\ny = 2\n") + body = doc.body + attrs = body.attributes() + self.assertEqual(len(attrs), 2) + + +class TestCollectLeadingComments(TestCase): + """Tests for _collect_leading_comments helper.""" + + def _body(self, hcl: str): + doc = DocumentView.parse(hcl) + return doc.body.raw # BodyRule + + def test_comment_before_block(self): + body = self._body('# about resource\nresource "a" "b" {}\n') + # Find the BlockRule child + from hcl2.rules.base import BlockRule + + for child in body.children: + if isinstance(child, BlockRule): + result = _collect_leading_comments(body, child.index) + self.assertEqual(result, [{"value": "about resource"}]) + return + self.fail("No BlockRule found") + + def test_comment_before_attribute(self): + body = self._body("# about x\nx = 1\n") + from hcl2.rules.base import AttributeRule + + for child in body.children: + if isinstance(child, AttributeRule): + result = _collect_leading_comments(body, child.index) + self.assertEqual(result, [{"value": "about x"}]) + return + self.fail("No AttributeRule found") + + def test_stops_at_previous_semantic_sibling(self): + body = self._body("x = 1\n# about y\ny = 2\n") + from hcl2.rules.base import AttributeRule + + attrs = [c for c in body.children if isinstance(c, AttributeRule)] + # First attribute (x) — comment before it is empty (only bare newlines) + result_x = _collect_leading_comments(body, attrs[0].index) + self.assertEqual(result_x, []) + # Second attribute (y) — has "about y" above it + result_y = _collect_leading_comments(body, attrs[1].index) + self.assertEqual(result_y, [{"value": "about y"}]) + + def test_bare_newlines_not_collected(self): + body = self._body("\n\nx = 1\n") + from hcl2.rules.base import AttributeRule + + for child in body.children: + if isinstance(child, AttributeRule): + result = _collect_leading_comments(body, child.index) + self.assertEqual(result, []) + return + self.fail("No AttributeRule found") + + def test_multiple_comments_in_order(self): + body = self._body("# first\n# second\nx = 1\n") + from hcl2.rules.base import AttributeRule + + for child in body.children: + if isinstance(child, AttributeRule): + result = _collect_leading_comments(body, child.index) + self.assertEqual(result, [{"value": "first"}, {"value": "second"}]) + return + self.fail("No AttributeRule found") + + def test_comment_between_two_blocks(self): + body = self._body('resource "a" "b" {}\n# about variable\nvariable "c" {}\n') + from hcl2.rules.base import BlockRule + + blocks = [c for c in body.children if isinstance(c, BlockRule)] + self.assertEqual(len(blocks), 2) + # First block: no leading comments + self.assertEqual(_collect_leading_comments(body, blocks[0].index), []) + # Second block: "about variable" + self.assertEqual( + _collect_leading_comments(body, blocks[1].index), + [{"value": "about variable"}], + ) diff --git a/test/unit/query/test_builtins.py b/test/unit/query/test_builtins.py new file mode 100644 index 00000000..1a402688 --- /dev/null +++ b/test/unit/query/test_builtins.py @@ -0,0 +1,108 @@ +# pylint: disable=C0103,C0114,C0115,C0116 +from unittest import TestCase + +from hcl2.query.body import DocumentView +from hcl2.query.builtins import apply_builtin +from hcl2.query.path import QuerySyntaxError, parse_path +from hcl2.query.resolver import resolve_path + + +class TestKeysBuiltin(TestCase): + def test_keys_on_object(self): + doc = DocumentView.parse("x = {\n a = 1\n b = 2\n}\n") + results = resolve_path(doc, parse_path("x")) + keys = apply_builtin("keys", results) + self.assertEqual(len(keys), 1) + # ObjectView keys + self.assertEqual(sorted(keys[0]), ["a", "b"]) + + def test_keys_on_body(self): + doc = DocumentView.parse("x = 1\ny = 2\n") + keys = apply_builtin("keys", [doc.body]) + self.assertEqual(len(keys), 1) + self.assertEqual(keys[0], ["x", "y"]) + + def test_keys_on_document(self): + doc = DocumentView.parse('resource "a" "b" {}\nx = 1\n') + keys = apply_builtin("keys", [doc]) + self.assertEqual(len(keys), 1) + self.assertIn("resource", keys[0]) + self.assertIn("x", keys[0]) + + def test_keys_on_block(self): + doc = DocumentView.parse('resource "aws_instance" "main" {}\n') + blocks = doc.blocks("resource") + keys = apply_builtin("keys", blocks) + self.assertEqual(len(keys), 1) + self.assertEqual(keys[0], ["resource", "aws_instance", "main"]) + + def test_keys_on_dict(self): + keys = apply_builtin("keys", [{"a": 1, "b": 2}]) + self.assertEqual(len(keys), 1) + self.assertEqual(sorted(keys[0]), ["a", "b"]) + + +class TestValuesBuiltin(TestCase): + def test_values_on_object(self): + doc = DocumentView.parse("x = {\n a = 1\n b = 2\n}\n") + results = resolve_path(doc, parse_path("x")) + vals = apply_builtin("values", results) + self.assertEqual(len(vals), 1) + self.assertEqual(len(vals[0]), 2) + + def test_values_on_tuple(self): + doc = DocumentView.parse("x = [1, 2, 3]\n") + results = resolve_path(doc, parse_path("x")) + vals = apply_builtin("values", results) + self.assertEqual(len(vals), 1) + self.assertEqual(len(vals[0]), 3) + + def test_values_on_body(self): + doc = DocumentView.parse("x = 1\ny = 2\n") + vals = apply_builtin("values", [doc.body]) + self.assertEqual(len(vals), 1) + self.assertEqual(len(vals[0]), 2) + + def test_values_on_dict(self): + vals = apply_builtin("values", [{"a": 1, "b": 2}]) + self.assertEqual(len(vals), 1) + self.assertEqual(sorted(vals[0]), [1, 2]) + + +class TestLengthBuiltin(TestCase): + def test_length_on_tuple(self): + doc = DocumentView.parse("x = [1, 2, 3]\n") + results = resolve_path(doc, parse_path("x")) + lengths = apply_builtin("length", results) + self.assertEqual(lengths, [3]) + + def test_length_on_object(self): + doc = DocumentView.parse("x = {\n a = 1\n b = 2\n}\n") + results = resolve_path(doc, parse_path("x")) + lengths = apply_builtin("length", results) + self.assertEqual(lengths, [2]) + + def test_length_on_body(self): + doc = DocumentView.parse("x = 1\ny = 2\n") + lengths = apply_builtin("length", [doc.body]) + self.assertEqual(lengths, [2]) + + def test_length_on_node_view(self): + doc = DocumentView.parse("x = 1\n") + results = resolve_path(doc, parse_path("x")) + lengths = apply_builtin("length", results) + self.assertEqual(lengths, [1]) + + def test_length_on_list(self): + lengths = apply_builtin("length", [[1, 2, 3]]) + self.assertEqual(lengths, [3]) + + def test_length_on_string(self): + lengths = apply_builtin("length", ["hello"]) + self.assertEqual(lengths, [5]) + + +class TestUnknownBuiltin(TestCase): + def test_unknown_raises(self): + with self.assertRaises(QuerySyntaxError): + apply_builtin("nope", [1]) diff --git a/test/unit/query/test_containers.py b/test/unit/query/test_containers.py new file mode 100644 index 00000000..b5b4db69 --- /dev/null +++ b/test/unit/query/test_containers.py @@ -0,0 +1,67 @@ +# pylint: disable=C0103,C0114,C0115,C0116 +from unittest import TestCase + +from hcl2.query.body import DocumentView +from hcl2.query.containers import ObjectView, TupleView +from hcl2.rules.containers import ObjectRule, TupleRule +from hcl2.walk import find_first + + +class TestTupleView(TestCase): + def test_elements(self): + doc = DocumentView.parse("x = [1, 2, 3]\n") + attr = doc.attribute("x") + tuple_node = find_first(attr.raw, TupleRule) + self.assertIsNotNone(tuple_node) + tv = TupleView(tuple_node) + self.assertEqual(len(tv), 3) + + def test_getitem(self): + doc = DocumentView.parse("x = [1, 2, 3]\n") + attr = doc.attribute("x") + tuple_node = find_first(attr.raw, TupleRule) + tv = TupleView(tuple_node) + elem = tv[0] + self.assertIsNotNone(elem) + + def test_elements_property(self): + doc = DocumentView.parse("x = [1, 2, 3]\n") + attr = doc.attribute("x") + tuple_node = find_first(attr.raw, TupleRule) + tv = TupleView(tuple_node) + elems = tv.elements + self.assertEqual(len(elems), 3) + + +class TestObjectView(TestCase): + def test_entries(self): + doc = DocumentView.parse("x = {\n a = 1\n b = 2\n}\n") + attr = doc.attribute("x") + obj_node = find_first(attr.raw, ObjectRule) + self.assertIsNotNone(obj_node) + ov = ObjectView(obj_node) + entries = ov.entries + self.assertEqual(len(entries), 2) + + def test_keys(self): + doc = DocumentView.parse("x = {\n a = 1\n b = 2\n}\n") + attr = doc.attribute("x") + obj_node = find_first(attr.raw, ObjectRule) + ov = ObjectView(obj_node) + self.assertEqual(ov.keys, ["a", "b"]) + + def test_get(self): + doc = DocumentView.parse("x = {\n a = 1\n b = 2\n}\n") + attr = doc.attribute("x") + obj_node = find_first(attr.raw, ObjectRule) + ov = ObjectView(obj_node) + val = ov.get("a") + self.assertIsNotNone(val) + + def test_get_missing(self): + doc = DocumentView.parse("x = {\n a = 1\n}\n") + attr = doc.attribute("x") + obj_node = find_first(attr.raw, ObjectRule) + ov = ObjectView(obj_node) + val = ov.get("missing") + self.assertIsNone(val) diff --git a/test/unit/query/test_diff.py b/test/unit/query/test_diff.py new file mode 100644 index 00000000..c1832476 --- /dev/null +++ b/test/unit/query/test_diff.py @@ -0,0 +1,109 @@ +# pylint: disable=C0103,C0114,C0115,C0116 +import json +from unittest import TestCase + +from hcl2.query.diff import DiffEntry, diff_dicts, format_diff_json, format_diff_text + + +class TestDiffDicts(TestCase): + def test_identical(self): + d = {"a": 1, "b": "hello"} + self.assertEqual(diff_dicts(d, d), []) + + def test_added_key(self): + left = {"a": 1} + right = {"a": 1, "b": 2} + entries = diff_dicts(left, right) + self.assertEqual(len(entries), 1) + self.assertEqual(entries[0].kind, "added") + self.assertEqual(entries[0].path, "b") + self.assertEqual(entries[0].right, 2) + + def test_removed_key(self): + left = {"a": 1, "b": 2} + right = {"a": 1} + entries = diff_dicts(left, right) + self.assertEqual(len(entries), 1) + self.assertEqual(entries[0].kind, "removed") + self.assertEqual(entries[0].path, "b") + self.assertEqual(entries[0].left, 2) + + def test_changed_value(self): + left = {"a": 1} + right = {"a": 2} + entries = diff_dicts(left, right) + self.assertEqual(len(entries), 1) + self.assertEqual(entries[0].kind, "changed") + self.assertEqual(entries[0].left, 1) + self.assertEqual(entries[0].right, 2) + + def test_nested_change(self): + left = {"a": {"b": 1}} + right = {"a": {"b": 2}} + entries = diff_dicts(left, right) + self.assertEqual(len(entries), 1) + self.assertEqual(entries[0].path, "a.b") + self.assertEqual(entries[0].kind, "changed") + + def test_list_added_element(self): + left = {"items": [1, 2]} + right = {"items": [1, 2, 3]} + entries = diff_dicts(left, right) + self.assertEqual(len(entries), 1) + self.assertEqual(entries[0].path, "items[2]") + self.assertEqual(entries[0].kind, "added") + + def test_list_removed_element(self): + left = {"items": [1, 2, 3]} + right = {"items": [1, 2]} + entries = diff_dicts(left, right) + self.assertEqual(len(entries), 1) + self.assertEqual(entries[0].path, "items[2]") + self.assertEqual(entries[0].kind, "removed") + + def test_empty_dicts(self): + self.assertEqual(diff_dicts({}, {}), []) + + def test_multiple_changes(self): + left = {"a": 1, "b": 2, "c": 3} + right = {"a": 1, "b": 99, "d": 4} + entries = diff_dicts(left, right) + kinds = {e.path: e.kind for e in entries} + self.assertEqual(kinds["b"], "changed") + self.assertEqual(kinds["c"], "removed") + self.assertEqual(kinds["d"], "added") + + +class TestFormatDiffText(TestCase): + def test_empty(self): + self.assertEqual(format_diff_text([]), "") + + def test_added(self): + entries = [DiffEntry(path="x", kind="added", right=42)] + text = format_diff_text(entries) + self.assertIn("+ x:", text) + self.assertIn("42", text) + + def test_removed(self): + entries = [DiffEntry(path="x", kind="removed", left="old")] + text = format_diff_text(entries) + self.assertIn("- x:", text) + self.assertIn("'old'", text) + + def test_changed(self): + entries = [DiffEntry(path="x", kind="changed", left=1, right=2)] + text = format_diff_text(entries) + self.assertIn("~ x:", text) + self.assertIn("->", text) + + +class TestFormatDiffJson(TestCase): + def test_json_output(self): + entries = [ + DiffEntry(path="a", kind="added", right=1), + DiffEntry(path="b", kind="removed", left=2), + ] + data = json.loads(format_diff_json(entries)) + self.assertEqual(len(data), 2) + self.assertEqual(data[0]["kind"], "added") + self.assertEqual(data[1]["kind"], "removed") diff --git a/test/unit/query/test_expressions.py b/test/unit/query/test_expressions.py new file mode 100644 index 00000000..5d565497 --- /dev/null +++ b/test/unit/query/test_expressions.py @@ -0,0 +1,90 @@ +# pylint: disable=C0103,C0114,C0115,C0116 +from unittest import TestCase + +from hcl2.query.body import DocumentView +from hcl2.query.expressions import ConditionalView +from hcl2.query.path import parse_path +from hcl2.query.resolver import resolve_path + + +class TestConditionalView(TestCase): + def _parse(self, hcl): + return DocumentView.parse(hcl) + + def test_conditional_detected(self): + doc = self._parse('x = true ? "yes" : "no"\n') + results = resolve_path(doc, parse_path("*..conditional:*")) + self.assertEqual(len(results), 1) + self.assertIsInstance(results[0], ConditionalView) + + def test_condition_property(self): + doc = self._parse('x = true ? "yes" : "no"\n') + results = resolve_path(doc, parse_path("*..conditional:*")) + cond = results[0] + self.assertEqual(cond.condition.to_hcl().strip(), "true") + + def test_true_val_property(self): + doc = self._parse('x = true ? "yes" : "no"\n') + results = resolve_path(doc, parse_path("*..conditional:*")) + cond = results[0] + self.assertEqual(cond.true_val.to_hcl().strip(), '"yes"') + + def test_false_val_property(self): + doc = self._parse('x = true ? "yes" : "no"\n') + results = resolve_path(doc, parse_path("*..conditional:*")) + cond = results[0] + self.assertEqual(cond.false_val.to_hcl().strip(), '"no"') + + def test_resolve_condition_by_path(self): + doc = self._parse('x = true ? "yes" : "no"\n') + results = resolve_path(doc, parse_path("*..conditional:*.condition")) + self.assertEqual(len(results), 1) + self.assertEqual(results[0].to_hcl().strip(), "true") + + def test_resolve_true_val_by_path(self): + doc = self._parse('x = true ? "yes" : "no"\n') + results = resolve_path(doc, parse_path("*..conditional:*.true_val")) + self.assertEqual(len(results), 1) + self.assertEqual(results[0].to_hcl().strip(), '"yes"') + + def test_resolve_false_val_by_path(self): + doc = self._parse('x = true ? "yes" : "no"\n') + results = resolve_path(doc, parse_path("*..conditional:*.false_val")) + self.assertEqual(len(results), 1) + self.assertEqual(results[0].to_hcl().strip(), '"no"') + + def test_type_name(self): + from hcl2.query._base import view_type_name + + doc = self._parse('x = true ? "yes" : "no"\n') + results = resolve_path(doc, parse_path("*..conditional:*")) + self.assertEqual(view_type_name(results[0]), "conditional") + + def test_nested_conditional_in_block(self): + hcl = 'resource "aws" "main" {\n val = var.enabled ? "on" : "off"\n}\n' + doc = self._parse(hcl) + results = resolve_path(doc, parse_path("resource..conditional:*")) + self.assertEqual(len(results), 1) + self.assertIsInstance(results[0], ConditionalView) + + def test_pipe_to_condition(self): + from hcl2.query.pipeline import ( + classify_stage, + execute_pipeline, + split_pipeline, + ) + + doc = self._parse('x = true ? "yes" : "no"\n') + stages = [ + classify_stage(s) for s in split_pipeline("*..conditional:* | .condition") + ] + results = execute_pipeline(doc, stages) + self.assertEqual(len(results), 1) + self.assertEqual(results[0].to_hcl().strip(), "true") + + def test_conditional_with_complex_condition(self): + doc = self._parse('x = var.count > 0 ? "some" : "none"\n') + results = resolve_path(doc, parse_path("*..conditional:*.condition")) + self.assertEqual(len(results), 1) + # The condition is a binary op + self.assertIn(">", results[0].to_hcl()) diff --git a/test/unit/query/test_for_exprs.py b/test/unit/query/test_for_exprs.py new file mode 100644 index 00000000..b165acc9 --- /dev/null +++ b/test/unit/query/test_for_exprs.py @@ -0,0 +1,119 @@ +# pylint: disable=C0103,C0114,C0115,C0116 +from unittest import TestCase + +from hcl2.query.body import DocumentView +from hcl2.query.for_exprs import ForTupleView, ForObjectView +from hcl2.rules.for_expressions import ForTupleExprRule, ForObjectExprRule +from hcl2.walk import find_first + + +class TestForTupleView(TestCase): + def test_iterator_name(self): + doc = DocumentView.parse("x = [for item in var.list : item]\n") + node = find_first(doc.raw, ForTupleExprRule) + self.assertIsNotNone(node) + fv = ForTupleView(node) + self.assertEqual(fv.iterator_name, "item") + + def test_second_iterator_name_none(self): + doc = DocumentView.parse("x = [for item in var.list : item]\n") + node = find_first(doc.raw, ForTupleExprRule) + fv = ForTupleView(node) + self.assertIsNone(fv.second_iterator_name) + + def test_second_iterator_name(self): + doc = DocumentView.parse("x = [for k, v in var.map : v]\n") + node = find_first(doc.raw, ForTupleExprRule) + fv = ForTupleView(node) + self.assertEqual(fv.second_iterator_name, "v") + + def test_iterable(self): + doc = DocumentView.parse("x = [for item in var.list : item]\n") + node = find_first(doc.raw, ForTupleExprRule) + fv = ForTupleView(node) + self.assertIsNotNone(fv.iterable) + + def test_value_expr(self): + doc = DocumentView.parse("x = [for item in var.list : item]\n") + node = find_first(doc.raw, ForTupleExprRule) + fv = ForTupleView(node) + self.assertIsNotNone(fv.value_expr) + + def test_no_condition(self): + doc = DocumentView.parse("x = [for item in var.list : item]\n") + node = find_first(doc.raw, ForTupleExprRule) + fv = ForTupleView(node) + self.assertFalse(fv.has_condition) + self.assertIsNone(fv.condition) + + def test_with_condition(self): + doc = DocumentView.parse('x = [for item in var.list : item if item != ""]\n') + node = find_first(doc.raw, ForTupleExprRule) + fv = ForTupleView(node) + self.assertTrue(fv.has_condition) + self.assertIsNotNone(fv.condition) + + +class TestForObjectView(TestCase): + def test_iterator_name(self): + doc = DocumentView.parse("x = {for k, v in var.map : k => v}\n") + node = find_first(doc.raw, ForObjectExprRule) + self.assertIsNotNone(node) + fv = ForObjectView(node) + self.assertEqual(fv.iterator_name, "k") + + def test_key_expr(self): + doc = DocumentView.parse("x = {for k, v in var.map : k => v}\n") + node = find_first(doc.raw, ForObjectExprRule) + fv = ForObjectView(node) + self.assertIsNotNone(fv.key_expr) + + def test_value_expr(self): + doc = DocumentView.parse("x = {for k, v in var.map : k => v}\n") + node = find_first(doc.raw, ForObjectExprRule) + fv = ForObjectView(node) + self.assertIsNotNone(fv.value_expr) + + def test_no_ellipsis(self): + doc = DocumentView.parse("x = {for k, v in var.map : k => v}\n") + node = find_first(doc.raw, ForObjectExprRule) + fv = ForObjectView(node) + self.assertFalse(fv.has_ellipsis) + + def test_with_ellipsis(self): + doc = DocumentView.parse("x = {for k, v in var.map : k => v...}\n") + node = find_first(doc.raw, ForObjectExprRule) + fv = ForObjectView(node) + self.assertTrue(fv.has_ellipsis) + + def test_second_iterator_name(self): + doc = DocumentView.parse("x = {for k, v in var.map : k => v}\n") + node = find_first(doc.raw, ForObjectExprRule) + fv = ForObjectView(node) + self.assertEqual(fv.second_iterator_name, "v") + + def test_second_iterator_name_none(self): + doc = DocumentView.parse("x = {for item in var.list : item => item}\n") + node = find_first(doc.raw, ForObjectExprRule) + fv = ForObjectView(node) + self.assertIsNone(fv.second_iterator_name) + + def test_iterable(self): + doc = DocumentView.parse("x = {for k, v in var.map : k => v}\n") + node = find_first(doc.raw, ForObjectExprRule) + fv = ForObjectView(node) + self.assertIsNotNone(fv.iterable) + + def test_no_condition(self): + doc = DocumentView.parse("x = {for k, v in var.map : k => v}\n") + node = find_first(doc.raw, ForObjectExprRule) + fv = ForObjectView(node) + self.assertFalse(fv.has_condition) + self.assertIsNone(fv.condition) + + def test_with_condition(self): + doc = DocumentView.parse('x = {for k, v in var.map : k => v if k != ""}\n') + node = find_first(doc.raw, ForObjectExprRule) + fv = ForObjectView(node) + self.assertTrue(fv.has_condition) + self.assertIsNotNone(fv.condition) diff --git a/test/unit/query/test_functions.py b/test/unit/query/test_functions.py new file mode 100644 index 00000000..42541842 --- /dev/null +++ b/test/unit/query/test_functions.py @@ -0,0 +1,46 @@ +# pylint: disable=C0103,C0114,C0115,C0116 +from unittest import TestCase + +from hcl2.query.body import DocumentView +from hcl2.query.functions import FunctionCallView +from hcl2.rules.functions import FunctionCallRule +from hcl2.walk import find_first + + +class TestFunctionCallView(TestCase): + def test_name(self): + doc = DocumentView.parse("x = length(var.list)\n") + node = find_first(doc.raw, FunctionCallRule) + self.assertIsNotNone(node) + fv = FunctionCallView(node) + self.assertEqual(fv.name, "length") + + def test_args(self): + doc = DocumentView.parse("x = length(var.list)\n") + node = find_first(doc.raw, FunctionCallRule) + fv = FunctionCallView(node) + self.assertEqual(len(fv.args), 1) + + def test_no_args(self): + doc = DocumentView.parse("x = timestamp()\n") + node = find_first(doc.raw, FunctionCallRule) + fv = FunctionCallView(node) + self.assertEqual(len(fv.args), 0) + + def test_no_ellipsis(self): + doc = DocumentView.parse("x = length(var.list)\n") + node = find_first(doc.raw, FunctionCallRule) + fv = FunctionCallView(node) + self.assertFalse(fv.has_ellipsis) + + def test_ellipsis(self): + doc = DocumentView.parse("x = length(var.list...)\n") + node = find_first(doc.raw, FunctionCallRule) + fv = FunctionCallView(node) + self.assertTrue(fv.has_ellipsis) + + def test_multiple_args(self): + doc = DocumentView.parse('x = coalesce(var.a, var.b, "default")\n') + node = find_first(doc.raw, FunctionCallRule) + fv = FunctionCallView(node) + self.assertEqual(len(fv.args), 3) diff --git a/test/unit/query/test_introspect.py b/test/unit/query/test_introspect.py new file mode 100644 index 00000000..075318e9 --- /dev/null +++ b/test/unit/query/test_introspect.py @@ -0,0 +1,90 @@ +# pylint: disable=C0103,C0114,C0115,C0116 +from unittest import TestCase + +from hcl2.query.body import DocumentView +from hcl2.query.introspect import build_schema, describe_results + + +class TestDescribeResults(TestCase): + def test_describe_block(self): + doc = DocumentView.parse('resource "aws_instance" "main" {}\n') + blocks = doc.blocks("resource") + result = describe_results(blocks) + self.assertIn("results", result) + self.assertEqual(len(result["results"]), 1) + desc = result["results"][0] + self.assertEqual(desc["type"], "BlockView") + self.assertIn("properties", desc) + self.assertIn("methods", desc) + self.assertIn("block_type", desc["summary"]) + + def test_describe_attribute(self): + doc = DocumentView.parse("x = 1\n") + attrs = doc.attributes("x") + result = describe_results(attrs) + desc = result["results"][0] + self.assertEqual(desc["type"], "AttributeView") + self.assertIn("name", desc["summary"]) + + def test_describe_primitive(self): + result = describe_results([42]) + desc = result["results"][0] + self.assertEqual(desc["type"], "int") + self.assertIn("42", desc["value"]) + + +class TestBuildSchema(TestCase): + def test_schema_has_views(self): + schema = build_schema() + self.assertIn("views", schema) + self.assertIn("DocumentView", schema["views"]) + self.assertIn("BlockView", schema["views"]) + self.assertIn("AttributeView", schema["views"]) + self.assertIn("NodeView", schema["views"]) + + def test_schema_has_eval_namespace(self): + schema = build_schema() + self.assertIn("eval_namespace", schema) + self.assertIn("builtins", schema["eval_namespace"]) + self.assertIn("variables", schema["eval_namespace"]) + self.assertIn("len", schema["eval_namespace"]["builtins"]) + + def test_schema_view_has_properties(self): + schema = build_schema() + doc_schema = schema["views"]["DocumentView"] + self.assertIn("properties", doc_schema) + self.assertIn("body", doc_schema["properties"]) + + def test_schema_view_has_methods(self): + schema = build_schema() + doc_schema = schema["views"]["DocumentView"] + self.assertIn("methods", doc_schema) + + def test_schema_view_wraps(self): + schema = build_schema() + block_schema = schema["views"]["BlockView"] + self.assertEqual(block_schema["wraps"], "BlockRule") + + def test_schema_nodeview_no_wraps(self): + schema = build_schema() + nv_schema = schema["views"]["NodeView"] + self.assertNotIn("wraps", nv_schema) + + def test_describe_body_view_no_summary(self): + doc = DocumentView.parse("x = 1\n") + result = describe_results([doc.body]) + desc = result["results"][0] + self.assertEqual(desc["type"], "BodyView") + self.assertNotIn("summary", desc) + + def test_describe_document_view(self): + doc = DocumentView.parse("x = 1\n") + result = describe_results([doc]) + desc = result["results"][0] + self.assertEqual(desc["type"], "DocumentView") + + def test_schema_static_methods(self): + schema = build_schema() + doc_schema = schema["views"]["DocumentView"] + # DocumentView has parse and parse_file static methods + self.assertIn("static_methods", doc_schema) diff --git a/test/unit/query/test_path.py b/test/unit/query/test_path.py new file mode 100644 index 00000000..eb9415dc --- /dev/null +++ b/test/unit/query/test_path.py @@ -0,0 +1,213 @@ +# pylint: disable=C0103,C0114,C0115,C0116 +from unittest import TestCase + +from hcl2.query.path import PathSegment, QuerySyntaxError, parse_path + + +class TestParsePath(TestCase): + def test_simple(self): + segments = parse_path("resource") + self.assertEqual(len(segments), 1) + self.assertEqual(segments[0], PathSegment("resource", False, None)) + + def test_dotted(self): + segments = parse_path("resource.aws_instance.main") + self.assertEqual(len(segments), 3) + self.assertEqual(segments[0].name, "resource") + self.assertEqual(segments[1].name, "aws_instance") + self.assertEqual(segments[2].name, "main") + + def test_wildcard(self): + segments = parse_path("*") + self.assertEqual(segments[0], PathSegment("*", False, None)) + + def test_select_all(self): + segments = parse_path("variable[*]") + self.assertEqual(segments[0], PathSegment("variable", True, None)) + + def test_index(self): + segments = parse_path("variable[0]") + self.assertEqual(segments[0], PathSegment("variable", False, 0)) + + def test_complex(self): + segments = parse_path("resource.aws_instance[*].tags") + self.assertEqual(len(segments), 3) + self.assertEqual(segments[0].name, "resource") + self.assertTrue(segments[1].select_all) + self.assertEqual(segments[2].name, "tags") + + def test_empty_raises(self): + with self.assertRaises(QuerySyntaxError): + parse_path("") + + def test_recursive_descent(self): + segments = parse_path("a..b") + self.assertEqual(len(segments), 2) + self.assertEqual(segments[0], PathSegment("a", False, None)) + self.assertEqual(segments[1], PathSegment("b", False, None, recursive=True)) + + def test_recursive_with_index(self): + segments = parse_path("resource..tags[*]") + self.assertEqual(len(segments), 2) + self.assertEqual(segments[1].name, "tags") + self.assertTrue(segments[1].recursive) + self.assertTrue(segments[1].select_all) + + def test_recursive_in_middle(self): + segments = parse_path("a.b..c.d") + self.assertEqual(len(segments), 4) + self.assertFalse(segments[0].recursive) + self.assertFalse(segments[1].recursive) + self.assertTrue(segments[2].recursive) + self.assertFalse(segments[3].recursive) + + def test_triple_dot_raises(self): + with self.assertRaises(QuerySyntaxError): + parse_path("a...b") + + def test_recursive_at_end_raises(self): + with self.assertRaises(QuerySyntaxError): + parse_path("a..") + + def test_leading_dot_raises(self): + with self.assertRaises(QuerySyntaxError): + parse_path(".a") + + def test_invalid_segment_raises(self): + with self.assertRaises(QuerySyntaxError): + parse_path("123invalid") + + def test_hyphen_in_name(self): + segments = parse_path("local-exec") + self.assertEqual(segments[0].name, "local-exec") + + def test_index_large(self): + segments = parse_path("items[42]") + self.assertEqual(segments[0].index, 42) + + def test_type_filter(self): + segments = parse_path("function_call:length") + self.assertEqual(len(segments), 1) + self.assertEqual(segments[0].name, "length") + self.assertEqual(segments[0].type_filter, "function_call") + + def test_type_filter_with_index(self): + segments = parse_path("function_call:length[0]") + self.assertEqual(segments[0].name, "length") + self.assertEqual(segments[0].type_filter, "function_call") + self.assertEqual(segments[0].index, 0) + + def test_type_filter_with_wildcard(self): + segments = parse_path("function_call:*[*]") + self.assertEqual(segments[0].name, "*") + self.assertEqual(segments[0].type_filter, "function_call") + self.assertTrue(segments[0].select_all) + + def test_type_filter_in_recursive(self): + segments = parse_path("*..function_call:length") + self.assertEqual(len(segments), 2) + self.assertTrue(segments[1].recursive) + self.assertEqual(segments[1].type_filter, "function_call") + self.assertEqual(segments[1].name, "length") + + def test_no_type_filter(self): + segments = parse_path("length") + self.assertIsNone(segments[0].type_filter) + + def test_skip_labels(self): + segments = parse_path("block~") + self.assertEqual(len(segments), 1) + self.assertEqual(segments[0].name, "block") + self.assertTrue(segments[0].skip_labels) + + def test_skip_labels_with_bracket(self): + segments = parse_path("resource~[*]") + self.assertEqual(segments[0].name, "resource") + self.assertTrue(segments[0].skip_labels) + self.assertTrue(segments[0].select_all) + + def test_skip_labels_with_select(self): + segments = parse_path("block~[select(.ami)]") + self.assertEqual(segments[0].name, "block") + self.assertTrue(segments[0].skip_labels) + self.assertIsNotNone(segments[0].predicate) + + def test_skip_labels_in_path(self): + segments = parse_path("block~.ami") + self.assertEqual(len(segments), 2) + self.assertTrue(segments[0].skip_labels) + self.assertFalse(segments[1].skip_labels) + + def test_no_skip_labels_by_default(self): + segments = parse_path("block") + self.assertFalse(segments[0].skip_labels) + + def test_select_with_trailing_star(self): + segments = parse_path("variable[select(.default)][*]") + self.assertEqual(segments[0].name, "variable") + self.assertIsNotNone(segments[0].predicate) + self.assertTrue(segments[0].select_all) + self.assertIsNone(segments[0].index) + + def test_select_with_trailing_index(self): + segments = parse_path("variable[select(.default)][0]") + self.assertEqual(segments[0].name, "variable") + self.assertIsNotNone(segments[0].predicate) + self.assertFalse(segments[0].select_all) + self.assertEqual(segments[0].index, 0) + + def test_select_no_trailing_bracket(self): + segments = parse_path("variable[select(.default)]") + self.assertIsNotNone(segments[0].predicate) + self.assertTrue(segments[0].select_all) + self.assertIsNone(segments[0].index) + + def test_optional_suffix(self): + segments = parse_path("x?") + self.assertEqual(len(segments), 1) + self.assertEqual(segments[0].name, "x") + + def test_optional_with_bracket(self): + segments = parse_path("x[*]?") + self.assertEqual(len(segments), 1) + self.assertEqual(segments[0].name, "x") + self.assertTrue(segments[0].select_all) + + def test_optional_after_select(self): + segments = parse_path("*[select(.x)]?") + self.assertEqual(len(segments), 1) + self.assertIsNotNone(segments[0].predicate) + + def test_optional_produces_same_as_without(self): + seg_plain = parse_path("resource") + seg_opt = parse_path("resource?") + self.assertEqual(seg_plain[0].name, seg_opt[0].name) + self.assertEqual(seg_plain[0].select_all, seg_opt[0].select_all) + + def test_escaped_quote_in_string(self): + # Escaped quote inside a quoted string should not terminate it + segments = parse_path('*[select(.name == "a\\"b")]') + self.assertEqual(len(segments), 1) + self.assertIsNotNone(segments[0].predicate) + + # jq compat: .[] as alias for [*] + + def test_jq_iterate_alias(self): + """resource.[] is equivalent to resource[*]""" + segments = parse_path("resource.[]") + expected = parse_path("resource[*]") + self.assertEqual(segments, expected) + + def test_jq_iterate_alias_chained(self): + """a.b.[] normalizes to a.b[*]""" + segments = parse_path("a.b.[]") + self.assertEqual(len(segments), 2) + self.assertEqual(segments[0].name, "a") + self.assertEqual(segments[1].name, "b") + self.assertTrue(segments[1].select_all) + + def test_jq_iterate_alias_with_continuation(self): + """resource.[].tags normalizes to resource[*].tags""" + segments = parse_path("resource.[].tags") + expected = parse_path("resource[*].tags") + self.assertEqual(segments, expected) diff --git a/test/unit/query/test_pipeline.py b/test/unit/query/test_pipeline.py new file mode 100644 index 00000000..d961a763 --- /dev/null +++ b/test/unit/query/test_pipeline.py @@ -0,0 +1,332 @@ +# pylint: disable=C0103,C0114,C0115,C0116 +from unittest import TestCase + +from hcl2.query.body import DocumentView +from hcl2.query.path import QuerySyntaxError +from hcl2.query.pipeline import ( + BuiltinStage, + ConstructStage, + PathStage, + SelectStage, + classify_stage, + execute_pipeline, + split_pipeline, +) + + +class TestSplitPipeline(TestCase): + def test_single_stage(self): + self.assertEqual(split_pipeline("resource"), ["resource"]) + + def test_multi_stage(self): + self.assertEqual( + split_pipeline("resource[*] | .aws_instance | .tags"), + ["resource[*]", ".aws_instance", ".tags"], + ) + + def test_bracket_aware(self): + # Pipe inside brackets should not split + result = split_pipeline("x[*] | y") + self.assertEqual(result, ["x[*]", "y"]) + + def test_paren_aware(self): + result = split_pipeline("select(.a | .b) | y") + # The pipe inside parens should not split + # Actually this would be select(.a | .b) and y + # But our grammar doesn't support pipes in predicates, + # this is just testing depth tracking + self.assertEqual(len(result), 2) + + def test_quote_aware(self): + result = split_pipeline('"a | b" | y') + self.assertEqual(len(result), 2) + + def test_escaped_quote_in_string(self): + # Escaped quote should not toggle string mode + result = split_pipeline('"a\\"b | c" | y') + self.assertEqual(len(result), 2) + self.assertEqual(result[0], '"a\\"b | c"') + self.assertEqual(result[1], "y") + + def test_empty_stage_error(self): + with self.assertRaises(QuerySyntaxError): + split_pipeline("x | | y") + + def test_trailing_pipe_error(self): + with self.assertRaises(QuerySyntaxError): + split_pipeline("x |") + + def test_leading_pipe_error(self): + with self.assertRaises(QuerySyntaxError): + split_pipeline("| x") + + def test_empty_pipeline_error(self): + with self.assertRaises(QuerySyntaxError): + split_pipeline("") + + def test_whitespace_stripped(self): + result = split_pipeline(" x | y ") + self.assertEqual(result, ["x", "y"]) + + +class TestClassifyStage(TestCase): + def test_path_stage(self): + stage = classify_stage("resource.aws_instance") + self.assertIsInstance(stage, PathStage) + self.assertEqual(len(stage.segments), 2) + + def test_builtin_keys(self): + stage = classify_stage("keys") + self.assertIsInstance(stage, BuiltinStage) + self.assertEqual(stage.name, "keys") + + def test_builtin_values(self): + stage = classify_stage("values") + self.assertIsInstance(stage, BuiltinStage) + self.assertEqual(stage.name, "values") + + def test_builtin_length(self): + stage = classify_stage("length") + self.assertIsInstance(stage, BuiltinStage) + self.assertEqual(stage.name, "length") + + def test_select_stage(self): + stage = classify_stage("select(.name)") + self.assertIsInstance(stage, SelectStage) + self.assertIsNotNone(stage.predicate) + + def test_select_with_comparison(self): + stage = classify_stage('select(.name == "foo")') + self.assertIsInstance(stage, SelectStage) + + def test_path_with_wildcard(self): + stage = classify_stage("*[*]") + self.assertIsInstance(stage, PathStage) + + +class TestExecutePipeline(TestCase): + def _make_doc(self, hcl): + return DocumentView.parse(hcl) + + def test_single_stage_identity(self): + doc = self._make_doc("x = 1\n") + stage = classify_stage("x") + results = execute_pipeline(doc, [stage]) + self.assertEqual(len(results), 1) + self.assertEqual(results[0].name, "x") + + def test_multi_stage_chaining(self): + doc = self._make_doc('resource "aws_instance" "main" {\n ami = "test"\n}\n') + # Pipe unwraps blocks to body, so chain with body attributes + stages = [ + classify_stage(s) + for s in split_pipeline("resource.aws_instance.main | .ami") + ] + results = execute_pipeline(doc, stages) + self.assertEqual(len(results), 1) + + def test_empty_intermediate(self): + doc = self._make_doc("x = 1\n") + stages = [classify_stage(s) for s in split_pipeline("nonexistent | .foo")] + results = execute_pipeline(doc, stages) + self.assertEqual(len(results), 0) + + def test_pipe_with_wildcard(self): + doc = self._make_doc("x = 1\ny = 2\nz = 3\n") + stages = [classify_stage(s) for s in split_pipeline("*[*] | length")] + results = execute_pipeline(doc, stages) + self.assertEqual(len(results), 3) + # Each attribute has length 1 + self.assertEqual(results, [1, 1, 1]) + + def test_pipe_builtin(self): + doc = self._make_doc("x = {\n a = 1\n b = 2\n}\n") + stages = [classify_stage(s) for s in split_pipeline("x | keys")] + results = execute_pipeline(doc, stages) + self.assertEqual(len(results), 1) + self.assertEqual(sorted(results[0]), ["a", "b"]) + + def test_pipe_select(self): + doc = self._make_doc('variable "a" {\n default = 1\n}\nvariable "b" {}\n') + stages = [ + classify_stage(s) for s in split_pipeline("variable[*] | select(.default)") + ] + results = execute_pipeline(doc, stages) + self.assertEqual(len(results), 1) + + def test_backward_compat_no_pipe(self): + """Single structural path still works through pipeline.""" + doc = self._make_doc('resource "aws_instance" "main" {\n ami = "test"\n}\n') + stages = [classify_stage("resource.aws_instance.main.ami")] + results = execute_pipeline(doc, stages) + self.assertEqual(len(results), 1) + + +class TestPropertyAccessPipeStages(TestCase): + """Test property accessor pipe stages (Option B).""" + + def _run(self, hcl, query): + doc = DocumentView.parse(hcl) + stages = [classify_stage(s) for s in split_pipeline(query)] + return execute_pipeline(doc, stages) + + def test_block_type_property(self): + r = self._run( + 'resource "aws" "x" {\n ami = 1\n}\n', + "resource[*] | .block_type", + ) + self.assertEqual(r, ["resource"]) + + def test_name_labels_property(self): + r = self._run( + 'resource "aws" "x" {\n ami = 1\n}\n', + "resource[*] | .name_labels", + ) + self.assertEqual(r, [["aws", "x"]]) + + def test_labels_property(self): + r = self._run( + 'resource "aws" "x" {\n ami = 1\n}\n', + "resource[*] | .labels", + ) + self.assertEqual(r, [["resource", "aws", "x"]]) + + def test_attribute_name_property(self): + r = self._run("x = 1\ny = 2\n", "*[*] | .name") + self.assertEqual(sorted(r), ["x", "y"]) + + def test_function_call_name_property(self): + r = self._run( + 'x = substr("hello", 0, 3)\n', + "*..function_call:*[*] | .name", + ) + self.assertEqual(r, ["substr"]) + + def test_property_then_builtin(self): + """Property access result feeds into a builtin.""" + r = self._run( + 'resource "aws" "x" {\n ami = 1\n}\n', + "resource[*] | .labels | length", + ) + self.assertEqual(r, [3]) + + def test_structural_still_works_after_pipe(self): + """Structural path resolution still works through pipes.""" + r = self._run( + 'resource "aws" "x" {\n ami = "test"\n}\n', + "resource.aws.x | .ami", + ) + self.assertEqual(len(r), 1) + + def test_type_qualifier_filter_in_pipe(self): + """Type qualifier in pipe stage filters by value type.""" + r = self._run( + "a = {x = 1}\nb = [1, 2]\nc = 3\n", + "*[*] | object:*", + ) + self.assertEqual(len(r), 1) + self.assertEqual(type(r[0]).__name__, "ObjectView") + + def test_type_qualifier_tuple_in_pipe(self): + r = self._run( + "a = {x = 1}\nb = [1, 2]\nc = 3\n", + "*[*] | tuple:*", + ) + self.assertEqual(len(r), 1) + self.assertEqual(type(r[0]).__name__, "TupleView") + + +class TestOptionalTolerance(TestCase): + """Test that trailing ? is tolerated in pipeline stages.""" + + def test_classify_stage_optional(self): + stage = classify_stage("resource?") + self.assertIsInstance(stage, PathStage) + + def test_classify_stage_optional_with_bracket(self): + stage = classify_stage("x[*]?") + self.assertIsInstance(stage, PathStage) + self.assertTrue(stage.segments[0].select_all) + + def test_classify_builtin_optional(self): + stage = classify_stage("keys?") + self.assertIsInstance(stage, BuiltinStage) + self.assertEqual(stage.name, "keys") + + def test_classify_select_optional(self): + stage = classify_stage("select(.name)?") + # select(.name)? — ? stripped first, then select() detected + self.assertIsInstance(stage, SelectStage) + + def test_brace_aware_split(self): + """Pipes inside braces should not split.""" + result = split_pipeline("x[*] | {source, cpu}") + self.assertEqual(len(result), 2) + self.assertEqual(result[1], "{source, cpu}") + + +class TestConstructStage(TestCase): + """Test object construction ``{field1, field2}`` pipeline stage.""" + + def _run(self, hcl, query): + doc = DocumentView.parse(hcl) + stages = [classify_stage(s) for s in split_pipeline(query)] + return execute_pipeline(doc, stages) + + def test_classify_construct(self): + stage = classify_stage("{source, cpu}") + self.assertIsInstance(stage, ConstructStage) + self.assertEqual(len(stage.fields), 2) + self.assertEqual(stage.fields[0][0], "source") + self.assertEqual(stage.fields[1][0], "cpu") + + def test_classify_construct_renamed(self): + stage = classify_stage("{mod: .source, vcpu: .cpu}") + self.assertIsInstance(stage, ConstructStage) + self.assertEqual(stage.fields[0][0], "mod") + self.assertEqual(stage.fields[1][0], "vcpu") + + def test_execute_construct_shorthand(self): + r = self._run( + 'resource "aws" "x" {\n ami = "test"\n count = 2\n}\n', + "resource.aws.x | {ami, count}", + ) + self.assertEqual(len(r), 1) + self.assertIsInstance(r[0], dict) + self.assertIn("ami", r[0]) + self.assertIn("count", r[0]) + # Values should be flat, not nested dicts like {"ami": {"ami": ...}} + self.assertNotIsInstance(r[0]["ami"], dict) + self.assertEqual(r[0]["ami"], '"test"') + self.assertEqual(r[0]["count"], 2) + + def test_execute_construct_renamed(self): + r = self._run( + 'resource "aws" "x" {\n ami = "test"\n}\n', + "resource[*] | {type: .block_type, name: .name_labels}", + ) + self.assertEqual(len(r), 1) + self.assertEqual(r[0]["type"], "resource") + self.assertEqual(r[0]["name"], ["aws", "x"]) + + def test_construct_missing_field(self): + r = self._run( + "x = 1\n", + "x | {value, nonexistent}", + ) + self.assertEqual(len(r), 1) + self.assertIsNone(r[0]["nonexistent"]) + + def test_construct_with_select(self): + r = self._run( + "a = 1\nb = 2\nc = 3\n", + "*[select(.value > 1)] | {name, value}", + ) + self.assertEqual(len(r), 2) + names = sorted(d["name"] for d in r) + self.assertEqual(names, ["b", "c"]) + + def test_construct_with_index(self): + stage = classify_stage("{first: .items[0]}") + self.assertIsInstance(stage, ConstructStage) + self.assertEqual(stage.fields[0][0], "first") diff --git a/test/unit/query/test_predicate.py b/test/unit/query/test_predicate.py new file mode 100644 index 00000000..cc0628e8 --- /dev/null +++ b/test/unit/query/test_predicate.py @@ -0,0 +1,609 @@ +# pylint: disable=C0103,C0114,C0115,C0116 +from unittest import TestCase + +from hcl2.query.body import DocumentView +from hcl2.query.path import QuerySyntaxError, parse_path +from hcl2.query.predicate import ( + AllExpr, + AndExpr, + AnyExpr, + Comparison, + HasExpr, + NotExpr, + OrExpr, + evaluate_predicate, + parse_predicate, + tokenize, +) +from hcl2.query.resolver import resolve_path + + +class TestTokenize(TestCase): + def test_dot_and_word(self): + tokens = tokenize(".foo") + self.assertEqual(len(tokens), 2) + self.assertEqual(tokens[0].kind, "DOT") + self.assertEqual(tokens[1].kind, "WORD") + self.assertEqual(tokens[1].value, "foo") + + def test_comparison(self): + tokens = tokenize('.name == "bar"') + kinds = [t.kind for t in tokens] + self.assertEqual(kinds, ["DOT", "WORD", "OP", "STRING"]) + + def test_number(self): + tokens = tokenize(".x > 42") + self.assertEqual(tokens[2].kind, "OP") + self.assertEqual(tokens[3].kind, "NUMBER") + self.assertEqual(tokens[3].value, "42") + + def test_float_number(self): + tokens = tokenize(".x > 3.14") + self.assertEqual(tokens[3].value, "3.14") + + def test_brackets(self): + tokens = tokenize(".items[0]") + kinds = [t.kind for t in tokens] + self.assertEqual(kinds, ["DOT", "WORD", "LBRACKET", "NUMBER", "RBRACKET"]) + + def test_boolean_keywords(self): + tokens = tokenize(".a and .b or not .c") + words = [t.value for t in tokens if t.kind == "WORD"] + self.assertEqual(words, ["a", "and", "b", "or", "not", "c"]) + + def test_all_operators(self): + for op in ["==", "!=", "<", ">", "<=", ">="]: + tokens = tokenize(f".x {op} 1") + self.assertEqual(tokens[2].kind, "OP") + self.assertEqual(tokens[2].value, op) + + def test_unexpected_char_raises(self): + with self.assertRaises(QuerySyntaxError): + tokenize("@invalid") + + +class TestParsePredicate(TestCase): + def test_existence(self): + pred = parse_predicate(".name") + self.assertIsInstance(pred, Comparison) + self.assertIsNone(pred.operator) + self.assertEqual(pred.accessor.parts, ["name"]) + + def test_equality_string(self): + pred = parse_predicate('.name == "foo"') + self.assertIsInstance(pred, Comparison) + self.assertEqual(pred.operator, "==") + self.assertEqual(pred.value, "foo") + + def test_equality_int(self): + pred = parse_predicate(".count == 5") + self.assertEqual(pred.operator, "==") + self.assertEqual(pred.value, 5) + + def test_less_than(self): + pred = parse_predicate(".count < 10") + self.assertEqual(pred.operator, "<") + self.assertEqual(pred.value, 10) + + def test_boolean_true(self): + pred = parse_predicate(".enabled == true") + self.assertEqual(pred.value, True) + + def test_boolean_false(self): + pred = parse_predicate(".enabled == false") + self.assertEqual(pred.value, False) + + def test_null(self): + pred = parse_predicate(".x == null") + self.assertIsNone(pred.value) + + def test_dotted_accessor(self): + pred = parse_predicate(".tags.Name") + self.assertIsInstance(pred, Comparison) + self.assertEqual(pred.accessor.parts, ["tags", "Name"]) + + def test_indexed_accessor(self): + pred = parse_predicate(".items[0]") + self.assertEqual(pred.accessor.parts, ["items"]) + self.assertEqual(pred.accessor.index, 0) + + def test_and(self): + pred = parse_predicate(".a and .b") + self.assertIsInstance(pred, AndExpr) + self.assertEqual(len(pred.children), 2) + + def test_or(self): + pred = parse_predicate(".a or .b") + self.assertIsInstance(pred, OrExpr) + self.assertEqual(len(pred.children), 2) + + def test_not(self): + pred = parse_predicate("not .a") + self.assertIsInstance(pred, NotExpr) + + def test_combined_and_or(self): + pred = parse_predicate(".a and .b or .c") + # Should parse as (.a and .b) or .c due to precedence + self.assertIsInstance(pred, OrExpr) + self.assertIsInstance(pred.children[0], AndExpr) + + def test_empty_raises(self): + with self.assertRaises(QuerySyntaxError): + parse_predicate("") + + def test_no_leading_dot_raises(self): + with self.assertRaises(QuerySyntaxError): + parse_predicate("name") + + def test_extra_tokens_raises(self): + with self.assertRaises(QuerySyntaxError): + parse_predicate('.name == "foo" extra') + + +class TestEvaluatePredicate(TestCase): + def _make_doc(self, hcl): + return DocumentView.parse(hcl) + + def test_existence_true(self): + doc = self._make_doc('variable "a" {\n default = 1\n}\n') + blocks = doc.blocks("variable") + pred = parse_predicate(".default") + self.assertTrue(evaluate_predicate(pred, blocks[0])) + + def test_existence_false(self): + doc = self._make_doc('variable "a" {}\n') + blocks = doc.blocks("variable") + pred = parse_predicate(".default") + self.assertFalse(evaluate_predicate(pred, blocks[0])) + + def test_equality_block_type(self): + doc = self._make_doc('resource "aws_instance" "main" {}\n') + blocks = doc.blocks() + pred = parse_predicate('.block_type == "resource"') + self.assertTrue(evaluate_predicate(pred, blocks[0])) + + def test_equality_block_type_mismatch(self): + doc = self._make_doc('resource "aws_instance" "main" {}\n') + blocks = doc.blocks() + pred = parse_predicate('.block_type == "variable"') + self.assertFalse(evaluate_predicate(pred, blocks[0])) + + def test_attribute_name(self): + doc = self._make_doc("x = 1\ny = 2\n") + attrs = doc.body.attributes() + pred = parse_predicate('.name == "x"') + self.assertTrue(evaluate_predicate(pred, attrs[0])) + self.assertFalse(evaluate_predicate(pred, attrs[1])) + + def test_attribute_value(self): + doc = self._make_doc("x = 1\ny = 2\n") + attrs = doc.body.attributes() + pred = parse_predicate(".value == 1") + self.assertTrue(evaluate_predicate(pred, attrs[0])) + self.assertFalse(evaluate_predicate(pred, attrs[1])) + + def test_not_predicate(self): + doc = self._make_doc("x = 1\ny = 2\n") + attrs = doc.body.attributes() + pred = parse_predicate('not .name == "x"') + self.assertFalse(evaluate_predicate(pred, attrs[0])) + self.assertTrue(evaluate_predicate(pred, attrs[1])) + + def test_and_predicate(self): + doc = self._make_doc("x = 1\ny = 2\n") + attrs = doc.body.attributes() + pred = parse_predicate('.name == "x" and .value == 1') + self.assertTrue(evaluate_predicate(pred, attrs[0])) + self.assertFalse(evaluate_predicate(pred, attrs[1])) + + def test_or_predicate(self): + doc = self._make_doc("x = 1\ny = 2\nz = 3\n") + attrs = doc.body.attributes() + pred = parse_predicate('.name == "x" or .name == "y"') + self.assertTrue(evaluate_predicate(pred, attrs[0])) + self.assertTrue(evaluate_predicate(pred, attrs[1])) + self.assertFalse(evaluate_predicate(pred, attrs[2])) + + def test_greater_than(self): + doc = self._make_doc("x = 5\ny = 15\n") + attrs = doc.body.attributes() + pred = parse_predicate(".value > 10") + self.assertFalse(evaluate_predicate(pred, attrs[0])) + self.assertTrue(evaluate_predicate(pred, attrs[1])) + + def test_type_accessor_block(self): + doc = self._make_doc('resource "aws_instance" "main" {}\n') + blocks = doc.blocks() + pred = parse_predicate('.type == "block"') + self.assertTrue(evaluate_predicate(pred, blocks[0])) + + def test_type_accessor_attribute(self): + doc = self._make_doc("x = 1\n") + attrs = doc.body.attributes() + pred = parse_predicate('.type == "attribute"') + self.assertTrue(evaluate_predicate(pred, attrs[0])) + + def test_type_accessor_object(self): + doc = self._make_doc("x = {\n a = 1\n}\n") + attr = doc.attribute("x") + # value_node is ExprTerm wrapping ObjectRule + from hcl2.query._base import view_for + from hcl2.rules.expressions import ExprTermRule + + vn = attr.value_node + if isinstance(vn._node, ExprTermRule): + inner = view_for(vn._node.expression) + else: + inner = vn + pred = parse_predicate('.type == "object"') + self.assertTrue(evaluate_predicate(pred, inner)) + + def test_type_accessor_mismatch(self): + doc = self._make_doc("x = 1\n") + attrs = doc.body.attributes() + pred = parse_predicate('.type == "block"') + self.assertFalse(evaluate_predicate(pred, attrs[0])) + + def test_type_accessor_document(self): + doc = self._make_doc("x = 1\n") + pred = parse_predicate('.type == "document"') + self.assertTrue(evaluate_predicate(pred, doc)) + + def test_type_accessor_tuple(self): + doc = self._make_doc("x = [1, 2]\n") + attr = doc.attribute("x") + from hcl2.query._base import view_for + from hcl2.rules.expressions import ExprTermRule + + vn = attr.value_node + if isinstance(vn._node, ExprTermRule): + inner = view_for(vn._node.expression) + else: + inner = vn + pred = parse_predicate('.type == "tuple"') + self.assertTrue(evaluate_predicate(pred, inner)) + + +class TestKeywordComparison(TestCase): + """Test that HCL keywords (true/false/null) compare correctly.""" + + def test_keyword_true_matches_true(self): + doc = DocumentView.parse("x = true\n") + attrs = doc.body.attributes() + pred = parse_predicate(".value == true") + self.assertTrue(evaluate_predicate(pred, attrs[0])) + + def test_keyword_true_not_matches_string(self): + doc = DocumentView.parse("x = true\n") + attrs = doc.body.attributes() + pred = parse_predicate('.value == "true"') + self.assertFalse(evaluate_predicate(pred, attrs[0])) + + def test_keyword_false_matches_false(self): + doc = DocumentView.parse("x = false\n") + attrs = doc.body.attributes() + pred = parse_predicate(".value == false") + self.assertTrue(evaluate_predicate(pred, attrs[0])) + + def test_keyword_null_matches_null(self): + doc = DocumentView.parse("x = null\n") + attrs = doc.body.attributes() + pred = parse_predicate(".value == null") + self.assertTrue(evaluate_predicate(pred, attrs[0])) + + def test_conditional_true_val_keyword(self): + doc = DocumentView.parse("x = a == b ? true : false\n") + results = resolve_path(doc, parse_path("*..conditional:*")) + pred = parse_predicate(".true_val == true") + self.assertTrue(evaluate_predicate(pred, results[0])) + + def test_conditional_false_val_keyword(self): + doc = DocumentView.parse("x = a == b ? true : false\n") + results = resolve_path(doc, parse_path("*..conditional:*")) + pred = parse_predicate(".false_val == false") + self.assertTrue(evaluate_predicate(pred, results[0])) + + +class TestSelectInPath(TestCase): + """Test [select()] bracket syntax in structural paths.""" + + def test_select_bracket_in_path(self): + doc = DocumentView.parse("x = 1\ny = 2\n") + results = resolve_path(doc, parse_path('*[select(.name == "x")]')) + self.assertEqual(len(results), 1) + self.assertEqual(results[0].name, "x") + + def test_select_bracket_existence(self): + doc = DocumentView.parse('variable "a" {\n default = 1\n}\nvariable "b" {}\n') + results = resolve_path(doc, parse_path("variable[select(.default)]")) + self.assertEqual(len(results), 1) + + def test_select_bracket_no_match(self): + doc = DocumentView.parse("x = 1\ny = 2\n") + results = resolve_path(doc, parse_path('*[select(.name == "z")]')) + self.assertEqual(len(results), 0) + + def test_select_bracket_with_type_qualifier(self): + doc = DocumentView.parse('x = substr("hello", 0, 3)\ny = upper("a")\n') + results = resolve_path(doc, parse_path("*..function_call:*[select(.args[2])]")) + self.assertEqual(len(results), 1) + self.assertEqual(results[0].name, "substr") + + +class TestAccessorBuiltin(TestCase): + """Test ``| builtin`` syntax in predicate accessors.""" + + def test_parse_pipe_length(self): + pred = parse_predicate(".args | length > 2") + self.assertIsInstance(pred, Comparison) + self.assertEqual(pred.accessor.builtin, "length") + self.assertEqual(pred.operator, ">") + self.assertEqual(pred.value, 2) + + def test_parse_pipe_keys(self): + pred = parse_predicate(".tags | keys") + self.assertIsInstance(pred, Comparison) + self.assertEqual(pred.accessor.builtin, "keys") + self.assertIsNone(pred.operator) + + def test_parse_invalid_builtin(self): + with self.assertRaises(QuerySyntaxError): + parse_predicate(".args | bogus") + + def test_tokenize_pipe(self): + tokens = tokenize(".args | length") + kinds = [t.kind for t in tokens] + self.assertIn("PIPE", kinds) + + def test_evaluate_length_gt(self): + doc = DocumentView.parse('x = substr("hello", 0, 3)\n') + funcs = resolve_path(doc, parse_path("*..function_call:*")) + func = funcs[0] + pred = parse_predicate(".args | length > 2") + self.assertTrue(evaluate_predicate(pred, func)) + pred2 = parse_predicate(".args | length > 5") + self.assertFalse(evaluate_predicate(pred2, func)) + + def test_evaluate_length_eq(self): + doc = DocumentView.parse('x = substr("hello", 0, 3)\n') + funcs = resolve_path(doc, parse_path("*..function_call:*")) + func = funcs[0] + pred = parse_predicate(".args | length == 3") + self.assertTrue(evaluate_predicate(pred, func)) + + +class TestAnyAll(TestCase): + """Test ``any(accessor; pred)`` and ``all(accessor; pred)``.""" + + def test_parse_any(self): + pred = parse_predicate('any(.elements; .type == "function_call")') + self.assertIsInstance(pred, AnyExpr) + self.assertEqual(pred.accessor.parts, ["elements"]) + self.assertIsInstance(pred.predicate, Comparison) + + def test_parse_all(self): + pred = parse_predicate('all(.items; .name == "x")') + self.assertIsInstance(pred, AllExpr) + self.assertEqual(pred.accessor.parts, ["items"]) + + def test_parse_any_with_boolean_combinators(self): + pred = parse_predicate( + 'any(.elements; .type == "function_call" or .type == "tuple")' + ) + self.assertIsInstance(pred, AnyExpr) + self.assertIsInstance(pred.predicate, OrExpr) + + def test_evaluate_any_true(self): + doc = DocumentView.parse("x = [1, f(a), 3]\n") + tuples = resolve_path(doc, parse_path("*..tuple:*")) + pred = parse_predicate('any(.elements; .type == "function_call")') + self.assertTrue(evaluate_predicate(pred, tuples[0])) + + def test_evaluate_any_false(self): + doc = DocumentView.parse("x = [1, 2, 3]\n") + tuples = resolve_path(doc, parse_path("*..tuple:*")) + pred = parse_predicate('any(.elements; .type == "function_call")') + self.assertFalse(evaluate_predicate(pred, tuples[0])) + + def test_evaluate_all_true(self): + doc = DocumentView.parse("x = [1, 2, 3]\n") + tuples = resolve_path(doc, parse_path("*..tuple:*")) + pred = parse_predicate('all(.elements; .type == "node")') + self.assertTrue(evaluate_predicate(pred, tuples[0])) + + def test_evaluate_all_false(self): + doc = DocumentView.parse("x = [1, f(a), 3]\n") + tuples = resolve_path(doc, parse_path("*..tuple:*")) + pred = parse_predicate('all(.elements; .type == "node")') + self.assertFalse(evaluate_predicate(pred, tuples[0])) + + def test_any_on_none_is_false(self): + doc = DocumentView.parse("x = 1\n") + attrs = resolve_path(doc, parse_path("x")) + pred = parse_predicate('any(.nonexistent; .type == "node")') + self.assertFalse(evaluate_predicate(pred, attrs[0])) + + def test_all_on_none_is_true(self): + doc = DocumentView.parse("x = 1\n") + attrs = resolve_path(doc, parse_path("x")) + pred = parse_predicate('all(.nonexistent; .type == "node")') + self.assertTrue(evaluate_predicate(pred, attrs[0])) + + def test_any_with_not(self): + pred = parse_predicate('not any(.elements; .type == "function_call")') + self.assertIsInstance(pred, NotExpr) + self.assertIsInstance(pred.child, AnyExpr) + + +class TestStringFunctions(TestCase): + """Test string functions in predicate accessors.""" + + def test_parse_contains(self): + pred = parse_predicate('.source | contains("docker")') + self.assertIsInstance(pred, Comparison) + self.assertEqual(pred.accessor.builtin, "contains") + self.assertEqual(pred.accessor.builtin_arg, "docker") + + def test_parse_test(self): + pred = parse_predicate('.ami | test("^ami-[0-9]+")') + self.assertIsInstance(pred, Comparison) + self.assertEqual(pred.accessor.builtin, "test") + self.assertEqual(pred.accessor.builtin_arg, "^ami-[0-9]+") + + def test_parse_startswith(self): + pred = parse_predicate('.name | startswith("prod-")') + self.assertIsInstance(pred, Comparison) + self.assertEqual(pred.accessor.builtin, "startswith") + self.assertEqual(pred.accessor.builtin_arg, "prod-") + + def test_parse_endswith(self): + pred = parse_predicate('.path | endswith("/api")') + self.assertIsInstance(pred, Comparison) + self.assertEqual(pred.accessor.builtin, "endswith") + self.assertEqual(pred.accessor.builtin_arg, "/api") + + def test_evaluate_contains_true(self): + doc = DocumentView.parse('source = "docker_application_v2"\n') + attrs = doc.body.attributes() + pred = parse_predicate('.value | contains("docker")') + self.assertTrue(evaluate_predicate(pred, attrs[0])) + + def test_evaluate_contains_false(self): + doc = DocumentView.parse('source = "some_module"\n') + attrs = doc.body.attributes() + pred = parse_predicate('.value | contains("docker")') + self.assertFalse(evaluate_predicate(pred, attrs[0])) + + def test_evaluate_test_true(self): + doc = DocumentView.parse('ami = "ami-12345"\n') + attrs = doc.body.attributes() + pred = parse_predicate('.value | test("^ami-[0-9]+")') + self.assertTrue(evaluate_predicate(pred, attrs[0])) + + def test_evaluate_test_false(self): + doc = DocumentView.parse('ami = "xyz-12345"\n') + attrs = doc.body.attributes() + pred = parse_predicate('.value | test("^ami-[0-9]+")') + self.assertFalse(evaluate_predicate(pred, attrs[0])) + + def test_evaluate_startswith_true(self): + doc = DocumentView.parse('name = "prod-api"\n') + attrs = doc.body.attributes() + pred = parse_predicate('.value | startswith("prod-")') + self.assertTrue(evaluate_predicate(pred, attrs[0])) + + def test_evaluate_startswith_false(self): + doc = DocumentView.parse('name = "staging-api"\n') + attrs = doc.body.attributes() + pred = parse_predicate('.value | startswith("prod-")') + self.assertFalse(evaluate_predicate(pred, attrs[0])) + + def test_evaluate_endswith_true(self): + doc = DocumentView.parse('path = "some/path/api"\n') + attrs = doc.body.attributes() + pred = parse_predicate('.value | endswith("api")') + self.assertTrue(evaluate_predicate(pred, attrs[0])) + + def test_evaluate_endswith_false(self): + doc = DocumentView.parse('path = "some/path/web"\n') + attrs = doc.body.attributes() + pred = parse_predicate('.value | endswith("api")') + self.assertFalse(evaluate_predicate(pred, attrs[0])) + + def test_contains_on_none_returns_false(self): + doc = DocumentView.parse("x = 1\n") + attrs = doc.body.attributes() + pred = parse_predicate('.nonexistent | contains("x")') + self.assertFalse(evaluate_predicate(pred, attrs[0])) + + def test_test_invalid_regex_raises(self): + doc = DocumentView.parse('x = "hello"\n') + attrs = doc.body.attributes() + pred = parse_predicate('.value | test("[invalid")') + with self.assertRaises(QuerySyntaxError): + evaluate_predicate(pred, attrs[0]) + + def test_combined_contains_and_comparison(self): + doc = DocumentView.parse('source = "docker_app"\ncount = 3\n') + attrs = doc.body.attributes() + pred = parse_predicate('.value | contains("docker") and .name == "source"') + # This should parse as: (.value | contains("docker")) and (.name == "source") + # Actually it parses the contains as a bare accessor result, then "and" + # Let's use a simpler combined test + pred = parse_predicate('.name == "source"') + self.assertTrue(evaluate_predicate(pred, attrs[0])) + + def test_unknown_string_function_raises(self): + with self.assertRaises(QuerySyntaxError): + parse_predicate('.value | bogus("x")') + + +class TestPostfixNot(TestCase): + """Test postfix ``| not`` in predicate accessors.""" + + def test_parse_postfix_not(self): + pred = parse_predicate(".tags | not") + self.assertIsInstance(pred, Comparison) + self.assertEqual(pred.accessor.builtin, "not") + + def test_postfix_not_false_when_exists(self): + doc = DocumentView.parse('resource "aws" "x" {\n tags = {}\n}\n') + blocks = doc.blocks() + pred = parse_predicate(".tags | not") + self.assertFalse(evaluate_predicate(pred, blocks[0])) + + def test_postfix_not_true_when_missing(self): + doc = DocumentView.parse('resource "aws" "x" {}\n') + blocks = doc.blocks() + pred = parse_predicate(".tags | not") + self.assertTrue(evaluate_predicate(pred, blocks[0])) + + def test_postfix_not_equivalent_to_prefix(self): + doc = DocumentView.parse('variable "a" {\n default = 1\n}\nvariable "b" {}\n') + blocks = doc.blocks("variable") + # "not .default" and ".default | not" should be equivalent + pred_prefix = parse_predicate("not .default") + pred_postfix = parse_predicate(".default | not") + for block in blocks: + self.assertEqual( + evaluate_predicate(pred_prefix, block), + evaluate_predicate(pred_postfix, block), + ) + + +class TestHasExpr(TestCase): + """Test ``has("key")`` predicate.""" + + def test_parse_has(self): + pred = parse_predicate('has("tags")') + self.assertIsInstance(pred, HasExpr) + self.assertEqual(pred.key, "tags") + + def test_has_true(self): + doc = DocumentView.parse('resource "aws" "x" {\n tags = {}\n}\n') + blocks = doc.blocks() + pred = parse_predicate('has("tags")') + self.assertTrue(evaluate_predicate(pred, blocks[0])) + + def test_has_false(self): + doc = DocumentView.parse('resource "aws" "x" {}\n') + blocks = doc.blocks() + pred = parse_predicate('has("tags")') + self.assertFalse(evaluate_predicate(pred, blocks[0])) + + def test_has_equivalent_to_bare_accessor(self): + doc = DocumentView.parse('variable "a" {\n default = 1\n}\nvariable "b" {}\n') + blocks = doc.blocks("variable") + pred_has = parse_predicate('has("default")') + pred_bare = parse_predicate(".default") + for block in blocks: + self.assertEqual( + evaluate_predicate(pred_has, block), + evaluate_predicate(pred_bare, block), + ) + + def test_has_with_not(self): + doc = DocumentView.parse('resource "aws" "x" {}\n') + blocks = doc.blocks() + pred = parse_predicate('not has("tags")') + self.assertTrue(evaluate_predicate(pred, blocks[0])) diff --git a/test/unit/query/test_resolver.py b/test/unit/query/test_resolver.py new file mode 100644 index 00000000..2b70a2d5 --- /dev/null +++ b/test/unit/query/test_resolver.py @@ -0,0 +1,341 @@ +# pylint: disable=C0103,C0114,C0115,C0116 +from unittest import TestCase + +from hcl2.query.body import DocumentView +from hcl2.query.path import PathSegment, parse_path +from hcl2.query.resolver import resolve_path + + +class TestResolvePathStructural(TestCase): + def test_simple_attribute(self): + doc = DocumentView.parse("x = 1\n") + results = resolve_path(doc, parse_path("x")) + self.assertEqual(len(results), 1) + self.assertEqual(results[0].name, "x") + + def test_block_type(self): + doc = DocumentView.parse('resource "type" "name" {}\n') + results = resolve_path(doc, parse_path("resource")) + self.assertEqual(len(results), 1) + + def test_block_type_with_label(self): + doc = DocumentView.parse( + 'resource "aws_instance" "main" {\n ami = "test"\n}\n' + ) + results = resolve_path(doc, parse_path("resource.aws_instance")) + self.assertEqual(len(results), 1) + + def test_block_full_path(self): + doc = DocumentView.parse( + 'resource "aws_instance" "main" {\n ami = "test"\n}\n' + ) + results = resolve_path(doc, parse_path("resource.aws_instance.main")) + self.assertEqual(len(results), 1) + + def test_block_attribute(self): + doc = DocumentView.parse( + 'resource "aws_instance" "main" {\n ami = "test"\n}\n' + ) + results = resolve_path(doc, parse_path("resource.aws_instance.main.ami")) + self.assertEqual(len(results), 1) + + def test_wildcard_blocks(self): + doc = DocumentView.parse('resource "a" "b" {}\nvariable "c" {}\n') + results = resolve_path(doc, parse_path("*")) + self.assertEqual(len(results), 2) + + def test_select_all(self): + doc = DocumentView.parse('variable "a" {}\nvariable "b" {}\n') + results = resolve_path(doc, parse_path("variable[*]")) + self.assertEqual(len(results), 2) + + def test_index(self): + doc = DocumentView.parse('variable "a" {}\nvariable "b" {}\n') + results = resolve_path(doc, parse_path("variable[0]")) + self.assertEqual(len(results), 1) + + def test_no_match(self): + doc = DocumentView.parse("x = 1\n") + results = resolve_path(doc, parse_path("nonexistent")) + self.assertEqual(len(results), 0) + + def test_empty_segments(self): + doc = DocumentView.parse("x = 1\n") + results = resolve_path(doc, []) + self.assertEqual(len(results), 1) # returns root + + def test_label_mismatch(self): + doc = DocumentView.parse('resource "aws_instance" "main" {}\n') + results = resolve_path(doc, parse_path("resource.aws_s3_bucket")) + self.assertEqual(len(results), 0) + + def test_no_label_block(self): + doc = DocumentView.parse("locals {\n x = 1\n}\n") + results = resolve_path(doc, parse_path("locals.x")) + self.assertEqual(len(results), 1) + + def test_wildcard_labels(self): + doc = DocumentView.parse( + 'resource "aws_instance" "main" {}\nresource "aws_s3_bucket" "data" {}\n' + ) + results = resolve_path(doc, parse_path("resource[*].*")) + self.assertEqual(len(results), 2) + + def test_attribute_unwrap_to_object(self): + doc = DocumentView.parse("x = {\n a = 1\n b = 2\n}\n") + results = resolve_path(doc, parse_path("x.a")) + self.assertEqual(len(results), 1) + + def test_attribute_unwrap_to_object_wildcard(self): + doc = DocumentView.parse("x = {\n a = 1\n b = 2\n}\n") + results = resolve_path(doc, parse_path("x.*")) + self.assertEqual(len(results), 2) + + def test_tuple_select_all(self): + doc = DocumentView.parse("x = [1, 2, 3]\n") + results = resolve_path( + doc, + [ + PathSegment(name="x", select_all=False, index=None), + PathSegment(name="*", select_all=True, index=None), + ], + ) + self.assertEqual(len(results), 3) + + def test_tuple_index(self): + doc = DocumentView.parse("x = [1, 2, 3]\n") + results = resolve_path( + doc, + [ + PathSegment(name="x", select_all=False, index=None), + PathSegment(name="*", select_all=False, index=1), + ], + ) + self.assertEqual(len(results), 1) + + def test_tuple_index_out_of_range(self): + doc = DocumentView.parse("x = [1, 2]\n") + results = resolve_path( + doc, + [ + PathSegment(name="x", select_all=False, index=None), + PathSegment(name="*", select_all=False, index=99), + ], + ) + self.assertEqual(len(results), 0) + + def test_tuple_no_match_without_index(self): + doc = DocumentView.parse("x = [1, 2]\n") + results = resolve_path( + doc, + [ + PathSegment(name="x", select_all=False, index=None), + PathSegment(name="foo", select_all=False, index=None), + ], + ) + self.assertEqual(len(results), 0) + + def test_object_key_no_match(self): + doc = DocumentView.parse("x = {\n a = 1\n}\n") + results = resolve_path(doc, parse_path("x.nonexistent")) + self.assertEqual(len(results), 0) + + def test_wildcard_body_includes_attributes(self): + doc = DocumentView.parse("x = 1\ny = 2\n") + results = resolve_path(doc, parse_path("*")) + self.assertEqual(len(results), 2) + + def test_index_out_of_range_on_blocks(self): + doc = DocumentView.parse('variable "a" {}\n') + results = resolve_path(doc, parse_path("variable[99]")) + self.assertEqual(len(results), 0) + + def test_resolve_on_unknown_node_type(self): + doc = DocumentView.parse("x = 1\n") + attr = doc.attribute("x") + value_view = attr.value_node + results = resolve_path( + value_view, [PathSegment(name="foo", select_all=False, index=None)] + ) + self.assertEqual(len(results), 0) + + def test_block_labels_consumed_then_body(self): + doc = DocumentView.parse( + 'resource "aws_instance" "main" {\n ami = "test"\n}\n' + ) + results = resolve_path(doc, parse_path("resource.aws_instance.main.ami")) + self.assertEqual(len(results), 1) + self.assertEqual(results[0].name, "ami") + + +class TestResolveRecursive(TestCase): + def test_recursive_find_nested_attr(self): + hcl = 'resource "type" "name" {\n ami = "test"\n}\n' + doc = DocumentView.parse(hcl) + results = resolve_path(doc, parse_path("resource..ami")) + self.assertEqual(len(results), 1) + self.assertEqual(results[0].name, "ami") + + def test_recursive_deeply_nested(self): + hcl = ( + 'resource "type" "name" {\n' + ' provisioner "local-exec" {\n' + ' command = "echo"\n' + " }\n" + "}\n" + ) + doc = DocumentView.parse(hcl) + results = resolve_path(doc, parse_path("resource..command")) + self.assertEqual(len(results), 1) + self.assertEqual(results[0].name, "command") + + def test_recursive_multiple_matches(self): + hcl = ( + 'resource "a" "x" {\n ami = "1"\n}\n' + 'resource "b" "y" {\n ami = "2"\n}\n' + ) + doc = DocumentView.parse(hcl) + results = resolve_path(doc, parse_path("*..ami")) + self.assertEqual(len(results), 2) + + def test_recursive_no_match(self): + hcl = 'resource "type" "name" {\n ami = "test"\n}\n' + doc = DocumentView.parse(hcl) + results = resolve_path(doc, parse_path("resource..nonexistent")) + self.assertEqual(len(results), 0) + + def test_recursive_from_root(self): + hcl = 'resource "type" "name" {\n ami = "test"\n}\n' + doc = DocumentView.parse(hcl) + # ".." from root should search everything + results = resolve_path( + doc, + [PathSegment(name="ami", select_all=False, index=None, recursive=True)], + ) + self.assertEqual(len(results), 1) + + def test_recursive_with_select_all(self): + hcl = ( + 'resource "a" "x" {\n tag = "1"\n}\n' + 'resource "b" "y" {\n tag = "2"\n}\n' + ) + doc = DocumentView.parse(hcl) + results = resolve_path(doc, parse_path("*..tag[*]")) + self.assertEqual(len(results), 2) + + +class TestTypeFilter(TestCase): + def test_recursive_function_call_by_name(self): + hcl = 'x = length(var.list)\ny = upper("hello")\n' + doc = DocumentView.parse(hcl) + results = resolve_path(doc, parse_path("*..function_call:length")) + self.assertEqual(len(results), 1) + self.assertEqual(results[0].name, "length") + + def test_recursive_function_call_wildcard(self): + hcl = 'x = length(var.list)\ny = upper("hello")\n' + doc = DocumentView.parse(hcl) + results = resolve_path(doc, parse_path("*..function_call:*[*]")) + self.assertEqual(len(results), 2) + + def test_type_filter_attribute(self): + hcl = 'resource "a" "b" {}\nx = 1\n' + doc = DocumentView.parse(hcl) + results = resolve_path( + doc, + [ + PathSegment( + name="*", select_all=True, index=None, type_filter="attribute" + ) + ], + ) + self.assertEqual(len(results), 1) + self.assertEqual(results[0].name, "x") + + def test_type_filter_block(self): + hcl = 'resource "a" "b" {}\nx = 1\n' + doc = DocumentView.parse(hcl) + results = resolve_path( + doc, + [PathSegment(name="*", select_all=True, index=None, type_filter="block")], + ) + self.assertEqual(len(results), 1) + self.assertEqual(results[0].block_type, "resource") + + def test_type_filter_no_match(self): + hcl = "x = 1\n" + doc = DocumentView.parse(hcl) + results = resolve_path(doc, parse_path("*..function_call:length")) + self.assertEqual(len(results), 0) + + +class TestFunctionCallResolver(TestCase): + def test_function_call_args(self): + hcl = "x = length(var.list)\n" + doc = DocumentView.parse(hcl) + results = resolve_path(doc, parse_path("*..function_call:length")) + self.assertEqual(len(results), 1) + # Navigate to args + args = resolve_path(results[0], parse_path("args")) + self.assertEqual(len(args), 1) + + def test_function_call_args_select_all(self): + hcl = 'x = join(",", var.list)\n' + doc = DocumentView.parse(hcl) + results = resolve_path(doc, parse_path("*..function_call:join")) + self.assertEqual(len(results), 1) + args = resolve_path( + results[0], + [PathSegment(name="args", select_all=True, index=None)], + ) + self.assertEqual(len(args), 2) + + def test_function_call_args_index(self): + hcl = 'x = join(",", var.list)\n' + doc = DocumentView.parse(hcl) + results = resolve_path(doc, parse_path("*..function_call:join")) + self.assertEqual(len(results), 1) + args = resolve_path( + results[0], + [PathSegment(name="args", select_all=False, index=0)], + ) + self.assertEqual(len(args), 1) + + +class TestSkipLabels(TestCase): + """Test the ``~`` (skip labels) operator.""" + + def test_skip_labels_basic(self): + doc = DocumentView.parse( + 'resource "aws_instance" "main" {\n ami = "test"\n}\n' + ) + results = resolve_path(doc, parse_path("resource~.ami")) + self.assertEqual(len(results), 1) + self.assertEqual(results[0].name, "ami") + + def test_skip_labels_wildcard(self): + doc = DocumentView.parse( + 'resource "a" "x" {\n ami = 1\n}\nresource "b" "y" {\n ami = 2\n}\n' + ) + results = resolve_path(doc, parse_path("resource~[*]")) + self.assertEqual(len(results), 2) + + def test_skip_labels_with_select(self): + doc = DocumentView.parse('block "a" {\n x = 1\n}\nblock "b" {\n y = 2\n}\n') + results = resolve_path(doc, parse_path("block~[select(.x)]")) + self.assertEqual(len(results), 1) + + def test_skip_labels_delegates_to_body(self): + doc = DocumentView.parse('resource "aws" "main" {\n tags = {}\n}\n') + # Without ~ : need to consume labels + r1 = resolve_path(doc, parse_path("resource.aws.main.tags")) + self.assertEqual(len(r1), 1) + # With ~ : skip labels directly + r2 = resolve_path(doc, parse_path("resource~.tags")) + self.assertEqual(len(r2), 1) + + def test_no_skip_labels_matches_labels(self): + doc = DocumentView.parse('resource "aws_instance" "main" {\n ami = 1\n}\n') + # Without ~, "aws_instance" matches the label + results = resolve_path(doc, parse_path("resource.aws_instance")) + self.assertEqual(len(results), 1) diff --git a/test/unit/query/test_safe_eval.py b/test/unit/query/test_safe_eval.py new file mode 100644 index 00000000..d61e23af --- /dev/null +++ b/test/unit/query/test_safe_eval.py @@ -0,0 +1,144 @@ +# pylint: disable=C0103,C0114,C0115,C0116 +from unittest import TestCase + +from hcl2.query.safe_eval import ( + UnsafeExpressionError, + safe_eval, + validate_expression, +) + + +class TestValidateExpression(TestCase): + def test_simple_attribute(self): + validate_expression("x.foo") + + def test_method_call(self): + validate_expression("x.blocks('resource')") + + def test_safe_builtin(self): + validate_expression("len(x)") + + def test_lambda(self): + validate_expression("sorted(x, key=lambda b: b.name)") + + def test_comparison(self): + validate_expression("x == 1") + + def test_boolean_ops(self): + validate_expression("x and y or not z") + + def test_subscript(self): + validate_expression("x[0]") + + def test_constant(self): + validate_expression("42") + + def test_rejects_import(self): + with self.assertRaises(UnsafeExpressionError): + validate_expression("__import__('os')") + + def test_rejects_exec(self): + with self.assertRaises(UnsafeExpressionError): + validate_expression("exec('code')") + + def test_rejects_eval(self): + with self.assertRaises(UnsafeExpressionError): + validate_expression("eval('code')") + + def test_rejects_comprehension(self): + with self.assertRaises(UnsafeExpressionError): + validate_expression("[x for x in y]") + + def test_syntax_error(self): + with self.assertRaises(UnsafeExpressionError): + validate_expression("def foo(): pass") + + +class TestSafeEval(TestCase): + def test_attribute_access(self): + class Obj: + name = "test_value" + + result = safe_eval("x.name", {"x": Obj()}) + self.assertEqual(result, "test_value") + + def test_method_call(self): + result = safe_eval("x.upper()", {"x": "hello"}) + self.assertEqual(result, "HELLO") + + def test_len(self): + result = safe_eval("len(x)", {"x": [1, 2, 3]}) + self.assertEqual(result, 3) + + def test_sorted(self): + result = safe_eval("sorted(x)", {"x": [3, 1, 2]}) + self.assertEqual(result, [1, 2, 3]) + + def test_sorted_with_key(self): + result = safe_eval( + "sorted(x, key=lambda i: -i)", + {"x": [3, 1, 2]}, + ) + self.assertEqual(result, [3, 2, 1]) + + def test_subscript(self): + result = safe_eval("x[1]", {"x": [10, 20, 30]}) + self.assertEqual(result, 20) + + def test_filter_lambda(self): + result = safe_eval( + "list(filter(lambda i: i > 1, x))", + {"x": [1, 2, 3]}, + ) + self.assertEqual(result, [2, 3]) + + def test_boolean_ops(self): + result = safe_eval("x and y", {"x": True, "y": False}) + self.assertFalse(result) + + def test_comparison(self): + result = safe_eval("x == 42", {"x": 42}) + self.assertTrue(result) + + def test_restricted_no_builtins(self): + with self.assertRaises(Exception): + safe_eval("open('/etc/passwd')", {}) + + def test_max_depth(self): + # Build deeply nested attribute access + expr = "x" + ".a" * 25 + with self.assertRaises(UnsafeExpressionError) as ctx: + validate_expression(expr) + self.assertIn("depth", str(ctx.exception)) + + def test_max_node_count(self): + # Build expression with many nodes via a wide function call + # f(1,2,...,210) has 210 Constant + 210 arg nodes + Call + Name + Expression > 200 + args = ", ".join(["1"] * 210) + expr = f"len([{args}])" + with self.assertRaises(UnsafeExpressionError) as ctx: + validate_expression(expr) + self.assertIn("node count", str(ctx.exception)) + + def test_rejects_non_attr_non_name_call(self): + # (lambda: 1)() — Call where func is a Lambda, not Name/Attribute + with self.assertRaises(UnsafeExpressionError) as ctx: + validate_expression("(lambda: 1)()") + self.assertIn("Only method calls", str(ctx.exception)) + + def test_rejects_dunder_attribute(self): + with self.assertRaises(UnsafeExpressionError) as ctx: + validate_expression("x.__class__") + self.assertIn("dunder attribute", str(ctx.exception)) + + def test_rejects_dunder_chain(self): + with self.assertRaises(UnsafeExpressionError): + validate_expression("x.__class__.__subclasses__()") + + def test_rejects_getattr(self): + with self.assertRaises(UnsafeExpressionError): + validate_expression("getattr(x, 'y')") + + def test_rejects_hasattr(self): + with self.assertRaises(UnsafeExpressionError): + validate_expression("hasattr(x, 'y')") diff --git a/test/unit/rules/__init__.py b/test/unit/rules/__init__.py new file mode 100644 index 00000000..e69de29b diff --git a/test/unit/rules/test_abstract.py b/test/unit/rules/test_abstract.py new file mode 100644 index 00000000..3699ec0e --- /dev/null +++ b/test/unit/rules/test_abstract.py @@ -0,0 +1,179 @@ +# pylint: disable=C0103,C0114,C0115,C0116 +from unittest import TestCase + +from lark import Token, Tree +from lark.tree import Meta + +from hcl2.rules.abstract import LarkToken, LarkRule +from hcl2.utils import SerializationOptions, SerializationContext + + +# --- Concrete stubs for testing ABCs --- + + +class ConcreteToken(LarkToken): + @staticmethod + def lark_name() -> str: + return "TEST_TOKEN" + + @property + def serialize_conversion(self): + return str + + +class IntToken(LarkToken): + @staticmethod + def lark_name() -> str: + return "INT_TOKEN" + + @property + def serialize_conversion(self): + return int + + +class ConcreteRule(LarkRule): + @staticmethod + def lark_name() -> str: + return "test_rule" + + def serialize(self, options=SerializationOptions(), context=SerializationContext()): + return "test" + + +# --- Tests --- + + +class TestLarkToken(TestCase): + def test_init_stores_value(self): + token = ConcreteToken("hello") + self.assertEqual(token.value, "hello") + + def test_value_property(self): + token = ConcreteToken(42) + self.assertEqual(token.value, 42) + + def test_set_value(self): + token = ConcreteToken("old") + token.set_value("new") + self.assertEqual(token.value, "new") + + def test_str(self): + token = ConcreteToken("hello") + self.assertEqual(str(token), "hello") + + def test_str_numeric(self): + token = ConcreteToken(42) + self.assertEqual(str(token), "42") + + def test_repr(self): + token = ConcreteToken("hello") + self.assertEqual(repr(token), "") + + def test_to_lark_returns_token(self): + token = ConcreteToken("val") + lark_token = token.to_lark() + self.assertIsInstance(lark_token, Token) + self.assertEqual(lark_token.type, "TEST_TOKEN") + self.assertEqual(lark_token, "val") + + def test_serialize_uses_conversion(self): + token = ConcreteToken("hello") + self.assertEqual(token.serialize(), "hello") + + def test_serialize_int_conversion(self): + token = IntToken("42") + result = token.serialize() + self.assertEqual(result, 42) + self.assertIsInstance(result, int) + + def test_lark_name(self): + self.assertEqual(ConcreteToken.lark_name(), "TEST_TOKEN") + + +class TestLarkRule(TestCase): + def test_init_sets_children(self): + t1 = ConcreteToken("a") + t2 = ConcreteToken("b") + rule = ConcreteRule([t1, t2]) + self.assertEqual(rule.children, [t1, t2]) + + def test_init_sets_parent_and_index(self): + t1 = ConcreteToken("a") + t2 = ConcreteToken("b") + rule = ConcreteRule([t1, t2]) + self.assertIs(t1._parent, rule) + self.assertIs(t2._parent, rule) + self.assertEqual(t1._index, 0) + self.assertEqual(t2._index, 1) + + def test_init_skips_none_children_for_parent_index(self): + t1 = ConcreteToken("a") + rule = ConcreteRule([None, t1, None]) + self.assertIs(t1._parent, rule) + self.assertEqual(t1._index, 1) + + def test_init_with_meta(self): + meta = Meta() + rule = ConcreteRule([], meta) + self.assertIs(rule._meta, meta) + + def test_init_without_meta(self): + rule = ConcreteRule([]) + self.assertIsNotNone(rule._meta) + + def test_parent_property(self): + child_rule = ConcreteRule([]) + parent_rule = ConcreteRule([child_rule]) + self.assertIs(child_rule.parent, parent_rule) + + def test_index_property(self): + child_rule = ConcreteRule([]) + ConcreteRule([child_rule]) + self.assertEqual(child_rule.index, 0) + + def test_children_property(self): + t = ConcreteToken("x") + rule = ConcreteRule([t]) + self.assertEqual(rule.children, [t]) + + def test_to_lark_builds_tree(self): + t1 = ConcreteToken("a") + t2 = ConcreteToken("b") + rule = ConcreteRule([t1, t2]) + tree = rule.to_lark() + self.assertIsInstance(tree, Tree) + self.assertEqual(tree.data, "test_rule") + self.assertEqual(len(tree.children), 2) + + def test_to_lark_skips_none_children(self): + t1 = ConcreteToken("a") + rule = ConcreteRule([None, t1, None]) + tree = rule.to_lark() + self.assertEqual(len(tree.children), 1) + self.assertEqual(tree.children[0], "a") + + def test_repr(self): + rule = ConcreteRule([]) + self.assertEqual(repr(rule), "") + + def test_nested_rules(self): + inner = ConcreteRule([ConcreteToken("x")]) + outer = ConcreteRule([inner]) + self.assertIs(inner.parent, outer) + tree = outer.to_lark() + self.assertEqual(tree.data, "test_rule") + self.assertEqual(len(tree.children), 1) + self.assertIsInstance(tree.children[0], Tree) + + +class TestLarkElement(TestCase): + def test_set_index(self): + token = ConcreteToken("x") + token.set_index(5) + self.assertEqual(token._index, 5) + + def test_set_parent(self): + token = ConcreteToken("x") + parent = ConcreteRule([]) + token.set_parent(parent) + self.assertIs(token._parent, parent) diff --git a/test/unit/rules/test_base.py b/test/unit/rules/test_base.py new file mode 100644 index 00000000..4dc51f92 --- /dev/null +++ b/test/unit/rules/test_base.py @@ -0,0 +1,315 @@ +# pylint: disable=C0103,C0114,C0115,C0116 +from unittest import TestCase + +from hcl2.const import IS_BLOCK +from hcl2.rules.base import AttributeRule, BodyRule, StartRule, BlockRule +from hcl2.rules.expressions import ExpressionRule, ExprTermRule +from hcl2.rules.literal_rules import IdentifierRule +from hcl2.rules.strings import StringRule, StringPartRule +from hcl2.rules.tokens import ( + NAME, + EQ, + LBRACE, + RBRACE, + DBLQUOTE, + STRING_CHARS, + NL_OR_COMMENT, +) +from hcl2.rules.whitespace import NewLineOrCommentRule +from hcl2.utils import SerializationOptions, SerializationContext + + +# --- Stubs & helpers --- + + +class StubExpression(ExpressionRule): + """Minimal concrete ExpressionRule that serializes to a fixed value.""" + + def __init__(self, value): + self._stub_value = value + super().__init__([], None) + + def serialize(self, options=SerializationOptions(), context=SerializationContext()): + return self._stub_value + + +def _make_identifier(name): + return IdentifierRule([NAME(name)]) + + +def _make_expr_term(value): + return ExprTermRule([StubExpression(value)]) + + +def _make_string_rule(text): + part = StringPartRule([STRING_CHARS(text)]) + return StringRule([DBLQUOTE(), part, DBLQUOTE()]) + + +def _make_nlc(text): + return NewLineOrCommentRule([NL_OR_COMMENT(text)]) + + +def _make_attribute(name, value): + return AttributeRule([_make_identifier(name), EQ(), _make_expr_term(value)]) + + +def _make_block(labels, body_children=None): + """Build a BlockRule with the given labels and body children. + + labels: list of IdentifierRule or StringRule instances + body_children: list of children for the body, or None for empty body + """ + body = BodyRule(body_children or []) + children = list(labels) + [LBRACE(), body, RBRACE()] + return BlockRule(children) + + +# --- AttributeRule tests --- + + +class TestAttributeRule(TestCase): + def test_lark_name(self): + self.assertEqual(AttributeRule.lark_name(), "attribute") + + def test_identifier_property(self): + ident = _make_identifier("name") + attr = AttributeRule([ident, EQ(), _make_expr_term("value")]) + self.assertIs(attr.identifier, ident) + + def test_expression_property(self): + expr_term = _make_expr_term("value") + attr = AttributeRule([_make_identifier("name"), EQ(), expr_term]) + self.assertIs(attr.expression, expr_term) + + def test_serialize(self): + attr = _make_attribute("name", "value") + self.assertEqual(attr.serialize(), {"name": "value"}) + + def test_serialize_int_value(self): + attr = _make_attribute("count", 42) + self.assertEqual(attr.serialize(), {"count": 42}) + + def test_serialize_expression_value(self): + attr = _make_attribute("expr", "${var.x}") + self.assertEqual(attr.serialize(), {"expr": "${var.x}"}) + + +# --- BodyRule tests --- + + +class TestBodyRule(TestCase): + def test_lark_name(self): + self.assertEqual(BodyRule.lark_name(), "body") + + def test_serialize_empty(self): + body = BodyRule([]) + self.assertEqual(body.serialize(), {}) + + def test_serialize_single_attribute(self): + body = BodyRule([_make_attribute("name", "value")]) + self.assertEqual(body.serialize(), {"name": "value"}) + + def test_serialize_multiple_attributes(self): + body = BodyRule([_make_attribute("a", 1), _make_attribute("b", 2)]) + self.assertEqual(body.serialize(), {"a": 1, "b": 2}) + + def test_serialize_single_block(self): + block = _make_block([_make_identifier("resource")]) + body = BodyRule([block]) + result = body.serialize() + self.assertIn("resource", result) + self.assertIsInstance(result["resource"], list) + self.assertEqual(len(result["resource"]), 1) + self.assertTrue(result["resource"][0][IS_BLOCK]) + + def test_serialize_multiple_blocks_same_type(self): + block1 = _make_block( + [_make_identifier("resource")], + [_make_attribute("name", "first")], + ) + block2 = _make_block( + [_make_identifier("resource")], + [_make_attribute("name", "second")], + ) + body = BodyRule([block1, block2]) + result = body.serialize() + self.assertEqual(len(result["resource"]), 2) + self.assertEqual(result["resource"][0]["name"], "first") + self.assertEqual(result["resource"][1]["name"], "second") + + def test_serialize_mixed_attributes_and_blocks(self): + attr = _make_attribute("version", "1.0") + block = _make_block([_make_identifier("provider")]) + body = BodyRule([attr, block]) + result = body.serialize() + self.assertEqual(result["version"], "1.0") + self.assertIn("provider", result) + self.assertIsInstance(result["provider"], list) + + def test_serialize_comments_collected(self): + nlc = _make_nlc("# a comment\n") + attr = _make_attribute("x", 1) + body = BodyRule([nlc, attr]) + result = body.serialize(options=SerializationOptions(with_comments=True)) + self.assertIn("__comments__", result) + + def test_serialize_comments_not_collected_without_option(self): + nlc = _make_nlc("# a comment\n") + attr = _make_attribute("x", 1) + body = BodyRule([nlc, attr]) + result = body.serialize(options=SerializationOptions(with_comments=False)) + self.assertNotIn("__comments__", result) + + def test_serialize_bare_newlines_not_collected_as_comments(self): + nlc = _make_nlc("\n") + attr = _make_attribute("x", 1) + body = BodyRule([nlc, attr]) + result = body.serialize(options=SerializationOptions(with_comments=True)) + self.assertNotIn("__comments__", result) + + def test_serialize_raises_when_block_name_collides_with_attribute(self): + attr = _make_attribute("resource", "value") + block = _make_block([_make_identifier("resource")]) + body = BodyRule([attr, block]) + with self.assertRaises(RuntimeError): + body.serialize() + + def test_serialize_skips_newline_children(self): + nlc = _make_nlc("\n") + attr = _make_attribute("x", 1) + body = BodyRule([nlc, attr, nlc]) + result = body.serialize() + # NLC children should not appear as keys + keys = [k for k in result.keys() if not k.startswith("__")] + self.assertEqual(keys, ["x"]) + + +# --- StartRule tests --- + + +class TestStartRule(TestCase): + def test_lark_name(self): + self.assertEqual(StartRule.lark_name(), "start") + + def test_body_property(self): + body = BodyRule([]) + start = StartRule([body]) + self.assertIs(start.body, body) + + def test_serialize_delegates_to_body(self): + attr = _make_attribute("key", "val") + body = BodyRule([attr]) + start = StartRule([body]) + self.assertEqual(start.serialize(), body.serialize()) + + def test_serialize_empty_body(self): + start = StartRule([BodyRule([])]) + self.assertEqual(start.serialize(), {}) + + +# --- BlockRule tests --- + + +class TestBlockRule(TestCase): + def test_lark_name(self): + self.assertEqual(BlockRule.lark_name(), "block") + + def test_labels_property_single(self): + ident = _make_identifier("resource") + block = _make_block([ident]) + self.assertEqual(len(block.labels), 1) + self.assertIs(block.labels[0], ident) + + def test_labels_property_two(self): + i1 = _make_identifier("resource") + i2 = _make_identifier("aws_instance") + block = _make_block([i1, i2]) + self.assertEqual(len(block.labels), 2) + self.assertIs(block.labels[0], i1) + self.assertIs(block.labels[1], i2) + + def test_labels_property_three(self): + i1 = _make_identifier("resource") + i2 = _make_identifier("aws_instance") + s3 = _make_string_rule("example") + block = _make_block([i1, i2, s3]) + labels = block.labels + self.assertEqual(len(labels), 3) + self.assertIs(labels[0], i1) + self.assertIs(labels[1], i2) + self.assertIs(labels[2], s3) + + def test_body_property(self): + body = BodyRule([]) + ident = _make_identifier("resource") + block = BlockRule([ident, LBRACE(), body, RBRACE()]) + self.assertIs(block.body, body) + + def test_constructor_filters_tokens(self): + """LBRACE and RBRACE should not appear in labels or body.""" + ident = _make_identifier("resource") + body = BodyRule([]) + block = BlockRule([ident, LBRACE(), body, RBRACE()]) + # labels should only contain the identifier + self.assertEqual(len(block.labels), 1) + self.assertIs(block.labels[0], ident) + self.assertIs(block.body, body) + + def test_serialize_single_label_empty_body(self): + block = _make_block([_make_identifier("resource")]) + result = block.serialize() + self.assertEqual(result, {IS_BLOCK: True}) + + def test_serialize_single_label_with_body(self): + block = _make_block( + [_make_identifier("resource")], + [_make_attribute("name", "foo")], + ) + result = block.serialize() + self.assertEqual(result, {"name": "foo", IS_BLOCK: True}) + + def test_serialize_two_labels(self): + block = _make_block( + [_make_identifier("resource"), _make_identifier("aws_instance")], + [_make_attribute("ami", "abc")], + ) + result = block.serialize() + self.assertIn("aws_instance", result) + inner = result["aws_instance"] + self.assertEqual(inner, {"ami": "abc", IS_BLOCK: True}) + + def test_serialize_three_labels(self): + block = _make_block( + [ + _make_identifier("resource"), + _make_identifier("aws_instance"), + _make_string_rule("example"), + ], + [_make_attribute("ami", "abc")], + ) + result = block.serialize() + self.assertIn("aws_instance", result) + inner = result["aws_instance"] + self.assertIn('"example"', inner) + innermost = inner['"example"'] + self.assertEqual(innermost, {"ami": "abc", IS_BLOCK: True}) + + def test_serialize_explicit_blocks_false(self): + block = _make_block( + [_make_identifier("resource")], + [_make_attribute("name", "foo")], + ) + opts = SerializationOptions(explicit_blocks=False) + result = block.serialize(options=opts) + self.assertNotIn(IS_BLOCK, result) + self.assertEqual(result, {"name": "foo"}) + + def test_serialize_string_label(self): + block = _make_block( + [_make_identifier("resource"), _make_string_rule("my_label")], + [_make_attribute("x", 1)], + ) + result = block.serialize() + # StringRule serializes with quotes + self.assertIn('"my_label"', result) diff --git a/test/unit/rules/test_containers.py b/test/unit/rules/test_containers.py new file mode 100644 index 00000000..526b0216 --- /dev/null +++ b/test/unit/rules/test_containers.py @@ -0,0 +1,353 @@ +# pylint: disable=C0103,C0114,C0115,C0116 +from unittest import TestCase + +from hcl2.rules.containers import ( + TupleRule, + ObjectElemKeyRule, + ObjectElemKeyExpressionRule, + ObjectElemRule, + ObjectRule, +) +from hcl2.rules.expressions import ExpressionRule +from hcl2.rules.literal_rules import IdentifierRule, IntLitRule, FloatLitRule +from hcl2.rules.strings import StringRule, StringPartRule +from hcl2.rules.tokens import ( + LSQB, + RSQB, + LBRACE, + RBRACE, + EQ, + COLON, + COMMA, + NAME, + DBLQUOTE, + STRING_CHARS, + IntLiteral, + FloatLiteral, +) +from hcl2.rules.whitespace import NewLineOrCommentRule +from hcl2.rules.tokens import NL_OR_COMMENT +from hcl2.utils import SerializationOptions, SerializationContext + + +# --- Stubs & Helpers --- + + +class StubExpression(ExpressionRule): + """Minimal ExpressionRule that serializes to a fixed value.""" + + def __init__(self, value): + self._stub_value = value + super().__init__([], None) + + def serialize(self, options=SerializationOptions(), context=SerializationContext()): + return self._stub_value + + +def _make_nlc(text): + return NewLineOrCommentRule([NL_OR_COMMENT(text)]) + + +def _make_identifier(name): + return IdentifierRule([NAME(name)]) + + +def _make_string_rule(text): + part = StringPartRule([STRING_CHARS(text)]) + return StringRule([DBLQUOTE(), part, DBLQUOTE()]) + + +def _make_object_elem_key(identifier_name): + return ObjectElemKeyRule([_make_identifier(identifier_name)]) + + +def _make_object_elem(key_name, expr_value, sep=None): + key = _make_object_elem_key(key_name) + separator = sep or EQ() + return ObjectElemRule([key, separator, StubExpression(expr_value)]) + + +# --- TupleRule tests --- + + +class TestTupleRule(TestCase): + def test_lark_name(self): + self.assertEqual(TupleRule.lark_name(), "tuple") + + def test_elements_empty_tuple(self): + rule = TupleRule([LSQB(), RSQB()]) + self.assertEqual(rule.elements, []) + + def test_elements_single(self): + expr = StubExpression(1) + rule = TupleRule([LSQB(), expr, RSQB()]) + self.assertEqual(rule.elements, [expr]) + + def test_elements_multiple(self): + e1 = StubExpression(1) + e2 = StubExpression(2) + e3 = StubExpression(3) + rule = TupleRule([LSQB(), e1, COMMA(), e2, COMMA(), e3, RSQB()]) + self.assertEqual(rule.elements, [e1, e2, e3]) + + def test_elements_skips_non_expressions(self): + e1 = StubExpression(1) + e2 = StubExpression(2) + nlc = _make_nlc("\n") + rule = TupleRule([LSQB(), nlc, e1, COMMA(), nlc, e2, RSQB()]) + self.assertEqual(len(rule.elements), 2) + + def test_serialize_default_returns_list(self): + rule = TupleRule( + [LSQB(), StubExpression(1), COMMA(), StubExpression(2), RSQB()] + ) + result = rule.serialize() + self.assertEqual(result, [1, 2]) + + def test_serialize_empty_returns_empty_list(self): + rule = TupleRule([LSQB(), RSQB()]) + self.assertEqual(rule.serialize(), []) + + def test_serialize_single_element(self): + rule = TupleRule([LSQB(), StubExpression(42), RSQB()]) + self.assertEqual(rule.serialize(), [42]) + + def test_serialize_wrap_tuples(self): + rule = TupleRule( + [LSQB(), StubExpression("a"), COMMA(), StubExpression("b"), RSQB()] + ) + opts = SerializationOptions(wrap_tuples=True) + result = rule.serialize(options=opts) + self.assertEqual(result, "${[a, b]}") + + def test_serialize_wrap_tuples_empty(self): + rule = TupleRule([LSQB(), RSQB()]) + opts = SerializationOptions(wrap_tuples=True) + result = rule.serialize(options=opts) + self.assertEqual(result, "${[]}") + + def test_serialize_inside_dollar_string(self): + rule = TupleRule([LSQB(), StubExpression("a"), RSQB()]) + ctx = SerializationContext(inside_dollar_string=True) + result = rule.serialize(context=ctx) + # Inside dollar string forces string representation + self.assertEqual(result, "[a]") + + def test_serialize_inside_dollar_string_no_extra_wrap(self): + rule = TupleRule( + [LSQB(), StubExpression("a"), COMMA(), StubExpression("b"), RSQB()] + ) + ctx = SerializationContext(inside_dollar_string=True) + result = rule.serialize(context=ctx) + self.assertEqual(result, "[a, b]") + + def test_serialize_wrap_tuples_inside_dollar_string(self): + rule = TupleRule([LSQB(), StubExpression("x"), RSQB()]) + opts = SerializationOptions(wrap_tuples=True) + ctx = SerializationContext(inside_dollar_string=True) + result = rule.serialize(options=opts, context=ctx) + # Already inside $, so no extra wrapping + self.assertEqual(result, "[x]") + + +# --- ObjectElemKeyRule tests --- + + +class TestObjectElemKeyRule(TestCase): + def test_lark_name(self): + self.assertEqual(ObjectElemKeyRule.lark_name(), "object_elem_key") + + def test_value_property_identifier(self): + ident = _make_identifier("foo") + rule = ObjectElemKeyRule([ident]) + self.assertIs(rule.value, ident) + + def test_serialize_identifier(self): + rule = ObjectElemKeyRule([_make_identifier("my_key")]) + self.assertEqual(rule.serialize(), "my_key") + + def test_serialize_int_lit(self): + rule = ObjectElemKeyRule([IntLitRule([IntLiteral("5")])]) + self.assertEqual(rule.serialize(), "5") + + def test_serialize_float_lit(self): + rule = ObjectElemKeyRule([FloatLitRule([FloatLiteral("3.14")])]) + self.assertEqual(rule.serialize(), "3.14") + + def test_serialize_string(self): + rule = ObjectElemKeyRule([_make_string_rule("k3")]) + self.assertEqual(rule.serialize(), '"k3"') + + +# --- ObjectElemKeyExpressionRule tests --- + + +class TestObjectElemKeyExpressionRule(TestCase): + def test_lark_name(self): + self.assertEqual( + ObjectElemKeyExpressionRule.lark_name(), "object_elem_key_expr" + ) + + def test_expression_property(self): + expr = StubExpression("1 + 1") + rule = ObjectElemKeyExpressionRule([expr]) + self.assertIs(rule.expression, expr) + + def test_serialize_bare(self): + rule = ObjectElemKeyExpressionRule([StubExpression("1 + 1")]) + result = rule.serialize() + self.assertEqual(result, "${1 + 1}") + + def test_serialize_inside_dollar_string(self): + rule = ObjectElemKeyExpressionRule([StubExpression("1 + 1")]) + ctx = SerializationContext(inside_dollar_string=True) + result = rule.serialize(context=ctx) + self.assertEqual(result, "1 + 1") + + def test_serialize_function_call(self): + rule = ObjectElemKeyExpressionRule([StubExpression('format("k", v)')]) + result = rule.serialize() + self.assertEqual(result, '${format("k", v)}') + + +# --- ObjectElemRule tests --- + + +class TestObjectElemRule(TestCase): + def test_lark_name(self): + self.assertEqual(ObjectElemRule.lark_name(), "object_elem") + + def test_key_property(self): + key = _make_object_elem_key("foo") + rule = ObjectElemRule([key, EQ(), StubExpression("bar")]) + self.assertIs(rule.key, key) + + def test_expression_property(self): + expr = StubExpression("bar") + rule = ObjectElemRule([_make_object_elem_key("foo"), EQ(), expr]) + self.assertIs(rule.expression, expr) + + def test_serialize_with_eq(self): + rule = _make_object_elem("name", "value") + self.assertEqual(rule.serialize(), {"name": "value"}) + + def test_serialize_with_colon(self): + rule = ObjectElemRule([_make_object_elem_key("k"), COLON(), StubExpression(42)]) + self.assertEqual(rule.serialize(), {"k": 42}) + + def test_serialize_int_value(self): + rule = _make_object_elem("count", 5) + self.assertEqual(rule.serialize(), {"count": 5}) + + def test_serialize_string_key(self): + key = ObjectElemKeyRule([_make_string_rule("quoted")]) + rule = ObjectElemRule([key, EQ(), StubExpression("val")]) + self.assertEqual(rule.serialize(), {'"quoted"': "val"}) + + +# --- ObjectRule tests --- + + +class TestObjectRule(TestCase): + def test_lark_name(self): + self.assertEqual(ObjectRule.lark_name(), "object") + + def test_elements_empty(self): + rule = ObjectRule([LBRACE(), RBRACE()]) + self.assertEqual(rule.elements, []) + + def test_elements_single(self): + elem = _make_object_elem("k", "v") + rule = ObjectRule([LBRACE(), elem, RBRACE()]) + self.assertEqual(rule.elements, [elem]) + + def test_elements_multiple(self): + e1 = _make_object_elem("a", 1) + e2 = _make_object_elem("b", 2) + rule = ObjectRule([LBRACE(), e1, e2, RBRACE()]) + self.assertEqual(rule.elements, [e1, e2]) + + def test_elements_skips_non_elem(self): + e1 = _make_object_elem("a", 1) + nlc = _make_nlc("\n") + rule = ObjectRule([LBRACE(), nlc, e1, nlc, RBRACE()]) + self.assertEqual(rule.elements, [e1]) + + def test_serialize_default_returns_dict(self): + rule = ObjectRule( + [ + LBRACE(), + _make_object_elem("k1", "v1"), + _make_object_elem("k2", "v2"), + RBRACE(), + ] + ) + result = rule.serialize() + self.assertEqual(result, {"k1": "v1", "k2": "v2"}) + + def test_serialize_empty_returns_empty_dict(self): + rule = ObjectRule([LBRACE(), RBRACE()]) + self.assertEqual(rule.serialize(), {}) + + def test_serialize_single_element(self): + rule = ObjectRule([LBRACE(), _make_object_elem("x", 42), RBRACE()]) + self.assertEqual(rule.serialize(), {"x": 42}) + + def test_serialize_wrap_objects(self): + rule = ObjectRule( + [ + LBRACE(), + _make_object_elem("k1", "v1"), + _make_object_elem("k2", "v2"), + RBRACE(), + ] + ) + opts = SerializationOptions(wrap_objects=True) + result = rule.serialize(options=opts) + # Result is "{k1 = v1, k2 = v2}" wrapped in ${}, giving ${{...}} + self.assertEqual(result, "${{k1 = v1, k2 = v2}}") + + def test_serialize_wrap_objects_empty(self): + rule = ObjectRule([LBRACE(), RBRACE()]) + opts = SerializationOptions(wrap_objects=True) + result = rule.serialize(options=opts) + self.assertEqual(result, "${{}}") + + def test_serialize_inside_dollar_string(self): + rule = ObjectRule( + [ + LBRACE(), + _make_object_elem("k", "v"), + RBRACE(), + ] + ) + ctx = SerializationContext(inside_dollar_string=True) + result = rule.serialize(context=ctx) + # Inside dollar string forces string representation + self.assertEqual(result, "{k = v}") + + def test_serialize_inside_dollar_string_no_extra_wrap(self): + rule = ObjectRule( + [ + LBRACE(), + _make_object_elem("a", 1), + _make_object_elem("b", 2), + RBRACE(), + ] + ) + ctx = SerializationContext(inside_dollar_string=True) + result = rule.serialize(context=ctx) + self.assertEqual(result, "{a = 1, b = 2}") + + def test_serialize_wrap_objects_inside_dollar_string(self): + rule = ObjectRule( + [ + LBRACE(), + _make_object_elem("k", "v"), + RBRACE(), + ] + ) + opts = SerializationOptions(wrap_objects=True) + ctx = SerializationContext(inside_dollar_string=True) + result = rule.serialize(options=opts, context=ctx) + self.assertEqual(result, "{k = v}") diff --git a/test/unit/rules/test_directives.py b/test/unit/rules/test_directives.py new file mode 100644 index 00000000..bb2be42e --- /dev/null +++ b/test/unit/rules/test_directives.py @@ -0,0 +1,187 @@ +"""Unit tests for template directive rule classes.""" + +# pylint: disable=C0103,C0114,C0115,C0116 +from unittest import TestCase + +from hcl2.rules.directives import ( + TemplateIfStartRule, + TemplateElseRule, + TemplateEndifRule, + TemplateForStartRule, + TemplateEndforRule, + TemplateIfRule, + TemplateForRule, +) +from hcl2.rules.literal_rules import IdentifierRule +from hcl2.rules.strings import StringPartRule +from hcl2.rules.tokens import ( + NAME, + DIRECTIVE_START, + STRIP_MARKER, + RBRACE, + IF, + ELSE, + ENDIF, + FOR, + IN, + ENDFOR, + COMMA, + STRING_CHARS, +) + + +class TestTemplateIfStartRule(TestCase): + def test_lark_name(self): + self.assertEqual(TemplateIfStartRule.lark_name(), "template_if_start") + + def test_serialize_basic(self): + cond = IdentifierRule([NAME("cond")]) + rule = TemplateIfStartRule([DIRECTIVE_START(), IF(), cond, RBRACE()]) + self.assertEqual(rule.serialize(), "%{ if cond }") + + def test_serialize_strip_markers(self): + cond = IdentifierRule([NAME("cond")]) + rule = TemplateIfStartRule( + [DIRECTIVE_START(), STRIP_MARKER(), IF(), cond, STRIP_MARKER(), RBRACE()] + ) + self.assertEqual(rule.serialize(), "%{~ if cond ~}") + + def test_condition_property(self): + cond = IdentifierRule([NAME("x")]) + rule = TemplateIfStartRule([DIRECTIVE_START(), IF(), cond, RBRACE()]) + self.assertIs(rule.condition, cond) + + +class TestTemplateElseRule(TestCase): + def test_lark_name(self): + self.assertEqual(TemplateElseRule.lark_name(), "template_else") + + def test_serialize_basic(self): + rule = TemplateElseRule([DIRECTIVE_START(), ELSE(), RBRACE()]) + self.assertEqual(rule.serialize(), "%{ else }") + + def test_serialize_strip_markers(self): + rule = TemplateElseRule( + [DIRECTIVE_START(), STRIP_MARKER(), ELSE(), STRIP_MARKER(), RBRACE()] + ) + self.assertEqual(rule.serialize(), "%{~ else ~}") + + +class TestTemplateEndifRule(TestCase): + def test_lark_name(self): + self.assertEqual(TemplateEndifRule.lark_name(), "template_endif") + + def test_serialize_basic(self): + rule = TemplateEndifRule([DIRECTIVE_START(), ENDIF(), RBRACE()]) + self.assertEqual(rule.serialize(), "%{ endif }") + + +class TestTemplateForStartRule(TestCase): + def test_lark_name(self): + self.assertEqual(TemplateForStartRule.lark_name(), "template_for_start") + + def test_serialize_basic(self): + iterator = IdentifierRule([NAME("item")]) + collection = IdentifierRule([NAME("items")]) + rule = TemplateForStartRule( + [DIRECTIVE_START(), FOR(), iterator, IN(), collection, RBRACE()] + ) + self.assertEqual(rule.serialize(), "%{ for item in items }") + + def test_serialize_key_value(self): + key = IdentifierRule([NAME("k")]) + val = IdentifierRule([NAME("v")]) + collection = IdentifierRule([NAME("map")]) + rule = TemplateForStartRule( + [DIRECTIVE_START(), FOR(), key, COMMA(), val, IN(), collection, RBRACE()] + ) + self.assertEqual(rule.serialize(), "%{ for k, v in map }") + + def test_serialize_strip_markers(self): + iterator = IdentifierRule([NAME("x")]) + collection = IdentifierRule([NAME("xs")]) + rule = TemplateForStartRule( + [ + DIRECTIVE_START(), + STRIP_MARKER(), + FOR(), + iterator, + IN(), + collection, + STRIP_MARKER(), + RBRACE(), + ] + ) + self.assertEqual(rule.serialize(), "%{~ for x in xs ~}") + + +class TestTemplateEndforRule(TestCase): + def test_lark_name(self): + self.assertEqual(TemplateEndforRule.lark_name(), "template_endfor") + + def test_serialize_basic(self): + rule = TemplateEndforRule([DIRECTIVE_START(), ENDFOR(), RBRACE()]) + self.assertEqual(rule.serialize(), "%{ endfor }") + + +class TestTemplateIfRule(TestCase): + def test_lark_name(self): + self.assertEqual(TemplateIfRule.lark_name(), "template_if") + + def test_serialize_basic(self): + cond = IdentifierRule([NAME("cond")]) + if_start = TemplateIfStartRule([DIRECTIVE_START(), IF(), cond, RBRACE()]) + body = [StringPartRule([STRING_CHARS("yes")])] + endif = TemplateEndifRule([DIRECTIVE_START(), ENDIF(), RBRACE()]) + rule = TemplateIfRule(if_start, body, None, None, endif) + self.assertEqual(rule.serialize(), "%{ if cond }yes%{ endif }") + + def test_serialize_with_else(self): + cond = IdentifierRule([NAME("cond")]) + if_start = TemplateIfStartRule([DIRECTIVE_START(), IF(), cond, RBRACE()]) + if_body = [StringPartRule([STRING_CHARS("yes")])] + else_rule = TemplateElseRule([DIRECTIVE_START(), ELSE(), RBRACE()]) + else_body = [StringPartRule([STRING_CHARS("no")])] + endif = TemplateEndifRule([DIRECTIVE_START(), ENDIF(), RBRACE()]) + rule = TemplateIfRule(if_start, if_body, else_rule, else_body, endif) + self.assertEqual(rule.serialize(), "%{ if cond }yes%{ else }no%{ endif }") + + def test_serialize_strip_markers(self): + cond = IdentifierRule([NAME("c")]) + if_start = TemplateIfStartRule( + [DIRECTIVE_START(), STRIP_MARKER(), IF(), cond, STRIP_MARKER(), RBRACE()] + ) + body = [StringPartRule([STRING_CHARS("x")])] + endif = TemplateEndifRule( + [DIRECTIVE_START(), STRIP_MARKER(), ENDIF(), STRIP_MARKER(), RBRACE()] + ) + rule = TemplateIfRule(if_start, body, None, None, endif) + self.assertEqual(rule.serialize(), "%{~ if c ~}x%{~ endif ~}") + + +class TestTemplateForRule(TestCase): + def test_lark_name(self): + self.assertEqual(TemplateForRule.lark_name(), "template_for") + + def test_serialize_basic(self): + iterator = IdentifierRule([NAME("item")]) + collection = IdentifierRule([NAME("items")]) + for_start = TemplateForStartRule( + [DIRECTIVE_START(), FOR(), iterator, IN(), collection, RBRACE()] + ) + body = [StringPartRule([STRING_CHARS("text")])] + endfor = TemplateEndforRule([DIRECTIVE_START(), ENDFOR(), RBRACE()]) + rule = TemplateForRule(for_start, body, endfor) + self.assertEqual(rule.serialize(), "%{ for item in items }text%{ endfor }") + + def test_serialize_key_value(self): + key = IdentifierRule([NAME("k")]) + val = IdentifierRule([NAME("v")]) + collection = IdentifierRule([NAME("m")]) + for_start = TemplateForStartRule( + [DIRECTIVE_START(), FOR(), key, COMMA(), val, IN(), collection, RBRACE()] + ) + body = [StringPartRule([STRING_CHARS("text")])] + endfor = TemplateEndforRule([DIRECTIVE_START(), ENDFOR(), RBRACE()]) + rule = TemplateForRule(for_start, body, endfor) + self.assertEqual(rule.serialize(), "%{ for k, v in m }text%{ endfor }") diff --git a/test/unit/rules/test_expressions.py b/test/unit/rules/test_expressions.py new file mode 100644 index 00000000..8ab7e8db --- /dev/null +++ b/test/unit/rules/test_expressions.py @@ -0,0 +1,490 @@ +# pylint: disable=C0103,C0114,C0115,C0116 +from unittest import TestCase + +from hcl2.rules.abstract import LarkRule +from hcl2.rules.expressions import ( + ExpressionRule, + ExprTermRule, + ConditionalRule, + BinaryTermRule, + BinaryOpRule, + UnaryOpRule, +) +from hcl2.rules.literal_rules import BinaryOperatorRule +from hcl2.rules.tokens import ( + LPAR, + RPAR, + QMARK, + COLON, + BINARY_OP, + StringToken, +) +from hcl2.utils import SerializationOptions, SerializationContext + + +# --- Stubs & helpers --- + + +class StubExpression(ExpressionRule): + """Minimal concrete ExpressionRule that serializes to a fixed string.""" + + def __init__(self, value, children=None): + self._stub_value = value + super().__init__(children or [], None) + + def serialize(self, options=SerializationOptions(), context=SerializationContext()): + return self._stub_value + + +class NonExpressionRule(LarkRule): + """A rule that is NOT an ExpressionRule, for parent-chain tests.""" + + @staticmethod + def lark_name(): + return "non_expression" + + def serialize(self, options=SerializationOptions(), context=SerializationContext()): + return "non_expr" + + +def _make_expr_term(value): + """Build ExprTermRule wrapping a StubExpression (no parens).""" + return ExprTermRule([StubExpression(value)]) + + +def _make_paren_expr_term(value): + """Build ExprTermRule wrapping a StubExpression in parentheses.""" + return ExprTermRule([LPAR(), StubExpression(value), RPAR()]) + + +def _make_binary_operator(op_str): + """Build BinaryOperatorRule for an operator string.""" + return BinaryOperatorRule([BINARY_OP(op_str)]) + + +def _make_binary_term(op_str, rhs_value): + """Build BinaryTermRule with given operator and RHS value.""" + return BinaryTermRule([_make_binary_operator(op_str), _make_expr_term(rhs_value)]) + + +MINUS_TOKEN = StringToken["MINUS"] # type: ignore[type-arg,name-defined] +NOT_TOKEN = StringToken["NOT"] # type: ignore[type-arg,name-defined] + + +# --- ExprTermRule tests --- + + +class TestExprTermRule(TestCase): + def test_lark_name(self): + self.assertEqual(ExprTermRule.lark_name(), "expr_term") + + def test_construction_without_parens(self): + stub = StubExpression("a") + rule = ExprTermRule([stub]) + self.assertFalse(rule.parentheses) + + def test_construction_without_parens_children_structure(self): + stub = StubExpression("a") + rule = ExprTermRule([stub]) + # children: [None, None, stub, None, None] + self.assertEqual(len(rule.children), 5) + self.assertIsNone(rule.children[0]) + self.assertIsNone(rule.children[1]) + self.assertIs(rule.children[2], stub) + self.assertIsNone(rule.children[3]) + self.assertIsNone(rule.children[4]) + + def test_construction_with_parens(self): + stub = StubExpression("a") + rule = ExprTermRule([LPAR(), stub, RPAR()]) + self.assertTrue(rule.parentheses) + + def test_construction_with_parens_children_structure(self): + stub = StubExpression("a") + lpar = LPAR() + rpar = RPAR() + rule = ExprTermRule([lpar, stub, rpar]) + # children: [LPAR, None, stub, None, RPAR] + self.assertEqual(len(rule.children), 5) + self.assertIs(rule.children[0], lpar) + self.assertIsNone(rule.children[1]) + self.assertIs(rule.children[2], stub) + self.assertIsNone(rule.children[3]) + self.assertIs(rule.children[4], rpar) + + def test_expression_property(self): + stub = StubExpression("a") + rule = ExprTermRule([stub]) + self.assertIs(rule.expression, stub) + + def test_expression_property_with_parens(self): + stub = StubExpression("a") + rule = ExprTermRule([LPAR(), stub, RPAR()]) + self.assertIs(rule.expression, stub) + + def test_serialize_no_parens_delegates_to_inner(self): + rule = _make_expr_term("hello") + self.assertEqual(rule.serialize(), "hello") + + def test_serialize_no_parens_passes_through_int(self): + stub = StubExpression(42) + rule = ExprTermRule([stub]) + self.assertEqual(rule.serialize(), 42) + + def test_serialize_with_parens_wraps_and_dollar(self): + rule = _make_paren_expr_term("a") + result = rule.serialize() + self.assertEqual(result, "${(a)}") + + def test_serialize_with_parens_inside_dollar_string(self): + rule = _make_paren_expr_term("a") + ctx = SerializationContext(inside_dollar_string=True) + result = rule.serialize(context=ctx) + # Inside dollar string: wraps in () but NOT in ${} + self.assertEqual(result, "(a)") + + def test_serialize_sets_inside_parentheses_context(self): + """When parenthesized, inner expression should see inside_parentheses=True.""" + seen_context = {} + + class ContextCapture(ExpressionRule): + def serialize( + self, options=SerializationOptions(), context=SerializationContext() + ): + seen_context["inside_parentheses"] = context.inside_parentheses + return "x" + + rule = ExprTermRule([LPAR(), ContextCapture([]), RPAR()]) + rule.serialize() + self.assertTrue(seen_context["inside_parentheses"]) + + def test_serialize_no_parens_preserves_inside_parentheses(self): + """Without parens, inside_parentheses passes through from caller context.""" + seen_context = {} + + class ContextCapture(ExpressionRule): + def serialize( + self, options=SerializationOptions(), context=SerializationContext() + ): + seen_context["inside_parentheses"] = context.inside_parentheses + return "x" + + rule = ExprTermRule([ContextCapture([])]) + rule.serialize(context=SerializationContext(inside_parentheses=False)) + self.assertFalse(seen_context["inside_parentheses"]) + + +# --- ConditionalRule tests --- + + +class TestConditionalRule(TestCase): + def _make_conditional(self, cond_val="cond", true_val="yes", false_val="no"): + return ConditionalRule( + [ + StubExpression(cond_val), + QMARK(), + StubExpression(true_val), + COLON(), + StubExpression(false_val), + ] + ) + + def test_lark_name(self): + self.assertEqual(ConditionalRule.lark_name(), "conditional") + + def test_construction_inserts_optional_slots(self): + rule = self._make_conditional() + # Should have 9 children after _insert_optionals at [1, 3, 5, 7] + self.assertEqual(len(rule.children), 9) + + def test_condition_property(self): + cond = StubExpression("cond") + rule = ConditionalRule( + [cond, QMARK(), StubExpression("t"), COLON(), StubExpression("f")] + ) + self.assertIs(rule.condition, cond) + + def test_if_true_property(self): + true_expr = StubExpression("yes") + rule = ConditionalRule( + [ + StubExpression("c"), + QMARK(), + true_expr, + COLON(), + StubExpression("f"), + ] + ) + self.assertIs(rule.if_true, true_expr) + + def test_if_false_property(self): + false_expr = StubExpression("no") + rule = ConditionalRule( + [ + StubExpression("c"), + QMARK(), + StubExpression("t"), + COLON(), + false_expr, + ] + ) + self.assertIs(rule.if_false, false_expr) + + def test_serialize_format(self): + rule = self._make_conditional("a", "b", "c") + result = rule.serialize() + self.assertEqual(result, "${a ? b : c}") + + def test_serialize_wraps_in_dollar_string(self): + rule = self._make_conditional("x", "y", "z") + result = rule.serialize() + self.assertTrue(result.startswith("${")) + self.assertTrue(result.endswith("}")) + + def test_serialize_no_double_wrap_inside_dollar_string(self): + rule = self._make_conditional("x", "y", "z") + ctx = SerializationContext(inside_dollar_string=True) + result = rule.serialize(context=ctx) + self.assertEqual(result, "x ? y : z") + + def test_serialize_force_parens_no_parent(self): + """force_operation_parentheses with no parent → no wrapping.""" + rule = self._make_conditional("a", "b", "c") + opts = SerializationOptions(force_operation_parentheses=True) + result = rule.serialize(options=opts) + # No parent, so _wrap_into_parentheses returns unchanged + self.assertEqual(result, "${a ? b : c}") + + def test_serialize_force_parens_with_expression_parent(self): + """force_operation_parentheses with ExpressionRule parent → wraps.""" + rule = self._make_conditional("a", "b", "c") + # Nest inside another expression to set parent + StubExpression("outer", children=[rule]) + opts = SerializationOptions(force_operation_parentheses=True) + result = rule.serialize(options=opts) + self.assertEqual(result, "${(a ? b : c)}") + + +# --- BinaryTermRule tests --- + + +class TestBinaryTermRule(TestCase): + def test_lark_name(self): + self.assertEqual(BinaryTermRule.lark_name(), "binary_term") + + def test_construction_inserts_optional(self): + rule = _make_binary_term("+", "b") + # [None, BinaryOperatorRule, None, ExprTermRule] + self.assertEqual(len(rule.children), 4) + self.assertIsNone(rule.children[0]) + self.assertIsNone(rule.children[2]) + + def test_binary_operator_property(self): + op = _make_binary_operator("+") + rhs = _make_expr_term("b") + rule = BinaryTermRule([op, rhs]) + self.assertIs(rule.binary_operator, op) + + def test_expr_term_property(self): + op = _make_binary_operator("+") + rhs = _make_expr_term("b") + rule = BinaryTermRule([op, rhs]) + self.assertIs(rule.expr_term, rhs) + + def test_serialize(self): + rule = _make_binary_term("+", "b") + result = rule.serialize() + self.assertEqual(result, "+ b") + + def test_serialize_equals_operator(self): + rule = _make_binary_term("==", "x") + self.assertEqual(rule.serialize(), "== x") + + def test_serialize_and_operator(self): + rule = _make_binary_term("&&", "y") + self.assertEqual(rule.serialize(), "&& y") + + +# --- BinaryOpRule tests --- + + +class TestBinaryOpRule(TestCase): + def _make_binary_op(self, lhs_val, op_str, rhs_val): + lhs = _make_expr_term(lhs_val) + bt = _make_binary_term(op_str, rhs_val) + return BinaryOpRule([lhs, bt, None]) + + def test_lark_name(self): + self.assertEqual(BinaryOpRule.lark_name(), "binary_op") + + def test_expr_term_property(self): + lhs = _make_expr_term("a") + bt = _make_binary_term("+", "b") + rule = BinaryOpRule([lhs, bt, None]) + self.assertIs(rule.expr_term, lhs) + + def test_binary_term_property(self): + lhs = _make_expr_term("a") + bt = _make_binary_term("+", "b") + rule = BinaryOpRule([lhs, bt, None]) + self.assertIs(rule.binary_term, bt) + + def test_serialize_addition(self): + rule = self._make_binary_op("a", "+", "b") + self.assertEqual(rule.serialize(), "${a + b}") + + def test_serialize_equality(self): + rule = self._make_binary_op("x", "==", "y") + self.assertEqual(rule.serialize(), "${x == y}") + + def test_serialize_and(self): + rule = self._make_binary_op("p", "&&", "q") + self.assertEqual(rule.serialize(), "${p && q}") + + def test_serialize_multiply(self): + rule = self._make_binary_op("a", "*", "b") + self.assertEqual(rule.serialize(), "${a * b}") + + def test_serialize_no_double_wrap_inside_dollar_string(self): + rule = self._make_binary_op("a", "+", "b") + ctx = SerializationContext(inside_dollar_string=True) + result = rule.serialize(context=ctx) + self.assertEqual(result, "a + b") + + def test_serialize_force_parens_no_parent(self): + """No parent → _wrap_into_parentheses returns unchanged.""" + rule = self._make_binary_op("a", "+", "b") + opts = SerializationOptions(force_operation_parentheses=True) + result = rule.serialize(options=opts) + self.assertEqual(result, "${a + b}") + + def test_serialize_force_parens_with_expression_parent(self): + """With ExpressionRule parent → wraps in parens.""" + rule = self._make_binary_op("a", "+", "b") + StubExpression("outer", children=[rule]) + opts = SerializationOptions(force_operation_parentheses=True) + result = rule.serialize(options=opts) + self.assertEqual(result, "${(a + b)}") + + def test_serialize_force_parens_inside_dollar_string_with_parent(self): + """Inside dollar string + parent → parens without extra ${}.""" + rule = self._make_binary_op("a", "+", "b") + StubExpression("outer", children=[rule]) + opts = SerializationOptions(force_operation_parentheses=True) + ctx = SerializationContext(inside_dollar_string=True) + result = rule.serialize(options=opts, context=ctx) + self.assertEqual(result, "(a + b)") + + +# --- UnaryOpRule tests --- + + +class TestUnaryOpRule(TestCase): + def _make_unary(self, op_str, operand_val): + token_cls = MINUS_TOKEN if op_str == "-" else NOT_TOKEN + token = token_cls(op_str) + expr_term = _make_expr_term(operand_val) + return UnaryOpRule([token, expr_term]) + + def test_lark_name(self): + self.assertEqual(UnaryOpRule.lark_name(), "unary_op") + + def test_operator_property_minus(self): + rule = self._make_unary("-", "x") + self.assertEqual(rule.operator, "-") + + def test_operator_property_not(self): + rule = self._make_unary("!", "x") + self.assertEqual(rule.operator, "!") + + def test_expr_term_property(self): + expr_term = _make_expr_term("x") + token = MINUS_TOKEN("-") + rule = UnaryOpRule([token, expr_term]) + self.assertIs(rule.expr_term, expr_term) + + def test_serialize_minus(self): + rule = self._make_unary("-", "a") + self.assertEqual(rule.serialize(), "${-a}") + + def test_serialize_not(self): + rule = self._make_unary("!", "flag") + self.assertEqual(rule.serialize(), "${!flag}") + + def test_serialize_no_double_wrap_inside_dollar_string(self): + rule = self._make_unary("-", "x") + ctx = SerializationContext(inside_dollar_string=True) + result = rule.serialize(context=ctx) + self.assertEqual(result, "-x") + + def test_serialize_force_parens_no_parent(self): + rule = self._make_unary("-", "x") + opts = SerializationOptions(force_operation_parentheses=True) + result = rule.serialize(options=opts) + self.assertEqual(result, "${-x}") + + def test_serialize_force_parens_with_expression_parent(self): + rule = self._make_unary("-", "x") + StubExpression("outer", children=[rule]) + opts = SerializationOptions(force_operation_parentheses=True) + result = rule.serialize(options=opts) + self.assertEqual(result, "${(-x)}") + + +# --- ExpressionRule._wrap_into_parentheses tests --- + + +class TestWrapIntoParenthesesMethod(TestCase): + def test_returns_unchanged_when_inside_parentheses(self): + expr = StubExpression("test") + ctx = SerializationContext(inside_parentheses=True) + result = expr._wrap_into_parentheses("${x}", context=ctx) + self.assertEqual(result, "${x}") + + def test_returns_unchanged_when_no_parent(self): + expr = StubExpression("test") + result = expr._wrap_into_parentheses("${x}") + self.assertEqual(result, "${x}") + + def test_returns_unchanged_when_parent_not_expression(self): + expr = StubExpression("test") + NonExpressionRule([expr]) + result = expr._wrap_into_parentheses("${x}") + self.assertEqual(result, "${x}") + + def test_wraps_when_parent_is_expression(self): + expr = StubExpression("test") + StubExpression("outer", children=[expr]) + result = expr._wrap_into_parentheses("${x}") + self.assertEqual(result, "${(x)}") + + def test_wraps_plain_string_when_parent_is_expression(self): + expr = StubExpression("test") + StubExpression("outer", children=[expr]) + result = expr._wrap_into_parentheses("a + b") + self.assertEqual(result, "(a + b)") + + def test_expr_term_parent_with_expression_grandparent(self): + """Parent is ExprTermRule, grandparent is ExpressionRule → wraps.""" + inner = StubExpression("test") + expr_term = ExprTermRule([inner]) + # inner is now at expr_term._children[2], parent=expr_term + StubExpression("grandparent", children=[expr_term]) + # expr_term.parent = grandparent (ExpressionRule) + result = inner._wrap_into_parentheses("${x}") + self.assertEqual(result, "${(x)}") + + def test_expr_term_parent_with_non_expression_grandparent(self): + """Parent is ExprTermRule, grandparent is NOT ExpressionRule → no wrap.""" + inner = StubExpression("test") + expr_term = ExprTermRule([inner]) + NonExpressionRule([expr_term]) + result = inner._wrap_into_parentheses("${x}") + self.assertEqual(result, "${x}") + + def test_expr_term_parent_with_no_grandparent(self): + """Parent is ExprTermRule with no parent → no wrap.""" + inner = StubExpression("test") + ExprTermRule([inner]) + result = inner._wrap_into_parentheses("${x}") + self.assertEqual(result, "${x}") diff --git a/test/unit/rules/test_for_expressions.py b/test/unit/rules/test_for_expressions.py new file mode 100644 index 00000000..38cb90ea --- /dev/null +++ b/test/unit/rules/test_for_expressions.py @@ -0,0 +1,416 @@ +# pylint: disable=C0103,C0114,C0115,C0116 +from unittest import TestCase + +from hcl2.rules.expressions import ExpressionRule +from hcl2.rules.for_expressions import ( + ForIntroRule, + ForCondRule, + ForTupleExprRule, + ForObjectExprRule, +) +from hcl2.rules.literal_rules import IdentifierRule +from hcl2.rules.tokens import ( + NAME, + LSQB, + RSQB, + LBRACE, + RBRACE, + FOR, + IN, + IF, + COMMA, + COLON, + ELLIPSIS, + FOR_OBJECT_ARROW, +) +from hcl2.utils import SerializationOptions, SerializationContext + + +# --- Stubs & helpers --- + + +class StubExpression(ExpressionRule): + """Minimal concrete ExpressionRule that serializes to a fixed string.""" + + def __init__(self, value): + self._stub_value = value + self._last_options = None + super().__init__([], None) + + def serialize(self, options=SerializationOptions(), context=SerializationContext()): + self._last_options = options + return self._stub_value + + +def _make_identifier(name): + return IdentifierRule([NAME(name)]) + + +def _make_for_intro_single(iter_name, iterable_value): + """Build ForIntroRule with a single iterator: for iter_name in iterable :""" + return ForIntroRule( + [ + FOR(), + _make_identifier(iter_name), + IN(), + StubExpression(iterable_value), + COLON(), + ] + ) + + +def _make_for_intro_dual(iter1_name, iter2_name, iterable_value): + """Build ForIntroRule with dual iterators: for iter1, iter2 in iterable :""" + return ForIntroRule( + [ + FOR(), + _make_identifier(iter1_name), + COMMA(), + _make_identifier(iter2_name), + IN(), + StubExpression(iterable_value), + COLON(), + ] + ) + + +def _make_for_cond(value): + """Build ForCondRule: if """ + return ForCondRule([IF(), StubExpression(value)]) + + +# --- ForIntroRule tests --- + + +class TestForIntroRule(TestCase): + def test_lark_name(self): + self.assertEqual(ForIntroRule.lark_name(), "for_intro") + + def test_first_iterator_single(self): + ident = _make_identifier("v") + rule = ForIntroRule([FOR(), ident, IN(), StubExpression("items"), COLON()]) + self.assertIs(rule.first_iterator, ident) + + def test_first_iterator_dual(self): + i1 = _make_identifier("k") + i2 = _make_identifier("v") + rule = ForIntroRule( + [FOR(), i1, COMMA(), i2, IN(), StubExpression("items"), COLON()] + ) + self.assertIs(rule.first_iterator, i1) + + def test_second_iterator_none_when_single(self): + rule = _make_for_intro_single("v", "items") + self.assertIsNone(rule.second_iterator) + + def test_second_iterator_present_when_dual(self): + i2 = _make_identifier("v") + rule = ForIntroRule( + [ + FOR(), + _make_identifier("k"), + COMMA(), + i2, + IN(), + StubExpression("items"), + COLON(), + ] + ) + self.assertIs(rule.second_iterator, i2) + + def test_iterable_property(self): + iterable = StubExpression("items") + rule = ForIntroRule([FOR(), _make_identifier("v"), IN(), iterable, COLON()]) + self.assertIs(rule.iterable, iterable) + + def test_serialize_single_iterator(self): + rule = _make_for_intro_single("v", "items") + self.assertEqual(rule.serialize(), "for v in items : ") + + def test_serialize_dual_iterator(self): + rule = _make_for_intro_dual("k", "v", "items") + self.assertEqual(rule.serialize(), "for k, v in items : ") + + def test_children_length(self): + rule = _make_for_intro_single("v", "items") + self.assertEqual(len(rule.children), 12) + + +# --- ForCondRule tests --- + + +class TestForCondRule(TestCase): + def test_lark_name(self): + self.assertEqual(ForCondRule.lark_name(), "for_cond") + + def test_condition_expr_property(self): + cond_expr = StubExpression("cond") + rule = ForCondRule([IF(), cond_expr]) + self.assertIs(rule.condition_expr, cond_expr) + + def test_serialize(self): + rule = _make_for_cond("cond") + self.assertEqual(rule.serialize(), "if cond") + + def test_children_length(self): + rule = _make_for_cond("cond") + self.assertEqual(len(rule.children), 3) + + +# --- ForTupleExprRule tests --- + + +class TestForTupleExprRule(TestCase): + def test_lark_name(self): + self.assertEqual(ForTupleExprRule.lark_name(), "for_tuple_expr") + + def test_for_intro_property(self): + intro = _make_for_intro_single("v", "items") + rule = ForTupleExprRule([LSQB(), intro, StubExpression("expr"), RSQB()]) + self.assertIs(rule.for_intro, intro) + + def test_value_expr_property(self): + value_expr = StubExpression("expr") + rule = ForTupleExprRule( + [ + LSQB(), + _make_for_intro_single("v", "items"), + value_expr, + RSQB(), + ] + ) + self.assertIs(rule.value_expr, value_expr) + + def test_condition_none(self): + rule = ForTupleExprRule( + [ + LSQB(), + _make_for_intro_single("v", "items"), + StubExpression("expr"), + RSQB(), + ] + ) + self.assertIsNone(rule.condition) + + def test_condition_present(self): + cond = _make_for_cond("cond") + rule = ForTupleExprRule( + [ + LSQB(), + _make_for_intro_single("v", "items"), + StubExpression("expr"), + cond, + RSQB(), + ] + ) + self.assertIsInstance(rule.condition, ForCondRule) + self.assertIs(rule.condition, cond) + + def test_serialize_without_condition(self): + rule = ForTupleExprRule( + [ + LSQB(), + _make_for_intro_single("v", "items"), + StubExpression("expr"), + RSQB(), + ] + ) + self.assertEqual(rule.serialize(), "${[for v in items : expr]}") + + def test_serialize_with_condition(self): + rule = ForTupleExprRule( + [ + LSQB(), + _make_for_intro_single("v", "items"), + StubExpression("expr"), + _make_for_cond("cond"), + RSQB(), + ] + ) + self.assertEqual(rule.serialize(), "${[for v in items : expr if cond]}") + + def test_serialize_inside_dollar_string(self): + rule = ForTupleExprRule( + [ + LSQB(), + _make_for_intro_single("v", "items"), + StubExpression("expr"), + RSQB(), + ] + ) + ctx = SerializationContext(inside_dollar_string=True) + self.assertEqual(rule.serialize(context=ctx), "[for v in items : expr]") + + +# --- ForObjectExprRule tests --- + + +class TestForObjectExprRule(TestCase): + def test_lark_name(self): + self.assertEqual(ForObjectExprRule.lark_name(), "for_object_expr") + + def test_for_intro_property(self): + intro = _make_for_intro_dual("k", "v", "items") + rule = ForObjectExprRule( + [ + LBRACE(), + intro, + StubExpression("key"), + FOR_OBJECT_ARROW(), + StubExpression("value"), + RBRACE(), + ] + ) + self.assertIs(rule.for_intro, intro) + + def test_key_expr_property(self): + key_expr = StubExpression("key") + rule = ForObjectExprRule( + [ + LBRACE(), + _make_for_intro_dual("k", "v", "items"), + key_expr, + FOR_OBJECT_ARROW(), + StubExpression("value"), + RBRACE(), + ] + ) + self.assertIs(rule.key_expr, key_expr) + + def test_value_expr_property(self): + value_expr = StubExpression("value") + rule = ForObjectExprRule( + [ + LBRACE(), + _make_for_intro_dual("k", "v", "items"), + StubExpression("key"), + FOR_OBJECT_ARROW(), + value_expr, + RBRACE(), + ] + ) + self.assertIs(rule.value_expr, value_expr) + + def test_ellipsis_none(self): + rule = ForObjectExprRule( + [ + LBRACE(), + _make_for_intro_dual("k", "v", "items"), + StubExpression("key"), + FOR_OBJECT_ARROW(), + StubExpression("value"), + RBRACE(), + ] + ) + self.assertIsNone(rule.ellipsis) + + def test_ellipsis_present(self): + ellipsis = ELLIPSIS() + rule = ForObjectExprRule( + [ + LBRACE(), + _make_for_intro_dual("k", "v", "items"), + StubExpression("key"), + FOR_OBJECT_ARROW(), + StubExpression("value"), + ellipsis, + RBRACE(), + ] + ) + self.assertIs(rule.ellipsis, ellipsis) + + def test_condition_none(self): + rule = ForObjectExprRule( + [ + LBRACE(), + _make_for_intro_dual("k", "v", "items"), + StubExpression("key"), + FOR_OBJECT_ARROW(), + StubExpression("value"), + RBRACE(), + ] + ) + self.assertIsNone(rule.condition) + + def test_condition_present(self): + cond = _make_for_cond("cond") + rule = ForObjectExprRule( + [ + LBRACE(), + _make_for_intro_dual("k", "v", "items"), + StubExpression("key"), + FOR_OBJECT_ARROW(), + StubExpression("value"), + cond, + RBRACE(), + ] + ) + self.assertIsInstance(rule.condition, ForCondRule) + self.assertIs(rule.condition, cond) + + def test_serialize_basic(self): + rule = ForObjectExprRule( + [ + LBRACE(), + _make_for_intro_dual("k", "v", "items"), + StubExpression("key"), + FOR_OBJECT_ARROW(), + StubExpression("value"), + RBRACE(), + ] + ) + self.assertEqual(rule.serialize(), "${{for k, v in items : key => value}}") + + def test_serialize_with_ellipsis(self): + rule = ForObjectExprRule( + [ + LBRACE(), + _make_for_intro_dual("k", "v", "items"), + StubExpression("key"), + FOR_OBJECT_ARROW(), + StubExpression("value"), + ELLIPSIS(), + RBRACE(), + ] + ) + result = rule.serialize() + self.assertIn("...", result) + self.assertEqual(result, "${{for k, v in items : key => value...}}") + + def test_serialize_with_condition(self): + rule = ForObjectExprRule( + [ + LBRACE(), + _make_for_intro_dual("k", "v", "items"), + StubExpression("key"), + FOR_OBJECT_ARROW(), + StubExpression("value"), + _make_for_cond("cond"), + RBRACE(), + ] + ) + result = rule.serialize() + self.assertIn("if cond", result) + self.assertEqual(result, "${{for k, v in items : key => value if cond}}") + + def test_serialize_preserves_caller_options(self): + value_expr = StubExpression("value") + rule = ForObjectExprRule( + [ + LBRACE(), + _make_for_intro_dual("k", "v", "items"), + StubExpression("key"), + FOR_OBJECT_ARROW(), + value_expr, + RBRACE(), + ] + ) + caller_options = SerializationOptions( + with_comments=True, preserve_heredocs=False + ) + rule.serialize(options=caller_options) + # value_expr should receive options with wrap_objects=True but + # all other caller settings preserved + self.assertTrue(value_expr._last_options.wrap_objects) + self.assertTrue(value_expr._last_options.with_comments) + self.assertFalse(value_expr._last_options.preserve_heredocs) diff --git a/test/unit/rules/test_functions.py b/test/unit/rules/test_functions.py new file mode 100644 index 00000000..6d3146c0 --- /dev/null +++ b/test/unit/rules/test_functions.py @@ -0,0 +1,146 @@ +# pylint: disable=C0103,C0114,C0115,C0116 +from unittest import TestCase + +from hcl2.rules.expressions import ExpressionRule +from hcl2.rules.functions import ( + ArgumentsRule, + FunctionCallRule, +) +from hcl2.rules.literal_rules import IdentifierRule +from hcl2.rules.tokens import NAME, COMMA, ELLIPSIS, LPAR, RPAR, StringToken +from hcl2.utils import SerializationOptions, SerializationContext + + +# --- Stubs & helpers --- + + +class StubExpression(ExpressionRule): + """Minimal concrete ExpressionRule that serializes to a fixed value.""" + + def __init__(self, value): + self._stub_value = value + super().__init__([], None) + + def serialize(self, options=SerializationOptions(), context=SerializationContext()): + return self._stub_value + + +def _make_identifier(name): + return IdentifierRule([NAME(name)]) + + +def _make_arguments(values, ellipsis=False): + """Build an ArgumentsRule from a list of stub values. + + values: list of serialization values for StubExpression args + ellipsis: if True, append an ELLIPSIS token + """ + children = [] + for i, val in enumerate(values): + if i > 0: + children.append(COMMA()) + children.append(StubExpression(val)) + if ellipsis: + children.append(ELLIPSIS()) + return ArgumentsRule(children) + + +def _make_function_call(func_names, arg_values=None, ellipsis=False): + """Build a FunctionCallRule. + + func_names: list of identifier name strings (e.g. ["func"] or ["ns", "mod", "func"]) + arg_values: optional list of stub values for arguments + ellipsis: if True, pass ellipsis to arguments + """ + children = [_make_identifier(name) for name in func_names] + children.append(LPAR()) + if arg_values is not None: + children.append(_make_arguments(arg_values, ellipsis)) + children.append(RPAR()) + return FunctionCallRule(children) + + +# --- ArgumentsRule tests --- + + +class TestArgumentsRule(TestCase): + def test_lark_name(self): + self.assertEqual(ArgumentsRule.lark_name(), "arguments") + + def test_has_ellipsis_false(self): + rule = _make_arguments(["a"]) + self.assertFalse(rule.has_ellipsis) + + def test_has_ellipsis_true(self): + rule = _make_arguments(["a", "b"], ellipsis=True) + self.assertTrue(rule.has_ellipsis) + + def test_arguments_single(self): + rule = _make_arguments(["a"]) + self.assertEqual(len(rule.arguments), 1) + + def test_arguments_multiple(self): + rule = _make_arguments(["a", "b", "c"]) + self.assertEqual(len(rule.arguments), 3) + + def test_serialize_single_arg(self): + rule = _make_arguments(["a"]) + self.assertEqual(rule.serialize(), "a") + + def test_serialize_with_ellipsis(self): + rule = _make_arguments(["a", "b"], ellipsis=True) + self.assertEqual(rule.serialize(), "a, b ...") + + +# --- FunctionCallRule tests --- + + +class TestFunctionCallRule(TestCase): + def test_lark_name(self): + self.assertEqual(FunctionCallRule.lark_name(), "function_call") + + def test_identifiers_single(self): + rule = _make_function_call(["func"]) + self.assertEqual(len(rule.identifiers), 1) + + def test_identifiers_multiple(self): + rule = _make_function_call(["ns", "mod", "func"]) + self.assertEqual(len(rule.identifiers), 3) + + def test_arguments_property_present(self): + rule = _make_function_call(["func"], ["a"]) + self.assertIsInstance(rule.arguments, ArgumentsRule) + + def test_arguments_property_none(self): + rule = _make_function_call(["func"]) + self.assertIsNone(rule.arguments) + + def test_serialize_simple_no_args(self): + rule = _make_function_call(["func"]) + self.assertEqual(rule.serialize(), "${func()}") + + def test_serialize_simple_with_args(self): + rule = _make_function_call(["func"], ["a", "b"]) + self.assertEqual(rule.serialize(), "${func(a, b)}") + + def test_serialize_inside_dollar_string(self): + rule = _make_function_call(["func"], ["a"]) + ctx = SerializationContext(inside_dollar_string=True) + self.assertEqual(rule.serialize(context=ctx), "func(a)") + + def test_arguments_with_colons_tokens(self): + """FunctionCallRule with COLONS tokens (provider syntax) should still find arguments.""" + COLONS = StringToken["COLONS"] + children = [ + _make_identifier("provider"), + COLONS("::"), + _make_identifier("func"), + COLONS("::"), + _make_identifier("aa"), + LPAR(), + _make_arguments([5]), + RPAR(), + ] + rule = FunctionCallRule(children) + self.assertIsNotNone(rule.arguments) + self.assertEqual(rule.serialize(), "${provider::func::aa(5)}") diff --git a/test/unit/rules/test_literal_rules.py b/test/unit/rules/test_literal_rules.py new file mode 100644 index 00000000..9a834e14 --- /dev/null +++ b/test/unit/rules/test_literal_rules.py @@ -0,0 +1,126 @@ +# pylint: disable=C0103,C0114,C0115,C0116 +from unittest import TestCase + +from hcl2.rules.literal_rules import ( + KeywordRule, + IdentifierRule, + IntLitRule, + FloatLitRule, + BinaryOperatorRule, +) +from hcl2.rules.tokens import NAME, BINARY_OP, IntLiteral, FloatLiteral +from hcl2.utils import SerializationContext, SerializationOptions + + +class TestKeywordRule(TestCase): + def test_lark_name(self): + self.assertEqual(KeywordRule.lark_name(), "keyword") + + def test_token_property(self): + token = NAME("true") + rule = KeywordRule([token]) + self.assertIs(rule.token, token) + + def test_serialize(self): + rule = KeywordRule([NAME("true")]) + self.assertEqual(rule.serialize(), "true") + + +class TestIdentifierRule(TestCase): + def test_lark_name(self): + self.assertEqual(IdentifierRule.lark_name(), "identifier") + + def test_serialize(self): + rule = IdentifierRule([NAME("my_var")]) + self.assertEqual(rule.serialize(), "my_var") + + def test_token_property(self): + token = NAME("foo") + rule = IdentifierRule([token]) + self.assertIs(rule.token, token) + + +class TestIntLitRule(TestCase): + def test_lark_name(self): + self.assertEqual(IntLitRule.lark_name(), "int_lit") + + def test_serialize_returns_int(self): + rule = IntLitRule([IntLiteral("42")]) + result = rule.serialize() + self.assertEqual(result, 42) + self.assertIsInstance(result, int) + + +class TestFloatLitRule(TestCase): + def test_lark_name(self): + self.assertEqual(FloatLitRule.lark_name(), "float_lit") + + def test_serialize_returns_float(self): + rule = FloatLitRule([FloatLiteral("3.14")]) + result = rule.serialize() + self.assertAlmostEqual(result, 3.14) + self.assertIsInstance(result, float) + + def test_serialize_scientific_notation_as_dollar_string(self): + """Scientific notation is preserved as ${...} to survive dict round-trip.""" + rule = FloatLitRule([FloatLiteral("1.23e5")]) + self.assertEqual(rule.serialize(), "${1.23e5}") + + def test_serialize_scientific_negative_exponent(self): + rule = FloatLitRule([FloatLiteral("9.87e-3")]) + self.assertEqual(rule.serialize(), "${9.87e-3}") + + def test_serialize_scientific_inside_dollar_string(self): + """Inside a dollar string context, return raw value without wrapping.""" + rule = FloatLitRule([FloatLiteral("1.23e5")]) + ctx = SerializationContext(inside_dollar_string=True) + self.assertEqual(rule.serialize(context=ctx), "1.23e5") + + def test_serialize_regular_float_not_wrapped(self): + """Non-scientific floats should remain plain Python floats.""" + rule = FloatLitRule([FloatLiteral("123.456")]) + result = rule.serialize() + self.assertEqual(result, 123.456) + self.assertIsInstance(result, float) + + def test_serialize_scientific_disabled(self): + """With preserve_scientific_notation=False, returns plain float.""" + rule = FloatLitRule([FloatLiteral("1.23e5")]) + opts = SerializationOptions(preserve_scientific_notation=False) + result = rule.serialize(options=opts) + self.assertEqual(result, 123000.0) + self.assertIsInstance(result, float) + + +class TestBinaryOperatorRule(TestCase): + def test_lark_name(self): + self.assertEqual(BinaryOperatorRule.lark_name(), "binary_operator") + + def test_serialize_plus(self): + rule = BinaryOperatorRule([BINARY_OP("+")]) + self.assertEqual(rule.serialize(), "+") + + def test_serialize_equals(self): + rule = BinaryOperatorRule([BINARY_OP("==")]) + self.assertEqual(rule.serialize(), "==") + + def test_serialize_and(self): + rule = BinaryOperatorRule([BINARY_OP("&&")]) + self.assertEqual(rule.serialize(), "&&") + + def test_serialize_or(self): + rule = BinaryOperatorRule([BINARY_OP("||")]) + self.assertEqual(rule.serialize(), "||") + + def test_serialize_gt(self): + rule = BinaryOperatorRule([BINARY_OP(">")]) + self.assertEqual(rule.serialize(), ">") + + def test_serialize_multiply(self): + rule = BinaryOperatorRule([BINARY_OP("*")]) + self.assertEqual(rule.serialize(), "*") + + def test_token_property(self): + token = BINARY_OP("+") + rule = BinaryOperatorRule([token]) + self.assertIs(rule.token, token) diff --git a/test/unit/rules/test_strings.py b/test/unit/rules/test_strings.py new file mode 100644 index 00000000..e142de5b --- /dev/null +++ b/test/unit/rules/test_strings.py @@ -0,0 +1,356 @@ +# pylint: disable=C0103,C0114,C0115,C0116 +from unittest import TestCase + +from hcl2.rules.expressions import ExpressionRule +from hcl2.rules.strings import ( + InterpolationRule, + StringPartRule, + StringRule, + HeredocTemplateRule, + HeredocTrimTemplateRule, +) +from hcl2.rules.tokens import ( + INTERP_START, + RBRACE, + DBLQUOTE, + STRING_CHARS, + ESCAPED_INTERPOLATION, + HEREDOC_TEMPLATE, + HEREDOC_TRIM_TEMPLATE, +) +from hcl2.utils import SerializationOptions, SerializationContext + + +# --- Stubs --- + + +class StubExpression(ExpressionRule): + """Minimal ExpressionRule that serializes to a fixed string.""" + + def __init__(self, value): + self._stub_value = value + super().__init__([], None) + + def serialize(self, options=SerializationOptions(), context=SerializationContext()): + return self._stub_value + + +# --- Helpers --- + + +def _make_string_part_chars(text): + return StringPartRule([STRING_CHARS(text)]) + + +def _make_string_part_escaped(text): + return StringPartRule([ESCAPED_INTERPOLATION(text)]) + + +def _make_string_part_interpolation(expr_value): + interp = InterpolationRule([INTERP_START(), StubExpression(expr_value), RBRACE()]) + return StringPartRule([interp]) + + +def _make_string(parts): + """Build StringRule from a list of StringPartRule children.""" + return StringRule([DBLQUOTE(), *parts, DBLQUOTE()]) + + +# --- InterpolationRule tests --- + + +class TestInterpolationRule(TestCase): + def test_lark_name(self): + self.assertEqual(InterpolationRule.lark_name(), "interpolation") + + def test_expression_property(self): + expr = StubExpression("var.name") + rule = InterpolationRule([INTERP_START(), expr, RBRACE()]) + self.assertIs(rule.expression, expr) + + def test_serialize_wraps_in_dollar_string(self): + rule = InterpolationRule([INTERP_START(), StubExpression("var.name"), RBRACE()]) + self.assertEqual(rule.serialize(), "${var.name}") + + def test_serialize_idempotent_if_already_dollar(self): + rule = InterpolationRule([INTERP_START(), StubExpression("${x}"), RBRACE()]) + self.assertEqual(rule.serialize(), "${x}") + + def test_serialize_expression_result(self): + rule = InterpolationRule([INTERP_START(), StubExpression("a + b"), RBRACE()]) + self.assertEqual(rule.serialize(), "${a + b}") + + +# --- StringPartRule tests --- + + +class TestStringPartRule(TestCase): + def test_lark_name(self): + self.assertEqual(StringPartRule.lark_name(), "string_part") + + def test_content_property_string_chars(self): + token = STRING_CHARS("hello") + rule = StringPartRule([token]) + self.assertIs(rule.content, token) + + def test_serialize_string_chars(self): + rule = _make_string_part_chars("hello world") + self.assertEqual(rule.serialize(), "hello world") + + def test_serialize_escaped_interpolation(self): + rule = _make_string_part_escaped("$${aws:username}") + self.assertEqual(rule.serialize(), "$${aws:username}") + + def test_serialize_interpolation(self): + rule = _make_string_part_interpolation("var.name") + self.assertEqual(rule.serialize(), "${var.name}") + + def test_content_property_interpolation(self): + interp = InterpolationRule([INTERP_START(), StubExpression("x"), RBRACE()]) + rule = StringPartRule([interp]) + self.assertIs(rule.content, interp) + + +# --- StringRule tests --- + + +class TestStringRule(TestCase): + def test_lark_name(self): + self.assertEqual(StringRule.lark_name(), "string") + + def test_string_parts_property(self): + p1 = _make_string_part_chars("hello") + p2 = _make_string_part_chars(" world") + rule = _make_string([p1, p2]) + self.assertEqual(rule.string_parts, [p1, p2]) + + def test_string_parts_empty(self): + rule = _make_string([]) + self.assertEqual(rule.string_parts, []) + + def test_serialize_plain_string(self): + rule = _make_string([_make_string_part_chars("hello")]) + self.assertEqual(rule.serialize(), '"hello"') + + def test_serialize_empty_string(self): + rule = _make_string([]) + self.assertEqual(rule.serialize(), '""') + + def test_serialize_concatenated_parts(self): + rule = _make_string( + [ + _make_string_part_chars("prefix:"), + _make_string_part_interpolation("var.name"), + _make_string_part_chars("-suffix"), + ] + ) + self.assertEqual(rule.serialize(), '"prefix:${var.name}-suffix"') + + def test_serialize_escaped_and_interpolation(self): + rule = _make_string( + [ + _make_string_part_interpolation("bar"), + _make_string_part_escaped("$${baz:bat}"), + ] + ) + self.assertEqual(rule.serialize(), '"${bar}$${baz:bat}"') + + def test_serialize_only_interpolation(self): + rule = _make_string([_make_string_part_interpolation("x")]) + self.assertEqual(rule.serialize(), '"${x}"') + + def test_serialize_strip_string_quotes(self): + rule = _make_string([_make_string_part_chars("hello")]) + opts = SerializationOptions(strip_string_quotes=True) + self.assertEqual(rule.serialize(opts), "hello") + + def test_serialize_strip_string_quotes_empty(self): + rule = _make_string([]) + opts = SerializationOptions(strip_string_quotes=True) + self.assertEqual(rule.serialize(opts), "") + + def test_serialize_strip_string_quotes_with_interpolation(self): + rule = _make_string( + [ + _make_string_part_chars("prefix:"), + _make_string_part_interpolation("var.name"), + ] + ) + opts = SerializationOptions(strip_string_quotes=True) + self.assertEqual(rule.serialize(opts), "prefix:${var.name}") + + +# --- HeredocTemplateRule tests --- + + +class TestHeredocTemplateRule(TestCase): + def test_lark_name(self): + self.assertEqual(HeredocTemplateRule.lark_name(), "heredoc_template") + + def test_heredoc_property(self): + token = HEREDOC_TEMPLATE("< str: + return "test_inline" + + def serialize(self, options=SerializationOptions(), context=SerializationContext()): + return "test" + + +def _make_nlc(text): + """Helper: build NewLineOrCommentRule from a string.""" + return NewLineOrCommentRule([NL_OR_COMMENT(text)]) + + +# --- Tests --- + + +class TestNewLineOrCommentRule(TestCase): + def test_lark_name(self): + self.assertEqual(NewLineOrCommentRule.lark_name(), "new_line_or_comment") + + def test_serialize_newline(self): + rule = _make_nlc("\n") + self.assertEqual(rule.serialize(), "\n") + + def test_serialize_line_comment(self): + rule = _make_nlc("// this is a comment\n") + self.assertEqual(rule.serialize(), "// this is a comment\n") + + def test_serialize_hash_comment(self): + rule = _make_nlc("# hash comment\n") + self.assertEqual(rule.serialize(), "# hash comment\n") + + def test_to_list_bare_newline_returns_none(self): + rule = _make_nlc("\n") + self.assertIsNone(rule.to_list()) + + def test_to_list_line_comment(self): + rule = _make_nlc("// my comment\n") + result = rule.to_list() + self.assertEqual(result, [{"value": "my comment"}]) + + def test_to_list_hash_comment(self): + rule = _make_nlc("# my comment\n") + result = rule.to_list() + self.assertEqual(result, [{"value": "my comment"}]) + + def test_to_list_block_comment(self): + rule = _make_nlc("/* block comment */\n") + result = rule.to_list() + self.assertEqual(result, [{"value": "block comment"}]) + + def test_to_list_line_comment_ending_in_block_close(self): + """A // comment ending in */ should preserve the */ suffix.""" + rule = _make_nlc("// comment ending in */\n") + result = rule.to_list() + self.assertEqual(result, [{"value": "comment ending in */"}]) + + def test_to_list_hash_comment_ending_in_block_close(self): + """A # comment ending in */ should preserve the */ suffix.""" + rule = _make_nlc("# comment ending in */\n") + result = rule.to_list() + self.assertEqual(result, [{"value": "comment ending in */"}]) + + def test_to_list_multiline_block_comment(self): + """A multiline block comment should be a single value.""" + rule = _make_nlc("/* \nline one\nline two\n*/\n") + result = rule.to_list() + self.assertEqual(result, [{"value": "line one\nline two"}]) + + def test_to_list_multiple_comments(self): + rule = _make_nlc("// first\n// second\n") + result = rule.to_list() + self.assertIn({"value": "first"}, result) + self.assertIn({"value": "second"}, result) + + def test_token_property(self): + token = NL_OR_COMMENT("\n") + rule = NewLineOrCommentRule([token]) + self.assertIs(rule.token, token) + + +class TestInlineCommentMixIn(TestCase): + def test_insert_optionals_inserts_none_where_no_comment(self): + + token = NAME("x") + children = [token, NAME("y")] + mixin = ConcreteInlineComment.__new__(ConcreteInlineComment) + mixin._insert_optionals(children, [1]) + # Should have inserted None at index 1, pushing NAME("y") to index 2 + self.assertIsNone(children[1]) + self.assertEqual(len(children), 3) + + def test_insert_optionals_leaves_comment_in_place(self): + comment = _make_nlc("// comment\n") + + children = [NAME("x"), comment] + mixin = ConcreteInlineComment.__new__(ConcreteInlineComment) + mixin._insert_optionals(children, [1]) + # Should NOT insert None since index 1 is already a NewLineOrCommentRule + self.assertIs(children[1], comment) + self.assertEqual(len(children), 2) + + def test_insert_optionals_handles_index_error(self): + children = [_make_nlc("\n")] + mixin = ConcreteInlineComment.__new__(ConcreteInlineComment) + mixin._insert_optionals(children, [3]) + # Should insert None at index 3 + self.assertEqual(len(children), 2) + self.assertIsNone(children[1]) + + def test_inline_comments_collects_from_children(self): + comment = _make_nlc("// hello\n") + + rule = ConcreteInlineComment([NAME("x"), comment]) + result = rule.inline_comments() + self.assertEqual(result, [{"value": "hello"}]) + + def test_inline_comments_skips_bare_newlines(self): + newline = _make_nlc("\n") + + rule = ConcreteInlineComment([NAME("x"), newline]) + result = rule.inline_comments() + self.assertEqual(result, []) + + def test_inline_comments_recursive(self): + comment = _make_nlc("// inner\n") + inner = ConcreteInlineComment([comment]) + outer = ConcreteInlineComment([inner]) + result = outer.inline_comments() + self.assertEqual(result, [{"value": "inner"}]) + + def test_inline_comments_empty(self): + + rule = ConcreteInlineComment([NAME("x")]) + result = rule.inline_comments() + self.assertEqual(result, []) diff --git a/test/unit/test_api.py b/test/unit/test_api.py new file mode 100644 index 00000000..c39e6ae4 --- /dev/null +++ b/test/unit/test_api.py @@ -0,0 +1,301 @@ +# pylint: disable=C0103,C0114,C0115,C0116 +from io import StringIO +from unittest import TestCase + +from lark.tree import Tree + +from hcl2.api import ( + load, + loads, + dump, + dumps, + parse, + parses, + parse_to_tree, + parses_to_tree, + from_dict, + from_json, + reconstruct, + transform, + serialize, + query, +) +from hcl2.deserializer import DeserializerOptions +from hcl2.formatter import FormatterOptions +from hcl2.rules.base import StartRule +from hcl2.utils import SerializationOptions + + +SIMPLE_HCL = "x = 5\n" +SIMPLE_DICT = {"x": 5} + +BLOCK_HCL = 'resource "aws_instance" "example" {\n ami = "abc-123"\n}\n' + + +class TestLoads(TestCase): + def test_simple_attribute(self): + result = loads(SIMPLE_HCL) + self.assertEqual(result["x"], 5) + + def test_returns_dict(self): + result = loads(SIMPLE_HCL) + self.assertIsInstance(result, dict) + + def test_with_serialization_options(self): + result = loads( + SIMPLE_HCL, serialization_options=SerializationOptions(with_comments=False) + ) + self.assertIsInstance(result, dict) + self.assertEqual(result["x"], 5) + + def test_with_meta_option(self): + result = loads( + BLOCK_HCL, serialization_options=SerializationOptions(with_meta=True) + ) + self.assertIn("resource", result) + # Verify the option is accepted and produces a dict with expected content + self.assertIsInstance(result, dict) + + def test_block_parsing(self): + result = loads(BLOCK_HCL) + self.assertIn("resource", result) + + def test_strip_string_quotes(self): + result = loads( + BLOCK_HCL, + serialization_options=SerializationOptions( + strip_string_quotes=True, explicit_blocks=False + ), + ) + resource_list = result["resource"] + self.assertEqual(len(resource_list), 1) + block = resource_list[0] + # Block label should have no surrounding quotes + self.assertIn("aws_instance", block) + inner = block["aws_instance"] + self.assertIn("example", inner) + body = inner["example"] + # Attribute value should have no surrounding quotes + self.assertEqual(body["ami"], "abc-123") + # No __is_block__ marker + self.assertNotIn("__is_block__", body) + + +class TestLoad(TestCase): + def test_from_file(self): + f = StringIO(SIMPLE_HCL) + result = load(f) + self.assertEqual(result["x"], 5) + + def test_with_serialization_options(self): + f = StringIO(SIMPLE_HCL) + result = load( + f, serialization_options=SerializationOptions(with_comments=False) + ) + self.assertEqual(result["x"], 5) + + +class TestDumps(TestCase): + def test_simple_attribute(self): + result = dumps(SIMPLE_DICT) + self.assertIsInstance(result, str) + self.assertIn("x", result) + self.assertIn("5", result) + + def test_dumps_contains_key_and_value(self): + result = dumps(SIMPLE_DICT) + self.assertIn("x", result) + self.assertIn("5", result) + + def test_roundtrip(self): + result = loads(dumps(SIMPLE_DICT)) + self.assertEqual(result, SIMPLE_DICT) + + def test_with_deserializer_options(self): + result = dumps(SIMPLE_DICT, deserializer_options=DeserializerOptions()) + self.assertIsInstance(result, str) + + def test_with_formatter_options(self): + result = dumps(SIMPLE_DICT, formatter_options=FormatterOptions()) + self.assertIsInstance(result, str) + + +class TestDump(TestCase): + def test_writes_to_file(self): + f = StringIO() + dump(SIMPLE_DICT, f) + output = f.getvalue() + self.assertIn("x", output) + self.assertIn("5", output) + + +class TestParsesToTree(TestCase): + def test_returns_lark_tree(self): + result = parses_to_tree(SIMPLE_HCL) + self.assertIsInstance(result, Tree) + + def test_tree_has_start_rule(self): + result = parses_to_tree(SIMPLE_HCL) + self.assertEqual(result.data, "start") + + +class TestParseToTree(TestCase): + def test_from_file(self): + f = StringIO(SIMPLE_HCL) + result = parse_to_tree(f) + self.assertIsInstance(result, Tree) + + +class TestParses(TestCase): + def test_returns_start_rule(self): + result = parses(SIMPLE_HCL) + self.assertIsInstance(result, StartRule) + + def test_discard_comments_false(self): + hcl = "# comment\nx = 5\n" + result = parses(hcl, discard_comments=False) + serialized = serialize(result) + self.assertIn("__comments__", serialized) + + def test_discard_comments_true(self): + hcl = "# comment\nx = 5\n" + result = parses(hcl, discard_comments=True) + serialized = serialize(result) + self.assertNotIn("__comments__", serialized) + + +class TestParse(TestCase): + def test_from_file(self): + f = StringIO(SIMPLE_HCL) + result = parse(f) + self.assertIsInstance(result, StartRule) + + def test_discard_comments(self): + f = StringIO("# comment\nx = 5\n") + result = parse(f, discard_comments=True) + serialized = serialize(result) + self.assertNotIn("__comments__", serialized) + + +class TestTransform(TestCase): + def test_transforms_lark_tree(self): + lark_tree = parses_to_tree(SIMPLE_HCL) + result = transform(lark_tree) + self.assertIsInstance(result, StartRule) + + def test_discard_comments(self): + lark_tree = parses_to_tree("# comment\nx = 5\n") + result = transform(lark_tree, discard_comments=True) + serialized = serialize(result) + self.assertNotIn("__comments__", serialized) + + +class TestSerialize(TestCase): + def test_returns_dict(self): + tree = parses(SIMPLE_HCL) + result = serialize(tree) + self.assertIsInstance(result, dict) + self.assertEqual(result["x"], 5) + + def test_with_options(self): + tree = parses(SIMPLE_HCL) + result = serialize( + tree, serialization_options=SerializationOptions(with_comments=False) + ) + self.assertIsInstance(result, dict) + + def test_none_options_uses_defaults(self): + tree = parses(SIMPLE_HCL) + result = serialize(tree, serialization_options=None) + self.assertEqual(result["x"], 5) + + +class TestFromDict(TestCase): + def test_returns_start_rule(self): + result = from_dict(SIMPLE_DICT) + self.assertIsInstance(result, StartRule) + + def test_roundtrip(self): + tree = from_dict(SIMPLE_DICT) + result = serialize(tree) + self.assertEqual(result["x"], 5) + + def test_without_formatting(self): + result = from_dict(SIMPLE_DICT, apply_format=False) + self.assertIsInstance(result, StartRule) + + def test_with_deserializer_options(self): + result = from_dict(SIMPLE_DICT, deserializer_options=DeserializerOptions()) + self.assertIsInstance(result, StartRule) + + def test_with_formatter_options(self): + result = from_dict(SIMPLE_DICT, formatter_options=FormatterOptions()) + self.assertIsInstance(result, StartRule) + + +class TestFromJson(TestCase): + def test_returns_start_rule(self): + result = from_json('{"x": 5}') + self.assertIsInstance(result, StartRule) + + def test_roundtrip(self): + tree = from_json('{"x": 5}') + result = serialize(tree) + self.assertEqual(result["x"], 5) + + def test_without_formatting(self): + result = from_json('{"x": 5}', apply_format=False) + self.assertIsInstance(result, StartRule) + + +class TestReconstruct(TestCase): + def test_from_start_rule(self): + tree = parses(SIMPLE_HCL) + result = reconstruct(tree) + self.assertIsInstance(result, str) + self.assertIn("x", result) + + def test_from_lark_tree(self): + lark_tree = parses_to_tree(SIMPLE_HCL) + result = reconstruct(lark_tree) + self.assertIsInstance(result, str) + self.assertIn("x", result) + + def test_roundtrip(self): + tree = parses(SIMPLE_HCL) + hcl_text = reconstruct(tree) + reparsed = loads(hcl_text) + self.assertEqual(reparsed["x"], 5) + + +class TestErrorPaths(TestCase): + def test_loads_raises_on_invalid_hcl(self): + with self.assertRaises(Exception): + loads("this is {{{{ not valid hcl") + + def test_dumps_on_non_dict_raises_type_error(self): + with self.assertRaises(TypeError): + dumps("not a dict") + + def test_from_json_raises_on_invalid_json(self): + with self.assertRaises(Exception): + from_json("{not valid json") + + +class TestQuery(TestCase): + def test_query_string(self): + from hcl2.query.body import DocumentView + + result = query(SIMPLE_HCL) + self.assertIsInstance(result, DocumentView) + attr = result.attribute("x") + self.assertIsNotNone(attr) + + def test_query_file_object(self): + from hcl2.query.body import DocumentView + + f = StringIO(SIMPLE_HCL) + result = query(f) + self.assertIsInstance(result, DocumentView) + attr = result.attribute("x") + self.assertIsNotNone(attr) diff --git a/test/unit/test_builder.py b/test/unit/test_builder.py index 2ce0cfed..8bcd76c4 100644 --- a/test/unit/test_builder.py +++ b/test/unit/test_builder.py @@ -1,110 +1,153 @@ -# pylint:disable=C0116 - -"""Test building an HCL file from scratch""" - -from pathlib import Path +# pylint: disable=C0103,C0114,C0115,C0116 from unittest import TestCase -import hcl2 -import hcl2.builder - - -HELPERS_DIR = Path(__file__).absolute().parent.parent / "helpers" -HCL2_DIR = HELPERS_DIR / "terraform-config" -JSON_DIR = HELPERS_DIR / "terraform-config-json" -HCL2_FILES = [str(file.relative_to(HCL2_DIR)) for file in HCL2_DIR.iterdir()] - - -class TestBuilder(TestCase): - """Test building a variety of hcl files""" - - # print any differences fully to the console - maxDiff = None - - def test_build_blocks_tf(self): - nested_builder = hcl2.Builder() - nested_builder.block("nested_block_1", ["a"], foo="bar") - nested_builder.block("nested_block_1", ["a", "b"], bar="foo") - nested_builder.block("nested_block_1", foobar="barfoo") - nested_builder.block("nested_block_2", barfoo="foobar") - - builder = hcl2.Builder() - builder.block("block", a=1) - builder.block("block", ["label"], __nested_builder__=nested_builder, b=2) - - self.compare_filenames(builder, "blocks.tf") - - def test_build_escapes_tf(self): - builder = hcl2.Builder() - - builder.block("block", ["block_with_newlines"], a="line1\nline2") - - self.compare_filenames(builder, "escapes.tf") - - def test_locals_embdedded_condition_tf(self): - builder = hcl2.Builder() - - builder.block( - "locals", - terraform={ - "channels": "${(local.running_in_ci ? local.ci_channels : local.local_channels)}", - "authentication": [], - "foo": None, - }, +from hcl2.builder import Builder +from hcl2.const import IS_BLOCK + + +class TestBuilderAttributes(TestCase): + def test_empty_builder(self): + b = Builder() + result = b.build() + self.assertIn(IS_BLOCK, result) + self.assertTrue(result[IS_BLOCK]) + + def test_with_attributes(self): + b = Builder({"key": "value", "count": 3}) + result = b.build() + self.assertEqual(result["key"], "value") + self.assertEqual(result["count"], 3) + + def test_is_block_marker_present(self): + b = Builder({"x": 1}) + result = b.build() + self.assertTrue(result[IS_BLOCK]) + + +class TestBuilderBlock(TestCase): + def test_simple_block(self): + b = Builder() + b.block("resource") + result = b.build() + self.assertIn("resource", result) + self.assertEqual(len(result["resource"]), 1) + + def test_block_with_labels(self): + b = Builder() + b.block("resource", labels=["aws_instance", "example"]) + result = b.build() + block_entry = result["resource"][0] + self.assertIn("aws_instance", block_entry) + inner = block_entry["aws_instance"] + self.assertIn("example", inner) + + def test_block_with_attributes(self): + b = Builder() + b.block("resource", labels=["type"], ami="abc-123") + result = b.build() + block = result["resource"][0]["type"] + self.assertEqual(block["ami"], "abc-123") + + def test_multiple_blocks_same_type(self): + b = Builder() + b.block("resource", labels=["type_a"]) + b.block("resource", labels=["type_b"]) + result = b.build() + self.assertEqual(len(result["resource"]), 2) + + def test_multiple_block_types(self): + b = Builder() + b.block("resource") + b.block("data") + result = b.build() + self.assertIn("resource", result) + self.assertIn("data", result) + + def test_block_returns_builder(self): + b = Builder() + child = b.block("resource") + self.assertIsInstance(child, Builder) + + def test_block_child_attributes(self): + b = Builder() + child = b.block("resource", labels=["type"]) + child.attributes["nested_key"] = "nested_val" + # Rebuild to pick up the changes + result = b.build() + block = result["resource"][0]["type"] + self.assertEqual(block["nested_key"], "nested_val") + + def test_self_reference_raises(self): + b = Builder() + with self.assertRaises(ValueError): + b.block("resource", __nested_builder__=b) + + +class TestBuilderNestedBlocks(TestCase): + def test_nested_builder(self): + b = Builder() + inner = Builder() + inner.block("provisioner", labels=["local-exec"], command="echo hello") + b.block("resource", labels=["type"], __nested_builder__=inner) + result = b.build() + block = result["resource"][0]["type"] + self.assertIn("provisioner", block) + + def test_nested_blocks_merged(self): + b = Builder() + inner = Builder() + inner.block("sub_block", x=1) + inner.block("sub_block", x=2) + b.block("resource", __nested_builder__=inner) + result = b.build() + block = result["resource"][0] + self.assertEqual(len(block["sub_block"]), 2) + + +class TestBuilderBlockMarker(TestCase): + def test_block_marker_is_is_block(self): + """Verify IS_BLOCK marker is used (not __start_line__/__end_line__).""" + b = Builder({"x": 1}) + result = b.build() + self.assertIn(IS_BLOCK, result) + self.assertTrue(result[IS_BLOCK]) + self.assertNotIn("__start_line__", result) + self.assertNotIn("__end_line__", result) + + def test_nested_blocks_skip_is_block_key(self): + """_add_nested_blocks should skip IS_BLOCK when merging.""" + b = Builder() + inner = Builder() + inner.block("sub", val=1) + b.block("parent", __nested_builder__=inner) + result = b.build() + parent_block = result["parent"][0] + # sub blocks should be present, but IS_BLOCK from inner should not leak as a list + self.assertIn("sub", parent_block) + # IS_BLOCK should be a bool marker, not a list + self.assertTrue(parent_block[IS_BLOCK]) + + +class TestBuilderIntegration(TestCase): + def test_full_document(self): + doc = Builder() + doc.block( + "resource", + labels=["aws_instance", "web"], + ami="ami-12345", + instance_type="t2.micro", ) - - self.compare_filenames(builder, "locals_embedded_condition.tf") - - def test_locals_embedded_function_tf(self): - builder = hcl2.Builder() - - function_test = ( - "${var.basename}-${var.forwarder_function_name}_" - '${md5("${var.vpc_id}${data.aws_region.current.name}")}' + doc.block( + "resource", + labels=["aws_s3_bucket", "data"], + bucket="my-bucket", ) - builder.block("locals", function_test=function_test) - - self.compare_filenames(builder, "locals_embedded_function.tf") + result = doc.build() + self.assertEqual(len(result["resource"]), 2) - def test_locals_embedded_interpolation_tf(self): - builder = hcl2.Builder() - - attributes = { - "simple_interpolation": "prefix:${var.foo}-suffix", - "embedded_interpolation": "(long substring without interpolation); " - '${module.special_constants.aws_accounts["aaa-${local.foo}-${local.bar}"]}/us-west-2/key_foo', - "deeply_nested_interpolation": 'prefix1-${"prefix2-${"prefix3-$${foo:bar}"}"}', - "escaped_interpolation": "prefix:$${aws:username}-suffix", - "simple_and_escaped": '${"bar"}$${baz:bat}', - "simple_and_escaped_reversed": '$${baz:bat}${"bar"}', - "nested_escaped": 'bar-${"$${baz:bat}"}', - } - - builder.block("locals", **attributes) - - self.compare_filenames(builder, "string_interpolations.tf") - - def test_provider_function_tf(self): - builder = hcl2.Builder() - - builder.block( - "locals", - name2='${provider::test2::test("a")}', - name3='${test("a")}', - ) + web = result["resource"][0]["aws_instance"]["web"] + self.assertEqual(web["ami"], "ami-12345") + self.assertEqual(web["instance_type"], "t2.micro") - self.compare_filenames(builder, "provider_function.tf") - - def compare_filenames(self, builder: hcl2.Builder, filename: str): - hcl_dict = builder.build() - hcl_ast = hcl2.reverse_transform(hcl_dict) - hcl_content_built = hcl2.writes(hcl_ast) - - hcl_path = (HCL2_DIR / filename).absolute() - with hcl_path.open("r") as hcl_file: - hcl_file_content = hcl_file.read() - self.assertMultiLineEqual( - hcl_content_built, - hcl_file_content, - f"file {filename} does not match its programmatically built version.", - ) + data = result["resource"][1]["aws_s3_bucket"]["data"] + self.assertEqual(data["bucket"], "my-bucket") diff --git a/test/unit/test_deserializer.py b/test/unit/test_deserializer.py new file mode 100644 index 00000000..092d3300 --- /dev/null +++ b/test/unit/test_deserializer.py @@ -0,0 +1,574 @@ +# pylint: disable=C0103,C0114,C0115,C0116 +from unittest import TestCase + +from hcl2.const import IS_BLOCK, COMMENTS_KEY, INLINE_COMMENTS_KEY +from hcl2.deserializer import BaseDeserializer, DeserializerOptions +from hcl2.rules.base import StartRule, BodyRule, BlockRule, AttributeRule +from hcl2.rules.containers import ( + TupleRule, + ObjectRule, + ObjectElemRule, + ObjectElemKeyExpressionRule, +) +from hcl2.rules.expressions import ExprTermRule +from hcl2.rules.literal_rules import IdentifierRule, IntLitRule, FloatLitRule +from hcl2.rules.strings import ( + StringRule, + StringPartRule, + InterpolationRule, + HeredocTemplateRule, + HeredocTrimTemplateRule, +) +from hcl2.rules.tokens import ( + STRING_CHARS, + ESCAPED_INTERPOLATION, + COMMA, + EQ, + COLON, +) + + +# --- helpers --- + + +def _deser(options=None): + return BaseDeserializer(options) + + +# --- DeserializerOptions tests --- + + +class TestDeserializerOptions(TestCase): + def test_defaults(self): + opts = DeserializerOptions() + self.assertFalse(opts.heredocs_to_strings) + self.assertFalse(opts.strings_to_heredocs) + self.assertFalse(opts.object_elements_colon) + self.assertTrue(opts.object_elements_trailing_comma) + + +# --- load_python top-level dispatch --- + + +class TestBaseDeserializerLoadPython(TestCase): + def test_dict_input_produces_start_with_body(self): + d = _deser() + result = d.load_python({"x": 1}) + self.assertIsInstance(result, StartRule) + self.assertIsInstance(result.body, BodyRule) + + def test_dict_body_contains_attribute(self): + d = _deser() + result = d.load_python({"x": 1}) + body = result.body + self.assertEqual(len(body.children), 1) + self.assertIsInstance(body.children[0], AttributeRule) + + def test_list_input_raises_type_error(self): + d = _deser() + with self.assertRaises(TypeError) as cm: + d.load_python([1, 2]) + self.assertIn("list", str(cm.exception)) + + def test_scalar_string_input_raises_type_error(self): + d = _deser() + with self.assertRaises(TypeError) as cm: + d.load_python("hello") + self.assertIn("str", str(cm.exception)) + + def test_scalar_int_input_raises_type_error(self): + d = _deser() + with self.assertRaises(TypeError) as cm: + d.load_python(42) + self.assertIn("int", str(cm.exception)) + + def test_loads_parses_json(self): + d = _deser() + result = d.loads('{"key": 42}') + self.assertIsInstance(result, StartRule) + body = result.body + self.assertEqual(len(body.children), 1) + self.assertIsInstance(body.children[0], AttributeRule) + + +# --- _deserialize_text branches --- + + +class TestDeserializeText(TestCase): + def test_bool_true(self): + d = _deser() + result = d._deserialize_text(True) + self.assertIsInstance(result, IdentifierRule) + self.assertEqual(result.token.value, "true") + + def test_bool_false(self): + d = _deser() + result = d._deserialize_text(False) + self.assertIsInstance(result, IdentifierRule) + self.assertEqual(result.token.value, "false") + + def test_bool_before_int(self): + """bool is subclass of int; ensure True doesn't produce IntLitRule.""" + d = _deser() + result = d._deserialize_text(True) + self.assertNotIsInstance(result, IntLitRule) + self.assertIsInstance(result, IdentifierRule) + + def test_int_value(self): + d = _deser() + result = d._deserialize_text(42) + self.assertIsInstance(result, IntLitRule) + self.assertEqual(result.token.value, 42) + + def test_float_value(self): + d = _deser() + result = d._deserialize_text(3.14) + self.assertIsInstance(result, FloatLitRule) + self.assertEqual(result.token.value, 3.14) + + def test_quoted_string(self): + d = _deser() + result = d._deserialize_text('"hello"') + self.assertIsInstance(result, StringRule) + + def test_unquoted_string_identifier(self): + d = _deser() + result = d._deserialize_text("my_var") + self.assertIsInstance(result, IdentifierRule) + self.assertEqual(result.token.value, "my_var") + + def test_expression_string(self): + d = _deser() + result = d._deserialize_text("${var.x}") + self.assertIsInstance(result, ExprTermRule) + + def test_non_string_non_numeric_fallback(self): + """Non-string, non-numeric values get str()-converted to identifier.""" + d = _deser() + result = d._deserialize_text(None) + self.assertIsInstance(result, IdentifierRule) + self.assertEqual(result.token.value, "None") + + def test_zero_int(self): + d = _deser() + result = d._deserialize_text(0) + self.assertIsInstance(result, IntLitRule) + self.assertEqual(result.token.value, 0) + + def test_negative_float(self): + d = _deser() + result = d._deserialize_text(-1.5) + self.assertIsInstance(result, FloatLitRule) + self.assertEqual(result.token.value, -1.5) + + +# --- heredoc handling --- + + +class TestDeserializeHeredocs(TestCase): + def test_preserved_heredoc(self): + d = _deser() + result = d._deserialize_text('"< DictTransformer: - return DictTransformer(with_meta) - - def test_to_string_dollar(self): - string_values = { - '"bool"': "bool", - '"number"': "number", - '"string"': "string", - "${value_1}": "${value_1}", - '"value_2': '${"value_2}', - 'value_3"': '${value_3"}', - '"value_4"': "value_4", - "value_5": "${value_5}", - } - - dict_transformer = self.build_dict_transformer() - - for value, expected in string_values.items(): - actual = dict_transformer.to_string_dollar(value) - - self.assertEqual(actual, expected) diff --git a/test/unit/test_formatter.py b/test/unit/test_formatter.py new file mode 100644 index 00000000..34fadc71 --- /dev/null +++ b/test/unit/test_formatter.py @@ -0,0 +1,829 @@ +# pylint: disable=C0103,C0114,C0115,C0116 +from unittest import TestCase + +from hcl2.formatter import BaseFormatter, FormatterOptions +from hcl2.rules.base import ( + StartRule, + BodyRule, + BlockRule, + AttributeRule, +) +from hcl2.rules.containers import ( + ObjectRule, + ObjectElemRule, + ObjectElemKeyRule, + TupleRule, +) +from hcl2.rules.expressions import ExprTermRule +from hcl2.rules.for_expressions import ( + ForIntroRule, + ForCondRule, + ForTupleExprRule, + ForObjectExprRule, +) +from hcl2.rules.literal_rules import IdentifierRule +from hcl2.rules.tokens import ( + NAME, + EQ, + LBRACE, + RBRACE, + LSQB, + RSQB, + COMMA, + COLON, + FOR, + IN, + IF, + ELLIPSIS, + FOR_OBJECT_ARROW, +) +from hcl2.rules.whitespace import NewLineOrCommentRule + + +# --- helpers --- + + +def _fmt(options=None): + return BaseFormatter(options) + + +def _make_identifier(name): + return IdentifierRule([NAME(name)]) + + +def _make_expr_term(child): + """Wrap a rule in ExprTermRule.""" + return ExprTermRule([child]) + + +def _make_attribute(name, value_str="val"): + """Build a simple attribute: name = value_str (identifier).""" + return AttributeRule( + [ + _make_identifier(name), + EQ(), + _make_expr_term(_make_identifier(value_str)), + ] + ) + + +def _make_block(labels, body_children=None): + body = BodyRule(body_children or []) + children = list(labels) + [LBRACE(), body, RBRACE()] + return BlockRule(children) + + +def _make_object_elem(key_name, value_name, separator=None): + sep = separator or EQ() + key = ObjectElemKeyRule([_make_identifier(key_name)]) + val = ExprTermRule([_make_identifier(value_name)]) + return ObjectElemRule([key, sep, val]) + + +def _make_object(elems, trailing_commas=True): + children = [LBRACE()] + for elem in elems: + children.append(elem) + if trailing_commas: + children.append(COMMA()) + children.append(RBRACE()) + return ObjectRule(children) + + +def _make_tuple(elements, trailing_commas=True): + children = [LSQB()] + for elem in elements: + children.append(elem) + if trailing_commas: + children.append(COMMA()) + children.append(RSQB()) + return TupleRule(children) + + +def _nlc_value(rule): + """Extract the string value from a NewLineOrCommentRule.""" + return rule.token.value + + +# --- FormatterOptions tests --- + + +class TestFormatterOptions(TestCase): + def test_defaults(self): + opts = FormatterOptions() + self.assertEqual(opts.indent_length, 2) + self.assertTrue(opts.open_empty_blocks) + self.assertFalse(opts.open_empty_objects) + self.assertFalse(opts.open_empty_tuples) + self.assertTrue(opts.vertically_align_attributes) + self.assertTrue(opts.vertically_align_object_elements) + + +# --- _build_newline --- + + +class TestBuildNewline(TestCase): + def test_indent_level_zero(self): + f = _fmt() + nl = f._build_newline(0) + self.assertIsInstance(nl, NewLineOrCommentRule) + self.assertEqual(_nlc_value(nl), "\n") + + def test_indent_level_one_default_length(self): + f = _fmt() # indent_length=2 + nl = f._build_newline(1) + self.assertEqual(_nlc_value(nl), "\n ") + + def test_indent_level_two_default_length(self): + f = _fmt() + nl = f._build_newline(2) + self.assertEqual(_nlc_value(nl), "\n ") + + def test_count_two(self): + f = _fmt() + nl = f._build_newline(1, count=2) + self.assertEqual(_nlc_value(nl), "\n\n ") + + def test_custom_indent_length(self): + opts = FormatterOptions(indent_length=4) + f = _fmt(opts) + nl = f._build_newline(1) + self.assertEqual(_nlc_value(nl), "\n ") + + def test_custom_indent_level_two(self): + opts = FormatterOptions(indent_length=4) + f = _fmt(opts) + nl = f._build_newline(2) + self.assertEqual(_nlc_value(nl), "\n ") + + def test_tracks_last_newline(self): + f = _fmt() + nl1 = f._build_newline(0) + self.assertIs(f._last_new_line, nl1) + nl2 = f._build_newline(1) + self.assertIs(f._last_new_line, nl2) + + +# --- _deindent_last_line --- + + +class TestDeindentLastLine(TestCase): + def test_removes_one_indent_level(self): + f = _fmt() # indent_length=2 + f._build_newline(2) # "\n " + f._deindent_last_line() + self.assertEqual(f._last_new_line.token.value, "\n ") + + def test_deindent_twice(self): + f = _fmt() + f._build_newline(2) # "\n " + f._deindent_last_line(times=2) + self.assertEqual(f._last_new_line.token.value, "\n") + + def test_noop_when_no_trailing_spaces(self): + f = _fmt() + f._build_newline(0) # "\n" + f._deindent_last_line() + # Should not change since there are no trailing spaces + self.assertEqual(f._last_new_line.token.value, "\n") + + +# --- format_body_rule --- + + +class TestFormatBodyRule(TestCase): + def test_empty_body_no_children(self): + f = _fmt() + body = BodyRule([]) + # Body with no parent (not inside StartRule) — leading newline is + # added then immediately popped since there are no real children. + f.format_body_rule(body, 0) + self.assertEqual(len(body._children), 0) + + def test_body_with_single_attribute(self): + f = _fmt() + attr = _make_attribute("name") + body = BodyRule([attr]) + # Need a parent so it's not in StartRule context + block = _make_block([_make_identifier("test")]) + block._children[-2] = body # replace the empty body + body._parent = block + + f.format_body_rule(body, 1) + # Should have: newline, attr, (final newline removed by pop) + nlc_children = [ + c for c in body._children if isinstance(c, NewLineOrCommentRule) + ] + self.assertGreaterEqual(len(nlc_children), 1) + # The attribute should still be in children + attr_children = [c for c in body._children if isinstance(c, AttributeRule)] + self.assertEqual(len(attr_children), 1) + + def test_body_inside_start_no_leading_newline(self): + f = _fmt() + attr = _make_attribute("name") + body = BodyRule([attr]) + _start = StartRule([body]) + f.format_body_rule(body, 0) + # First child should be the attribute, not a newline (since in_start=True) + self.assertIsInstance(body._children[0], AttributeRule) + + def test_body_with_attribute_and_block(self): + f = _fmt() + attr = _make_attribute("version") + inner_block = _make_block([_make_identifier("provider")]) + body = BodyRule([attr, inner_block]) + _start = StartRule([body]) + + f.format_body_rule(body, 0) + # Should contain attr, block, and various newlines + attr_count = sum(1 for c in body._children if isinstance(c, AttributeRule)) + block_count = sum(1 for c in body._children if isinstance(c, BlockRule)) + self.assertEqual(attr_count, 1) + self.assertEqual(block_count, 1) + + +# --- format_block_rule --- + + +class TestFormatBlockRule(TestCase): + def test_nonempty_block_closing_newline(self): + f = _fmt() + block = _make_block( + [_make_identifier("resource")], + [_make_attribute("name")], + ) + _start = StartRule([BodyRule([block])]) + f.format_block_rule(block, indent_level=1) + # Last child should be RBRACE; second-to-last should be a newline + self.assertIsInstance(block.children[-1], RBRACE) + self.assertIsInstance(block.children[-2], NewLineOrCommentRule) + + def test_empty_block_open_true(self): + opts = FormatterOptions(open_empty_blocks=True) + f = _fmt(opts) + block = _make_block([_make_identifier("resource")]) + _start = StartRule([BodyRule([block])]) + + f.format_block_rule(block, indent_level=1) + # Should insert a double-newline before RBRACE + nlc_before_rbrace = block.children[-2] + self.assertIsInstance(nlc_before_rbrace, NewLineOrCommentRule) + # count=2 means two newlines + self.assertTrue(_nlc_value(nlc_before_rbrace).startswith("\n\n")) + + def test_empty_block_open_false(self): + opts = FormatterOptions(open_empty_blocks=False) + f = _fmt(opts) + block = _make_block([_make_identifier("resource")]) + _start = StartRule([BodyRule([block])]) + + f.format_block_rule(block, indent_level=1) + # Should NOT insert newline before RBRACE + nlc_children = [ + c for c in block.children if isinstance(c, NewLineOrCommentRule) + ] + # Only the body formatting newlines, but no double-newline insertion + has_double_nl = any(_nlc_value(c).startswith("\n\n") for c in nlc_children) + self.assertFalse(has_double_nl) + + +# --- format_tuple_rule --- + + +class TestFormatTupleRule(TestCase): + def test_nonempty_tuple_newlines(self): + f = _fmt() + elem1 = _make_expr_term(_make_identifier("a")) + elem2 = _make_expr_term(_make_identifier("b")) + tup = _make_tuple([elem1, elem2]) + + f.format_tuple_rule(tup, indent_level=1) + # Should have newlines after LSQB and after each COMMA + nlc_count = sum(1 for c in tup._children if isinstance(c, NewLineOrCommentRule)) + self.assertGreaterEqual(nlc_count, 2) + + def test_empty_tuple_default_no_newlines(self): + f = _fmt() # open_empty_tuples=False by default + tup = _make_tuple([], trailing_commas=False) + + original_len = len(tup.children) + f.format_tuple_rule(tup, indent_level=1) + # No newlines should be inserted + self.assertEqual(len(tup.children), original_len) + + def test_empty_tuple_open_true(self): + opts = FormatterOptions(open_empty_tuples=True) + f = _fmt(opts) + tup = _make_tuple([], trailing_commas=False) + + f.format_tuple_rule(tup, indent_level=1) + # Should insert a double-newline + nlc_children = [c for c in tup.children if isinstance(c, NewLineOrCommentRule)] + self.assertEqual(len(nlc_children), 1) + self.assertTrue(_nlc_value(nlc_children[0]).startswith("\n\n")) + + def test_deindent_on_last_line(self): + f = _fmt() + elem = _make_expr_term(_make_identifier("a")) + tup = _make_tuple([elem]) + + f.format_tuple_rule(tup, indent_level=1) + # The last newline should have been deindented + last_nlc = f._last_new_line + # At indent_level=1 with length 2, deindented means "\n" (no spaces) + self.assertEqual(_nlc_value(last_nlc), "\n") + + +# --- format_object_rule --- + + +class TestFormatObjectRule(TestCase): + def test_nonempty_object_newlines(self): + f = _fmt() + elem = _make_object_elem("key", "val") + obj = _make_object([elem]) + + f.format_object_rule(obj, indent_level=1) + nlc_count = sum(1 for c in obj._children if isinstance(c, NewLineOrCommentRule)) + # Should have newlines after LBRACE, after elements, before RBRACE + self.assertGreaterEqual(nlc_count, 2) + + def test_empty_object_open_true(self): + opts = FormatterOptions(open_empty_objects=True) + f = _fmt(opts) + obj = _make_object([], trailing_commas=False) + + f.format_object_rule(obj, indent_level=1) + nlc_children = [c for c in obj.children if isinstance(c, NewLineOrCommentRule)] + self.assertEqual(len(nlc_children), 1) + self.assertTrue(_nlc_value(nlc_children[0]).startswith("\n\n")) + + def test_empty_object_open_false(self): + opts = FormatterOptions(open_empty_objects=False) + f = _fmt(opts) + obj = _make_object([], trailing_commas=False) + + original_len = len(obj.children) + f.format_object_rule(obj, indent_level=1) + self.assertEqual(len(obj.children), original_len) + + def test_deindent_last_line(self): + f = _fmt() + elem = _make_object_elem("key", "val") + obj = _make_object([elem]) + + f.format_object_rule(obj, indent_level=1) + last_nlc = f._last_new_line + self.assertEqual(_nlc_value(last_nlc), "\n") + + def test_multiple_elements_get_newlines_between(self): + f = _fmt() + elem1 = _make_object_elem("a", "x") + elem2 = _make_object_elem("b", "y") + obj = _make_object([elem1, elem2], trailing_commas=False) + + f.format_object_rule(obj, indent_level=1) + # Should have newlines between the elements + nlc_count = sum(1 for c in obj._children if isinstance(c, NewLineOrCommentRule)) + self.assertGreaterEqual( + nlc_count, 3 + ) # after LBRACE, between elems, before RBRACE + + +# --- format_expression dispatch --- + + +class TestFormatExpression(TestCase): + def test_object_delegates(self): + f = _fmt() + elem = _make_object_elem("key", "val") + obj = _make_object([elem]) + expr = _make_expr_term(obj) + + f.format_expression(expr, indent_level=1) + # Object should have been formatted (newlines inserted) + nlc_count = sum(1 for c in obj._children if isinstance(c, NewLineOrCommentRule)) + self.assertGreater(nlc_count, 0) + + def test_tuple_delegates(self): + f = _fmt() + inner = _make_expr_term(_make_identifier("a")) + tup = _make_tuple([inner]) + expr = _make_expr_term(tup) + + f.format_expression(expr, indent_level=1) + nlc_count = sum(1 for c in tup._children if isinstance(c, NewLineOrCommentRule)) + self.assertGreater(nlc_count, 0) + + def test_nested_expr_term_recursive(self): + f = _fmt() + obj = _make_object([_make_object_elem("k", "v")]) + inner_expr = _make_expr_term(obj) + outer_expr = _make_expr_term(inner_expr) + + f.format_expression(outer_expr, indent_level=1) + nlc_count = sum(1 for c in obj._children if isinstance(c, NewLineOrCommentRule)) + self.assertGreater(nlc_count, 0) + + +# --- vertical alignment --- + + +class TestVerticalAlignment(TestCase): + def test_align_attributes_pads_eq(self): + f = _fmt() + attr_short = _make_attribute("a", "x") + attr_long = _make_attribute("long_name", "y") + body = BodyRule([attr_short, attr_long]) + + f._vertically_align_attributes_in_body(body) + # "a" has length 1, "long_name" has length 9, diff is 8 + eq_short = attr_short.children[1] + eq_long = attr_long.children[1] + self.assertEqual(len(eq_short.value) - len(eq_long.value), 8) + + def test_non_attribute_breaks_sequence(self): + f = _fmt() + attr1 = _make_attribute("x", "a") + block = _make_block([_make_identifier("blk")]) + attr2 = _make_attribute("yy", "b") + body = BodyRule([attr1, block, attr2]) + + f._vertically_align_attributes_in_body(body) + # attr1 is in its own group (length 1), attr2 in its own group (length 2) + # No cross-group padding: each group aligns independently + eq1 = attr1.children[1] + eq2 = attr2.children[1] + # Both should have no extra padding (single-element groups) + self.assertEqual(eq1.value.strip(), "=") + self.assertEqual(eq2.value.strip(), "=") + + def test_align_object_elems_pads_separator(self): + f = _fmt() + elem_short = _make_object_elem("a", "x") + elem_long = _make_object_elem("long_key", "y") + obj = _make_object([elem_short, elem_long], trailing_commas=False) + + f._vertically_align_object_elems(obj) + sep_short = elem_short.children[1] + sep_long = elem_long.children[1] + # "a" serializes to length 1, "long_key" to length 8, diff is 7 + self.assertGreater(len(sep_short.value), len(sep_long.value)) + + def test_colon_separator_extra_space(self): + f = _fmt() + elem = _make_object_elem("key", "val", separator=COLON()) + obj = _make_object([elem], trailing_commas=False) + + f._vertically_align_object_elems(obj) + sep = elem.children[1] + # Single element: spaces_to_add=0, but COLON gets +1 + self.assertTrue(sep.value.endswith(":")) + self.assertEqual(sep.value, " :") + + +# --- indent_length customization --- + + +class TestIndentLength(TestCase): + def test_indent_length_4(self): + opts = FormatterOptions(indent_length=4) + f = _fmt(opts) + nl = f._build_newline(1) + self.assertEqual(_nlc_value(nl), "\n ") + + def test_indent_length_4_level_2(self): + opts = FormatterOptions(indent_length=4) + f = _fmt(opts) + nl = f._build_newline(2) + self.assertEqual(_nlc_value(nl), "\n ") + + def test_deindent_with_indent_length_4(self): + opts = FormatterOptions(indent_length=4) + f = _fmt(opts) + f._build_newline(2) # "\n " + f._deindent_last_line() + self.assertEqual(f._last_new_line.token.value, "\n ") + + def test_format_body_uses_indent_length(self): + opts = FormatterOptions(indent_length=4) + f = _fmt(opts) + attr = _make_attribute("name") + body = BodyRule([attr]) + block = _make_block([_make_identifier("test")]) + block._children[-2] = body + body._parent = block + + f.format_body_rule(body, 1) + nlc_children = [ + c for c in body._children if isinstance(c, NewLineOrCommentRule) + ] + # At least one newline should have 4 spaces of indent + has_4_space = any(" " in _nlc_value(c) for c in nlc_children) + self.assertTrue(has_4_space) + + +# --- format_tree entry point --- + + +class TestFormatTree(TestCase): + def test_format_tree_with_start_rule(self): + f = _fmt() + attr = _make_attribute("key") + body = BodyRule([attr]) + start = StartRule([body]) + + f.format_tree(start) + # Should have processed the body (attribute is first child since in_start) + self.assertIsInstance(body._children[0], AttributeRule) + + def test_format_tree_with_non_start_rule_noop(self): + f = _fmt() + body = BodyRule([]) + # Passing a non-StartRule should be a no-op + f.format_tree(body) + self.assertEqual(len(body._children), 0) + + def test_full_format_start_with_block(self): + f = _fmt() + attr = _make_attribute("ami", "abc") + block = _make_block( + [_make_identifier("resource")], + [attr], + ) + body = BodyRule([block]) + start = StartRule([body]) + + f.format_tree(start) + # Block should have a closing newline before RBRACE + self.assertIsInstance(block.children[-1], RBRACE) + self.assertIsInstance(block.children[-2], NewLineOrCommentRule) + + +# --- for-expression helpers --- + + +def _make_for_intro(iterable_name="items", iterator_name="item"): + """Build a simple for_intro: for iterator_name in iterable_name :""" + return ForIntroRule( + [ + FOR(), + _make_identifier(iterator_name), + IN(), + _make_expr_term(_make_identifier(iterable_name)), + COLON(), + ] + ) + + +def _make_for_cond(condition_name="cond"): + """Build a for_cond: if condition_name""" + return ForCondRule( + [ + IF(), + _make_expr_term(_make_identifier(condition_name)), + ] + ) + + +def _make_for_tuple_expr(value_name="val", condition=None): + """Build a for_tuple_expr: [for item in items : value_name]""" + children = [ + LSQB(), + _make_for_intro(), + _make_expr_term(_make_identifier(value_name)), + ] + if condition is not None: + children.append(condition) + children.append(RSQB()) + return ForTupleExprRule(children) + + +def _make_for_object_expr(key_name="k", value_name="v", ellipsis=False, condition=None): + """Build a for_object_expr: {for item in items : key_name => value_name}""" + children = [ + LBRACE(), + _make_for_intro(), + _make_expr_term(_make_identifier(key_name)), + FOR_OBJECT_ARROW(), + _make_expr_term(_make_identifier(value_name)), + ] + if ellipsis: + children.append(ELLIPSIS()) + if condition is not None: + children.append(condition) + children.append(RBRACE()) + return ForObjectExprRule(children) + + +# --- format_fortupleexpr --- + + +class TestFormatForTupleExpr(TestCase): + def test_basic_no_condition_no_spurious_newline(self): + """No condition → index 5 should be None, no spurious blank line.""" + f = _fmt() + expr = _make_for_tuple_expr() + f.format_fortupleexpr(expr, indent_level=1) + + self.assertIsNone(expr.children[5]) + for idx in [1, 3, 7]: + self.assertIsInstance(expr.children[idx], NewLineOrCommentRule) + + def test_basic_no_condition_deindents_closing(self): + """Last newline (before ]) should be deindented.""" + f = _fmt() + expr = _make_for_tuple_expr() + f.format_fortupleexpr(expr, indent_level=1) + + last_nl = expr.children[7] + self.assertEqual(_nlc_value(last_nl), "\n") + + def test_with_condition_newline_before_if(self): + """With condition → index 5 should be a newline before `if`.""" + f = _fmt() + cond = _make_for_cond() + expr = _make_for_tuple_expr(condition=cond) + f.format_fortupleexpr(expr, indent_level=1) + + self.assertIsInstance(expr.children[5], NewLineOrCommentRule) + for idx in [1, 3, 7]: + self.assertIsInstance(expr.children[idx], NewLineOrCommentRule) + + def test_with_condition_deindents_closing(self): + """Even with condition, last newline (before ]) is deindented.""" + f = _fmt() + cond = _make_for_cond() + expr = _make_for_tuple_expr(condition=cond) + f.format_fortupleexpr(expr, indent_level=1) + + last_nl = expr.children[7] + self.assertEqual(_nlc_value(last_nl), "\n") + + def test_nested_value_object_formatting(self): + """Value expression containing an object should be formatted recursively.""" + f = _fmt() + obj = _make_object([_make_object_elem("k", "v")]) + children = [ + LSQB(), + _make_for_intro(), + _make_expr_term(obj), + RSQB(), + ] + expr = ForTupleExprRule(children) + + f.format_fortupleexpr(expr, indent_level=1) + + nlc_count = sum(1 for c in obj._children if isinstance(c, NewLineOrCommentRule)) + self.assertGreater(nlc_count, 0) + + def test_for_intro_iterable_formatting(self): + """ForIntroRule's iterable expression should be formatted recursively.""" + f = _fmt() + obj = _make_object([_make_object_elem("k", "v")]) + intro = ForIntroRule( + [ + FOR(), + _make_identifier("item"), + IN(), + _make_expr_term(obj), + COLON(), + ] + ) + children = [LSQB(), intro, _make_expr_term(_make_identifier("val")), RSQB()] + expr = ForTupleExprRule(children) + + f.format_fortupleexpr(expr, indent_level=1) + + nlc_count = sum(1 for c in obj._children if isinstance(c, NewLineOrCommentRule)) + self.assertGreater(nlc_count, 0) + + +# --- format_forobjectexpr --- + + +class TestFormatForObjectExpr(TestCase): + def test_basic_no_condition_no_ellipsis(self): + """No condition, no ellipsis → indices 6, 8, 10 should be None.""" + f = _fmt() + expr = _make_for_object_expr() + f.format_forobjectexpr(expr, indent_level=1) + + self.assertIsNone(expr.children[6]) + self.assertIsNone(expr.children[8]) + self.assertIsNone(expr.children[10]) + for idx in [1, 3, 12]: + self.assertIsInstance(expr.children[idx], NewLineOrCommentRule) + + def test_basic_deindents_closing(self): + """Last newline (before }) should be deindented.""" + f = _fmt() + expr = _make_for_object_expr() + f.format_forobjectexpr(expr, indent_level=1) + + last_nl = expr.children[12] + self.assertEqual(_nlc_value(last_nl), "\n") + + def test_with_condition_newline_before_if(self): + """With condition → index 10 should be a newline before `if`.""" + f = _fmt() + cond = _make_for_cond() + expr = _make_for_object_expr(condition=cond) + f.format_forobjectexpr(expr, indent_level=1) + + self.assertIsInstance(expr.children[10], NewLineOrCommentRule) + self.assertIsNone(expr.children[6]) + self.assertIsNone(expr.children[8]) + + def test_with_condition_deindents_closing(self): + """Even with condition, last newline (before }) is deindented.""" + f = _fmt() + cond = _make_for_cond() + expr = _make_for_object_expr(condition=cond) + f.format_forobjectexpr(expr, indent_level=1) + + last_nl = expr.children[12] + self.assertEqual(_nlc_value(last_nl), "\n") + + def test_with_ellipsis_and_condition(self): + """With ellipsis and condition → index 10 is newline, 6/8 cleared.""" + f = _fmt() + cond = _make_for_cond() + expr = _make_for_object_expr(ellipsis=True, condition=cond) + f.format_forobjectexpr(expr, indent_level=1) + + self.assertIsInstance(expr.children[9], ELLIPSIS) + self.assertIsInstance(expr.children[10], NewLineOrCommentRule) + self.assertIsNone(expr.children[6]) + self.assertIsNone(expr.children[8]) + + def test_nested_value_tuple_formatting(self): + """Value expression containing a tuple should be formatted recursively.""" + f = _fmt() + inner_tup = _make_tuple([_make_expr_term(_make_identifier("a"))]) + children = [ + LBRACE(), + _make_for_intro(), + _make_expr_term(_make_identifier("k")), + FOR_OBJECT_ARROW(), + _make_expr_term(inner_tup), + RBRACE(), + ] + expr = ForObjectExprRule(children) + + f.format_forobjectexpr(expr, indent_level=1) + + nlc_count = sum( + 1 for c in inner_tup._children if isinstance(c, NewLineOrCommentRule) + ) + self.assertGreater(nlc_count, 0) + + def test_for_cond_expression_formatting(self): + """ForCondRule's condition expression should be formatted recursively.""" + f = _fmt() + obj = _make_object([_make_object_elem("k", "v")]) + cond = ForCondRule([IF(), _make_expr_term(obj)]) + expr = _make_for_object_expr(condition=cond) + + f.format_forobjectexpr(expr, indent_level=1) + + nlc_count = sum(1 for c in obj._children if isinstance(c, NewLineOrCommentRule)) + self.assertGreater(nlc_count, 0) + + +# --- alignment idempotency --- + + +class TestAlignmentIdempotency(TestCase): + """Alignment must not double-pad when applied multiple times (#7).""" + + def test_attribute_alignment_does_not_double_pad(self): + """Running _vertically_align_attributes_in_body twice produces same padding.""" + f = _fmt() + attr_short = _make_attribute("a", "x") + attr_long = _make_attribute("long_name", "y") + body = BodyRule([attr_short, attr_long]) + + f._vertically_align_attributes_in_body(body) + eq_val_first = attr_short.children[1].value + + f._vertically_align_attributes_in_body(body) + eq_val_second = attr_short.children[1].value + + self.assertEqual(eq_val_first, eq_val_second) + + def test_object_elem_alignment_does_not_double_pad(self): + """Running _vertically_align_object_elems twice produces same padding.""" + f = _fmt() + elem_short = _make_object_elem("a", "x") + elem_long = _make_object_elem("long_key", "y") + obj = _make_object([elem_short, elem_long], trailing_commas=False) + + f._vertically_align_object_elems(obj) + sep_val_first = elem_short.children[1].value + + f._vertically_align_object_elems(obj) + sep_val_second = elem_short.children[1].value + + self.assertEqual(sep_val_first, sep_val_second) diff --git a/test/unit/test_hcl2_syntax.py b/test/unit/test_hcl2_syntax.py deleted file mode 100644 index 96113df3..00000000 --- a/test/unit/test_hcl2_syntax.py +++ /dev/null @@ -1,193 +0,0 @@ -# pylint:disable=C0114,C0116,C0103,W0612 - -import string # pylint:disable=W4901 # https://stackoverflow.com/a/16651393 -from unittest import TestCase - -from test.helpers.hcl2_helper import Hcl2Helper - -from lark import UnexpectedToken, UnexpectedCharacters - - -class TestHcl2Syntax(Hcl2Helper, TestCase): - """Test parsing individual elements of HCL2 syntax""" - - def test_argument(self): - syntax = self.build_argument("identifier", '"expression"') - result = self.load_to_dict(syntax) - self.assertDictEqual(result, {"identifier": "expression"}) - - def test_identifier_starts_with_digit(self): - for i in range(0, 10): - argument = self.build_argument(f"{i}id") - with self.assertRaises(UnexpectedToken) as e: - self.load_to_dict(argument) - assert ( - f"Unexpected token Token('DECIMAL', '{i}') at line 1, column 1" - in str(e) - ) - - def test_identifier_starts_with_special_chars(self): - chars = string.punctuation.replace("_", "") - for i in chars: - argument = self.build_argument(f"{i}id") - with self.assertRaises((UnexpectedToken, UnexpectedCharacters)) as e: - self.load_to_dict(argument) - - def test_identifier_contains_special_chars(self): - chars = string.punctuation.replace("_", "").replace("-", "") - for i in chars: - argument = self.build_argument(f"identifier{i}") - with self.assertRaises((UnexpectedToken, UnexpectedCharacters)) as e: - self.load_to_dict(argument) - - def test_identifier(self): - argument = self.build_argument("_-__identifier_-1234567890-_") - self.load_to_dict(argument) - - def test_block_no_labels(self): - block = """ - block { - } - """ - result = self.load_to_dict(block) - self.assertDictEqual(result, {"block": [{}]}) - - def test_block_single_label(self): - block = """ - block "label" { - } - """ - result = self.load_to_dict(block) - self.assertDictEqual(result, {"block": [{"label": {}}]}) - - def test_block_multiple_labels(self): - block = """ - block "label1" "label2" "label3" { - } - """ - result = self.load_to_dict(block) - self.assertDictEqual( - result, {"block": [{"label1": {"label2": {"label3": {}}}}]} - ) - - def test_unary_operation(self): - operations = [ - ("identifier = -10", {"identifier": -10}), - ("identifier = !true", {"identifier": "${!true}"}), - ] - for hcl, dict_ in operations: - result = self.load_to_dict(hcl) - self.assertDictEqual(result, dict_) - - def test_tuple(self): - tuple_ = """tuple = [ - identifier, - "string", 100, - true == false, - 5 + 5, function(), - ]""" - result = self.load_to_dict(tuple_) - self.assertDictEqual( - result, - { - "tuple": [ - "${identifier}", - "string", - 100, - "${true == false}", - "${5 + 5}", - "${function()}", - ] - }, - ) - - def test_object(self): - object_ = """object = { - key1: identifier, key2: "string", key3: 100, - key4: true == false // comment - key5: 5 + 5, key6: function(), - key7: value == null ? 1 : 0 - }""" - result = self.load_to_dict(object_) - self.assertDictEqual( - result, - { - "object": { - "key1": "${identifier}", - "key2": "string", - "key3": 100, - "key4": "${true == false}", - "key5": "${5 + 5}", - "key6": "${function()}", - "key7": "${value == null ? 1 : 0}", - } - }, - ) - - def test_function_call_and_arguments(self): - calls = { - "r = function()": {"r": "${function()}"}, - "r = function(arg1, arg2)": {"r": "${function(arg1, arg2)}"}, - """r = function( - arg1, arg2, - arg3, - ) - """: { - "r": "${function(arg1, arg2, arg3)}" - }, - } - - for call, expected in calls.items(): - result = self.load_to_dict(call) - self.assertDictEqual(result, expected) - - def test_index(self): - indexes = { - "r = identifier[10]": {"r": "${identifier[10]}"}, - "r = identifier.20": { - "r": "${identifier[2]}" - }, # TODO debug why `20` is parsed to `2` - """r = identifier["key"]""": {"r": '${identifier["key"]}'}, - """r = identifier.key""": {"r": "${identifier.key}"}, - } - for call, expected in indexes.items(): - result = self.load_to_dict(call) - self.assertDictEqual(result, expected) - - def test_e_notation(self): - literals = { - "var = 3e4": {"var": "${3e4}"}, - "var = 3.5e5": {"var": "${3.5e5}"}, - "var = -3e6": {"var": "${-3e6}"}, - "var = -2.3e4": {"var": "${-2.3e4}"}, - "var = -5e-2": {"var": "${-5e-2}"}, - "var = -6.1e-3": {"var": "${-6.1e-3}"}, - } - for actual, expected in literals.items(): - result = self.load_to_dict(actual) - self.assertDictEqual(result, expected) - - def test_null(self): - identifier = "var = null" - - expected = {"var": None} - - result = self.load_to_dict(identifier) - self.assertDictEqual(result, expected) - - def test_expr_term_parenthesis(self): - literals = { - "a = 1 * 2 + 3": {"a": "${1 * 2 + 3}"}, - "b = 1 * (2 + 3)": {"b": "${1 * (2 + 3)}"}, - "c = (1 * (2 + 3))": {"c": "${(1 * (2 + 3))}"}, - "conditional = value == null ? 1 : 0": { - "conditional": "${value == null ? 1 : 0}" - }, - "conditional = (value == null ? 1 : 0)": { - "conditional": "${(value == null ? 1 : 0)}" - }, - } - - for actual, expected in literals.items(): - result = self.load_to_dict(actual) - self.assertDictEqual(result, expected) diff --git a/test/unit/test_load.py b/test/unit/test_load.py deleted file mode 100644 index f9be8845..00000000 --- a/test/unit/test_load.py +++ /dev/null @@ -1,57 +0,0 @@ -""" Test parsing a variety of hcl files""" - -import json -from pathlib import Path -from unittest import TestCase - -from hcl2.parser import PARSER_FILE, parser -import hcl2 - - -HELPERS_DIR = Path(__file__).absolute().parent.parent / "helpers" -HCL2_DIR = HELPERS_DIR / "terraform-config" -JSON_DIR = HELPERS_DIR / "terraform-config-json" -HCL2_FILES = [str(file.relative_to(HCL2_DIR)) for file in HCL2_DIR.iterdir()] - - -class TestLoad(TestCase): - """Test parsing a variety of hcl files""" - - # print any differences fully to the console - maxDiff = None - - def test_load_terraform(self): - """Test parsing a set of hcl2 files and force recreating the parser file""" - - # create a parser to make sure that the parser file is created - parser() - - # delete the parser file to force it to be recreated - PARSER_FILE.unlink() - for hcl_path in HCL2_FILES: - yield self.check_terraform, hcl_path - - def test_load_terraform_from_cache(self): - """Test parsing a set of hcl2 files from a cached parser file""" - for hcl_path in HCL2_FILES: - yield self.check_terraform, hcl_path - - def check_terraform(self, hcl_path_str: str): - """Loads a single hcl2 file, parses it and compares with the expected json""" - hcl_path = (HCL2_DIR / hcl_path_str).absolute() - json_path = JSON_DIR / hcl_path.relative_to(HCL2_DIR).with_suffix(".json") - if not json_path.exists(): - assert ( - False - ), f"Expected json equivalent of the hcl file doesn't exist {json_path}" - - with hcl_path.open("r") as hcl_file, json_path.open("r") as json_file: - try: - hcl2_dict = hcl2.load(hcl_file) - except Exception as exc: - assert False, f"failed to tokenize terraform in `{hcl_path_str}`: {exc}" - - json_dict = json.load(json_file) - self.assertDictEqual( - hcl2_dict, json_dict, f"\n\nfailed comparing {hcl_path_str}" - ) diff --git a/test/unit/test_load_with_meta.py b/test/unit/test_load_with_meta.py deleted file mode 100644 index b081844e..00000000 --- a/test/unit/test_load_with_meta.py +++ /dev/null @@ -1,23 +0,0 @@ -"""Test parsing hcl files with meta parameters""" - -import json -from pathlib import Path -from unittest import TestCase - -import hcl2 - -TEST_WITH_META_DIR = Path(__file__).absolute().parent.parent / "helpers" / "with-meta" -TF_FILE_PATH = TEST_WITH_META_DIR / "data_sources.tf" -JSON_FILE_PATH = TEST_WITH_META_DIR / "data_sources.json" - - -class TestLoadWithMeta(TestCase): - """Test parsing hcl files with meta parameters""" - - def test_load_terraform_meta(self): - """Test load() with with_meta flag set to true.""" - with TF_FILE_PATH.open("r") as tf_file, JSON_FILE_PATH.open("r") as json_file: - self.assertDictEqual( - json.load(json_file), - hcl2.load(tf_file, with_meta=True), - ) diff --git a/test/unit/test_postlexer.py b/test/unit/test_postlexer.py new file mode 100644 index 00000000..07a8164b --- /dev/null +++ b/test/unit/test_postlexer.py @@ -0,0 +1,150 @@ +# pylint: disable=C0103,C0114,C0115,C0116 +"""Unit tests for hcl2.postlexer. + +Tests parse real HCL2 snippets through the full pipeline to verify that the +postlexer correctly handles newlines before binary operators and QMARK. +""" + +from unittest import TestCase + +from lark import Token + +from hcl2.api import loads +from hcl2.postlexer import OPERATOR_TYPES, PostLexer + + +class TestMergeNewlinesIntoOperators(TestCase): + """Test _merge_newlines_into_operators at the token-stream level.""" + + def _run(self, tokens): + """Run the postlexer pass and return a list of tokens.""" + return list(PostLexer()._merge_newlines_into_operators(iter(tokens))) + + def test_no_newlines_passes_through(self): + tokens = [Token("NAME", "a"), Token("PLUS", "+"), Token("NAME", "b")] + result = self._run(tokens) + self.assertEqual(len(result), 3) + self.assertEqual(result[1].type, "PLUS") + self.assertEqual(str(result[1]), "+") + + def test_newline_before_operator_is_merged(self): + tokens = [ + Token("NAME", "a"), + Token("NL_OR_COMMENT", "\n "), + Token("PLUS", "+"), + Token("NAME", "b"), + ] + result = self._run(tokens) + self.assertEqual(len(result), 3) + self.assertEqual(result[1].type, "PLUS") + self.assertEqual(str(result[1]), "\n +") + + def test_newline_before_non_operator_is_preserved(self): + tokens = [ + Token("NAME", "a"), + Token("NL_OR_COMMENT", "\n"), + Token("NAME", "b"), + ] + result = self._run(tokens) + self.assertEqual(len(result), 3) + self.assertEqual(result[1].type, "NL_OR_COMMENT") + + def test_consecutive_newlines_first_yielded(self): + tokens = [ + Token("NL_OR_COMMENT", "\n"), + Token("NL_OR_COMMENT", "\n "), + Token("PLUS", "+"), + ] + result = self._run(tokens) + self.assertEqual(len(result), 2) + self.assertEqual(result[0].type, "NL_OR_COMMENT") + self.assertEqual(str(result[0]), "\n") + self.assertEqual(result[1].type, "PLUS") + self.assertEqual(str(result[1]), "\n +") + + def test_trailing_newline_is_yielded(self): + tokens = [Token("NAME", "a"), Token("NL_OR_COMMENT", "\n")] + result = self._run(tokens) + self.assertEqual(len(result), 2) + self.assertEqual(result[1].type, "NL_OR_COMMENT") + + def test_all_operator_types_are_merged(self): + for op_type in sorted(OPERATOR_TYPES): + with self.subTest(op_type=op_type): + tokens = [ + Token("NL_OR_COMMENT", "\n"), + Token(op_type, "x"), + ] + result = self._run(tokens) + self.assertEqual(len(result), 1) + self.assertEqual(result[0].type, op_type) + self.assertTrue(str(result[0]).startswith("\n")) + + def test_minus_not_in_operator_types(self): + self.assertNotIn("MINUS", OPERATOR_TYPES) + + +class TestMultilineOperatorParsing(TestCase): + """Test that HCL2 snippets with multiline operators parse correctly.""" + + def test_multiline_ternary(self): + hcl = 'x = (\n a\n ? "yes"\n : "no"\n)\n' + result = loads(hcl) + self.assertEqual(result["x"], '${(a ? "yes" : "no")}') + + def test_multiline_logical_or(self): + hcl = "x = (\n a\n || b\n)\n" + result = loads(hcl) + self.assertEqual(result["x"], "${(a || b)}") + + def test_multiline_logical_and(self): + hcl = "x = (\n a\n && b\n)\n" + result = loads(hcl) + self.assertEqual(result["x"], "${(a && b)}") + + def test_multiline_equality(self): + hcl = "x = (\n a\n == b\n)\n" + result = loads(hcl) + self.assertEqual(result["x"], "${(a == b)}") + + def test_multiline_not_equal(self): + hcl = "x = (\n a\n != b\n)\n" + result = loads(hcl) + self.assertEqual(result["x"], "${(a != b)}") + + def test_multiline_comparison(self): + hcl = "x = (\n a\n >= b\n)\n" + result = loads(hcl) + self.assertEqual(result["x"], "${(a >= b)}") + + def test_multiline_addition(self): + hcl = "x = (\n a\n + b\n)\n" + result = loads(hcl) + self.assertEqual(result["x"], "${(a + b)}") + + def test_multiline_multiplication(self): + hcl = "x = (\n a\n * b\n)\n" + result = loads(hcl) + self.assertEqual(result["x"], "${(a * b)}") + + def test_multiline_chained_operators(self): + hcl = "x = (\n a\n && b\n && c\n)\n" + result = loads(hcl) + self.assertEqual(result["x"], "${(a && b && c)}") + + def test_multiline_nested_ternary(self): + hcl = 'x = (\n a\n ? b\n : c == "d"\n ? "e"\n : f\n)\n' + result = loads(hcl) + self.assertEqual(result["x"], '${(a ? b : c == "d" ? "e" : f)}') + + def test_minus_on_new_line_is_separate_attribute(self): + """MINUS is excluded from merging — newline before - starts a new statement.""" + hcl = "a = 1\nb = -2\n" + result = loads(hcl) + self.assertEqual(result["a"], 1) + self.assertIn("b", result) + + def test_single_line_operators_still_work(self): + hcl = "x = a + b\n" + result = loads(hcl) + self.assertEqual(result["x"], "${a + b}") diff --git a/test/unit/test_reconstruct_ast.py b/test/unit/test_reconstruct_ast.py deleted file mode 100644 index b9545def..00000000 --- a/test/unit/test_reconstruct_ast.py +++ /dev/null @@ -1,112 +0,0 @@ -""" Test reconstructing hcl files""" - -import json -from pathlib import Path -from unittest import TestCase - -import hcl2 - - -HELPERS_DIR = Path(__file__).absolute().parent.parent / "helpers" -HCL2_DIR = HELPERS_DIR / "terraform-config" -HCL2_FILES = [str(file.relative_to(HCL2_DIR)) for file in HCL2_DIR.iterdir()] -JSON_DIR = HELPERS_DIR / "terraform-config-json" - - -class TestReconstruct(TestCase): - """Test reconstructing a variety of hcl files""" - - # print any differences fully to the console - maxDiff = None - - def test_write_terraform(self): - """Test reconstructing a set of hcl2 files, to make sure they parse to the same structure""" - for hcl_path in HCL2_FILES: - yield self.check_terraform, hcl_path - - def test_write_terraform_exact(self): - """ - Test reconstructing a set of hcl2 files, to make sure they - reconstruct exactly the same, including whitespace. - """ - - # the reconstruction process is not precise, so some files do not - # reconstruct their whitespace exactly the same, but they are - # syntactically equivalent. This list is a target for further - # improvements to the whitespace handling of the reconstruction - # algorithm. - inexact_files = [ - # the reconstructor loses commas on the last element in an array, - # even if they're in the input file - "iam.tf", - "variables.tf", - # the reconstructor doesn't preserve indentation within comments - # perfectly - "multiline_expressions.tf", - # the reconstructor doesn't preserve the line that a ternary is - # broken on. - "route_table.tf", - ] - - for hcl_path in HCL2_FILES: - if hcl_path not in inexact_files: - yield self.check_whitespace, hcl_path - - def check_terraform(self, hcl_path_str: str): - """ - Loads a single hcl2 file, parses it, reconstructs it, - parses the reconstructed file, and compares with the expected json - """ - hcl_path = (HCL2_DIR / hcl_path_str).absolute() - json_path = JSON_DIR / hcl_path.relative_to(HCL2_DIR).with_suffix(".json") - with hcl_path.open("r") as hcl_file, json_path.open("r") as json_file: - hcl_file_content = hcl_file.read() - try: - hcl_ast = hcl2.parses(hcl_file_content) - except Exception as exc: - assert False, f"failed to tokenize terraform in `{hcl_path_str}`: {exc}" - - try: - hcl_reconstructed = hcl2.writes(hcl_ast) - except Exception as exc: - assert ( - False - ), f"failed to reconstruct terraform in `{hcl_path_str}`: {exc}" - - try: - hcl2_dict = hcl2.loads(hcl_reconstructed) - except Exception as exc: - assert ( - False - ), f"failed to tokenize terraform in file reconstructed from `{hcl_path_str}`: {exc}" - - json_dict = json.load(json_file) - self.assertDictEqual( - hcl2_dict, - json_dict, - f"failed comparing {hcl_path_str} with reconstructed version", - ) - - def check_whitespace(self, hcl_path_str: str): - """Tests that the reconstructed file matches the original file exactly.""" - hcl_path = (HCL2_DIR / hcl_path_str).absolute() - with hcl_path.open("r") as hcl_file: - hcl_file_content = hcl_file.read() - try: - hcl_ast = hcl2.parses(hcl_file_content) - except Exception as exc: - assert False, f"failed to tokenize terraform in `{hcl_path_str}`: {exc}" - - try: - hcl_reconstructed = hcl2.writes(hcl_ast) - except Exception as exc: - assert ( - False - ), f"failed to reconstruct terraform in `{hcl_path_str}`: {exc}" - - self.assertMultiLineEqual( - hcl_reconstructed, - hcl_file_content, - f"file {hcl_path_str} does not match its reconstructed version \ - exactly. this is usually whitespace related.", - ) diff --git a/test/unit/test_reconstruct_dict.py b/test/unit/test_reconstruct_dict.py deleted file mode 100644 index a65e8429..00000000 --- a/test/unit/test_reconstruct_dict.py +++ /dev/null @@ -1,88 +0,0 @@ -""" Test reconstructing hcl files""" - -import json -import traceback -from pathlib import Path -from unittest import TestCase - -import hcl2 - - -HELPERS_DIR = Path(__file__).absolute().parent.parent / "helpers" -HCL2_DIR = HELPERS_DIR / "terraform-config" -HCL2_FILES = [str(file.relative_to(HCL2_DIR)) for file in HCL2_DIR.iterdir()] -JSON_DIR = HELPERS_DIR / "terraform-config-json" - - -class TestReconstruct(TestCase): - """Test reconstructing a variety of hcl files""" - - # print any differences fully to the console - maxDiff = None - - def test_write_terraform(self): - """Test reconstructing a set of hcl2 files, to make sure they parse to the same structure""" - - # the reconstruction process is not precise, so some files do not - # reconstruct any embedded HCL expressions exactly the same. this - # list captures those, and should be manually inspected regularly to - # ensure that files remain syntactically equivalent - inexact_files = [ - # one level of interpolation is stripped from this file during - # reconstruction, since we don't have a way to distinguish it from - # a complex HCL expression. the output parses to the same value - # though - "multi_level_interpolation.tf", - ] - - for hcl_path in HCL2_FILES: - if hcl_path not in inexact_files: - yield self.check_terraform, hcl_path - - def check_terraform(self, hcl_path_str: str): - """ - Loads a single hcl2 file, parses it, reconstructs it, - parses the reconstructed file, and compares with the expected json - """ - hcl_path = (HCL2_DIR / hcl_path_str).absolute() - json_path = JSON_DIR / hcl_path.relative_to(HCL2_DIR).with_suffix(".json") - with hcl_path.open("r") as hcl_file, json_path.open("r") as json_file: - try: - hcl2_dict_correct = hcl2.load(hcl_file) - except Exception as exc: - raise RuntimeError( - f"failed to tokenize 'correct' terraform in " - f"`{hcl_path_str}`: {traceback.format_exc()}" - ) from exc - - json_dict = json.load(json_file) - - try: - hcl_ast = hcl2.reverse_transform(json_dict) - except Exception as exc: - raise RuntimeError( - f"failed to reverse transform HCL from " - f"`{json_path.name}`: {traceback.format_exc()}" - ) from exc - - try: - hcl_reconstructed = hcl2.writes(hcl_ast) - except Exception as exc: - raise RuntimeError( - f"failed to reconstruct terraform from AST from " - f"`{json_path.name}`: {traceback.format_exc()}" - ) from exc - - try: - hcl2_dict_reconstructed = hcl2.loads(hcl_reconstructed) - except Exception as exc: - raise RuntimeError( - f"failed to tokenize 'reconstructed' terraform from AST from " - f"`{json_path.name}`: {exc}, \n{hcl_reconstructed}" - ) from exc - - self.assertDictEqual( - hcl2_dict_reconstructed, - hcl2_dict_correct, - f"failed comparing {hcl_path_str} with reconstructed version from {json_path.name}", - ) diff --git a/test/unit/test_reconstructor.py b/test/unit/test_reconstructor.py new file mode 100644 index 00000000..e9f46900 --- /dev/null +++ b/test/unit/test_reconstructor.py @@ -0,0 +1,1084 @@ +# pylint: disable=C0103,C0114,C0115,C0116,C0302 +"""Unit tests for hcl2.reconstructor.""" + +from unittest import TestCase + +from lark import Tree, Token + +from hcl2.reconstructor import HCLReconstructor +from hcl2.rules.base import BlockRule, AttributeRule, BodyRule, StartRule +from hcl2.rules.containers import ( + ObjectRule, + ObjectElemRule, + ObjectElemKeyRule, + TupleRule, +) +from hcl2.rules.expressions import ( + BinaryTermRule, + ExprTermRule, + ConditionalRule, + UnaryOpRule, + BinaryOpRule, +) +from hcl2.rules.for_expressions import ( + ForIntroRule, + ForCondRule, + ForTupleExprRule, + ForObjectExprRule, +) +from hcl2.rules.literal_rules import BinaryOperatorRule, IdentifierRule +from hcl2.rules.strings import StringRule +from hcl2.rules.tokens import ( + NAME, + NL_OR_COMMENT, + EQ, + LBRACE, + RBRACE, + LSQB, + RSQB, + COMMA, + COLON, + QMARK, + FOR, + IN, + IF, + ELLIPSIS, + FOR_OBJECT_ARROW, + DBLQUOTE, + STRING_CHARS, + BINARY_OP, +) +from hcl2.rules.whitespace import NewLineOrCommentRule + + +# --- helpers --- + + +def _r(): + """Create a fresh HCLReconstructor.""" + return HCLReconstructor() + + +def _make_identifier(name): + return IdentifierRule([NAME(name)]) + + +def _make_expr_term(child): + return ExprTermRule([child]) + + +def _make_nlc(value): + """Build a NewLineOrCommentRule with the given string value.""" + return NewLineOrCommentRule([NL_OR_COMMENT(value)]) + + +def _make_attribute(name, value_str="val"): + return AttributeRule( + [ + _make_identifier(name), + EQ(), + _make_expr_term(_make_identifier(value_str)), + ] + ) + + +def _make_string(text): + return StringRule([DBLQUOTE(), STRING_CHARS(text), DBLQUOTE()]) + + +def _make_block(type_name, labels=None, body_children=None): + body = BodyRule(body_children or []) + children = [_make_identifier(type_name)] + for label in labels or []: + children.append(label) + children += [LBRACE(), body, RBRACE()] + return BlockRule(children) + + +def _make_object_elem(key_name, value_name, separator=None): + sep = separator or EQ() + key = ObjectElemKeyRule([_make_identifier(key_name)]) + val = ExprTermRule([_make_identifier(value_name)]) + return ObjectElemRule([key, sep, val]) + + +def _make_object(elems, trailing_commas=True): + children = [LBRACE()] + for elem in elems: + children.append(elem) + if trailing_commas: + children.append(COMMA()) + children.append(RBRACE()) + return ObjectRule(children) + + +def _make_tuple(elements, trailing_commas=True): + children = [LSQB()] + for elem in elements: + children.append(elem) + if trailing_commas: + children.append(COMMA()) + children.append(RSQB()) + return TupleRule(children) + + +def _to_lark(rule): + """Convert a LarkElement tree to a Lark Tree for the reconstructor.""" + return rule.to_lark() + + +def _reconstruct(rule, postproc=None): + """Helper: convert rule to Lark tree and reconstruct to string.""" + r = _r() + return r.reconstruct(_to_lark(rule), postproc=postproc) + + +# --- HCLReconstructor basic behavior --- + + +class TestReconstructorResetState(TestCase): + def test_reset_clears_all_state(self): + r = _r() + r._last_was_space = False + r._current_indent = 3 + r._last_token_name = "NAME" + r._last_rule_name = "identifier" + r._reset_state() + self.assertTrue(r._last_was_space) + self.assertEqual(r._current_indent, 0) + self.assertIsNone(r._last_token_name) + self.assertIsNone(r._last_rule_name) + + def test_reconstruct_resets_state_each_call(self): + r = _r() + tree1 = Tree( + "start", + [Tree("body", [Token("NAME", "a"), Token("EQ", "="), Token("NAME", "b")])], + ) + tree2 = Tree("start", [Tree("body", [Token("NAME", "x")])]) + r.reconstruct(tree1) + result = r.reconstruct(tree2) + # Second call should not be affected by state from first + self.assertEqual(result, "x\n") + + +class TestReconstructorTrailingNewline(TestCase): + def test_result_ends_with_newline(self): + tree = Tree("start", [Tree("body", [Token("NAME", "x")])]) + r = _r() + result = r.reconstruct(tree) + self.assertTrue(result.endswith("\n")) + + def test_empty_body_returns_empty_string(self): + tree = Tree("start", [Tree("body", [])]) + r = _r() + result = r.reconstruct(tree) + self.assertEqual(result, "") + + def test_already_has_newline_not_doubled(self): + tree = Tree("start", [Tree("body", [Token("NL_OR_COMMENT", "\n")])]) + r = _r() + result = r.reconstruct(tree) + self.assertEqual(result, "\n") + + +class TestReconstructorPostproc(TestCase): + def test_postproc_applied(self): + tree = Tree("start", [Tree("body", [Token("NAME", "hello")])]) + r = _r() + result = r.reconstruct(tree, postproc=lambda s: s.upper()) + self.assertEqual(result, "HELLO\n") + + def test_postproc_none_is_noop(self): + tree = Tree("start", [Tree("body", [Token("NAME", "hello")])]) + r = _r() + result = r.reconstruct(tree) + self.assertEqual(result, "hello\n") + + +# --- Space insertion: tokens --- + + +class TestSpaceBeforeToken(TestCase): + def test_no_space_at_beginning(self): + """First token should not get a leading space.""" + r = _r() + token = Token("NAME", "x") + self.assertFalse(r._should_add_space_before(token)) + + def test_no_space_when_last_was_space(self): + r = _r() + r._last_was_space = True + r._last_token_name = "NAME" + token = Token("NAME", "y") + self.assertFalse(r._should_add_space_before(token)) + + def test_space_before_lbrace_in_block(self): + r = _r() + r._last_was_space = False + r._last_token_name = "NAME" + token = Token("LBRACE", "{") + self.assertTrue( + r._should_add_space_before(token, parent_rule_name=BlockRule.lark_name()) + ) + + def test_no_space_before_lbrace_outside_block(self): + r = _r() + r._last_was_space = False + r._last_token_name = "NAME" + token = Token("LBRACE", "{") + self.assertFalse(r._should_add_space_before(token, parent_rule_name="object")) + + def test_no_space_default_for_unmatched_token(self): + r = _r() + r._last_was_space = False + r._last_token_name = "LPAR" + self.assertFalse(r._should_add_space_before(Token("RBRACE", "}"), None)) + + +class TestSpaceAroundEq(TestCase): + def test_space_before_eq(self): + r = _r() + r._last_was_space = False + r._last_token_name = "NAME" + token = Token("EQ", "=") + self.assertTrue(r._should_add_space_before(token)) + + def test_space_after_eq(self): + r = _r() + r._last_was_space = False + r._last_token_name = "EQ" + token = Token("NAME", "value") + self.assertTrue(r._should_add_space_before(token)) + + +class TestSpaceAroundBinaryOps(TestCase): + def test_space_before_binary_op(self): + r = _r() + r._last_was_space = False + r._last_token_name = "NAME" + for op in [ + "PLUS", + "MINUS", + "ASTERISK", + "SLASH", + "DOUBLE_EQ", + "NEQ", + "LT", + "GT", + "LEQ", + "GEQ", + "PERCENT", + "DOUBLE_AMP", + "DOUBLE_PIPE", + ]: + token = Token(op, "+") + self.assertTrue( + r._should_add_space_before(token), + f"Expected space before {op}", + ) + + def test_space_after_binary_op(self): + r = _r() + r._last_was_space = False + for op in ["PLUS", "MINUS", "DOUBLE_EQ"]: + r._last_token_name = op + token = Token("NAME", "x") + self.assertTrue( + r._should_add_space_before(token), + f"Expected space after {op}", + ) + + def test_no_space_in_unary_op(self): + r = _r() + r._last_was_space = False + r._last_token_name = "MINUS" + token = Token("NAME", "x") + self.assertFalse( + r._should_add_space_before(token, parent_rule_name=UnaryOpRule.lark_name()) + ) + + +class TestSpaceAroundConditional(TestCase): + def test_space_before_qmark(self): + r = _r() + r._last_was_space = False + r._last_token_name = "NAME" + token = Token("QMARK", "?") + self.assertTrue( + r._should_add_space_before( + token, parent_rule_name=ConditionalRule.lark_name() + ) + ) + + def test_space_before_colon(self): + r = _r() + r._last_was_space = False + r._last_token_name = "NAME" + token = Token("COLON", ":") + self.assertTrue( + r._should_add_space_before( + token, parent_rule_name=ConditionalRule.lark_name() + ) + ) + + def test_space_after_qmark(self): + r = _r() + r._last_was_space = False + r._last_token_name = "QMARK" + token = Token("NAME", "x") + self.assertTrue( + r._should_add_space_before( + token, parent_rule_name=ConditionalRule.lark_name() + ) + ) + + def test_space_after_colon(self): + r = _r() + r._last_was_space = False + r._last_token_name = "COLON" + token = Token("NAME", "x") + self.assertTrue( + r._should_add_space_before( + token, parent_rule_name=ConditionalRule.lark_name() + ) + ) + + def test_no_space_qmark_outside_conditional(self): + r = _r() + r._last_was_space = False + r._last_token_name = "NAME" + self.assertFalse(r._should_add_space_before(Token("QMARK", "?"), None)) + + +class TestSpaceAroundComma(TestCase): + def test_space_after_comma_before_name(self): + r = _r() + r._last_was_space = False + r._last_token_name = "COMMA" + token = Token("NAME", "x") + self.assertTrue(r._should_add_space_before(token)) + + def test_no_space_after_comma_before_rsqb(self): + r = _r() + r._last_was_space = False + r._last_token_name = "COMMA" + token = Token("RSQB", "]") + self.assertFalse(r._should_add_space_before(token)) + + def test_no_space_after_comma_before_nl(self): + r = _r() + r._last_was_space = False + r._last_token_name = "COMMA" + token = Token("NL_OR_COMMENT", "\n") + self.assertFalse(r._should_add_space_before(token)) + + +class TestSpaceAroundForKeywords(TestCase): + def test_space_before_for(self): + r = _r() + r._last_was_space = False + r._last_token_name = "LSQB" + token = Token("FOR", "for") + self.assertTrue(r._should_add_space_before(token)) + + def test_space_before_in(self): + r = _r() + r._last_was_space = False + r._last_token_name = "NAME" + token = Token("IN", "in") + self.assertTrue(r._should_add_space_before(token)) + + def test_space_before_if(self): + r = _r() + r._last_was_space = False + r._last_token_name = "NAME" + token = Token("IF", "if") + self.assertTrue(r._should_add_space_before(token)) + + def test_space_after_for(self): + r = _r() + r._last_was_space = False + r._last_token_name = "FOR" + token = Token("NAME", "x") + self.assertTrue(r._should_add_space_before(token)) + + def test_space_after_in(self): + r = _r() + r._last_was_space = False + r._last_token_name = "IN" + token = Token("NAME", "items") + self.assertTrue(r._should_add_space_before(token)) + + def test_space_after_if(self): + r = _r() + r._last_was_space = False + r._last_token_name = "IF" + token = Token("NAME", "cond") + self.assertTrue(r._should_add_space_before(token)) + + def test_no_space_after_for_before_nl(self): + r = _r() + r._last_was_space = False + r._last_token_name = "FOR" + token = Token("NL_OR_COMMENT", "\n") + self.assertFalse(r._should_add_space_before(token)) + + def test_no_space_after_in_before_nl(self): + r = _r() + r._last_was_space = False + r._last_token_name = "IN" + self.assertFalse(r._should_add_space_before(Token("NL_OR_COMMENT", "\n"))) + + def test_no_space_after_if_before_nl(self): + r = _r() + r._last_was_space = False + r._last_token_name = "IF" + self.assertFalse(r._should_add_space_before(Token("NL_OR_COMMENT", "\n"))) + + +class TestSpaceAroundForObjectArrow(TestCase): + def test_space_before_arrow(self): + r = _r() + r._last_was_space = False + r._last_token_name = "NAME" + token = Token("FOR_OBJECT_ARROW", "=>") + self.assertTrue(r._should_add_space_before(token)) + + def test_space_after_arrow(self): + r = _r() + r._last_was_space = False + r._last_token_name = "FOR_OBJECT_ARROW" + token = Token("NAME", "v") + self.assertTrue(r._should_add_space_before(token)) + + +class TestSpaceAroundEllipsis(TestCase): + def test_space_before_ellipsis(self): + r = _r() + r._last_was_space = False + r._last_token_name = "NAME" + token = Token("ELLIPSIS", "...") + self.assertTrue(r._should_add_space_before(token)) + + def test_space_after_ellipsis(self): + r = _r() + r._last_was_space = False + r._last_token_name = "ELLIPSIS" + token = Token("NAME", "x") + self.assertTrue(r._should_add_space_before(token)) + + +class TestSpaceColonInForIntro(TestCase): + def test_space_before_colon_in_for_intro(self): + r = _r() + r._last_was_space = False + r._last_token_name = "NAME" + token = Token("COLON", ":") + self.assertTrue( + r._should_add_space_before(token, parent_rule_name=ForIntroRule.lark_name()) + ) + + def test_no_space_colon_outside_for_intro_and_conditional(self): + r = _r() + r._last_was_space = False + r._last_token_name = "LPAR" + self.assertFalse(r._should_add_space_before(Token("COLON", ":"), None)) + + +# --- Space insertion: tree nodes --- + + +class TestSpaceBeforeTree(TestCase): + def test_space_between_labels_in_block(self): + """Space between identifier labels within a block.""" + r = _r() + r._last_was_space = False + r._last_token_name = "NAME" + r._last_rule_name = IdentifierRule.lark_name() + tree = Tree(IdentifierRule.lark_name(), [Token("NAME", "label2")]) + self.assertTrue( + r._should_add_space_before(tree, parent_rule_name=BlockRule.lark_name()) + ) + + def test_space_between_string_and_identifier_in_block(self): + r = _r() + r._last_was_space = False + r._last_token_name = "DBLQUOTE" + r._last_rule_name = StringRule.lark_name() + tree = Tree(IdentifierRule.lark_name(), [Token("NAME", "label")]) + self.assertTrue( + r._should_add_space_before(tree, parent_rule_name=BlockRule.lark_name()) + ) + + def test_no_space_between_labels_outside_block(self): + r = _r() + r._last_was_space = False + r._last_token_name = "NAME" + r._last_rule_name = IdentifierRule.lark_name() + tree = Tree(IdentifierRule.lark_name(), [Token("NAME", "x")]) + self.assertFalse(r._should_add_space_before(tree, parent_rule_name="attribute")) + + def test_no_space_for_non_label_tree_in_block(self): + r = _r() + r._last_was_space = False + r._last_token_name = "NAME" + r._last_rule_name = StringRule.lark_name() + self.assertFalse( + r._should_add_space_before( + Tree("expr_term", []), parent_rule_name=BlockRule.lark_name() + ) + ) + + def test_space_after_qmark_before_tree_in_conditional(self): + r = _r() + r._last_was_space = False + r._last_token_name = "QMARK" + tree = Tree("expr_term", [Token("NAME", "x")]) + self.assertTrue( + r._should_add_space_before( + tree, parent_rule_name=ConditionalRule.lark_name() + ) + ) + + def test_space_after_colon_before_tree_in_conditional(self): + r = _r() + r._last_was_space = False + r._last_token_name = "COLON" + tree = Tree("expr_term", [Token("NAME", "x")]) + self.assertTrue( + r._should_add_space_before( + tree, parent_rule_name=ConditionalRule.lark_name() + ) + ) + + def test_no_space_after_other_token_before_tree_in_conditional(self): + r = _r() + r._last_was_space = False + r._last_token_name = "LPAR" + self.assertFalse( + r._should_add_space_before( + Tree("expr_term", []), parent_rule_name=ConditionalRule.lark_name() + ) + ) + + def test_space_after_colon_before_tree_in_for_tuple_expr(self): + r = _r() + r._last_was_space = False + r._last_token_name = "COLON" + tree = Tree("expr_term", [Token("NAME", "x")]) + self.assertTrue( + r._should_add_space_before( + tree, parent_rule_name=ForTupleExprRule.lark_name() + ) + ) + + def test_space_after_colon_before_tree_in_for_object_expr(self): + r = _r() + r._last_was_space = False + r._last_token_name = "COLON" + tree = Tree("expr_term", [Token("NAME", "x")]) + self.assertTrue( + r._should_add_space_before( + tree, parent_rule_name=ForObjectExprRule.lark_name() + ) + ) + + def test_no_space_after_colon_before_nlc_in_for_expr(self): + r = _r() + r._last_was_space = False + r._last_token_name = "COLON" + tree = Tree("new_line_or_comment", [Token("NL_OR_COMMENT", "\n")]) + self.assertFalse( + r._should_add_space_before( + tree, parent_rule_name=ForTupleExprRule.lark_name() + ) + ) + + def test_no_space_after_colon_outside_for_expr(self): + r = _r() + r._last_was_space = False + r._last_token_name = "COLON" + self.assertFalse(r._should_add_space_before(Tree("expr_term", []), None)) + + def test_no_space_default_for_unmatched_tree(self): + r = _r() + r._last_was_space = False + r._last_token_name = "LPAR" + self.assertFalse(r._should_add_space_before(Tree("body", []), None)) + + +# --- _reconstruct_token --- + + +class TestReconstructToken(TestCase): + def test_simple_token(self): + r = _r() + token = Token("NAME", "hello") + result = r._reconstruct_token(token) + self.assertEqual(result, "hello") + + def test_updates_last_token_name(self): + r = _r() + token = Token("NAME", "hello") + r._reconstruct_token(token) + self.assertEqual(r._last_token_name, "NAME") + + def test_updates_last_was_space_for_newline(self): + r = _r() + token = Token("NL_OR_COMMENT", "\n") + r._reconstruct_token(token) + self.assertTrue(r._last_was_space) + + def test_updates_last_was_space_trailing_space(self): + r = _r() + r._reconstruct_token(Token("NAME", "x ")) + self.assertTrue(r._last_was_space) + + def test_updates_last_was_space_false(self): + r = _r() + token = Token("NAME", "hello") + r._reconstruct_token(token) + self.assertFalse(r._last_was_space) + + def test_space_prepended_when_needed(self): + r = _r() + r._last_was_space = False + r._last_token_name = "NAME" + token = Token("EQ", "=") + result = r._reconstruct_token(token) + self.assertEqual(result, " =") + + def test_empty_token_skips_last_was_space_update(self): + r = _r() + initial = r._last_was_space + r._reconstruct_token(Token("NAME", "")) + self.assertEqual(r._last_was_space, initial) + + +# --- _reconstruct_node --- + + +class TestReconstructNode(TestCase): + def test_token_returns_list_of_one(self): + r = _r() + token = Token("NAME", "x") + result = r._reconstruct_node(token) + self.assertIsInstance(result, list) + self.assertEqual(len(result), 1) + self.assertEqual(result[0], "x") + + def test_tree_returns_list(self): + r = _r() + tree = Tree("identifier", [Token("NAME", "x")]) + result = r._reconstruct_node(tree) + self.assertIsInstance(result, list) + self.assertEqual("".join(result), "x") + + def test_fallback_non_tree_non_token(self): + r = _r() + result = r._reconstruct_node(42) + self.assertEqual(result, ["42"]) + + +# --- _reconstruct_tree --- + + +class TestReconstructTree(TestCase): + def test_simple_tree(self): + r = _r() + tree = Tree("identifier", [Token("NAME", "myvar")]) + result = r._reconstruct_tree(tree) + self.assertEqual("".join(result), "myvar") + + def test_updates_last_rule_name(self): + r = _r() + tree = Tree("identifier", [Token("NAME", "x")]) + r._reconstruct_tree(tree) + self.assertEqual(r._last_rule_name, "identifier") + + def test_nested_tree(self): + r = _r() + inner = Tree("identifier", [Token("NAME", "x")]) + outer = Tree("expr_term", [inner]) + result = r._reconstruct_tree(outer) + self.assertEqual("".join(result), "x") + + def test_unary_op_no_space_between_op_and_operand(self): + r = _r() + tree = Tree( + UnaryOpRule.lark_name(), + [ + Token("MINUS", "-"), + Tree("expr_term", [Token("NAME", "x")]), + ], + ) + result = r._reconstruct_tree(tree) + self.assertEqual("".join(result), "-x") + + def test_empty_tree_returns_empty_list(self): + r = _r() + tree = Tree("body", []) + result = r._reconstruct_tree(tree) + self.assertEqual(result, []) + + def test_updates_last_was_space_for_trailing_newline(self): + r = _r() + tree = Tree("new_line_or_comment", [Token("NL_OR_COMMENT", "\n")]) + r._reconstruct_tree(tree) + self.assertTrue(r._last_was_space) + + def test_updates_last_was_space_for_trailing_non_space(self): + r = _r() + tree = Tree("identifier", [Token("NAME", "x")]) + r._reconstruct_tree(tree) + self.assertFalse(r._last_was_space) + + def test_tree_prepends_space_for_block_labels(self): + r = _r() + r._last_was_space = False + r._last_token_name = "NAME" + r._last_rule_name = IdentifierRule.lark_name() + tree = Tree(IdentifierRule.lark_name(), [Token("NAME", "b")]) + result = r._reconstruct_tree(tree, parent_rule_name=BlockRule.lark_name()) + text = "".join(result) + self.assertEqual(text, " b") + + +# --- End-to-end reconstruction via LarkElement.to_lark() --- + + +class TestReconstructAttribute(TestCase): + def test_simple_attribute(self): + attr = _make_attribute("name", "value") + body = BodyRule([attr]) + start = StartRule([body]) + result = _reconstruct(start) + self.assertEqual(result, "name = value\n") + + +class TestReconstructBlock(TestCase): + def test_empty_block(self): + block = _make_block("resource") + body = BodyRule([block]) + start = StartRule([body]) + result = _reconstruct(start) + self.assertEqual(result, "resource {}\n") + + def test_block_with_string_label(self): + block = _make_block("resource", labels=[_make_string("aws_instance")]) + body = BodyRule([block]) + start = StartRule([body]) + result = _reconstruct(start) + self.assertIn('resource "aws_instance"', result) + self.assertIn("{}", result) + + def test_block_with_identifier_labels(self): + block = _make_block( + "resource", + labels=[_make_identifier("aws_instance"), _make_string("example")], + ) + body = BodyRule([block]) + start = StartRule([body]) + result = _reconstruct(start) + self.assertIn("resource", result) + self.assertIn("aws_instance", result) + self.assertIn('"example"', result) + + def test_block_with_body(self): + attr = _make_attribute("ami", "abc") + nlc = _make_nlc("\n ") + nlc2 = _make_nlc("\n") + block = _make_block("resource", body_children=[nlc, attr, nlc2]) + body = BodyRule([block]) + start = StartRule([body]) + result = _reconstruct(start) + self.assertIn("resource {", result) + self.assertIn("ami = abc", result) + self.assertIn("}", result) + + +class TestReconstructConditional(TestCase): + def test_conditional_expression(self): + # condition ? true_val : false_val + cond = ConditionalRule( + [ + _make_expr_term(_make_identifier("enabled")), + QMARK(), + _make_expr_term(_make_identifier("yes")), + COLON(), + _make_expr_term(_make_identifier("no")), + ] + ) + attr = AttributeRule([_make_identifier("result"), EQ(), _make_expr_term(cond)]) + body = BodyRule([attr]) + start = StartRule([body]) + result = _reconstruct(start) + self.assertIn("enabled ? yes : no", result) + + +class TestReconstructUnaryOp(TestCase): + def test_negation(self): + unary = UnaryOpRule( + [ + BinaryOperatorRule([BINARY_OP("-")]), + _make_expr_term(_make_identifier("x")), + ] + ) + attr = AttributeRule([_make_identifier("val"), EQ(), _make_expr_term(unary)]) + body = BodyRule([attr]) + start = StartRule([body]) + result = _reconstruct(start) + self.assertIn("-x", result) + # Should NOT have a space between - and x + self.assertNotIn("- x", result) + + +class TestReconstructBinaryOp(TestCase): + def test_addition_raw_lark_tree(self): + """Test binary op spacing using raw Lark tokens (as the parser produces).""" + r = _r() + # Raw Lark tree with PLUS token type (as the Lark parser produces) + tree = Tree( + "start", + [ + Tree( + "body", + [ + Tree( + "attribute", + [ + Tree("identifier", [Token("NAME", "sum")]), + Token("EQ", "="), + Tree( + "binary_op", + [ + Tree( + "expr_term", + [ + Tree( + "identifier", [Token("NAME", "a")] + ), + ], + ), + Tree( + "binary_term", + [ + Tree( + "binary_operator", + [Token("PLUS", "+")], + ), + Tree( + "expr_term", + [ + Tree( + "identifier", + [Token("NAME", "b")], + ), + ], + ), + ], + ), + Tree( + "new_line_or_comment", + [Token("NL_OR_COMMENT", "\n")], + ), + ], + ), + ], + ), + ], + ), + ], + ) + result = r.reconstruct(tree) + self.assertIn("a + b", result) + + def test_addition_via_to_lark(self): + """Test binary op via LarkElement.to_lark() produces valid output.""" + binary = BinaryOpRule( + [ + _make_expr_term(_make_identifier("a")), + BinaryTermRule( + [ + BinaryOperatorRule([BINARY_OP("+")]), + _make_expr_term(_make_identifier("b")), + ] + ), + _make_nlc("\n"), + ] + ) + attr = AttributeRule([_make_identifier("sum"), EQ(), _make_expr_term(binary)]) + body = BodyRule([attr]) + start = StartRule([body]) + result = _reconstruct(start) + self.assertIn("sum", result) + self.assertIn("a", result) + self.assertIn("+", result) + self.assertIn("b", result) + + +class TestReconstructForTupleExpr(TestCase): + def test_basic_for_tuple(self): + intro = ForIntroRule( + [ + FOR(), + _make_identifier("item"), + IN(), + _make_expr_term(_make_identifier("items")), + COLON(), + ] + ) + expr = ForTupleExprRule( + [ + LSQB(), + intro, + _make_expr_term(_make_identifier("item")), + RSQB(), + ] + ) + attr = AttributeRule([_make_identifier("result"), EQ(), _make_expr_term(expr)]) + body = BodyRule([attr]) + start = StartRule([body]) + result = _reconstruct(start) + self.assertIn("for item in items :", result) + self.assertIn("item", result) + self.assertIn("[", result) + self.assertIn("]", result) + + def test_for_tuple_with_condition(self): + intro = ForIntroRule( + [ + FOR(), + _make_identifier("item"), + IN(), + _make_expr_term(_make_identifier("items")), + COLON(), + ] + ) + cond = ForCondRule([IF(), _make_expr_term(_make_identifier("item"))]) + expr = ForTupleExprRule( + [ + LSQB(), + intro, + _make_expr_term(_make_identifier("item")), + cond, + RSQB(), + ] + ) + attr = AttributeRule([_make_identifier("result"), EQ(), _make_expr_term(expr)]) + body = BodyRule([attr]) + start = StartRule([body]) + result = _reconstruct(start) + self.assertIn("if item", result) + + +class TestReconstructForObjectExpr(TestCase): + def test_basic_for_object(self): + intro = ForIntroRule( + [ + FOR(), + _make_identifier("k"), + COMMA(), + _make_identifier("v"), + IN(), + _make_expr_term(_make_identifier("items")), + COLON(), + ] + ) + expr = ForObjectExprRule( + [ + LBRACE(), + intro, + _make_expr_term(_make_identifier("k")), + FOR_OBJECT_ARROW(), + _make_expr_term(_make_identifier("v")), + RBRACE(), + ] + ) + attr = AttributeRule([_make_identifier("result"), EQ(), _make_expr_term(expr)]) + body = BodyRule([attr]) + start = StartRule([body]) + result = _reconstruct(start) + self.assertIn("for k, v in items :", result) + self.assertIn("k => v", result) + + def test_for_object_with_ellipsis(self): + intro = ForIntroRule( + [ + FOR(), + _make_identifier("k"), + IN(), + _make_expr_term(_make_identifier("items")), + COLON(), + ] + ) + expr = ForObjectExprRule( + [ + LBRACE(), + intro, + _make_expr_term(_make_identifier("k")), + FOR_OBJECT_ARROW(), + _make_expr_term(_make_identifier("v")), + ELLIPSIS(), + RBRACE(), + ] + ) + attr = AttributeRule([_make_identifier("result"), EQ(), _make_expr_term(expr)]) + body = BodyRule([attr]) + start = StartRule([body]) + result = _reconstruct(start) + self.assertIn("...", result) + + +class TestReconstructTuple(TestCase): + def test_inline_tuple(self): + tup = _make_tuple( + [ + _make_expr_term(_make_identifier("a")), + _make_expr_term(_make_identifier("b")), + ] + ) + attr = AttributeRule([_make_identifier("list"), EQ(), _make_expr_term(tup)]) + body = BodyRule([attr]) + start = StartRule([body]) + result = _reconstruct(start) + self.assertIn("[a, b,]", result) + + +class TestReconstructObject(TestCase): + def test_inline_object(self): + obj = _make_object( + [_make_object_elem("key", "val")], + ) + attr = AttributeRule([_make_identifier("obj"), EQ(), _make_expr_term(obj)]) + body = BodyRule([attr]) + start = StartRule([body]) + result = _reconstruct(start) + self.assertIn("key = val,", result) + self.assertIn("{", result) + self.assertIn("}", result) + + +class TestReconstructMultipleAttributes(TestCase): + def test_two_attributes_with_newlines(self): + attr1 = _make_attribute("a", "1") + attr2 = _make_attribute("b", "2") + nlc = _make_nlc("\n") + body = BodyRule([attr1, nlc, attr2]) + start = StartRule([body]) + result = _reconstruct(start) + self.assertIn("a = 1", result) + self.assertIn("b = 2", result) + lines = result.strip().split("\n") + self.assertEqual(len(lines), 2) + + +class TestReconstructString(TestCase): + def test_quoted_string(self): + s = _make_string("hello world") + attr = AttributeRule([_make_identifier("greeting"), EQ(), _make_expr_term(s)]) + body = BodyRule([attr]) + start = StartRule([body]) + result = _reconstruct(start) + self.assertIn('"hello world"', result) diff --git a/test/unit/test_utils.py b/test/unit/test_utils.py new file mode 100644 index 00000000..01954113 --- /dev/null +++ b/test/unit/test_utils.py @@ -0,0 +1,142 @@ +# pylint: disable=C0103,C0114,C0115,C0116 +from unittest import TestCase + +from hcl2.utils import ( + SerializationOptions, + SerializationContext, + is_dollar_string, + to_dollar_string, + unwrap_dollar_string, + wrap_into_parentheses, +) + + +class TestSerializationOptions(TestCase): + def test_default_values(self): + opts = SerializationOptions() + self.assertTrue(opts.with_comments) + self.assertFalse(opts.with_meta) + self.assertFalse(opts.wrap_objects) + self.assertFalse(opts.wrap_tuples) + self.assertTrue(opts.explicit_blocks) + self.assertTrue(opts.preserve_heredocs) + self.assertFalse(opts.force_operation_parentheses) + + def test_custom_values(self): + opts = SerializationOptions( + with_comments=False, + with_meta=True, + force_operation_parentheses=True, + ) + self.assertFalse(opts.with_comments) + self.assertTrue(opts.with_meta) + self.assertTrue(opts.force_operation_parentheses) + + +class TestSerializationContext(TestCase): + def test_default_values(self): + ctx = SerializationContext() + self.assertFalse(ctx.inside_dollar_string) + self.assertFalse(ctx.inside_parentheses) + + def test_replace_returns_new_instance(self): + ctx = SerializationContext() + new_ctx = ctx.replace(inside_dollar_string=True) + self.assertIsNot(ctx, new_ctx) + self.assertFalse(ctx.inside_dollar_string) + self.assertTrue(new_ctx.inside_dollar_string) + + def test_modify_mutates_and_restores(self): + ctx = SerializationContext() + self.assertFalse(ctx.inside_dollar_string) + + with ctx.modify(inside_dollar_string=True): + self.assertTrue(ctx.inside_dollar_string) + + self.assertFalse(ctx.inside_dollar_string) + + def test_modify_restores_on_exception(self): + ctx = SerializationContext() + + with self.assertRaises(ValueError): + with ctx.modify(inside_dollar_string=True, inside_parentheses=True): + self.assertTrue(ctx.inside_dollar_string) + self.assertTrue(ctx.inside_parentheses) + raise ValueError("test") + + self.assertFalse(ctx.inside_dollar_string) + self.assertFalse(ctx.inside_parentheses) + + def test_modify_multiple_fields(self): + ctx = SerializationContext() + with ctx.modify(inside_dollar_string=True, inside_parentheses=True): + self.assertTrue(ctx.inside_dollar_string) + self.assertTrue(ctx.inside_parentheses) + self.assertFalse(ctx.inside_dollar_string) + self.assertFalse(ctx.inside_parentheses) + + +class TestIsDollarString(TestCase): + def test_valid_dollar_string(self): + self.assertTrue(is_dollar_string("${x}")) + + def test_nested_dollar_string(self): + self.assertTrue(is_dollar_string("${a + b}")) + + def test_plain_string(self): + self.assertFalse(is_dollar_string("foo")) + + def test_incomplete_prefix(self): + self.assertFalse(is_dollar_string("${")) + + def test_non_string_input(self): + self.assertFalse(is_dollar_string(42)) + self.assertFalse(is_dollar_string(None)) + + def test_empty_dollar_string(self): + self.assertTrue(is_dollar_string("${}")) + + def test_dollar_without_brace(self): + self.assertFalse(is_dollar_string("$x}")) + + def test_missing_closing_brace(self): + self.assertFalse(is_dollar_string("${x")) + + +class TestToDollarString(TestCase): + def test_wraps_plain_string(self): + self.assertEqual(to_dollar_string("x"), "${x}") + + def test_idempotent_on_dollar_string(self): + self.assertEqual(to_dollar_string("${x}"), "${x}") + + def test_wraps_empty(self): + self.assertEqual(to_dollar_string(""), "${}") + + def test_wraps_expression(self): + self.assertEqual(to_dollar_string("a + b"), "${a + b}") + + +class TestUnwrapDollarString(TestCase): + def test_strips_wrapping(self): + self.assertEqual(unwrap_dollar_string("${x}"), "x") + + def test_noop_on_plain_string(self): + self.assertEqual(unwrap_dollar_string("foo"), "foo") + + def test_strips_complex_expression(self): + self.assertEqual(unwrap_dollar_string("${a + b}"), "a + b") + + +class TestWrapIntoParentheses(TestCase): + def test_plain_string(self): + self.assertEqual(wrap_into_parentheses("x"), "(x)") + + def test_dollar_string(self): + self.assertEqual(wrap_into_parentheses("${x}"), "${(x)}") + + def test_expression_string(self): + self.assertEqual(wrap_into_parentheses("a + b"), "(a + b)") + + def test_dollar_expression(self): + self.assertEqual(wrap_into_parentheses("${a + b}"), "${(a + b)}") diff --git a/test/unit/test_walk.py b/test/unit/test_walk.py new file mode 100644 index 00000000..ec81bb48 --- /dev/null +++ b/test/unit/test_walk.py @@ -0,0 +1,167 @@ +# pylint: disable=C0103,C0114,C0115,C0116 +from unittest import TestCase + +from hcl2.rules.base import AttributeRule, BlockRule, BodyRule, StartRule +from hcl2.rules.expressions import ExpressionRule, ExprTermRule +from hcl2.rules.literal_rules import IdentifierRule +from hcl2.rules.tokens import NAME, EQ, LBRACE, RBRACE, NL_OR_COMMENT +from hcl2.rules.whitespace import NewLineOrCommentRule +from hcl2.utils import SerializationOptions, SerializationContext +from hcl2.walk import ( + ancestors, + find_all, + find_by_predicate, + find_first, + walk, + walk_rules, + walk_semantic, +) + + +class StubExpression(ExpressionRule): + def __init__(self, value): + self._stub_value = value + super().__init__([], None) + + def serialize(self, options=SerializationOptions(), context=SerializationContext()): + return self._stub_value + + +def _make_identifier(name): + return IdentifierRule([NAME(name)]) + + +def _make_expr_term(value): + return ExprTermRule([StubExpression(value)]) + + +def _make_nlc(text): + return NewLineOrCommentRule([NL_OR_COMMENT(text)]) + + +def _make_attribute(name, value): + return AttributeRule([_make_identifier(name), EQ(), _make_expr_term(value)]) + + +def _make_block(labels, body_children=None): + body = BodyRule(body_children or []) + children = list(labels) + [LBRACE(), body, RBRACE()] + return BlockRule(children) + + +class TestWalk(TestCase): + def test_walk_single_node(self): + attr = _make_attribute("x", 1) + nodes = list(walk(attr)) + self.assertIn(attr, nodes) + self.assertTrue(len(nodes) > 1) + + def test_walk_skips_none(self): + attr = _make_attribute("x", 1) + nodes = list(walk(attr)) + self.assertTrue(all(n is not None for n in nodes)) + + def test_walk_includes_tokens(self): + from hcl2.rules.abstract import LarkToken + + attr = _make_attribute("x", 1) + nodes = list(walk(attr)) + has_token = any(isinstance(n, LarkToken) for n in nodes) + self.assertTrue(has_token) + + +class TestWalkRules(TestCase): + def test_only_rules(self): + from hcl2.rules.abstract import LarkRule, LarkToken + + attr = _make_attribute("x", 1) + rules = list(walk_rules(attr)) + for r in rules: + self.assertIsInstance(r, LarkRule) + self.assertNotIsInstance(r, LarkToken) + + +class TestWalkSemantic(TestCase): + def test_no_whitespace(self): + nlc = _make_nlc("\n") + body = BodyRule([nlc, _make_attribute("x", 1)]) + rules = list(walk_semantic(body)) + for r in rules: + self.assertNotIsInstance(r, NewLineOrCommentRule) + + def test_finds_attribute(self): + body = BodyRule([_make_attribute("x", 1)]) + rules = list(walk_semantic(body)) + self.assertTrue(any(isinstance(r, AttributeRule) for r in rules)) + + +class TestFindAll(TestCase): + def test_finds_all_attributes(self): + body = BodyRule([_make_attribute("x", 1), _make_attribute("y", 2)]) + start = StartRule([body]) + attrs = list(find_all(start, AttributeRule)) + self.assertEqual(len(attrs), 2) + + def test_finds_nested(self): + BodyRule([_make_attribute("inner", 1)]) # unused but creates parent refs + block = _make_block( + [_make_identifier("resource")], [_make_attribute("outer", 2)] + ) + outer_body = BodyRule([block]) + start = StartRule([outer_body]) + attrs = list(find_all(start, AttributeRule)) + self.assertEqual(len(attrs), 1) # only outer, inner is in block's body + + def test_finds_blocks(self): + block = _make_block([_make_identifier("resource")]) + body = BodyRule([block]) + start = StartRule([body]) + blocks = list(find_all(start, BlockRule)) + self.assertEqual(len(blocks), 1) + + +class TestFindFirst(TestCase): + def test_finds_first(self): + body = BodyRule([_make_attribute("x", 1), _make_attribute("y", 2)]) + start = StartRule([body]) + attr = find_first(start, AttributeRule) + self.assertIsNotNone(attr) + self.assertEqual(attr.identifier.serialize(), "x") + + def test_returns_none(self): + body = BodyRule([]) + start = StartRule([body]) + result = find_first(start, AttributeRule) + self.assertIsNone(result) + + +class TestFindByPredicate(TestCase): + def test_predicate(self): + attr1 = _make_attribute("x", 1) + attr2 = _make_attribute("y", 2) + body = BodyRule([attr1, attr2]) + found = list( + find_by_predicate( + body, + lambda n: isinstance(n, AttributeRule) + and n.identifier.serialize() == "x", + ) + ) + self.assertEqual(len(found), 1) + self.assertIs(found[0], attr1) + + +class TestAncestors(TestCase): + def test_parent_chain(self): + attr = _make_attribute("x", 1) + body = BodyRule([attr]) + start = StartRule([body]) + chain = list(ancestors(attr)) + self.assertEqual(chain[0], body) + self.assertEqual(chain[1], start) + + def test_empty_for_root(self): + body = BodyRule([]) + start = StartRule([body]) + chain = list(ancestors(start)) + self.assertEqual(len(chain), 0) diff --git a/tree-to-hcl2-reconstruction.md b/tree-to-hcl2-reconstruction.md deleted file mode 100644 index 1a5f83dc..00000000 --- a/tree-to-hcl2-reconstruction.md +++ /dev/null @@ -1,248 +0,0 @@ -# Writing HCL2 from Python - -Version 6 of this library supports reconstructing HCL files directly from -Python. This guide details how the reconstruction process takes place. See -also: [Limitations](#limitations) - -There are three major phases: - -- [Building a Python Dictionary](#building-a-python-dictionary) -- [Building an AST](#building-an-ast) -- [Reconstructing the file from the AST](#reconstructing-the-file-from-the-ast) - -## Example - -To create the `example.tf` file with the following content: - -```terraform -resource "aws_s3_bucket" "bucket" { - bucket = "bucket_id" - force_destroy = true -} -``` - -You can use the `hcl2.Builder` class like so: - -```python -import hcl2 - -example = hcl2.Builder() - -example.block( - "resource", - ["aws_s3_bucket", "bucket"], - bucket="bucket_id", - force_destroy=True, -) - -example_dict = example.build() -example_ast = hcl2.reverse_transform(example_dict) -example_file = hcl2.writes(example_ast) - -print(example_file) -# resource "aws_s3_bucket" "bucket" { -# bucket = "bucket_id" -# force_destroy = true -# } -# -``` - -This demonstrates a couple of different phases of the process worth mentioning. - -### Building a Python dictionary - -The `hcl2.Builder` class produces a dictionary that should be identical to the -output of `hcl2.load(example_file, with_meta=True)`. The `with_meta` keyword -argument is important here. HCL "blocks" in the Python dictionary are -identified by the presence of `__start_line__` and `__end_line__` metadata -within them. The `Builder` class handles adding that metadata. If that metadata -is missing, the `hcl2.reconstructor.HCLReverseTransformer` class fails to -identify what is a block and what is just an attribute with an object value. -Without that metadata, this dictionary: - -```python -{ - "resource": [ - { - "aws_s3_bucket": { - "bucket": { - "bucket": "bucket_id", - "force_destroy": True, - # "__start_line__": -1, - # "__end_line__": -1, - } - } - } - ] -} -``` - -Would produce this HCL output: - -```terraform -resource = [{ - aws_s3_bucket = { - bucket = { - bucket = "bucket_id" - force_destroy = true - } - } -}] -``` - -(This output parses to the same datastructure, but isn't formatted in blocks -as desired by the user. Therefore, using the `Builder` class is recommended.) - -### Building an AST - -The `hcl2.reconstructor.HCLReconstructor` class operates on an "abstract -syntax tree" (`hcl2.AST` or `Lark.Tree`, they're the same.) To produce this AST -from scratch in Python, use `hcl2.reverse_transform(hcl_dict)`, and to produce -this AST from an existing HCL file, use `hcl2.parse(hcl_file)`. - -You can also build these ASTs manually, if you want more control over the -generated HCL output. If you do this, though, make sure the AST you generate is -valid within the `hcl2.lark` grammar. - -Here's an example, which would add a "tags" element to that `example.tf` file -mentioned above. - -```python -from copy import deepcopy -from lark import Token, Tree -import hcl2 - - -def build_tags_tree(base_indent: int = 0) -> Tree: - # build Tree representing following HCL2 structure - # tags = { - # Name = "My bucket" - # Environment = "Dev" - # } - return Tree('attribute', [ - Tree('identifier', [ - Token('NAME', 'tags') - ]), - Token('EQ', '='), - Tree('expr_term', [ - Tree('object', [ - Tree('new_line_or_comment', [ - Token('NL_OR_COMMENT', '\n' + ' ' * (base_indent + 1)), - ]), - Tree('object_elem', [ - Tree('identifier', [ - Token('NAME', 'Name') - ]), - Token('EQ', '='), - Tree('expr_term', [ - Token('STRING_LIT', '"My bucket"') - ]) - ]), - Tree('new_line_and_or_comma', [ - Tree('new_line_or_comment', [ - Token('NL_OR_COMMENT', '\n' + ' ' * (base_indent + 1)), - ]), - ]), - Tree('object_elem', [ - Tree('identifier', [ - Token('NAME', 'Environment') - ]), - Token('EQ', '='), - Tree('expr_term', [ - Token('STRING_LIT', '"Dev"') - ]) - ]), - Tree('new_line_and_or_comma', [ - Tree('new_line_or_comment', [ - Token('NL_OR_COMMENT', '\n' + ' ' * base_indent), - ]), - ]), - ]), - ]) - ]) - - -def is_bucket_block(tree: Tree) -> bool: - # check whether given Tree represents `resource "aws_s3_bucket" "bucket"` - try: - return tree.data == 'block' and tree.children[2].value == '"bucket"' - except IndexError: - return False - - -def insert_tags(tree: Tree, indent: int = 0) -> Tree: - # Insert tags tree and adjust surrounding whitespaces to match indentation - new_children = [*tree.children.copy(), build_tags_tree(indent)] - # add indentation before tags tree - new_children[len(tree.children) - 1] = Tree('new_line_or_comment', [ - Token('NL_OR_COMMENT', '\n ') - ]) - # move closing bracket to the new line - new_children.append( - Tree('new_line_or_comment', [ - Token('NL_OR_COMMENT', '\n') - ]) - ) - return Tree(tree.data, new_children) - - -def process_token(node: Token, indent=0): - # Print details of this token and return its copy - print(f'[{indent}] (token)\t|', ' ' * indent, node.type, node.value) - return deepcopy(node) - - -def process_tree(node: Tree, depth=0) -> Tree: - # Recursively iterate over tree's children - # the depth parameter represents recursion depth, - # it's used to deduce indentation for printing tree and for adjusting whitespace after adding tags - new_children = [] - print(f'[{depth}] (tree)\t|', ' ' * depth, node.data) - for child in node.children: - if isinstance(child, Tree): - if is_bucket_block(child): - block_children = child.children.copy() - # this child is the Tree representing block's actual body - block_children[3] = insert_tags(block_children[3], depth) - # replace original Tree with new one including the modified body - child = Tree(child.data, block_children) - - new_children.append(process_tree(child, depth + 1)) - - else: - new_children.append(process_token(child, depth + 1)) - - return Tree(node.data, new_children) - - -def main(): - tree = hcl2.parse(open('example.tf')) - new_tree = process_tree(tree) - reconstructed = hcl2.writes(new_tree) - open('example_reconstructed.tf', 'w').write(reconstructed) - - -if __name__ == "__main__": - main() - -``` - -### Reconstructing the file from the AST - -Once the AST has been generated, you can convert it back to valid HCL using -`hcl2.writes(ast)`. In the above example, that conversion is done in the -`main()` function. - -## Limitations - -- Some formatting choices are impossible to specify via `hcl2.Builder()` and - require manual intervention of the AST produced after the `reverse_transform` - step. - -- Most notably, this means it's not possible to generate files containing - comments (both inline and block comments) - -- Even when parsing a file directly and writing it back out, some formatting - information may be lost due to Terminals discarded during the parsing process. - The reconstructed output should still parse to the same dictionary at the end - of the day though.