Yet Another YAML AST - programmatically transform YAML, preserving whitespace and comments
Programmatically edit YAML at the AST level, so re-serializing doesn't introduce extraneous changes. Preserves:
- Comments
- Whitespace (including trailing spaces)
- Quote styles (
',", or none)- By default, will switch between
'and"if the other is added to a string (as a literal character)
- By default, will switch between
- Block scalar indicators (
|,|-,|+) - Other formatting choices (e.g. indentation)
Other libraries (e.g. ruamel.yaml) make formatting changes when serializing. yaya avoids this by:
- Parsing YAML to get the AST (with position information)
- Applying modifications only to specific values or subtrees
- Leaving everything else untouched
It also tries to mimic neighboring formatting, when adding values/trees, while also supporting dict-like ergonomics and path-based navigation.
pip install lossless-yamlfrom yaya import YAYA
# Load a YAML file
doc = YAYA.load('.github/workflows/test.yaml')
# Simple string replacement in all values
doc.replace_in_values('src/marin', 'lib/marin/src/marin')
# Regex-based replacement
doc.replace_in_values_regex(r'\buv sync(?! --package)', 'uv sync --package myapp')
doc.save()# Navigate using paths
runs_on = doc.get_path("jobs.test.runs-on")
step_name = doc.get_path("jobs.test.steps[0].name")
# Or dict-like access
runs_on = doc["jobs"]["test"]["runs-on"]
# Assert values before making changes
doc.assert_value("on", ["push"])
doc.assert_absent("jobs.test.defaults")
doc.assert_present("jobs.test.steps")# Replace a simple value
doc.replace_key("jobs.test.runs-on", "ubuntu-22.04")
# Replace a list item
doc.replace_key("build.commands[1]", "uv sync --package marin --frozen")
# Replace with a complex structure
doc.replace_key("on", {
"push": {
"branches": ["main"],
"paths": ["lib/**", "uv.lock"]
},
"pull_request": {
"paths": ["lib/**", "uv.lock"]
}
})
doc.save()# Add key after another (maintains order)
doc.add_key_after("jobs.test.runs-on", "defaults", {
"run": {
"working-directory": "lib/myapp"
}
})
# Add or replace (convenience method)
doc.ensure_key("jobs.test.timeout-minutes", 30)
doc.save()# Delete a key
doc.delete_key("build.mkdocs") # Returns True if deleted, False if not found
# Delete nested key
doc.delete_key("jobs.test.defaults.run.shell")
# Delete multiple keys
doc.delete_key("build.python")
doc.delete_key("build.obsolete")
doc.save()Given this YAML file:
# Production config
database:
host: prod-db-1.example.com
port: 5432This code:
doc = YAYA.load('config.yaml')
doc.replace_in_values('prod-db-1', 'prod-db-2')
doc.save()Produces exactly:
# Production config
database:
host: prod-db-2.example.com
port: 5432No reformatting. No comment loss. Just the change you made.
- Parse YAML with ruamel.yaml to get AST + position information
- Convert line/column positions to byte offsets
- Track modifications as you change values
- Apply byte-level replacements when saving, leaving everything else untouched
- Byte-for-byte preservation of unchanged content
- String replacement (literal and regex)
- Path-based navigation (
jobs.test.steps[0].name) - Replace values or subtrees (scalars, dicts, lists, list items)
- Add keys with proper positioning
- Delete keys while preserving surrounding content
- Assertions for validation (
assert_value,assert_present,assert_absent) - Comment preservation
- Block scalar support
- Flow and block style handling
- Binary data not supported
- Adding keys only supports
add_key_aftercurrently (not arbitrary positions)
ruamel.yaml is excellent for round-trip YAML editing and preserves most formatting. However:
| Feature | ruamel.yaml | yaya |
|---|---|---|
| Preserves comments | ✅ | ✅ |
| Preserves most whitespace | ✅ | ✅ |
| Byte-for-byte identical | ❌ | ✅ |
| Trailing whitespace | ❌ | ✅ |
| Block scalar indicators | ❌ (computes new ones) | ✅ |
yaya uses ruamel.yaml under the hood but takes a different approach: instead of serializing the AST back to YAML, it modifies the original bytes directly.
MIT
Issues and pull requests welcome!