Skip to content

Commit 558c9cf

Browse files
committed
Merge remote-tracking branch 'origin/main' into refactor-discovery-runner-pool
2 parents 89bcb96 + bfcf295 commit 558c9cf

File tree

6 files changed

+473
-175
lines changed

6 files changed

+473
-175
lines changed

docs-starlight/package-lock.json

Lines changed: 6 additions & 2 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

docs-starlight/public/d2/docs/03-features/15-cas-0.svg

Lines changed: 173 additions & 0 deletions
Loading

docs-starlight/src/content/docs/03-features/15-cas.md

Lines changed: 0 additions & 51 deletions
This file was deleted.
Lines changed: 172 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,172 @@
1+
---
2+
title: Content Addressable Store (CAS)
3+
description: Learn how Terragrunt supports deduplication of content using a Content Addressable Store (CAS).
4+
slug: docs/features/cas
5+
sidebar:
6+
order: 15
7+
---
8+
9+
import FileTree from '@components/vendored/starlight/FileTree.astro';
10+
11+
Terragrunt supports a Content Addressable Store (CAS) to deduplicate content across multiple Terragrunt configurations. This feature is still experimental and not recommended for general production usage.
12+
13+
The CAS is used to speed up both catalog cloning and OpenTofu/Terraform source cloning by avoiding redundant downloads of Git repositories.
14+
15+
To use the CAS, you will need to enable the [cas](/docs/reference/experiments/#cas) experiment.
16+
17+
## Usage
18+
19+
When you enable the `cas` experiment, Terragrunt will automatically use the CAS when cloning any compatible source (Git repositories).
20+
21+
### Catalog Usage
22+
23+
```hcl
24+
# root.hcl
25+
26+
catalog {
27+
urls = [
28+
"[email protected]:acme/modules.git"
29+
]
30+
}
31+
```
32+
33+
### OpenTofu/Terraform Source Usage
34+
35+
```hcl
36+
# terragrunt.hcl
37+
38+
terraform {
39+
source = "[email protected]:acme/infrastructure-modules.git//vpc?ref=v1.0.0"
40+
}
41+
```
42+
43+
When Terragrunt clones a repository while using the CAS, if the repository is not found in the CAS, Terragrunt will clone the repository from the original URL and store it in the CAS for future use.
44+
45+
When generating a repository from the CAS, Terragrunt will hard link entries from the CAS to the new repository. This allows Terragrunt to deduplicate content across multiple repositories.
46+
47+
In the event that hard linking fails due to some operating system / host incompatibility with hard links, Terragrunt will fall back to performing copies of the content from the CAS.
48+
49+
## Storage
50+
51+
The CAS is stored in the `~/.cache/terragrunt/cas` directory. This directory can be safely deleted at any time, as Terragrunt will automatically regenerate the CAS as needed.
52+
53+
Avoid partial deletions of the CAS directory without care, as that might result in partially cloned repositories and unexpected behavior.
54+
55+
## How it works
56+
57+
Terragrunt's CAS uses a content-addressable storage model to deduplicate repository content from Git clones to save disk space and improve performance. Each Git object is identified by its SHA1 hash, allowing identical content to be shared across multiple cloned repositories and repeated clones.
58+
59+
### Content Addressing
60+
61+
CAS uses Git's native content addressing scheme where each object is uniquely identified by its SHA1 hash. This means:
62+
63+
- **Identical content** across different repositories shares the same hash
64+
- **Same commit hash** always represents the same content
65+
- **Storage is partitioned** by the first two characters of the hash (e.g., `ab/abc123...`)
66+
67+
### Storage Structure
68+
69+
The CAS store is organized in a partitioned structure to optimize file system performance:
70+
71+
<FileTree>
72+
73+
- ~/.cache/terragrunt/cas/store/
74+
- ab/
75+
- abc123...xyz (blob)
76+
- abc123...xyz.lock (lock file)
77+
- abd456...xyz (tree)
78+
- cd/
79+
- cd7890...xyz (blob)
80+
- cd7890...xyz.lock (lock file)
81+
- ...
82+
83+
</FileTree>
84+
85+
Each content object is stored at `{hash[:2]}/{hash}`, where the first two characters create a partition directory. This prevents having thousands of files in a single directory, which can degrade file system performance.
86+
87+
### Clone Flow
88+
89+
When Terragrunt needs to clone a repository using the CAS it does the following, depending on whether the content is already in the CAS or not:
90+
91+
#### Cold Clones
92+
93+
For cold clones, where the content is not already in the CAS:
94+
95+
1. Terragrunt resolves the Git reference (branch/tag) to a commit hash
96+
2. The tree related to the commit hash is not found in the CAS
97+
3. Terragrunt clones the repository to a temporary directory
98+
4. All blobs and trees required to reproduce the repository are extracted
99+
5. Content is stored in the CAS, partitioned by hash prefix
100+
6. The tree structure is read from the CAS and hard links are created to the target directory
101+
102+
#### Warm Clones
103+
104+
For warm clones, where the content is already in the CAS:
105+
106+
1. Terragrunt resolves the Git reference to a commit hash
107+
2. CAS checks if the content exists
108+
3. The tree structure is read directly from the CAS
109+
4. Hard links are created from CAS to the target directory
110+
111+
#### Flow Diagram
112+
113+
```d2
114+
direction: down
115+
116+
# Source
117+
git_repo: "Git Repository\n\[email protected]:acme/modules.git?ref=v1.0.0" {
118+
shape: cylinder
119+
}
120+
121+
# Decision Point
122+
check_cas: "In CAS?\n\nhash = 123abc..." {
123+
shape: diamond
124+
}
125+
126+
# First Clone Path (Content Not in CAS)
127+
clone_store: "Clone & Store\n(git clone → extract → store)" {
128+
shape: rectangle
129+
}
130+
131+
# Subsequent Clone Path (Content Already in CAS)
132+
read_cas: "Read from CAS\n\n123abc..." {
133+
shape: rectangle
134+
}
135+
136+
# Link Step
137+
link_step: "Link to Targets\n\nblob abc123... main.tf\nblob cd7890... variables.tf" {
138+
shape: rectangle
139+
}
140+
141+
# Linked Targets
142+
linked_target1: "Linked Target\n\n.terragrunt-cache/.../main.tf -->\n~/.cache/terragrunt/cas/store/ab/abc123..." {
143+
shape: rectangle
144+
}
145+
146+
linked_target2: "Linked Target\n\n.terragrunt-cache/.../variables.tf -->\n~/.cache/terragrunt/cas/store/cd/cd7890..." {
147+
shape: rectangle
148+
}
149+
150+
# Flow
151+
git_repo -> check_cas
152+
check_cas -> clone_store
153+
check_cas -> read_cas
154+
clone_store -> read_cas
155+
read_cas -> link_step
156+
link_step -> linked_target1
157+
link_step -> linked_target2
158+
```
159+
160+
### Deduplication Mechanism
161+
162+
CAS achieves deduplication through hard links, which allows multiple files to use the same physical space on disk, avoiding duplicated content in repositories cloned by Terragrunt.
163+
164+
- **Hard Links**: When the same content is requested multiple times, CAS creates hard links from the read-only store to each target directory
165+
- **Automatic Fallback**: If hard linking fails (e.g., cross-filesystem boundaries, operating system limitations), CAS automatically falls back to copying the content instead
166+
167+
### Performance Benefits
168+
169+
CAS provides significant performance improvements:
170+
171+
- **Faster Subsequent Clones**: Once content is in CAS, subsequent clones skip the network download and Git clone operations entirely
172+
- **Reduced Disk Usage**: Hard links share the same inode, so duplicate content only consumes disk space once, regardless of how many times the file is used in clones by Terragrunt

0 commit comments

Comments
 (0)