|
| 1 | +--- |
| 2 | +title: Content Addressable Store (CAS) |
| 3 | +description: Learn how Terragrunt supports deduplication of content using a Content Addressable Store (CAS). |
| 4 | +slug: docs/features/cas |
| 5 | +sidebar: |
| 6 | + order: 15 |
| 7 | +--- |
| 8 | + |
| 9 | +import FileTree from '@components/vendored/starlight/FileTree.astro'; |
| 10 | + |
| 11 | +Terragrunt supports a Content Addressable Store (CAS) to deduplicate content across multiple Terragrunt configurations. This feature is still experimental and not recommended for general production usage. |
| 12 | + |
| 13 | +The CAS is used to speed up both catalog cloning and OpenTofu/Terraform source cloning by avoiding redundant downloads of Git repositories. |
| 14 | + |
| 15 | +To use the CAS, you will need to enable the [cas](/docs/reference/experiments/#cas) experiment. |
| 16 | + |
| 17 | +## Usage |
| 18 | + |
| 19 | +When you enable the `cas` experiment, Terragrunt will automatically use the CAS when cloning any compatible source (Git repositories). |
| 20 | + |
| 21 | +### Catalog Usage |
| 22 | + |
| 23 | +```hcl |
| 24 | +# root.hcl |
| 25 | +
|
| 26 | +catalog { |
| 27 | + urls = [ |
| 28 | + "[email protected]:acme/modules.git" |
| 29 | + ] |
| 30 | +} |
| 31 | +``` |
| 32 | + |
| 33 | +### OpenTofu/Terraform Source Usage |
| 34 | + |
| 35 | +```hcl |
| 36 | +# terragrunt.hcl |
| 37 | +
|
| 38 | +terraform { |
| 39 | + source = "[email protected]:acme/infrastructure-modules.git//vpc?ref=v1.0.0" |
| 40 | +} |
| 41 | +``` |
| 42 | + |
| 43 | +When Terragrunt clones a repository while using the CAS, if the repository is not found in the CAS, Terragrunt will clone the repository from the original URL and store it in the CAS for future use. |
| 44 | + |
| 45 | +When generating a repository from the CAS, Terragrunt will hard link entries from the CAS to the new repository. This allows Terragrunt to deduplicate content across multiple repositories. |
| 46 | + |
| 47 | +In the event that hard linking fails due to some operating system / host incompatibility with hard links, Terragrunt will fall back to performing copies of the content from the CAS. |
| 48 | + |
| 49 | +## Storage |
| 50 | + |
| 51 | +The CAS is stored in the `~/.cache/terragrunt/cas` directory. This directory can be safely deleted at any time, as Terragrunt will automatically regenerate the CAS as needed. |
| 52 | + |
| 53 | +Avoid partial deletions of the CAS directory without care, as that might result in partially cloned repositories and unexpected behavior. |
| 54 | + |
| 55 | +## How it works |
| 56 | + |
| 57 | +Terragrunt's CAS uses a content-addressable storage model to deduplicate repository content from Git clones to save disk space and improve performance. Each Git object is identified by its SHA1 hash, allowing identical content to be shared across multiple cloned repositories and repeated clones. |
| 58 | + |
| 59 | +### Content Addressing |
| 60 | + |
| 61 | +CAS uses Git's native content addressing scheme where each object is uniquely identified by its SHA1 hash. This means: |
| 62 | + |
| 63 | +- **Identical content** across different repositories shares the same hash |
| 64 | +- **Same commit hash** always represents the same content |
| 65 | +- **Storage is partitioned** by the first two characters of the hash (e.g., `ab/abc123...`) |
| 66 | + |
| 67 | +### Storage Structure |
| 68 | + |
| 69 | +The CAS store is organized in a partitioned structure to optimize file system performance: |
| 70 | + |
| 71 | +<FileTree> |
| 72 | + |
| 73 | +- ~/.cache/terragrunt/cas/store/ |
| 74 | + - ab/ |
| 75 | + - abc123...xyz (blob) |
| 76 | + - abc123...xyz.lock (lock file) |
| 77 | + - abd456...xyz (tree) |
| 78 | + - cd/ |
| 79 | + - cd7890...xyz (blob) |
| 80 | + - cd7890...xyz.lock (lock file) |
| 81 | + - ... |
| 82 | + |
| 83 | +</FileTree> |
| 84 | + |
| 85 | +Each content object is stored at `{hash[:2]}/{hash}`, where the first two characters create a partition directory. This prevents having thousands of files in a single directory, which can degrade file system performance. |
| 86 | + |
| 87 | +### Clone Flow |
| 88 | + |
| 89 | +When Terragrunt needs to clone a repository using the CAS it does the following, depending on whether the content is already in the CAS or not: |
| 90 | + |
| 91 | +#### Cold Clones |
| 92 | + |
| 93 | +For cold clones, where the content is not already in the CAS: |
| 94 | + |
| 95 | +1. Terragrunt resolves the Git reference (branch/tag) to a commit hash |
| 96 | +2. The tree related to the commit hash is not found in the CAS |
| 97 | +3. Terragrunt clones the repository to a temporary directory |
| 98 | +4. All blobs and trees required to reproduce the repository are extracted |
| 99 | +5. Content is stored in the CAS, partitioned by hash prefix |
| 100 | +6. The tree structure is read from the CAS and hard links are created to the target directory |
| 101 | + |
| 102 | +#### Warm Clones |
| 103 | + |
| 104 | +For warm clones, where the content is already in the CAS: |
| 105 | + |
| 106 | +1. Terragrunt resolves the Git reference to a commit hash |
| 107 | +2. CAS checks if the content exists |
| 108 | +3. The tree structure is read directly from the CAS |
| 109 | +4. Hard links are created from CAS to the target directory |
| 110 | + |
| 111 | +#### Flow Diagram |
| 112 | + |
| 113 | +```d2 |
| 114 | +direction: down |
| 115 | +
|
| 116 | +# Source |
| 117 | +git_repo: "Git Repository\n\[email protected]:acme/modules.git?ref=v1.0.0" { |
| 118 | + shape: cylinder |
| 119 | +} |
| 120 | +
|
| 121 | +# Decision Point |
| 122 | +check_cas: "In CAS?\n\nhash = 123abc..." { |
| 123 | + shape: diamond |
| 124 | +} |
| 125 | +
|
| 126 | +# First Clone Path (Content Not in CAS) |
| 127 | +clone_store: "Clone & Store\n(git clone → extract → store)" { |
| 128 | + shape: rectangle |
| 129 | +} |
| 130 | +
|
| 131 | +# Subsequent Clone Path (Content Already in CAS) |
| 132 | +read_cas: "Read from CAS\n\n123abc..." { |
| 133 | + shape: rectangle |
| 134 | +} |
| 135 | +
|
| 136 | +# Link Step |
| 137 | +link_step: "Link to Targets\n\nblob abc123... main.tf\nblob cd7890... variables.tf" { |
| 138 | + shape: rectangle |
| 139 | +} |
| 140 | +
|
| 141 | +# Linked Targets |
| 142 | +linked_target1: "Linked Target\n\n.terragrunt-cache/.../main.tf -->\n~/.cache/terragrunt/cas/store/ab/abc123..." { |
| 143 | + shape: rectangle |
| 144 | +} |
| 145 | +
|
| 146 | +linked_target2: "Linked Target\n\n.terragrunt-cache/.../variables.tf -->\n~/.cache/terragrunt/cas/store/cd/cd7890..." { |
| 147 | + shape: rectangle |
| 148 | +} |
| 149 | +
|
| 150 | +# Flow |
| 151 | +git_repo -> check_cas |
| 152 | +check_cas -> clone_store |
| 153 | +check_cas -> read_cas |
| 154 | +clone_store -> read_cas |
| 155 | +read_cas -> link_step |
| 156 | +link_step -> linked_target1 |
| 157 | +link_step -> linked_target2 |
| 158 | +``` |
| 159 | + |
| 160 | +### Deduplication Mechanism |
| 161 | + |
| 162 | +CAS achieves deduplication through hard links, which allows multiple files to use the same physical space on disk, avoiding duplicated content in repositories cloned by Terragrunt. |
| 163 | + |
| 164 | +- **Hard Links**: When the same content is requested multiple times, CAS creates hard links from the read-only store to each target directory |
| 165 | +- **Automatic Fallback**: If hard linking fails (e.g., cross-filesystem boundaries, operating system limitations), CAS automatically falls back to copying the content instead |
| 166 | + |
| 167 | +### Performance Benefits |
| 168 | + |
| 169 | +CAS provides significant performance improvements: |
| 170 | + |
| 171 | +- **Faster Subsequent Clones**: Once content is in CAS, subsequent clones skip the network download and Git clone operations entirely |
| 172 | +- **Reduced Disk Usage**: Hard links share the same inode, so duplicate content only consumes disk space once, regardless of how many times the file is used in clones by Terragrunt |
0 commit comments