Skip to content

Commit 660995a

Browse files
authored
chore: return the current storage class and get more informations on storage class in dry-run mode (#59)
1 parent 59f4151 commit 660995a

File tree

3 files changed

+246
-78
lines changed

3 files changed

+246
-78
lines changed

scripts/README_storage_tier.md

Lines changed: 194 additions & 57 deletions
Original file line numberDiff line numberDiff line change
@@ -4,32 +4,87 @@
44

55
The `change_storage_tier.py` script allows you to change the storage tier (storage class) of S3 objects referenced in a STAC item. This is useful for optimizing storage costs by moving data to different storage tiers based on access patterns.
66

7+
## Requirements
8+
9+
The script requires the following Python packages:
10+
- `boto3` - AWS SDK for Python (S3 operations)
11+
- `httpx` - HTTP client (fetching STAC items)
12+
- `botocore` - AWS core functionality
13+
- `uv` - Python package installer and runner
14+
15+
All dependencies are managed via `uv` and will be automatically installed when running the script.
16+
17+
## Environment Setup
18+
19+
### Credentials for OVH Cloud Storage
20+
21+
Configure your credentials to access OVH cloud storage using one of these methods:
22+
23+
```bash
24+
# Option 1: Environment variables
25+
export AWS_ACCESS_KEY_ID="your-access-key"
26+
export AWS_SECRET_ACCESS_KEY="your-secret-key"
27+
28+
# Option 2: AWS CLI configuration
29+
aws configure
30+
```
31+
32+
### S3 Endpoint (Optional)
33+
34+
If using a custom S3-compatible service:
35+
36+
```bash
37+
export AWS_ENDPOINT_URL="https://s3.de.io.cloud.ovh.net"
38+
```
39+
40+
Or specify via command line:
41+
42+
```bash
43+
uv run python scripts/change_storage_tier.py \
44+
--s3-endpoint https://s3.de.io.cloud.ovh.net \
45+
...
46+
```
47+
48+
### Define STAC Item ID
49+
50+
For easier command execution, define the STAC item ID as a variable:
51+
52+
```bash
53+
ITEM_ID="S2B_MSIL2A_20250730T113319_N0511_R080_T29UQP_20250730T135754"
54+
```
55+
756
## Usage
857

958
### Basic Usage
1059

60+
Run the script using the STAC item ID variable defined in the setup:
61+
1162
```bash
12-
python scripts/change_storage_tier.py \
13-
--stac-item-url https://api.explorer.eopf.copernicus.eu/stac/collections/sentinel-2-l2a/items/ITEM_ID \
63+
uv run python scripts/change_storage_tier.py \
64+
--stac-item-url https://api.explorer.eopf.copernicus.eu/stac/collections/sentinel-2-l2a/items/$ITEM_ID \
1465
--storage-class GLACIER
1566
```
1667

1768
### Dry Run
1869

19-
Test the script without making actual changes:
70+
Test the script without making actual changes. Dry-run mode will:
71+
- Query and display the current storage class of each object
72+
- Show what changes would be made
73+
- Display storage class distribution statistics
74+
- Not modify any objects
2075

2176
```bash
22-
python scripts/change_storage_tier.py \
23-
--stac-item-url https://api.explorer.eopf.copernicus.eu/stac/collections/sentinel-2-l2a/items/ITEM_ID \
77+
uv run python scripts/change_storage_tier.py \
78+
--stac-item-url https://api.explorer.eopf.copernicus.eu/stac/collections/sentinel-2-l2a/items/$ITEM_ID \
2479
--storage-class GLACIER \
2580
--dry-run
2681
```
2782

2883
### With Custom S3 Endpoint
2984

3085
```bash
31-
python scripts/change_storage_tier.py \
32-
--stac-item-url https://api.explorer.eopf.copernicus.eu/stac/collections/sentinel-2-l2a/items/ITEM_ID \
86+
uv run python scripts/change_storage_tier.py \
87+
--stac-item-url https://api.explorer.eopf.copernicus.eu/stac/collections/sentinel-2-l2a/items/$ITEM_ID \
3388
--storage-class GLACIER \
3489
--s3-endpoint https://s3.de.io.cloud.ovh.net
3590
```
@@ -40,28 +95,28 @@ Only change storage class for specific parts of the Zarr store:
4095

4196
```bash
4297
# Only process reflectance data
43-
python scripts/change_storage_tier.py \
44-
--stac-item-url https://api.explorer.eopf.copernicus.eu/stac/collections/sentinel-2-l2a/items/ITEM_ID \
98+
uv run python scripts/change_storage_tier.py \
99+
--stac-item-url https://api.explorer.eopf.copernicus.eu/stac/collections/sentinel-2-l2a/items/$ITEM_ID \
45100
--storage-class GLACIER \
46101
--include-pattern "measurements/reflectance/*"
47102

48103
# Process multiple subdirectories
49-
python scripts/change_storage_tier.py \
50-
--stac-item-url https://api.explorer.eopf.copernicus.eu/stac/collections/sentinel-2-l2a/items/ITEM_ID \
104+
uv run python scripts/change_storage_tier.py \
105+
--stac-item-url https://api.explorer.eopf.copernicus.eu/stac/collections/sentinel-2-l2a/items/$ITEM_ID \
51106
--storage-class GLACIER \
52107
--include-pattern "measurements/*" \
53108
--include-pattern "quality/*"
54109

55110
# Exclude metadata files
56-
python scripts/change_storage_tier.py \
57-
--stac-item-url https://api.explorer.eopf.copernicus.eu/stac/collections/sentinel-2-l2a/items/ITEM_ID \
111+
uv run python scripts/change_storage_tier.py \
112+
--stac-item-url https://api.explorer.eopf.copernicus.eu/stac/collections/sentinel-2-l2a/items/$ITEM_ID \
58113
--storage-class GLACIER \
59114
--exclude-pattern "*.zattrs" \
60115
--exclude-pattern "*.zmetadata"
61116

62117
# Only process 60m resolution data
63-
python scripts/change_storage_tier.py \
64-
--stac-item-url https://api.explorer.eopf.copernicus.eu/stac/collections/sentinel-2-l2a/items/ITEM_ID \
118+
uv run python scripts/change_storage_tier.py \
119+
--stac-item-url https://api.explorer.eopf.copernicus.eu/stac/collections/sentinel-2-l2a/items/$ITEM_ID \
65120
--storage-class GLACIER \
66121
--include-pattern "*/r60m/*"
67122
```
@@ -113,14 +168,14 @@ This script can be integrated into your data pipeline workflow after the registr
113168

114169
```bash
115170
# 1. Convert
116-
python scripts/convert_v1_s2.py \
171+
uv run python scripts/convert_v1_s2.py \
117172
--source-url SOURCE_URL \
118173
--collection COLLECTION \
119174
--s3-output-bucket BUCKET \
120175
--s3-output-prefix PREFIX
121176

122177
# 2. Register
123-
python scripts/register_v1.py \
178+
uv run python scripts/register_v1.py \
124179
--source-url SOURCE_URL \
125180
--collection COLLECTION \
126181
--stac-api-url STAC_API \
@@ -130,28 +185,12 @@ python scripts/register_v1.py \
130185
--s3-output-prefix PREFIX
131186

132187
# 3. Change storage tier (optional)
133-
python scripts/change_storage_tier.py \
134-
--stac-item-url STAC_ITEM_URL \
188+
ITEM_ID="your-item-id"
189+
uv run python scripts/change_storage_tier.py \
190+
--stac-item-url https://api.explorer.eopf.copernicus.eu/stac/collections/sentinel-2-l2a/items/$ITEM_ID \
135191
--storage-class GLACIER
136192
```
137193

138-
## Environment Variables
139-
140-
The script uses the following environment variables if set:
141-
142-
- `AWS_ENDPOINT_URL` - S3 endpoint URL (if not provided via `--s3-endpoint`)
143-
- `AWS_ACCESS_KEY_ID` - AWS access key
144-
- `AWS_SECRET_ACCESS_KEY` - AWS secret key
145-
- `AWS_DEFAULT_REGION` - AWS region
146-
- `LOG_LEVEL` - Logging level (default: INFO)
147-
148-
## Requirements
149-
150-
The script requires the following Python packages:
151-
- `boto3` - AWS SDK for Python (S3 operations)
152-
- `httpx` - HTTP client (fetching STAC items)
153-
- `botocore` - AWS core functionality
154-
155194
## Error Handling
156195

157196
The script handles various error conditions:
@@ -173,59 +212,157 @@ The script provides detailed logging at different levels:
173212

174213
Set the `LOG_LEVEL` environment variable to control verbosity:
175214
```bash
176-
LOG_LEVEL=DEBUG python scripts/change_storage_tier.py ...
215+
LOG_LEVEL=DEBUG uv run python scripts/change_storage_tier.py \
216+
--stac-item-url https://api.explorer.eopf.copernicus.eu/stac/collections/sentinel-2-l2a/items/$ITEM_ID \
217+
--storage-class GLACIER
177218
```
178219

179220
## Examples
180221

181-
### Archive old data to GLACIER
222+
### Setup
223+
224+
First, define the STAC item ID:
182225

183226
```bash
184-
python scripts/change_storage_tier.py \
185-
--stac-item-url https://api.explorer.eopf.copernicus.eu/stac/collections/sentinel-2-l2a/items/S2A_MSIL2A_20250831T103701_N0511_R008_T31TFL_20250831T145420 \
186-
--storage-class GLACIER
227+
ITEM_ID="S2A_MSIL2A_20250831T103701_N0511_R008_T31TFL_20250831T145420"
187228
```
188229

189-
### Restore data from GLACIER to STANDARD
230+
### Check current storage class distribution
231+
232+
Use dry-run to see the current storage classes without making changes:
190233

191234
```bash
192-
python scripts/change_storage_tier.py \
193-
--stac-item-url https://api.explorer.eopf.copernicus.eu/stac/collections/sentinel-2-l2a/items/S2A_MSIL2A_20250831T103701_N0511_R008_T31TFL_20250831T145420 \
194-
--storage-class STANDARD
235+
uv run python scripts/change_storage_tier.py \
236+
--stac-item-url https://api.explorer.eopf.copernicus.eu/stac/collections/sentinel-2-l2a/items/$ITEM_ID \
237+
--storage-class GLACIER \
238+
--dry-run
195239
```
196240

197-
### Use high-performance storage
241+
Output example:
242+
```
243+
Summary for S2A_MSIL2A_20250831T103701_N0511_R008_T31TFL_20250831T145420:
244+
Total objects: 1500
245+
Skipped (filtered): 0
246+
Processed: 1500
247+
Succeeded: 1500
248+
Failed: 0
249+
250+
Current storage class distribution:
251+
GLACIER: 300 objects (20.0%)
252+
STANDARD: 1200 objects (80.0%)
253+
(DRY RUN)
254+
```
255+
256+
### Preview changes for specific data subset
257+
258+
Test what would happen when archiving only 60m resolution data:
198259

199260
```bash
200-
python scripts/change_storage_tier.py \
201-
--stac-item-url https://api.explorer.eopf.copernicus.eu/stac/collections/sentinel-2-l2a/items/S2A_MSIL2A_20250831T103701_N0511_R008_T31TFL_20250831T145420 \
202-
--storage-class EXPRESS_ONEZONE
261+
uv run python scripts/change_storage_tier.py \
262+
--stac-item-url https://api.explorer.eopf.copernicus.eu/stac/collections/sentinel-2-l2a/items/$ITEM_ID \
263+
--storage-class GLACIER \
264+
--include-pattern "*/r60m/*" \
265+
--dry-run
266+
```
267+
268+
### Check storage distribution for reflectance data only
269+
270+
```bash
271+
uv run python scripts/change_storage_tier.py \
272+
--stac-item-url https://api.explorer.eopf.copernicus.eu/stac/collections/sentinel-2-l2a/items/$ITEM_ID \
273+
--storage-class GLACIER \
274+
--include-pattern "measurements/reflectance/*" \
275+
--dry-run
276+
```
277+
278+
### Preview excluding metadata files
279+
280+
```bash
281+
uv run python scripts/change_storage_tier.py \
282+
--stac-item-url https://api.explorer.eopf.copernicus.eu/stac/collections/sentinel-2-l2a/items/$ITEM_ID \
283+
--storage-class GLACIER \
284+
--exclude-pattern "*.zattrs" \
285+
--exclude-pattern "*.zmetadata" \
286+
--dry-run
203287
```
204288

205289
### Archive only reflectance data to GLACIER
206290

207291
```bash
208-
python scripts/change_storage_tier.py \
209-
--stac-item-url https://api.explorer.eopf.copernicus.eu/stac/collections/sentinel-2-l2a/items/S2A_MSIL2A_20250831T103701_N0511_R008_T31TFL_20250831T145420 \
292+
# First, preview the changes
293+
uv run python scripts/change_storage_tier.py \
294+
--stac-item-url https://api.explorer.eopf.copernicus.eu/stac/collections/sentinel-2-l2a/items/$ITEM_ID \
295+
--storage-class GLACIER \
296+
--include-pattern "measurements/reflectance/*" \
297+
--dry-run
298+
299+
# Then apply the changes
300+
uv run python scripts/change_storage_tier.py \
301+
--stac-item-url https://api.explorer.eopf.copernicus.eu/stac/collections/sentinel-2-l2a/items/$ITEM_ID \
210302
--storage-class GLACIER \
211303
--include-pattern "measurements/reflectance/*"
212304
```
213305

214306
### Archive all measurement data except 10m resolution
215307

216308
```bash
217-
python scripts/change_storage_tier.py \
218-
--stac-item-url https://api.explorer.eopf.copernicus.eu/stac/collections/sentinel-2-l2a/items/S2A_MSIL2A_20250831T103701_N0511_R008_T31TFL_20250831T145420 \
309+
# Preview first
310+
uv run python scripts/change_storage_tier.py \
311+
--stac-item-url https://api.explorer.eopf.copernicus.eu/stac/collections/sentinel-2-l2a/items/$ITEM_ID \
312+
--storage-class GLACIER \
313+
--include-pattern "measurements/*" \
314+
--exclude-pattern "*/r10m/*" \
315+
--dry-run
316+
317+
# Apply changes
318+
uv run python scripts/change_storage_tier.py \
319+
--stac-item-url https://api.explorer.eopf.copernicus.eu/stac/collections/sentinel-2-l2a/items/$ITEM_ID \
219320
--storage-class GLACIER \
220321
--include-pattern "measurements/*" \
221322
--exclude-pattern "*/r10m/*"
222323
```
223324

224-
### Test filtering with dry-run
325+
### Archive old data to GLACIER
225326

226327
```bash
227-
python scripts/change_storage_tier.py \
228-
--stac-item-url https://api.explorer.eopf.copernicus.eu/stac/collections/sentinel-2-l2a/items/S2A_MSIL2A_20250831T103701_N0511_R008_T31TFL_20250831T145420 \
328+
# Preview the changes
329+
uv run python scripts/change_storage_tier.py \
330+
--stac-item-url https://api.explorer.eopf.copernicus.eu/stac/collections/sentinel-2-l2a/items/$ITEM_ID \
229331
--storage-class GLACIER \
230-
--include-pattern "measurements/reflectance/r60m/*" \
231332
--dry-run
333+
334+
# Apply the changes
335+
uv run python scripts/change_storage_tier.py \
336+
--stac-item-url https://api.explorer.eopf.copernicus.eu/stac/collections/sentinel-2-l2a/items/$ITEM_ID \
337+
--storage-class GLACIER
338+
```
339+
340+
### Restore data from GLACIER to STANDARD
341+
342+
```bash
343+
# Preview the changes
344+
uv run python scripts/change_storage_tier.py \
345+
--stac-item-url https://api.explorer.eopf.copernicus.eu/stac/collections/sentinel-2-l2a/items/$ITEM_ID \
346+
--storage-class STANDARD \
347+
--dry-run
348+
349+
# Apply the changes
350+
uv run python scripts/change_storage_tier.py \
351+
--stac-item-url https://api.explorer.eopf.copernicus.eu/stac/collections/sentinel-2-l2a/items/$ITEM_ID \
352+
--storage-class STANDARD
353+
```
354+
355+
### Use high-performance storage
356+
357+
```bash
358+
# Preview the changes
359+
uv run python scripts/change_storage_tier.py \
360+
--stac-item-url https://api.explorer.eopf.copernicus.eu/stac/collections/sentinel-2-l2a/items/$ITEM_ID \
361+
--storage-class EXPRESS_ONEZONE \
362+
--dry-run
363+
364+
# Apply the changes
365+
uv run python scripts/change_storage_tier.py \
366+
--stac-item-url https://api.explorer.eopf.copernicus.eu/stac/collections/sentinel-2-l2a/items/$ITEM_ID \
367+
--storage-class EXPRESS_ONEZONE
368+
```

0 commit comments

Comments
 (0)