Skip to content

Commit 0d9eb75

Browse files
author
Dan Gil
committed
Fix pre-commit issues: import order and EOF
1 parent d0462cc commit 0d9eb75

File tree

6 files changed

+194
-5
lines changed

6 files changed

+194
-5
lines changed
Lines changed: 192 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,192 @@
1+
<!-- bdcc8fc1-5a9f-4b9b-a057-5e06b697beac 0568a9a9-d98d-469d-ace5-dd2c44b3f5c5 -->
2+
# Robust Dependency Extraction System
3+
4+
## Overview
5+
6+
Transform the dependency extraction script to be resilient against repo structure changes through configuration-based sources, file discovery, validation, and comprehensive maintenance documentation.
7+
8+
## Implementation Steps
9+
10+
### 1. Create Configuration File (`config.yaml`)
11+
12+
Create `scripts_extract_dependencies/config.yaml` with:
13+
14+
- Component definitions (trtllm, vllm, sglang, operator, shared)
15+
- Source file patterns using glob patterns and fallback locations
16+
- Baseline dependency count
17+
- GitHub repository settings
18+
19+
Structure:
20+
21+
```yaml
22+
github:
23+
repo: "ai-dynamo/dynamo"
24+
branch: "main"
25+
26+
baseline:
27+
dependency_count: 251
28+
29+
components:
30+
trtllm:
31+
dockerfiles:
32+
- "container/Dockerfile.trtllm"
33+
- "containers/Dockerfile.trtllm" # fallback
34+
scripts: []
35+
36+
vllm:
37+
dockerfiles:
38+
- "container/Dockerfile.vllm"
39+
scripts:
40+
- "container/deps/vllm/install_vllm.sh"
41+
42+
sglang:
43+
dockerfiles:
44+
- "container/Dockerfile.sglang"
45+
46+
operator:
47+
dockerfiles:
48+
- "deploy/cloud/operator/Dockerfile"
49+
go_modules:
50+
- "deploy/cloud/operator/go.mod"
51+
52+
shared:
53+
dockerfiles:
54+
- "container/Dockerfile"
55+
requirements:
56+
- pattern: "container/deps/requirements*.txt"
57+
exclude: []
58+
pyproject:
59+
- "pyproject.toml"
60+
- "benchmarks/pyproject.toml"
61+
```
62+
63+
### 2. Add Configuration Loader
64+
65+
Modify `extract_dependency_versions.py`:
66+
67+
- Add `load_config()` method to DependencyExtractor class
68+
- Support YAML parsing (add pyyaml to dependencies if not present, or use json as fallback)
69+
- Validate configuration structure
70+
- Merge CLI args with config file settings
71+
72+
### 3. Implement File Discovery
73+
74+
Add new methods to DependencyExtractor:
75+
76+
- `discover_files(patterns: List[str]) -> List[Path]`: Find files matching patterns with fallbacks
77+
- `validate_critical_files() -> Dict[str, bool]`: Check if critical files exist
78+
- `find_file_alternatives(base_pattern: str) -> Optional[Path]`: Try common variations
79+
80+
Update `extract_all()` to:
81+
82+
- Use config-driven file discovery instead of hardcoded paths
83+
- Try multiple location patterns before failing
84+
- Report missing files with suggestions
85+
- Continue processing other components even if one fails
86+
87+
### 4. Enhanced Error Handling
88+
89+
Add comprehensive error tracking:
90+
91+
- Track missing files separately from extraction errors
92+
- Collect warnings for unversioned dependencies
93+
- Generate summary report of extraction success/failures
94+
- Add `--strict` mode that fails on missing files vs. warning mode (default)
95+
96+
Add new summary sections:
97+
98+
```
99+
Extraction Summary:
100+
Files Processed: 15/18
101+
Files Missing: 3
102+
- container/deps/requirements.standard.txt (optional)
103+
- ...
104+
Components:
105+
trtllm: ✓ Complete
106+
vllm: ⚠ Partial (missing install script)
107+
...
108+
```
109+
110+
### 5. Create Maintenance Documentation
111+
112+
Create `scripts_extract_dependencies/MAINTENANCE.md`:
113+
114+
**Sections:**
115+
116+
- How to add new components (step-by-step)
117+
- How to add new file types (requirements, dockerfiles, etc.)
118+
- How to update file paths when repo structure changes
119+
- How to update extraction patterns for new file formats
120+
- Troubleshooting guide for common issues
121+
- Config file reference documentation
122+
- How to update baseline count
123+
- Testing checklist before committing changes
124+
125+
### 6. Add Validation & Testing
126+
127+
Add `--validate` mode:
128+
129+
- Check config file syntax
130+
- Verify all configured paths exist
131+
- Test extraction patterns without writing output
132+
- Report configuration issues
133+
134+
Add `--dry-run` mode:
135+
136+
- Show what files would be processed
137+
- Display discovered files
138+
- Skip actual extraction
139+
140+
### 7. Update README
141+
142+
Update `scripts_extract_dependencies/README.md`:
143+
144+
- Add section on configuration file
145+
- Document file discovery behavior
146+
- Explain how to handle missing files
147+
- Add troubleshooting section
148+
- Link to MAINTENANCE.md
149+
- Add examples for common maintenance tasks
150+
151+
### 8. Add Version Detection Improvements
152+
153+
Enhance extraction methods:
154+
155+
- Better regex patterns for version strings
156+
- Support more version specifier formats (>= , ~=, ^, etc.)
157+
- Extract versions from comments if present
158+
- Add heuristics to guess versions from Git tags/branches when "latest" is used
159+
160+
## Files to Create/Modify
161+
162+
**New Files:**
163+
164+
- `scripts_extract_dependencies/config.yaml` - Configuration
165+
- `scripts_extract_dependencies/MAINTENANCE.md` - Maintenance guide
166+
167+
**Modified Files:**
168+
169+
- `scripts_extract_dependencies/extract_dependency_versions.py` - Add config loading, discovery, validation
170+
- `scripts_extract_dependencies/README.md` - Add config documentation, update examples
171+
172+
## Expected Outcomes
173+
174+
After implementation:
175+
176+
1. Script survives file moves - uses discovery patterns
177+
2. Easy to add new components - edit config.yaml
178+
3. Clear error messages - shows what's missing and where to look
179+
4. Maintainable - documentation guides future updates
180+
5. Validated - catches config errors before extraction
181+
6. Flexible - multiple fallback locations, graceful degradation
182+
183+
### To-dos
184+
185+
- [ ] Create config.yaml with component definitions, file patterns, and settings
186+
- [ ] Add configuration loading and validation to DependencyExtractor class
187+
- [ ] Implement file discovery with glob patterns and fallback locations
188+
- [ ] Add comprehensive error tracking and reporting with strict/warning modes
189+
- [ ] Create MAINTENANCE.md with guides for adding components, updating paths, troubleshooting
190+
- [ ] Add --validate and --dry-run modes for testing configuration
191+
- [ ] Update README.md with configuration documentation and troubleshooting
192+
- [ ] Enhance version extraction with better patterns and heuristics

.github/.DS_Store

6 KB
Binary file not shown.

.github/reports/.DS_Store

6 KB
Binary file not shown.

.github/reports/README.md

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -128,4 +128,3 @@ python3 .github/workflows/extract_dependency_versions.py --help
128128
- ⚙️ [Configuration](../workflows/extract_dependency_versions_config.yaml)
129129
- 📋 [Nightly Workflow](../workflows/dependency-extraction-nightly.yml)
130130
- 📸 [Release Workflow](../workflows/dependency-extraction-release.yml)
131-

.github/workflows/extract_dependency_versions.py

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -27,12 +27,12 @@
2727

2828
import argparse
2929
import csv
30-
from datetime import datetime
3130
import glob as glob_module
3231
import json
3332
import re
33+
from datetime import datetime
3434
from pathlib import Path
35-
from typing import List, Dict, Tuple, Optional, Set
35+
from typing import Dict, List, Optional, Set, Tuple
3636

3737
try:
3838
import yaml
@@ -1758,4 +1758,3 @@ def main():
17581758

17591759
if __name__ == "__main__":
17601760
main()
1761-

.github/workflows/extract_dependency_versions_config.yaml

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -170,4 +170,3 @@ extraction:
170170

171171
go_mod:
172172
skip_indirect: false # Set to true to skip indirect dependencies
173-

0 commit comments

Comments
 (0)