|
| 1 | +# Secure Mode Plugin |
| 2 | + |
| 3 | +## Overview |
| 4 | + |
| 5 | +The Secure Mode plugin (`plugins/secure-mode/`) implements security validation for SageMaker deployments. It restricts model configurations to a safe subset of options and validates files before models load. |
| 6 | + |
| 7 | +## When to Work on This |
| 8 | + |
| 9 | +- Adding new security controls for SageMaker requirements |
| 10 | +- Expanding the properties allowlist for new features |
| 11 | +- Fixing security vulnerabilities in model loading |
| 12 | +- Updating file format validation rules |
| 13 | +- SageMaker platform integration changes |
| 14 | + |
| 15 | +## Architecture |
| 16 | + |
| 17 | +**Event-Driven Design:** |
| 18 | +- Plugin registers `SecureModeModelServerListener` on startup |
| 19 | +- Listener intercepts `onModelConfigured()` events |
| 20 | +- Validation runs before model loads |
| 21 | +- Throws `IllegalConfigurationException` on any violation |
| 22 | + |
| 23 | +**Key Components:** |
| 24 | +- `SecureModePlugin` - Entry point, registers listener when `SAGEMAKER_SECURE_MODE=true` |
| 25 | +- `SecureModeModelServerListener` - Event listener that triggers validation |
| 26 | +- `SecureModeUtils` - Core validation logic (properties, options, file scanning) |
| 27 | +- `SecureModeAllowList` - Defines allowlisted properties and Python executables |
| 28 | + |
| 29 | +## Security Controls |
| 30 | + |
| 31 | +### Control Types |
| 32 | + |
| 33 | +1. **DISALLOW_REQUIREMENTS_TXT** - Blocks `requirements.txt` in model directory |
| 34 | +2. **DISALLOW_PICKLE_FILES** - Only allows Safetensors (blocks .bin, .pt, .pth, .ckpt, .pkl) |
| 35 | +3. **DISALLOW_TRUST_REMOTE_CODE** - Prevents `trust_remote_code=true` |
| 36 | +4. **DISALLOW_CUSTOM_INFERENCE_SCRIPTS** - Only allows `djl_python.*` entrypoints |
| 37 | +5. **DISALLOW_CHAT_TEMPLATE** - Blocks Jinja templates in `tokenizer_config.json` |
| 38 | + |
| 39 | +### Validation Layers |
| 40 | + |
| 41 | +**Layer 1: Properties Allowlist** |
| 42 | +- Validates all properties against `PROPERTIES_ALLOWLIST` |
| 43 | +- ~100 properties explicitly allowed |
| 44 | +- Any other property throws exception |
| 45 | + |
| 46 | +**Layer 2: Options Check** |
| 47 | +- Validates specific options based on enabled controls |
| 48 | +- Checks both properties file and environment variables |
| 49 | +- Validates Python executable path |
| 50 | + |
| 51 | +**Layer 3: File Scanning** |
| 52 | +- Recursively scans `SAGEMAKER_UNTRUSTED_CHANNELS` directories |
| 53 | +- Checks file extensions and content based on controls |
| 54 | +- Parses JSON files for forbidden fields |
| 55 | + |
| 56 | +## Environment Variables |
| 57 | + |
| 58 | +**Platform-Controlled (SageMaker only):** |
| 59 | +- `SAGEMAKER_SECURE_MODE` - Enable/disable (true/false) |
| 60 | +- `SAGEMAKER_SECURITY_CONTROLS` - Comma-separated control list |
| 61 | +- `SAGEMAKER_UNTRUSTED_CHANNELS` - Comma-separated paths to scan |
| 62 | + |
| 63 | +**User-Configurable (validated by plugin):** |
| 64 | +- `OPTION_*` - Model options (validated against allowlist) |
| 65 | +- `DJL_ENTRY_POINT` - Entrypoint (must be `djl_python.*`) |
| 66 | + |
| 67 | +## Development Guidelines |
| 68 | + |
| 69 | +### Adding New Allowlisted Properties |
| 70 | + |
| 71 | +1. Add to `SecureModeAllowList.PROPERTIES_ALLOWLIST` |
| 72 | +2. Document in README |
| 73 | +3. Add test case in `SecureModePluginTest` |
| 74 | + |
| 75 | +```java |
| 76 | +public static final Set<String> PROPERTIES_ALLOWLIST = Set.of( |
| 77 | + // ... existing properties |
| 78 | + "option.new_property" // Add here |
| 79 | +); |
| 80 | +``` |
| 81 | + |
| 82 | +### Adding New Security Controls |
| 83 | + |
| 84 | +1. Define constant in `SecureModeUtils`: |
| 85 | +```java |
| 86 | +static final String NEW_CONTROL = "DISALLOW_NEW_FEATURE"; |
| 87 | +``` |
| 88 | + |
| 89 | +2. Add validation logic in `checkOptions()` or `scanForbiddenFiles()` |
| 90 | +3. Update README with control description |
| 91 | +4. Add comprehensive test cases |
| 92 | + |
| 93 | +### Testing Approach |
| 94 | + |
| 95 | +**Unit Tests (`SecureModePluginTest`):** |
| 96 | +- Mock environment variables with `MockedStatic<Utils>` |
| 97 | +- Create temporary files in `build/mock/` directories |
| 98 | +- Test each control independently |
| 99 | +- Test valid configurations that should pass |
| 100 | +- Test environment variable vs properties file |
| 101 | + |
| 102 | +**Test Pattern:** |
| 103 | +```java |
| 104 | +@Test(expectedExceptions = IllegalConfigurationException.class) |
| 105 | +void testNewControl() throws IOException, ModelException { |
| 106 | + mockSecurityEnv( |
| 107 | + SecureModeUtils.NEW_CONTROL, |
| 108 | + TEST_MODEL_DIR.resolve("file.txt"), |
| 109 | + "content" |
| 110 | + ); |
| 111 | +} |
| 112 | +``` |
| 113 | + |
| 114 | +### Common Patterns |
| 115 | + |
| 116 | +**Checking Properties:** |
| 117 | +```java |
| 118 | +String value = prop.getProperty("option.key"); |
| 119 | +if (value != null && !isAllowed(value)) { |
| 120 | + throw new IllegalConfigurationException("Message"); |
| 121 | +} |
| 122 | +``` |
| 123 | + |
| 124 | +**Checking Environment Variables:** |
| 125 | +```java |
| 126 | +String value = Utils.getenv("ENV_VAR", Utils.getenv("FALLBACK")); |
| 127 | +if (value != null && !isAllowed(value)) { |
| 128 | + throw new IllegalConfigurationException("Message"); |
| 129 | +} |
| 130 | +``` |
| 131 | + |
| 132 | +**File Scanning:** |
| 133 | +```java |
| 134 | +try (Stream<Path> stream = Files.walk(dir)) { |
| 135 | + for (Path p : stream.collect(Collectors.toList())) { |
| 136 | + if (isForbidden(p)) { |
| 137 | + throw new IllegalConfigurationException("Message"); |
| 138 | + } |
| 139 | + } |
| 140 | +} |
| 141 | +``` |
| 142 | + |
| 143 | +## Integration Points |
| 144 | + |
| 145 | +### Model Loading Flow |
| 146 | + |
| 147 | +``` |
| 148 | +ModelServer.registerModel() |
| 149 | + ↓ |
| 150 | +EventManager.notifyListeners(MODEL_CONFIGURED) |
| 151 | + ↓ |
| 152 | +SecureModeModelServerListener.onModelConfigured() |
| 153 | + ↓ |
| 154 | +SecureModeUtils.validateSecurity() |
| 155 | + ↓ |
| 156 | +[Validation passes] → Model loads |
| 157 | +[Validation fails] → IllegalConfigurationException → Model rejected |
| 158 | +``` |
| 159 | + |
| 160 | +### Adapter Loading Flow |
| 161 | + |
| 162 | +``` |
| 163 | +Adapter.newInstance() [COMMON PATH for both static and dynamic] |
| 164 | + ↓ |
| 165 | +Adapter.validateAdapterSecurity() |
| 166 | + ↓ |
| 167 | +[Validation passes] → Adapter created → register() → modelInfo.registerAdapter() |
| 168 | +[Validation fails] → IllegalConfigurationException → Adapter NOT created → NOT in LIST API |
| 169 | +``` |
| 170 | + |
| 171 | +**Static Adapters (Workflow Config):** |
| 172 | +``` |
| 173 | +AdapterWorkflowFunction.prepare() |
| 174 | + ↓ |
| 175 | +Adapter.newInstance() → validateAdapterSecurity() |
| 176 | + ↓ |
| 177 | +[Fails] → Workflow initialization fails → Model doesn't load |
| 178 | +``` |
| 179 | + |
| 180 | +**Dynamic Adapters (API):** |
| 181 | +``` |
| 182 | +POST /models/{model}/adapters |
| 183 | + ↓ |
| 184 | +AdapterManagementRequestHandler.handleRegisterAdapter() |
| 185 | + ↓ |
| 186 | +Adapter.newInstance() → validateAdapterSecurity() |
| 187 | + ↓ |
| 188 | +[Fails] → HTTP error response → Adapter NOT registered |
| 189 | +``` |
| 190 | + |
| 191 | +### Engine Filtering |
| 192 | + |
| 193 | +Only Python and MPI engines are validated: |
| 194 | +```java |
| 195 | +String engine = modelInfo.getEngineName(); |
| 196 | +if (!"Python".equals(engine) && !"MPI".equals(engine)) { |
| 197 | + logger.info("Skipping security check for engine: {}", engine); |
| 198 | + return; |
| 199 | +} |
| 200 | +``` |
| 201 | + |
| 202 | +### SageMaker Platform |
| 203 | + |
| 204 | +- SageMaker sets environment variables in container |
| 205 | +- Plugin automatically activates when `SAGEMAKER_SECURE_MODE=true` |
| 206 | +- No user configuration required |
| 207 | +- Validation happens before model serves any requests |
| 208 | + |
| 209 | +## Troubleshooting |
| 210 | + |
| 211 | +### Common Issues |
| 212 | + |
| 213 | +**"Security Controls environment variable is not set"** |
| 214 | +- `SAGEMAKER_SECURITY_CONTROLS` must be set when secure mode is enabled |
| 215 | +- SageMaker platform should set this automatically |
| 216 | + |
| 217 | +**"Property X is prohibited from being set in Secure Mode"** |
| 218 | +- Property not in allowlist |
| 219 | +- Add to `PROPERTIES_ALLOWLIST` if it should be allowed |
| 220 | + |
| 221 | +**"Custom entrypoint is prohibited in Secure Mode"** |
| 222 | +- Entrypoint must start with `djl_python.` |
| 223 | +- Check `option.entryPoint`, `DJL_ENTRY_POINT`, `OPTION_ENTRYPOINT` |
| 224 | +- Ensure no `model.py` file exists |
| 225 | + |
| 226 | +**"Pickle-based files found in directory"** |
| 227 | +- Convert model to Safetensors format |
| 228 | +- Remove .bin, .pt, .pth, .ckpt, .pkl files from untrusted directories |
| 229 | + |
| 230 | +### Debugging |
| 231 | + |
| 232 | +Enable debug logging: |
| 233 | +```properties |
| 234 | +# log4j2.xml |
| 235 | +<Logger name="ai.djl.serving.plugins.securemode" level="debug"/> |
| 236 | +``` |
| 237 | + |
| 238 | +Check environment variables: |
| 239 | +```bash |
| 240 | +echo $SAGEMAKER_SECURE_MODE |
| 241 | +echo $SAGEMAKER_SECURITY_CONTROLS |
| 242 | +echo $SAGEMAKER_UNTRUSTED_CHANNELS |
| 243 | +``` |
| 244 | + |
| 245 | +## Performance Considerations |
| 246 | + |
| 247 | +- File scanning is recursive and may be slow for large directories |
| 248 | +- Validation runs once per model at load time (not per request) |
| 249 | +- Properties validation is fast (Set lookup) |
| 250 | +- JSON parsing only for `tokenizer_config.json` files |
| 251 | + |
| 252 | +## Security Best Practices |
| 253 | + |
| 254 | +1. **Minimize allowlist** - Only add properties that are truly needed |
| 255 | +2. **Validate early** - Fail fast before model loads |
| 256 | +3. **Clear error messages** - Help users understand what's blocked and why |
| 257 | +4. **Test thoroughly** - Each control should have positive and negative tests |
| 258 | +5. **Document changes** - Update README when modifying allowlist or controls |
| 259 | + |
| 260 | +## Adapter Security (NEW) |
| 261 | + |
| 262 | +### Implementation |
| 263 | + |
| 264 | +Adapter validation is implemented in `wlm/src/main/java/ai/djl/serving/wlm/Adapter.java`: |
| 265 | + |
| 266 | +```java |
| 267 | +public static <I, O> Adapter<I, O> newInstance(...) { |
| 268 | + // Validate adapter security FIRST |
| 269 | + validateAdapterSecurity(src); |
| 270 | + |
| 271 | + // Then create adapter |
| 272 | + return new PyAdapter(...); |
| 273 | +} |
| 274 | + |
| 275 | +private static void validateAdapterSecurity(String adapterPath) { |
| 276 | + // Check for model.py, requirements.txt, pickle files |
| 277 | +} |
| 278 | +``` |
| 279 | + |
| 280 | +### What Gets Validated |
| 281 | + |
| 282 | +When adapters are loaded (statically or dynamically): |
| 283 | + |
| 284 | +1. **DISALLOW_CUSTOM_INFERENCE_SCRIPTS** - Blocks `model.py` in adapter directory |
| 285 | +2. **DISALLOW_REQUIREMENTS_TXT** - Blocks `requirements.txt` in adapter directory |
| 286 | +3. **DISALLOW_PICKLE_FILES** - Blocks `.bin`, `.pt`, `.pth`, `.ckpt`, `.pkl` files (recursively) |
| 287 | + |
| 288 | +### Testing |
| 289 | + |
| 290 | +Comprehensive tests in `SecureModeAdapterTest.java`: |
| 291 | +- Adapter with `model.py` → blocked |
| 292 | +- Adapter with `requirements.txt` → blocked |
| 293 | +- Adapter with pickle files → blocked |
| 294 | +- Adapter with `.safetensors` → allowed |
| 295 | +- Multiple violations → fails on first |
| 296 | +- Secure mode disabled → all allowed |
| 297 | + |
| 298 | +### Key Points |
| 299 | + |
| 300 | +- **Single validation point**: `Adapter.newInstance()` is the common path for all adapter loading |
| 301 | +- **Fail fast**: Validation happens before adapter object is created |
| 302 | +- **Hard fail**: Exception propagates, adapter never registered |
| 303 | +- **Clean state**: Failed adapters don't appear in LIST API |
| 304 | + |
| 305 | +## Related Code |
| 306 | + |
| 307 | +- `wlm/src/main/java/ai/djl/serving/wlm/Adapter.java` - Adapter creation and validation |
| 308 | +- `wlm/src/main/java/ai/djl/serving/wlm/ModelInfo.java` - Model and adapter registry |
| 309 | +- `serving/src/main/java/ai/djl/serving/http/AdapterManagementRequestHandler.java` - Dynamic adapter API |
| 310 | +- `serving/src/main/java/ai/djl/serving/workflow/function/AdapterWorkflowFunction.java` - Static adapter loading |
| 311 | +- `serving/src/main/java/ai/djl/serving/wlm/util/EventManager.java` - Event system |
| 312 | +- `serving/src/main/java/ai/djl/serving/http/IllegalConfigurationException.java` - Exception type |
| 313 | +- `engines/python/setup/djl_python/adapter_manager_mixin.py` - Python adapter management |
| 314 | +- `engines/python/` - Python engine that secure mode validates |
| 315 | + |
| 316 | +## References |
| 317 | + |
| 318 | +- Plugin system: `serving/docs/plugin_management.md` |
| 319 | +- Model configuration: `serving/docs/configuration.md` |
| 320 | +- SageMaker integration: AWS SageMaker documentation |
0 commit comments