ics-simlab-config-gen-claude/docs/CHANGES.md

203 lines
6.6 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Summary of Changes
## Problem Fixed
PLC2 crashed at startup when attempting Modbus TCP write to PLC1 before PLC1 was ready, causing `ConnectionRefusedError` and container crash.
## Files Changed
### 1. `tools/compile_ir.py` (CRITICAL FIX)
**Location:** Lines 17-37 in `render_plc_rules()` function
**Changes:**
- Added `import time` to generated PLC logic files
- Added `_safe_callback()` function with retry logic (30 retries × 0.2s = 6s)
- Modified `_write()` to call `_safe_callback(cbs[key])` instead of direct `cbs[key]()`
**Impact:** All generated PLC logic files now include safe callback wrapper that prevents crashes from connection failures.
### 2. `build_scenario.py` (NEW FILE)
**Purpose:** Deterministic scenario builder that uses correct Python venv
**Features:**
- Uses `sys.executable` to ensure correct Python interpreter
- Orchestrates: configuration.json → IR → logic/*.py → validation
- Creates complete scenario directory at `outputs/scenario_run/`
- Validates all generated files
**Usage:**
```bash
.venv/bin/python3 build_scenario.py --out outputs/scenario_run --overwrite
```
### 3. `test_simlab.sh` (NEW FILE)
**Purpose:** Interactive ICS-SimLab test launcher
**Usage:**
```bash
./test_simlab.sh
```
### 4. `diagnose_runtime.sh` (NEW FILE)
**Purpose:** Diagnostic script to check scenario files and Docker state
**Usage:**
```bash
./diagnose_runtime.sh
```
### 5. `RUNTIME_FIX.md` (NEW FILE)
**Purpose:** Complete documentation of the fix, testing procedures, and troubleshooting
## Testing Commands
### Build Scenario
```bash
.venv/bin/python3 build_scenario.py --out outputs/scenario_run --overwrite
```
### Verify Fix
```bash
# Should show _safe_callback function
grep -A5 "_safe_callback" outputs/scenario_run/logic/plc2.py
```
### Run ICS-SimLab
```bash
cd ~/projects/ICS-SimLab-main/curtin-ics-simlab
sudo ./start.sh ~/projects/ics-simlab-config-gen_claude/outputs/scenario_run
```
### Monitor PLC2 Logs
```bash
# Find container name
sudo docker ps | grep plc2
# View logs (look for: NO "Exception in thread" errors)
sudo docker logs <plc2_container_name> -f
```
### Stop ICS-SimLab
```bash
cd ~/projects/ICS-SimLab-main/curtin-ics-simlab
sudo ./stop.sh
```
## Expected Runtime Behavior
### Before Fix
```
PLC2 container:
Exception in thread Thread-1:
Traceback (most recent call last):
...
ConnectionRefusedError: [Errno 111] Connection refused
[Container crashes]
```
### After Fix (Success Case)
```
PLC2 container:
[Silent retries for ~6 seconds]
[Normal operation once PLC1 is ready]
[No exceptions, no crashes]
```
### After Fix (PLC1 Never Starts)
```
PLC2 container:
WARNING: Callback failed after 30 attempts: [Errno 111] Connection refused
[Container continues running]
[Retries on next write attempt]
```
## Code Diff
### tools/compile_ir.py
```python
# BEFORE (lines 17-37):
def render_plc_rules(plc_name: str, rules: List[object]) -> str:
lines = []
lines.append('"""\n')
lines.append(f"PLC logic for {plc_name}: IR-compiled rules.\n\n")
lines.append("Autogenerated by ics-simlab-config-gen (IR compiler).\n")
lines.append('"""\n\n')
lines.append("from typing import Any, Callable, Dict\n\n\n")
lines.append("def _get_float(regs: Dict[str, Any], key: str, default: float = 0.0) -> float:\n")
lines.append(" try:\n")
lines.append(" return float(regs[key]['value'])\n")
lines.append(" except Exception:\n")
lines.append(" return float(default)\n\n\n")
lines.append("def _write(out_regs: Dict[str, Any], cbs: Dict[str, Callable[[], None]], key: str, value: int) -> None:\n")
lines.append(" if key not in out_regs:\n")
lines.append(" return\n")
lines.append(" cur = out_regs[key].get('value', None)\n")
lines.append(" if cur == value:\n")
lines.append(" return\n")
lines.append(" out_regs[key]['value'] = value\n")
lines.append(" if key in cbs:\n")
lines.append(" cbs[key]()\n\n\n") # <-- CRASHES HERE
# AFTER (lines 17-46):
def render_plc_rules(plc_name: str, rules: List[object]) -> str:
lines = []
lines.append('"""\n')
lines.append(f"PLC logic for {plc_name}: IR-compiled rules.\n\n")
lines.append("Autogenerated by ics-simlab-config-gen (IR compiler).\n")
lines.append('"""\n\n')
lines.append("import time\n") # <-- ADDED
lines.append("from typing import Any, Callable, Dict\n\n\n")
lines.append("def _get_float(regs: Dict[str, Any], key: str, default: float = 0.0) -> float:\n")
lines.append(" try:\n")
lines.append(" return float(regs[key]['value'])\n")
lines.append(" except Exception:\n")
lines.append(" return float(default)\n\n\n")
# ADDED: Safe callback wrapper
lines.append("def _safe_callback(cb: Callable[[], None], retries: int = 30, delay: float = 0.2) -> None:\n")
lines.append(" \"\"\"Invoke callback with retry logic to handle startup race conditions.\"\"\"\n")
lines.append(" for attempt in range(retries):\n")
lines.append(" try:\n")
lines.append(" cb()\n")
lines.append(" return\n")
lines.append(" except Exception as e:\n")
lines.append(" if attempt == retries - 1:\n")
lines.append(" print(f\"WARNING: Callback failed after {retries} attempts: {e}\")\n")
lines.append(" return\n")
lines.append(" time.sleep(delay)\n\n\n")
lines.append("def _write(out_regs: Dict[str, Any], cbs: Dict[str, Callable[[], None]], key: str, value: int) -> None:\n")
lines.append(" if key not in out_regs:\n")
lines.append(" return\n")
lines.append(" cur = out_regs[key].get('value', None)\n")
lines.append(" if cur == value:\n")
lines.append(" return\n")
lines.append(" out_regs[key]['value'] = value\n")
lines.append(" if key in cbs:\n")
lines.append(" _safe_callback(cbs[key])\n\n\n") # <-- NOW SAFE
```
## Validation Checklist
- [x] Fix implemented in `tools/compile_ir.py`
- [x] Build script created (`build_scenario.py`)
- [x] Build script uses correct venv (`sys.executable`)
- [x] Generated files include `_safe_callback()`
- [x] Generated files call `_safe_callback(cbs[key])` not `cbs[key]()`
- [x] Only uses stdlib (`time.sleep`)
- [x] Never raises from callbacks
- [x] Preserves PLC logic contract (no signature changes)
- [x] Test scripts created
- [x] Documentation created
## Next Steps
1. Run `./diagnose_runtime.sh` to verify scenario files
2. Run `./test_simlab.sh` to start ICS-SimLab
3. Monitor PLC2 logs for crashes (should see none)
4. Verify callbacks eventually succeed once PLC1 is ready