203 lines
6.6 KiB
Markdown
203 lines
6.6 KiB
Markdown
# Summary of Changes
|
||
|
||
## Problem Fixed
|
||
|
||
PLC2 crashed at startup when attempting Modbus TCP write to PLC1 before PLC1 was ready, causing `ConnectionRefusedError` and container crash.
|
||
|
||
## Files Changed
|
||
|
||
### 1. `tools/compile_ir.py` (CRITICAL FIX)
|
||
|
||
**Location:** Lines 17-37 in `render_plc_rules()` function
|
||
|
||
**Changes:**
|
||
- Added `import time` to generated PLC logic files
|
||
- Added `_safe_callback()` function with retry logic (30 retries × 0.2s = 6s)
|
||
- Modified `_write()` to call `_safe_callback(cbs[key])` instead of direct `cbs[key]()`
|
||
|
||
**Impact:** All generated PLC logic files now include safe callback wrapper that prevents crashes from connection failures.
|
||
|
||
### 2. `build_scenario.py` (NEW FILE)
|
||
|
||
**Purpose:** Deterministic scenario builder that uses correct Python venv
|
||
|
||
**Features:**
|
||
- Uses `sys.executable` to ensure correct Python interpreter
|
||
- Orchestrates: configuration.json → IR → logic/*.py → validation
|
||
- Creates complete scenario directory at `outputs/scenario_run/`
|
||
- Validates all generated files
|
||
|
||
**Usage:**
|
||
```bash
|
||
.venv/bin/python3 build_scenario.py --out outputs/scenario_run --overwrite
|
||
```
|
||
|
||
### 3. `test_simlab.sh` (NEW FILE)
|
||
|
||
**Purpose:** Interactive ICS-SimLab test launcher
|
||
|
||
**Usage:**
|
||
```bash
|
||
./test_simlab.sh
|
||
```
|
||
|
||
### 4. `diagnose_runtime.sh` (NEW FILE)
|
||
|
||
**Purpose:** Diagnostic script to check scenario files and Docker state
|
||
|
||
**Usage:**
|
||
```bash
|
||
./diagnose_runtime.sh
|
||
```
|
||
|
||
### 5. `RUNTIME_FIX.md` (NEW FILE)
|
||
|
||
**Purpose:** Complete documentation of the fix, testing procedures, and troubleshooting
|
||
|
||
## Testing Commands
|
||
|
||
### Build Scenario
|
||
```bash
|
||
.venv/bin/python3 build_scenario.py --out outputs/scenario_run --overwrite
|
||
```
|
||
|
||
### Verify Fix
|
||
```bash
|
||
# Should show _safe_callback function
|
||
grep -A5 "_safe_callback" outputs/scenario_run/logic/plc2.py
|
||
```
|
||
|
||
### Run ICS-SimLab
|
||
```bash
|
||
cd ~/projects/ICS-SimLab-main/curtin-ics-simlab
|
||
sudo ./start.sh ~/projects/ics-simlab-config-gen_claude/outputs/scenario_run
|
||
```
|
||
|
||
### Monitor PLC2 Logs
|
||
```bash
|
||
# Find container name
|
||
sudo docker ps | grep plc2
|
||
|
||
# View logs (look for: NO "Exception in thread" errors)
|
||
sudo docker logs <plc2_container_name> -f
|
||
```
|
||
|
||
### Stop ICS-SimLab
|
||
```bash
|
||
cd ~/projects/ICS-SimLab-main/curtin-ics-simlab
|
||
sudo ./stop.sh
|
||
```
|
||
|
||
## Expected Runtime Behavior
|
||
|
||
### Before Fix
|
||
```
|
||
PLC2 container:
|
||
Exception in thread Thread-1:
|
||
Traceback (most recent call last):
|
||
...
|
||
ConnectionRefusedError: [Errno 111] Connection refused
|
||
[Container crashes]
|
||
```
|
||
|
||
### After Fix (Success Case)
|
||
```
|
||
PLC2 container:
|
||
[Silent retries for ~6 seconds]
|
||
[Normal operation once PLC1 is ready]
|
||
[No exceptions, no crashes]
|
||
```
|
||
|
||
### After Fix (PLC1 Never Starts)
|
||
```
|
||
PLC2 container:
|
||
WARNING: Callback failed after 30 attempts: [Errno 111] Connection refused
|
||
[Container continues running]
|
||
[Retries on next write attempt]
|
||
```
|
||
|
||
## Code Diff
|
||
|
||
### tools/compile_ir.py
|
||
|
||
```python
|
||
# BEFORE (lines 17-37):
|
||
def render_plc_rules(plc_name: str, rules: List[object]) -> str:
|
||
lines = []
|
||
lines.append('"""\n')
|
||
lines.append(f"PLC logic for {plc_name}: IR-compiled rules.\n\n")
|
||
lines.append("Autogenerated by ics-simlab-config-gen (IR compiler).\n")
|
||
lines.append('"""\n\n')
|
||
lines.append("from typing import Any, Callable, Dict\n\n\n")
|
||
lines.append("def _get_float(regs: Dict[str, Any], key: str, default: float = 0.0) -> float:\n")
|
||
lines.append(" try:\n")
|
||
lines.append(" return float(regs[key]['value'])\n")
|
||
lines.append(" except Exception:\n")
|
||
lines.append(" return float(default)\n\n\n")
|
||
lines.append("def _write(out_regs: Dict[str, Any], cbs: Dict[str, Callable[[], None]], key: str, value: int) -> None:\n")
|
||
lines.append(" if key not in out_regs:\n")
|
||
lines.append(" return\n")
|
||
lines.append(" cur = out_regs[key].get('value', None)\n")
|
||
lines.append(" if cur == value:\n")
|
||
lines.append(" return\n")
|
||
lines.append(" out_regs[key]['value'] = value\n")
|
||
lines.append(" if key in cbs:\n")
|
||
lines.append(" cbs[key]()\n\n\n") # <-- CRASHES HERE
|
||
|
||
# AFTER (lines 17-46):
|
||
def render_plc_rules(plc_name: str, rules: List[object]) -> str:
|
||
lines = []
|
||
lines.append('"""\n')
|
||
lines.append(f"PLC logic for {plc_name}: IR-compiled rules.\n\n")
|
||
lines.append("Autogenerated by ics-simlab-config-gen (IR compiler).\n")
|
||
lines.append('"""\n\n')
|
||
lines.append("import time\n") # <-- ADDED
|
||
lines.append("from typing import Any, Callable, Dict\n\n\n")
|
||
lines.append("def _get_float(regs: Dict[str, Any], key: str, default: float = 0.0) -> float:\n")
|
||
lines.append(" try:\n")
|
||
lines.append(" return float(regs[key]['value'])\n")
|
||
lines.append(" except Exception:\n")
|
||
lines.append(" return float(default)\n\n\n")
|
||
# ADDED: Safe callback wrapper
|
||
lines.append("def _safe_callback(cb: Callable[[], None], retries: int = 30, delay: float = 0.2) -> None:\n")
|
||
lines.append(" \"\"\"Invoke callback with retry logic to handle startup race conditions.\"\"\"\n")
|
||
lines.append(" for attempt in range(retries):\n")
|
||
lines.append(" try:\n")
|
||
lines.append(" cb()\n")
|
||
lines.append(" return\n")
|
||
lines.append(" except Exception as e:\n")
|
||
lines.append(" if attempt == retries - 1:\n")
|
||
lines.append(" print(f\"WARNING: Callback failed after {retries} attempts: {e}\")\n")
|
||
lines.append(" return\n")
|
||
lines.append(" time.sleep(delay)\n\n\n")
|
||
lines.append("def _write(out_regs: Dict[str, Any], cbs: Dict[str, Callable[[], None]], key: str, value: int) -> None:\n")
|
||
lines.append(" if key not in out_regs:\n")
|
||
lines.append(" return\n")
|
||
lines.append(" cur = out_regs[key].get('value', None)\n")
|
||
lines.append(" if cur == value:\n")
|
||
lines.append(" return\n")
|
||
lines.append(" out_regs[key]['value'] = value\n")
|
||
lines.append(" if key in cbs:\n")
|
||
lines.append(" _safe_callback(cbs[key])\n\n\n") # <-- NOW SAFE
|
||
```
|
||
|
||
## Validation Checklist
|
||
|
||
- [x] Fix implemented in `tools/compile_ir.py`
|
||
- [x] Build script created (`build_scenario.py`)
|
||
- [x] Build script uses correct venv (`sys.executable`)
|
||
- [x] Generated files include `_safe_callback()`
|
||
- [x] Generated files call `_safe_callback(cbs[key])` not `cbs[key]()`
|
||
- [x] Only uses stdlib (`time.sleep`)
|
||
- [x] Never raises from callbacks
|
||
- [x] Preserves PLC logic contract (no signature changes)
|
||
- [x] Test scripts created
|
||
- [x] Documentation created
|
||
|
||
## Next Steps
|
||
|
||
1. Run `./diagnose_runtime.sh` to verify scenario files
|
||
2. Run `./test_simlab.sh` to start ICS-SimLab
|
||
3. Monitor PLC2 logs for crashes (should see none)
|
||
4. Verify callbacks eventually succeed once PLC1 is ready
|