# Summary of Changes ## Problem Fixed PLC2 crashed at startup when attempting Modbus TCP write to PLC1 before PLC1 was ready, causing `ConnectionRefusedError` and container crash. ## Files Changed ### 1. `tools/compile_ir.py` (CRITICAL FIX) **Location:** Lines 17-37 in `render_plc_rules()` function **Changes:** - Added `import time` to generated PLC logic files - Added `_safe_callback()` function with retry logic (30 retries × 0.2s = 6s) - Modified `_write()` to call `_safe_callback(cbs[key])` instead of direct `cbs[key]()` **Impact:** All generated PLC logic files now include safe callback wrapper that prevents crashes from connection failures. ### 2. `build_scenario.py` (NEW FILE) **Purpose:** Deterministic scenario builder that uses correct Python venv **Features:** - Uses `sys.executable` to ensure correct Python interpreter - Orchestrates: configuration.json → IR → logic/*.py → validation - Creates complete scenario directory at `outputs/scenario_run/` - Validates all generated files **Usage:** ```bash .venv/bin/python3 build_scenario.py --out outputs/scenario_run --overwrite ``` ### 3. `test_simlab.sh` (NEW FILE) **Purpose:** Interactive ICS-SimLab test launcher **Usage:** ```bash ./test_simlab.sh ``` ### 4. `diagnose_runtime.sh` (NEW FILE) **Purpose:** Diagnostic script to check scenario files and Docker state **Usage:** ```bash ./diagnose_runtime.sh ``` ### 5. `RUNTIME_FIX.md` (NEW FILE) **Purpose:** Complete documentation of the fix, testing procedures, and troubleshooting ## Testing Commands ### Build Scenario ```bash .venv/bin/python3 build_scenario.py --out outputs/scenario_run --overwrite ``` ### Verify Fix ```bash # Should show _safe_callback function grep -A5 "_safe_callback" outputs/scenario_run/logic/plc2.py ``` ### Run ICS-SimLab ```bash cd ~/projects/ICS-SimLab-main/curtin-ics-simlab sudo ./start.sh ~/projects/ics-simlab-config-gen_claude/outputs/scenario_run ``` ### Monitor PLC2 Logs ```bash # Find container name sudo docker ps | grep plc2 # View logs (look for: NO "Exception in thread" errors) sudo docker logs -f ``` ### Stop ICS-SimLab ```bash cd ~/projects/ICS-SimLab-main/curtin-ics-simlab sudo ./stop.sh ``` ## Expected Runtime Behavior ### Before Fix ``` PLC2 container: Exception in thread Thread-1: Traceback (most recent call last): ... ConnectionRefusedError: [Errno 111] Connection refused [Container crashes] ``` ### After Fix (Success Case) ``` PLC2 container: [Silent retries for ~6 seconds] [Normal operation once PLC1 is ready] [No exceptions, no crashes] ``` ### After Fix (PLC1 Never Starts) ``` PLC2 container: WARNING: Callback failed after 30 attempts: [Errno 111] Connection refused [Container continues running] [Retries on next write attempt] ``` ## Code Diff ### tools/compile_ir.py ```python # BEFORE (lines 17-37): def render_plc_rules(plc_name: str, rules: List[object]) -> str: lines = [] lines.append('"""\n') lines.append(f"PLC logic for {plc_name}: IR-compiled rules.\n\n") lines.append("Autogenerated by ics-simlab-config-gen (IR compiler).\n") lines.append('"""\n\n') lines.append("from typing import Any, Callable, Dict\n\n\n") lines.append("def _get_float(regs: Dict[str, Any], key: str, default: float = 0.0) -> float:\n") lines.append(" try:\n") lines.append(" return float(regs[key]['value'])\n") lines.append(" except Exception:\n") lines.append(" return float(default)\n\n\n") lines.append("def _write(out_regs: Dict[str, Any], cbs: Dict[str, Callable[[], None]], key: str, value: int) -> None:\n") lines.append(" if key not in out_regs:\n") lines.append(" return\n") lines.append(" cur = out_regs[key].get('value', None)\n") lines.append(" if cur == value:\n") lines.append(" return\n") lines.append(" out_regs[key]['value'] = value\n") lines.append(" if key in cbs:\n") lines.append(" cbs[key]()\n\n\n") # <-- CRASHES HERE # AFTER (lines 17-46): def render_plc_rules(plc_name: str, rules: List[object]) -> str: lines = [] lines.append('"""\n') lines.append(f"PLC logic for {plc_name}: IR-compiled rules.\n\n") lines.append("Autogenerated by ics-simlab-config-gen (IR compiler).\n") lines.append('"""\n\n') lines.append("import time\n") # <-- ADDED lines.append("from typing import Any, Callable, Dict\n\n\n") lines.append("def _get_float(regs: Dict[str, Any], key: str, default: float = 0.0) -> float:\n") lines.append(" try:\n") lines.append(" return float(regs[key]['value'])\n") lines.append(" except Exception:\n") lines.append(" return float(default)\n\n\n") # ADDED: Safe callback wrapper lines.append("def _safe_callback(cb: Callable[[], None], retries: int = 30, delay: float = 0.2) -> None:\n") lines.append(" \"\"\"Invoke callback with retry logic to handle startup race conditions.\"\"\"\n") lines.append(" for attempt in range(retries):\n") lines.append(" try:\n") lines.append(" cb()\n") lines.append(" return\n") lines.append(" except Exception as e:\n") lines.append(" if attempt == retries - 1:\n") lines.append(" print(f\"WARNING: Callback failed after {retries} attempts: {e}\")\n") lines.append(" return\n") lines.append(" time.sleep(delay)\n\n\n") lines.append("def _write(out_regs: Dict[str, Any], cbs: Dict[str, Callable[[], None]], key: str, value: int) -> None:\n") lines.append(" if key not in out_regs:\n") lines.append(" return\n") lines.append(" cur = out_regs[key].get('value', None)\n") lines.append(" if cur == value:\n") lines.append(" return\n") lines.append(" out_regs[key]['value'] = value\n") lines.append(" if key in cbs:\n") lines.append(" _safe_callback(cbs[key])\n\n\n") # <-- NOW SAFE ``` ## Validation Checklist - [x] Fix implemented in `tools/compile_ir.py` - [x] Build script created (`build_scenario.py`) - [x] Build script uses correct venv (`sys.executable`) - [x] Generated files include `_safe_callback()` - [x] Generated files call `_safe_callback(cbs[key])` not `cbs[key]()` - [x] Only uses stdlib (`time.sleep`) - [x] Never raises from callbacks - [x] Preserves PLC logic contract (no signature changes) - [x] Test scripts created - [x] Documentation created ## Next Steps 1. Run `./diagnose_runtime.sh` to verify scenario files 2. Run `./test_simlab.sh` to start ICS-SimLab 3. Monitor PLC2 logs for crashes (should see none) 4. Verify callbacks eventually succeed once PLC1 is ready