Fix clw-llm-crash: OpenClaw LLM inference crashes during model execution

Mon, 13 Apr 2026 22:24:49 +0800

1. Symptoms

The clw-llm-crash error in OpenClaw manifests as an abrupt termination of the LLM inference process. Common indicators include:

Core dump or segmentation fault: Process exits with signal 11 (SIGSEGV) during model loading or token generation.

Log output:

[ERROR] clw-llm-crash: LLM backend failed at tensor allocation (line 1423, llm_engine.cpp)
[FATAL] GPU context lost: CUDA error 700 (cudaErrorIllegalAddress)
Aborted (core dumped)

High resource usage spike: VRAM usage jumps to 100% followed by OOM killer activation or driver reset.
Reproducible on specific models: Crashes consistently with quantized GGUF models (e.g., Llama-3-8B-Q4_K_M.gguf) but not unquantized ones.
Platform-specific: More frequent on NVIDIA GPUs with CUDA 11.x; AMD ROCm users report HIP kernel panics.

Stack traces often point to llm_engine.cpp or claw_cuda_backend.cu in OpenClaw source. Use gdb or cuda-gdb for deeper inspection:

Clw-Llm-Crash on ErrorVault — Developer Error Code Dictionary

Fix clw-llm-crash: OpenClaw LLM inference crashes during model execution

1. Symptoms