Fix clw-gpu-not-found: OpenClaw fails to detect compatible GPU device

OpenCL Intermediate Linux Windows macOS

1. Symptoms

The clw-gpu-not-found error in OpenClaw manifests when the library attempts to initialize an OpenCL context targeting a GPU device but fails to detect any compatible hardware. This typically halts program execution during device enumeration or context creation.

Common symptoms include:

[ERROR] clw-gpu-not-found: No OpenCL GPU devices available on platform.

CLW_ERROR: clwGetGpuDevice() returned CLW_NO_GPU_FOUND (code: -1001)

Failed to create GPU context: clw-gpu-not-found. Falling back to CPU (performance degraded).


Programs relying on GPU acceleration, such as compute shaders, machine learning inference, or scientific simulations built with OpenClaw, will either crash, log the error and degrade to CPU mode, or exit with a non-zero code. On Linux systems, you might see this in `dmesg` or application logs alongside OpenCL ICD loader warnings. Windows Event Viewer may log ICD-related failures under "Application" events. Performance drops are noticeable: GPU-bound kernels run 10-100x slower on CPU fallback.

In verbose mode (e.g., `CLW_LOG_LEVEL=debug`), additional diagnostics appear:

[DEBUG] Platforms found: 1 (Intel CPU only) [DEBUG] GPU devices on platform 0: 0 [ERROR] clw-gpu-not-found


This error blocks parallel compute workloads, causing timeouts in HPC pipelines or reduced FPS in graphics/rendering apps using OpenClaw.

## 2. Root Cause

OpenClaw is a thin C/C++ wrapper around OpenCL 1.2/2.x APIs, providing functions like `clwGetPlatforms()`, `clwGetGpuDevices()`, and `clwCreateContext()`. The `clw-gpu-not-found` error (internal code often -1001 or CLW_DEVICE_NOT_FOUND) triggers when:

1. **No OpenCL-compatible GPU hardware**: Integrated GPUs (e.g., Intel UHD) may lack full OpenCL support, or discrete GPUs (NVIDIA/AMD) are absent/disabled in BIOS.

2. **Missing or incompatible drivers**:
   - NVIDIA: CUDA Toolkit without OpenCL ICD.
   - AMD: ROCm or legacy APP SDK not exposing OpenCL.
   - Intel: oneAPI Base Toolkit absent.
   - Vendor-agnostic: POCL (CPU-only) installed but no GPU ICD.

3. **ICD Loader misconfiguration**: OpenCL Installable Client Driver (ICD) registry (`/etc/OpenCL/vendors/` on Linux, `HKLM\SOFTWARE\Khronos\OpenCL\Vendors` on Windows) lacks `.icd` files pointing to GPU drivers.

4. **Platform/device filtering**: Code calls `clwGetGpuDevices()` without fallback, ignoring CPU or accelerator devices.

5. **Runtime environment**: Docker containers without `--gpus all`, WSL2 without GPU passthrough, or virtual machines without GPU acceleration.

6. **Hardware state**: GPU powered off, PCIe slot disabled, or overheating/throttled.

Use `clinfo` to diagnose:

Number of platforms: 1 Platform Name: Intel(R) OpenCL Platform Devices: 1 (CPU only)


Zero GPU entries confirm the issue. Kernel logs (`journalctl -u opencl` or `nvidia-smi`) reveal driver absences.

## 3. Step-by-Step Fix

Fix `clw-gpu-not-found` by verifying hardware/drivers, then updating code for robust device selection.

### Step 1: Verify GPU Hardware
Run `lspci | grep VGA` (Linux), `dxdiag` (Windows), or `system_profiler SPDisplaysDataType` (macOS).

03:00.0 VGA compatible controller: NVIDIA Corporation GA104 [GeForce RTX 3070]


If no GPU, enable in BIOS or add hardware.

### Step 2: Install OpenCL Drivers
- **NVIDIA (Linux/Windows)**:

Ubuntu/Debian

sudo apt update sudo apt install nvidia-driver-535 nvidia-opencl-icd ocl-icd-opencl-dev

Reboot, verify: `nvidia-smi`.

- **AMD (Linux)**:

sudo apt install rocm-opencl-runtime


- **Intel**:

Linux

wget https://apt.repos.intel.com/intel-gpg-keys/GPG-PUB-KEY-INTEL-SW-PRODUCTS.PUB sudo apt-key add GPG-PUB-KEY-INTEL-SW-PRODUCTS.PUB sudo add-apt-repository “deb https://apt.repos.intel.com/oneapi all main” sudo apt install intel-oneapi-runtime-opencl


- **Windows**: Download CUDA Toolkit (includes OpenCL), AMD Software, or Intel oneAPI.

### Step 3: Install clinfo and Validate

sudo apt install clinfo clinfo | grep -E “Number of platforms|Platform Name|GPU”

Expected:

Number of platforms: 2 Platform Name: NVIDIA CUDA Name: GeForce RTX 3070/PCIe/SSE2


Update ICD if needed:

Linux: Ensure /etc/OpenCL/vendors/nvidia.icd exists

echo “libnvidia-opencl.so.1” | sudo tee /etc/OpenCL/vendors/nvidia.icd


### Step 4: Update OpenClaw Code
Modify device selection to enumerate all devices and prefer GPU.

**Before:**
```cpp
#include <openclaw/clw.h>

int main() {
    clw_int status;
    clw_device gpu;

    status = clwGetGpuDevice(0, &gpu);  // Assumes GPU at index 0
    if (status != CLW_SUCCESS) {
        fprintf(stderr, "clw-gpu-not-found\n");
        return -1;
    }

    clw_context ctx;
    clwCreateContext(&ctx, 1, &gpu);
    // ...
}

This fails if no GPU.

After:

#include <openclaw/clw.h>
#include <vector>
#include <iostream>

int main() {
    clw_int status;
    clw_uint num_platforms, num_devices;
    std::vector<clw_device> devices;

    // Enumerate platforms
    status = clwGetPlatformCount(&num_platforms);
    if (status != CLW_SUCCESS) return -1;

    clw_platform* platforms = new clw_platform[num_platforms];
    clwGetPlatforms(platforms, num_platforms);

    // Collect all GPU devices, fallback to CPU
    for (clw_uint i = 0; i < num_platforms; ++i) {
        clwGetDeviceCount(platforms[i], CLW_DEVICE_TYPE_GPU, &num_devices);
        clw_device* gpus = new clw_device[num_devices];
        clwGetDevices(platforms[i], CLW_DEVICE_TYPE_GPU, gpus, num_devices);

        for (clw_uint j = 0; j < num_devices; ++j) {
            devices.push_back(gpus[j]);
        }
        delete[] gpus;

        // CPU fallback if no GPU
        if (devices.empty()) {
            clwGetDeviceCount(platforms[i], CLW_DEVICE_TYPE_CPU, &num_devices);
            clw_device* cpus = new clw_device[num_devices];
            clwGetDevices(platforms[i], CLW_DEVICE_TYPE_CPU, cpus, num_devices);
            if (num_devices > 0) devices.push_back(cpus[0]);
            delete[] cpus;
        }
    }
    delete[] platforms;

    if (devices.empty()) {
        std::cerr << "clw-gpu-not-found: No devices available" << std::endl;
        return -1;
    }

    clw_device selected = devices[0];  // Prefer first GPU or CPU
    clw_context ctx;
    clwCreateContext(&ctx, 1, &selected);

    std::cout << "Using device: GPU fallback handled" << std::endl;
    // Proceed with kernels...
    clwReleaseContext(ctx);
    return 0;
}

Compile: g++ -o app app.cpp -lOpenCL -lopenclaw.

Step 5: Environment Fixes

Docker: docker run --gpus all -it your-image. WSL2: Install NVIDIA drivers for WSL.

4. Verification

  1. Rerun clinfo – confirm GPU listed with OpenCL version >=1.2.
Platform Profile: FULL_PROFILE
Platform Version: OpenCL 3.0 CUDA 12.2.140
Name: NVIDIA GeForce RTX 3070
  1. Execute updated code – expect “Using device: GPU” without errors.

  2. Monitor with nvidia-smi -l 1 or rocm-smi – GPU utilization >0% during kernels.

  3. OpenClaw-specific test:

// Minimal test snippet
clwGetGpuDeviceCount(&count);
printf("GPU count: %u\n", count);  // Should be >0
  1. Stress test: Run a matrix multiply kernel 100x, benchmark FPS/throughput vs CPU.

If still failing: strace -e open ./app (Linux) for library loads.

5. Common Pitfalls

  • Driver version mismatch: CUDA 12.x requires matching OpenClaw/OpenCL headers. Pin versions: apt install nvidia-opencl-dev-535.

  • ICD priority: Multiple vendors? Edit /etc/OpenCL/vendors/ order – NVIDIA first for perf.

# Pitfall: Only POCL installed
ls /etc/OpenCL/vendors/  # Should have nvidia.icd, not just pocl.icd
  • Code assumes single GPU: clwGetGpuDevice(0) fails on multi-GPU if index wrong. Always enumerate.

  • Headless servers: No X11? Use nvidia-xconfig --virtual or prime offload.

  • Virtualization: VMware/VirtualBox lacks GPU passthrough; use VFIO/KVM.

  • macOS: Deprecated OpenCL – use Metal. Error persists post-Mojave.

  • Overlooking iGPUs: Intel Arc/ UHD often supports OpenCL 3.0 – test with clinfo.

  • Permissions: sudo usermod -a -G render $USER for non-root access.

⚠️ Unverified on ARM (e.g., Apple Silicon) – may require MoltenVK.

  • clw-platform-not-found: No OpenCL platforms at all. Fix: Install ICD loader (ocl-icd-libopencl1).

  • clw-context-create-failed: GPU found but context alloc fails (OOM). Check VRAM with nvidia-smi.

  • clw-device-init-error: Device selected but init fails (e.g., invalid props). Use CLW_CONTEXT_PLATFORM.

ErrorSimilarityQuick Fix
clw-gpu-not-found100% (this article)Drivers + enumeration
clw-context-create-failed80%Increase VRAM limits
clw-no-opencl60%Install libOpenCL

Cross-reference for chained failures, e.g., platform error precedes GPU.


(Word count: 1427. Code blocks: ~45% of content)