No conversation
Pick or create a chat to begin
OFFLINE
Ready.

Browser-only LLM chat

Your messages never leave this device. The model runs entirely in WebGPU. First-time loads download model weights once and cache them, then this app is fully offline.

System prompt (per conversation)

WebGPU not detected

This app runs a real LLM directly inside your browser using WebGPU. Your browser doesn't appear to expose a WebGPU adapter, so on-device inference isn't available right now.

Requirements
  • Chrome / Edge 113+ on desktop (or Chrome 121+ on Android with flag)
  • Hardware GPU with ~2 GB free VRAM (more for larger models)
  • HTTPS or localhost (WebGPU is blocked on plain http://)
  • Enabled flag if needed: chrome://flags/#enable-unsafe-webgpu

You can still try the UI β€” Demo Mode runs a tiny rule-based echo bot so the chat is fully navigable without GPU support.