GRM-2.6-Opus Zero

Text-only GRM-2.6-Opus deployment for ZeroGPU with 4-bit loading, thinking controls, and streaming chat.

Optimized for ZeroGPU usage: text-only chat, NF4 4-bit quantization, bounded context, and shorter default generation lengths for better queue behavior. Model: OrionLLM/GRM-2.6-Opus

Chatbot