ReleaseCerebrasCerebraspublished Dec 8, 2025seen 5d

Cerebras/vscode-cerebras-chat v0.1.18

Cerebras/vscode-cerebras-chat

Open original ↗

Captured source

source ↗
published Dec 8, 2025seen 5dcaptured 12hhttp 200method plain

v0.1.18 - Rate Limit Optimization & Model Updates

Repository: Cerebras/vscode-cerebras-chat

Tag: v0.1.18

Published: 2025-12-08T21:35:19Z

Prerelease: no

Release notes:

Features

  • Use conservative max_completion_tokens defaults (8192) to prevent premature rate limiting
  • Cerebras rate limiter estimates quota based on max_completion_tokens upfront, not actual usage
  • Lower defaults preserve rate limit headroom for agentic tools

Fixes

  • Update llama-3.3-70b: maxInputTokens to 131072, maxOutputTokens to 65536
  • Update qwen-3-235b-a22b-instruct-2507: maxOutputTokens to 40960

Notability

notability 2.0/10

Routine VS Code extension update