Skip to main content

Cerebras

Cerebras provides ultra-fast inference on their custom wafer-scale hardware with an OpenAI-compatible API.

Install

pip install synapsekit[openai]

Cerebras uses the OpenAI-compatible API, so it requires the openai package.

Usage

from synapsekit import LLMConfig
from synapsekit.llm.cerebras import CerebrasLLM

llm = CerebrasLLM(LLMConfig(
model="llama3.1-70b",
api_key="csk-...",
))

async for token in llm.stream("What is RAG?"):
print(token, end="", flush=True)

Available models

ModelID
Llama 3.1 8Bllama3.1-8b
Llama 3.1 70Bllama3.1-70b
Llama 3.3 70Bllama-3.3-70b

See the full list at inference-docs.cerebras.ai.

Function calling

result = await llm.call_with_tools(messages, tools)

Custom base URL

llm = CerebrasLLM(config, base_url="http://localhost:8000/v1")

Parameters

ParameterDescription
modelCerebras model ID
api_keyYour Cerebras API key
base_urlCustom API base URL (default: https://api.cerebras.ai/v1)