What does this fork signal mean?

SiliconFlow forked siliconflow/litellm (forked from BerriAI/litellm). This fork signal points to upstream code the lab may be inspecting, patching, or building on. High-signal details: repo siliconflow/litellm · parent BerriAI/litellm · Routine self-fork, no traction.. onlylabs links this event to 1 captured evidence page and 6 related fork signals.

SiliconFlow Fork: siliconflow/litellm

Captured source

source ↗

GitHub/github.com/siliconflow/litellm

siliconflow/litellm repository metadata

Source ↗

published Oct 13, 2025seen Jun 5captured Jun 11http 200method plain

siliconflow/litellm

Description: Python SDK, Proxy Server (LLM Gateway) to call 100+ LLM APIs in OpenAI format - [Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthropic, Sagemaker, HuggingFace, Replicate, Groq]

License: NOASSERTION

Stars: 0

Forks: 0

Open issues: 0

Created: 2025-10-13T03:40:09Z

Pushed: 2025-10-20T08:41:19Z

Default branch: main

Fork: yes

Parent repository: BerriAI/litellm

Archived: no

README:

🚅 LiteLLM

Call all LLM APIs using the OpenAI format [Bedrock, Huggingface, VertexAI, TogetherAI, Azure, OpenAI, Groq etc.]

LiteLLM Proxy Server (LLM Gateway) | Hosted Proxy (Preview) | Enterprise Tier

LiteLLM manages:

Translate inputs to provider's completion, embedding, and image_generation endpoints
Consistent output, text responses will always be available at ['choices'][0]['message']['content']
Retry/fallback logic across multiple deployments (e.g. Azure/OpenAI) - Router
Set Budgets & Rate limits per project, api key, model LiteLLM Proxy Server (LLM Gateway)

**Jump to LiteLLM Proxy (LLM Gateway) Docs**

**Jump to Supported LLM Providers**

🚨 Stable Release: Use docker images with the -stable tag. These have undergone 12 hour load tests, before being published. More information about the release cycle here

Support for more providers. Missing a provider or LLM Platform, raise a feature request.

Usage (Docs)

> [!IMPORTANT] > LiteLLM v1.0.0 now requires openai>=1.0.0. Migration guide here > LiteLLM v1.40.14+ now requires pydantic>=2.0.0. No changes required.

pip install litellm

from litellm import completion
import os

## set ENV variables
os.environ["OPENAI_API_KEY"] = "your-openai-key"
os.environ["ANTHROPIC_API_KEY"] = "your-anthropic-key"

messages = [{ "content": "Hello, how are you?","role": "user"}]

# openai call
response = completion(model="openai/gpt-4o", messages=messages)

# anthropic call
response = completion(model="anthropic/claude-sonnet-4-20250514", messages=messages)
print(response)

Response (OpenAI Format)

{
"id": "chatcmpl-1214900a-6cdd-4148-b663-b5e2f642b4de",
"created": 1751494488,
"model": "claude-sonnet-4-20250514",
"object": "chat.completion",
"system_fingerprint": null,
"choices": [
{
"finish_reason": "stop",
"index": 0,
"message": {
"content": "Hello! I'm doing well, thank you for asking. I'm here and ready to help with whatever you'd like to discuss or work on. How are you doing today?",
"role": "assistant",
"tool_calls": null,
"function_call": null
}
}
],
"usage": {
"completion_tokens": 39,
"prompt_tokens": 13,
"total_tokens": 52,
"completion_tokens_details": null,
"prompt_tokens_details": {
"audio_tokens": null,
"cached_tokens": 0
},
"cache_creation_input_tokens": 0,
"cache_read_input_tokens": 0
}
}

Call any model supported by a provider, with model=/. There might be provider-specific details here, so refer to provider docs for more information

Async (Docs)

from litellm import acompletion
import asyncio

async def test_get_response():
user_message = "Hello, how are you?"
messages = [{"content": user_message, "role": "user"}]
response = await acompletion(model="openai/gpt-4o", messages=messages)
return response

response = asyncio.run(test_get_response())
print(response)

Streaming (Docs)

liteLLM supports streaming the model response back, pass stream=True to get a streaming iterator in response. Streaming is supported for all models (Bedrock, Huggingface, TogetherAI, Azure, OpenAI, etc.)

from litellm import completion
response = completion(model="openai/gpt-4o", messages=messages, stream=True)
for part in response:
print(part.choices[0].delta.content or "")

# claude sonnet 4
response = completion('anthropic/claude-sonnet-4-20250514', messages, stream=True)
for part in response:
print(part)

Response chunk (OpenAI Format)

{
"id": "chatcmpl-fe575c37-5004-4926-ae5e-bfbc31f356ca",
"created": 1751494808,
"model": "claude-sonnet-4-20250514",
"object": "chat.completion.chunk",
"system_fingerprint": null,
"choices": [
{
"finish_reason": null,
"index": 0,
"delta": {
"provider_specific_fields": null,
"content": "Hello",
"role": "assistant",
"function_call": null,
"tool_calls": null,
"audio": null
},
"logprobs": null
}
],
"provider_specific_fields": null,
"stream_options": null,
"citations": null
}

Logging Observability (Docs)

LiteLLM exposes pre defined callbacks to send data to Lunary, MLflow, Langfuse, DynamoDB, s3 Buckets, Helicone, Promptlayer, Traceloop, Athina, Slack

from litellm import completion

## set env variables for logging tools (when using MLflow, no API key set up is required)
os.environ["LUNARY_PUBLIC_KEY"] = "your-lunary-public-key"
os.environ["HELICONE_API_KEY"] = "your-helicone-auth-key"
os.environ["LANGFUSE_PUBLIC_KEY"] = ""
os.environ["LANGFUSE_SECRET_KEY"] = ""
os.environ["ATHINA_API_KEY"] = "your-athina-api-key"

os.environ["OPENAI_API_KEY"] = "your-openai-key"

# set callbacks
litellm.success_callback = ["lunary", "mlflow", "langfuse", "athina", "helicone"] # log input/output to lunary, langfuse, supabase, athina, helicone etc

#openai call
response = completion(model="openai/gpt-4o", messages=[{"role": "user", "content": "Hi 👋 - i'm openai"}])

LiteLLM Proxy Server (LLM Gateway) - (Docs)

Track spend + Load Balance across multiple projects

Hosted Proxy (Preview)

The proxy provides:

1. Hooks for auth 2. Hooks for logging...

Excerpt shown — open the source for the full document.

Notability

notability 1.0/10

Routine self-fork, no traction.