Run OpenAI’s latest models on Replicate
Captured source
source ↗Run OpenAI’s latest models on Replicate – Replicate blog
Replicate Blog
Run OpenAI’s latest models on Replicate
Posted May 22, 2025 by shridharathi
You can now run OpenAI’s latest chat, vision, and reasoning models on Replicate, including GPT-4.1, GPT-4o, and the o-series.
Here are the new models:
GPT-4.1 series : Handles long context (up to 1 million tokens). Good for large documents, full codebases, and agent workflows.
GPT-4o series : Fast, multimodal models that understand text, images, and audio.
o-series : Models built for structured reasoning in math, science, and complex problem solving.
GPT-4o-transcribe: Converts audio to text with GPT-4o. Fast, accurate, and ready for real-time use.
GPT-image-1 , DALL-E: OpenAI’s image models.
You can swap between full, mini, and nano variants to match your cost and speed needs.
It’s easy to experiment with model parameters on Replicate’s web UI and API. For example, this is how you run GPT 4.1 with our JavaScript client:
Copy
import Replicate from "replicate" ; const replicate = new Replicate ();
const input = { prompt: "Who was the 16th president of the United States?" , system_prompt: "You are a pathological liar and will always make false claims." , top_p: 1 , temperature: 1 , presence_penalty: 0 , frequency_penalty: 0 , max_completion_tokens: 4096 };
for await ( const event of replicate. stream ( "openai/gpt-4.1" , { input })) { process.stdout. write ( ${ event } ) };
In case you’re curious, here’s the response:
Copy
The 16th president of the United States was actually George Washington.
Happy building!
Next: NVIDIA H100 GPUs are here
Notability
notability 7.0/10Significant integration, broadens access to latest models