GPT-4o System Card
Captured source
source ↗GPT-4o System Card | OpenAI
August 8, 2024
GPT‑4o System Card
Loading…
Share
GPT-4o Scorecard
Key Areas of Risk Evaluation & Mitigation
Unauthorized voice generation
Speaker identification
Ungrounded inference & sensitive trait attribution
Generating disallowed audio content
Generating erotic & violent speech
Preparedness Framework Scorecard
Cybersecurity
Low
Biological Threats
Low
Persuasion
Medium
Model Autonomy
Low
Scorecard ratings
- Low
- Medium
- High
- Critical
Only models with a post-mitigation score of "medium" or below can be deployed.Only models with a post-mitigation score of "high" or below can be developed further.
We thoroughly evaluate new models for potential risks and build in appropriate safeguards before deploying them in ChatGPT or the API. We’re publishing the model System Card together with the Preparedness Framework scorecard to provide an end-to-end safety assessment of GPT‑4o, including what we’ve done to track and address today’s safety challenges as well as frontier risks.
Building on the safety evaluations and mitigations we developed for GPT‑4, and GPT‑4V, we’ve focused additional efforts on GPT‑4o's audio capabilities which present novel risks, while also evaluating its text and vision capabilities.
Some of the risks we evaluated include speaker identification, unauthorized voice generation, the potential generation of copyrighted content, ungrounded inference, and disallowed content. Based on these evaluations, we’ve implemented safeguards at both the model- and system-levels to mitigate these risks.
Our findings indicate that GPT‑4o’s voice modality doesn’t meaningfully increase Preparedness risks. Three of the four Preparedness Framework categories scored low, with persuasion, scoring borderline medium. The Safety Advisory Group reviewed our Preparedness evaluations and mitigations as part of our safe deployment process. We invite you to read the details of this work in the report below.
---
Introduction
GPT‑4o1 is an autoregressive omni model, which accepts as input any combination of text, audio, image, and video and generates any combination of text, audio, and image outputs. It’s trained end-to-end across text, vision, and audio, meaning that all inputs and outputs are processed by the same neural network.
GPT‑4o can respond to audio inputs in as little as 232 milliseconds, with an average of 320 milliseconds, which is similar to human response time 2 in a conversation. It matches GPT‑4 Turbo performance on text in English and code, with significant improvement on text in non-English languages, while also being much faster and 50% cheaper in the API. GPT‑4o is especially better at vision and audio understanding compared to existing models.
In line with our commitment to building AI safely and consistent with our voluntary commitments to the White House3, we are sharing the GPT‑4o System Card, which includes our Preparedness Framework 5 evaluations. In this System Card, we provide a detailed look at GPT‑4o’s capabilities, limitations, and safety evaluations across multiple categories, with a focus on speech-to-speech (voice)A while also evaluating text and image capabilities, and the measures we’ve taken to enhance safety and alignment. We also include third party assessments on general autonomous capabilities, as well as discussion of potential societal impacts of GPT‑4o text and vision capabilities.
Model data & training
GPT‑4o's capabilities were pre-trained using data up to October 2023, sourced from a wide variety of materials including:
1. Select publicly available data, mostly collected from industry-standard machine learning datasets and web crawls. 2. Proprietary data from data partnerships. We form partnerships to access non-publicly available data, such as pay-walled content, archives, and metadata. For example, we partnered with Shutterstock 5 on building and delivering AI-generated images.
The key dataset components that contribute to GPT‑4o’s capabilities are:
1. Web Data – Data from public web pages provides a rich and diverse range of information, ensuring the model learns from a wide variety of perspectives and topics. 2. Code and math – Including code and math data in training helps the model develop robust reasoning skills by exposing it to structured logic and problem-solving processes. 3. Multimodal data – Our dataset includes images, audio, and video to teach the LLMs how to interpret and generate non-textual input and output. From this data, the model learns how to interpret visual images, actions and sequences in real-world contexts, language patterns, and speech nuances.
Prior to deployment, OpenAI assesses and mitigates potential risks that may stem from generative models, such as information harms, bias and discrimination, or other content that violates our safety policies. We use a combination of methods, spanning all stages of development across pre-training, post-training, product development, and policy. For example, during post-training, we align the model to human preferences; we red team the resulting models and add product-level mitigations such as monitoring and enforcement; and we provide moderation tools and transparency reports to our users.
We find that the majority of effective testing and mitigations are done after the pre-training stage because filtering pre-trained data alone cannot address nuanced and context-specific harms. At the same time, certain pre-training filtering mitigations can provide an additional layer of defense that, along with other safety mitigations, help exclude unwanted and harmful information from our datasets:
- We use our Moderation API and safety classifiers to filter out data that could contribute to harmful content or information hazards, including CSAM, hateful content, violence, and CBRN.
- As with our previous image generation systems, we filter our image generation datasets for explicit content such as graphic sexual material and CSAM.
- We use advanced data filtering processes to reduce personal information from training data.
- Upon releasing DALL·E 3, we piloted a new approach to give users the power to opt images out of training. To respect those opt-outs, we fingerprinted the images and used the fingerprints to remove all instances of the images from the training dataset for the GPT‑4o series of models.
Risk identification, assessment and mitigation
Deployment preparation was carried out via…
Excerpt shown — open the source for the full document.
Notability
notability 9.0/10Flagship model system card release