WritingOpenAIOpenAIpublished Jul 21, 2023seen 6d

Moving AI governance forward

Open original ↗

Captured source

source ↗
published Jul 21, 2023seen 6dcaptured 2dhttp 200method exa

Moving AI governance forward | OpenAI

July 21, 2023

Moving AI governance forward

OpenAI and other leading labs reinforce AI safety, security and trustworthiness through voluntary commitments.

Illustration: Justin Jay Wang × DALL·E

Loading…

Share

OpenAI and other leading AI labs are making a set of voluntary commitments to reinforce the safety, security and trustworthiness of AI technology and our services. This process, coordinated by the White House, is an important step in advancing meaningful and effective AI governance, both in the US and around the world.

As part of our mission to build safe and beneficial AGI, we will continue to pilot and refine⁠ concrete governance practices specifically tailored to highly capable foundation models like the ones that we produce. We will also continue to invest in research in areas that can help inform regulation, such as techniques for assessing potentially dangerous capabilities in AI models.

“Policymakers around the world are considering new laws for highly capable AI systems. Today’s commitments contribute specific and concrete practices to that ongoing discussion. This announcement is part of our ongoing collaboration with governments, civil society organizations and others around the world to advance AI governance,” said Anna Makanju, VP of Global Affairs.

Voluntary AI commitments

The following is a list of commitments that companies are making to promote the safe, secure, and transparent development and use of AI technology. These voluntary commitments are consistent with existing laws and regulations, and designed to advance a generative AI legal and policy regime. Companies intend these voluntary commitments to remain in effect until regulations covering substantially the same issues come into force. Individual companies may make additional commitments beyond those included here.

Scope: Where commitments mention particular models, they apply only to generative models that are overall more powerful than the current industry frontier (e.g. models that are overall more powerful than any currently released models, including GPT‑4, Claude 2, PaLM 2, Titan and, in the case of image generation, DALL-E 2).

Safety

1) Commit to internal and external red-teaming of models or systems in areas including misuse, societal risks, and national security concerns, such as bio, cyber, and other safety areas.

Companies making this commitment understand that robust red-teaming is essential for building successful products, ensuring public confidence in AI, and guarding against significant national security threats. Model safety and capability evaluations, including red teaming, are an open area of scientific inquiry, and more work remains to be done. Companies commit to advancing this area of research, and to developing a multi-faceted, specialized, and detailed red-teaming regime, including drawing on independent domain experts, for all major public releases of new models within scope. In designing the regime, they will ensure that they give significant attention to the following:

  • Bio, chemical, and radiological risks, such as the ways in which systems can lower barriers to entry for weapons development, design, acquisition, or use
  • Cyber capabilities, such as the ways in which systems can aid vulnerability discovery, exploitation, or operational use, bearing in mind that such capabilities could also have useful defensive applications and might be appropriate to include in a system
  • The effects of system interaction and tool use, including the capacity to control physical systems
  • The capacity for models to make copies of themselves or “self-replicate”
  • Societal risks, such as bias and discrimination

To support these efforts, companies making this commitment commit to advancing ongoing research in AI safety, including on the interpretability of AI systems’ decision-making processes and on increasing the robustness of AI systems against misuse. Similarly, companies commit to publicly disclosing their red-teaming and safety procedures in their transparency reports (described below).

2) Work toward information sharing among companies and governments regarding trust and safety risks, dangerous or emergent capabilities, and attempts to circumvent safeguards

Companies making this commitment recognize the importance of information sharing, common standards, and best practices for red-teaming and advancing the trust and safety of AI. They commit to establish or join a forum or mechanism through which they can develop, advance, and adopt shared standards and best practices for frontier AI safety, such as the NIST AI Risk Management Framework or future standards related to red-teaming, safety, and societal risks. The forum or mechanism can facilitate the sharing of information on advances in frontier capabilities and emerging risks and threats, such as attempts to circumvent safeguards, and can facilitate the development of technical working groups on priority areas of concern. In this work, companies will engage closely with governments, including the U.S. government, civil society, and academia, as appropriate.

Security

3) Invest in cybersecurity and insider threat safeguards to protect proprietary and unreleased model weights

Companies making this commitment will treat unreleased AI model weights for models in scope as core intellectual property for their business, especially with regards to cybersecurity and insider threat risks. This includes limiting access to model weights to those whose job function requires it and establishing a robust insider threat detection program consistent with protections provided for their most valuable intellectual property and trade secrets. In addition, it requires storing and working with the weights in an appropriately secure environment to reduce the risk of unsanctioned release.

4) Incent third-party discovery and reporting of issues and vulnerabilities

Companies making this commitment recognize that AI systems may continue to have weaknesses and vulnerabilities even after robust red-teaming. They commit to establishing for systems within scope bounty systems, contests, or prizes to incent the responsible disclosure of weaknesses, such as unsafe behaviors, or to include AI systems in their existing bug bounty programs.

Trust

5) Develop and deploy mechanisms that enable users to understand if audio or visual content is AI-generated, including robust provenance,…

Excerpt shown — open the source for the full document.