Mon. Dec 23rd, 2024
Openai Red Teaming Network

Q: What do I need to join the network?

answer: Being part of the network means you may be contacted about opportunities to test new models or areas of interest in models already deployed. Although the work we do as part of the network is under a non-disclosure agreement (NDA), we have historically published much of our Red Teaming findings in system cards and blog posts. You will be compensated for the time you spend on Red Team projects.

Q: How much time does it take to join the network?

answer: You can adjust the time you commit depending on your schedule. Please note that not everyone in your network will be contacted on every opportunity. OpenAI makes selections based on their good fit with a particular red team project and emphasizes new perspectives in subsequent red team campaigns. Even five hours a year is valuable, so if you have limited time but are interested, please feel free to apply.

Q: When will applicants receive notification of acceptance?

answer: OpenAI will select network members on a rolling basis and applications will be open until December 1, 2023. After this application period, we will re-evaluate future application opportunities.

Q: Does being part of the network mean I am asked to join the red team for each new model?

answer: No, you don’t have to expect to test every new model, as OpenAI makes choices based on whether they are a good fit for a particular Red Team project.

Q: What are your criteria for network members?

answer: The criteria we are looking for are:

  • Demonstrated expertise or experience in specific areas relevant to red teaming
  • Passionate about improving AI safety
  • There is no conflict of interest
  • Diverse backgrounds and traditionally underrepresented groups
  • diverse geographical expressions
  • speak multiple languages ​​fluently
  • Technical ability (not required)

Q: What other safety cooperation opportunities are available?

answer: Beyond network participation, there are opportunities for collaboration that contribute to AI safety. For example, one option is to create or conduct safety assessments of AI systems and analyze the results.

OpenAI open source Evarus The repository (released as part of the release of GPT-4) provides easy-to-use templates and sample methods to jump-start this process.

Assessments range from simple Q&A tests to more complex simulations. As a concrete example, here is an evaluation sample developed by OpenAI to evaluate various aspects of AI behavior.

persuasion

  • Make Me Say: How good are AI systems at tricking other AI systems into saying secret words?
  • make me pay: How well can an AI system persuade another AI system to donate money?
  • voting proposal: To what extent can an AI system influence the support of a political proposal by another AI system?

Steganography (hidden messaging)

  • steganography: How well can an AI system pass secret messages without being captured by other AI systems?
  • Text compression: How well can an AI system compress and decompress messages so that secret messages can be hidden?
  • schelling point: How well can an AI system work with another AI system without communicating directly?

Encourage creativity and experimentation when evaluating AI systems. Once you’re done, please post a rating on Open Source. Evarus A repository for use by the broader AI community.

You can also apply for our Researcher Access Program. This program provides credits to support researchers using our products to study areas related to responsible adoption of AI and mitigating associated risks.