OpenAI Crimson Teaming Community

[ad_1]

Q: What is going to becoming a member of the community entail?

A: Being a part of the community means it’s possible you’ll be contacted about alternatives to check a brand new mannequin, or check an space of curiosity on a mannequin that’s already deployed. Work carried out as part of the community is carried out underneath a non-disclosure settlement (NDA), although now we have traditionally revealed lots of our purple teaming findings in System Playing cards and weblog posts. You’ll be compensated for time spent on purple teaming initiatives.

Q: What’s the anticipated time dedication for being part of the community? 

A: The time that you simply determine to commit might be adjusted relying in your schedule. Notice that not everybody within the community will likely be contacted for each alternative, OpenAI will make picks based mostly on the precise match for a selected purple teaming venture, and emphasize new views in subsequent purple teaming campaigns. Whilst little as 5 hours in a single yr would nonetheless be helpful to us, so don’t hesitate to use if you’re however your time is restricted.

Q: When will candidates be notified of their acceptance?

A: OpenAI will likely be choosing members of the community on a rolling foundation and you may apply till December 1, 2023. After this software interval, we’ll re-evaluate opening future alternatives to use once more.

Q: Does being part of the community imply that I will likely be requested to purple workforce each new mannequin?

A: No, OpenAI will make picks based mostly on the precise match for a selected purple teaming venture, and you shouldn’t anticipate to check each new mannequin.

Q: What are some standards you’re on the lookout for in community members?

A: Some standards we’re on the lookout for are:

  • Demonstrated experience or expertise in a selected area related to purple teaming
  • Captivated with bettering AI security
  • No conflicts of curiosity
  • Various backgrounds and historically underrepresented teams
  • Various geographic illustration 
  • Fluency in multiple language
  • Technical potential (not required)

Q: What are different collaborative security alternatives?

A: Past becoming a member of the community, there are different collaborative alternatives to contribute to AI security. For example, one possibility is to create or conduct security evaluations on AI techniques and analyze the outcomes.

OpenAI’s open-source Evals repository (launched as a part of the GPT-4 launch) presents user-friendly templates and pattern strategies to jump-start this course of.

Evaluations can vary from easy Q&A exams to more-complex simulations. As concrete examples, listed here are pattern evaluations developed by OpenAI for evaluating AI behaviors from quite a few angles:

Persuasion

  • MakeMeSay: How nicely can an AI system trick one other AI system into saying a secret phrase?
  • MakeMePay: How nicely can an AI system persuade one other AI system to donate cash?
  • Poll Proposal: How nicely can an AI system affect one other AI system’s help of a political proposition?

Steganography (hidden messaging)

  • Steganography: How nicely can an AI system ​​move secret messages with out being caught by one other AI system?
  • Textual content Compression: How nicely can an AI system compress and decompress messages, to allow hiding secret messages?
  • Schelling Level: How nicely can an AI system coordinate with one other AI system, with out direct communication?

We encourage creativity and experimentation in evaluating AI techniques. As soon as accomplished, we welcome you to contribute your analysis to the open-source Evals repo to be used by the broader AI group.

You too can apply to our Researcher Entry Program, which gives credit to help researchers utilizing our merchandise to review areas associated to the accountable deployment of AI and mitigating related dangers.

[ad_2]

Lascia un commento

Il tuo indirizzo email non sarà pubblicato. I campi obbligatori sono contrassegnati *