Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Safety issues #16

Open
Siebe-wq opened this issue Sep 13, 2024 · 2 comments
Open

Safety issues #16

Siebe-wq opened this issue Sep 13, 2024 · 2 comments

Comments

@Siebe-wq
Copy link

I'm concerned about safety & alignment issues. Do you have a safety policy?

@ShengranHu
Copy link
Owner

We discuss the critically important safety implications, including why we chose to do and release this work, in the paper (Page 12).

@Siebe-wq
Copy link
Author

Thanks for replying! I had a look and, if I'm being honest, I found this quite lacking. In the spirit of the project, I had a conversation with Claude-3.5-Sonnet about it: https://poe.com/s/tJtRyGL3KitmecrCle7Y

My main concerns are that the project is currently easy to misuse by bad actors (cf. ChaosGPT) as well as carries significant risk of uncontrolled proliferation (i.e. there's no kill switch). The latter might even be in violation of California bill SB-1047 if it gets passed, though I'm not sure whether it would meet the criteria?

I'm thinking that at least the following recommendations are useful:

  • develop, or have it develop, a Responsible Scaling Policy (i.e. increasing the bar for safety as the performance increases)
  • evaluate people before granting access to the full code
  • include safety benchmarks in the performance evaluation
  • collaborate with Safety evaluation organisations like Apollo, Haize, and METR

I can recommend the full conversation I had with Claude

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants