Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Ongoing] Knowledge base additions #142

Open
jphall663 opened this issue Oct 27, 2023 · 7 comments
Open

[Ongoing] Knowledge base additions #142

jphall663 opened this issue Oct 27, 2023 · 7 comments
Assignees

Comments

@jphall663
Copy link
Owner

jphall663 commented Oct 27, 2023

@jphall663
Copy link
Owner Author

jphall663 commented Oct 30, 2023

@jphall663
Copy link
Owner Author

jphall663 commented Oct 30, 2023

@datherton09
Copy link
Collaborator

All added. Waiting on EO. Decided to go ahead and add the "Intellectual property" page because I could still imagine it being a useful resource/portal (especially considering the USTPO falls under it, and that contains a specific resource we link to).

@jphall663
Copy link
Owner Author

jphall663 commented Oct 31, 2023

@jphall663 jphall663 changed the title Things to add for next week (week of 10/30) [Ongoing] Knowledge base additions Jan 12, 2024
@jphall663 jphall663 reopened this Jan 12, 2024
@datherton09
Copy link
Collaborator

datherton09 commented Feb 21, 2024

[ALL ADDED, 2/21/2024]

benchmarks:

https://wavesbench.github.io/
https://github.com/huggingface/evaluate
https://github.com/AI-secure/DecodingTrust
https://docs.google.com/spreadsheets/u/1/d/e/2PACX-1vQObeTxvXtOs--zd98qG2xBHHuTTJOyNISBJPthZFr3at2LCrs3rcv73d4of1A78JV2eLuxECFXJY43/pubhtml
https://safetyprompts.com/
python software:

https://github.com/lilacai/lilac
official guidance:

https://www.ohchr.org/sites/default/files/documents/issues/business/b-tech/taxonomy-GenAI-Human-Rights-Harms.pdf
community resources:

https://www.hackerone.com/vulnerability-and-security-testing-blog
https://www.synack.com/wp-content/uploads/2022/09/Crowdsourced-Security-Landscape-Government.pdf
CSET stuff (just double check we reference somehow):
-- https://cset.georgetown.edu/article/translating-ai-risk-management-into-practice/
-- https://cset.georgetown.edu/publication/repurposing-the-wheel/
-- https://cset.georgetown.edu/publication/adding-structure-to-ai-harm/
-- https://cset.georgetown.edu/article/understanding-ai-harms-an-overview/
-- https://cset.georgetown.edu/publication/ai-incident-collection-an-observational-study-of-the-great-ai-experiment/
https://www.scsp.ai/wp-content/uploads/2023/11/SCSP_JHU-HCAI-Framework-Nov-6.pdf
https://openai.com/research/building-an-early-warning-system-for-llm-aided-biological-threat-creation
https://c2pa.org/
https://aiverifyfoundation.sg/downloads/Cataloguing_LLM_Evaluations.pdf
https://partnershiponai.org/modeldeployment/
https://cdn.openai.com/openai-preparedness-framework-beta.pdf

https://dominiquesheltonleipzig.com/country-legislation-frameworks/

red-teaming section:

https://www.hackerone.com/thought-leadership/ai-safety-red-teaming
https://cset.georgetown.edu/article/what-does-ai-red-teaming-actually-mean/

@jphall663
Copy link
Owner Author

jphall663 commented Mar 15, 2024

Red teaming -- but do we want to start hosting papers?

HarmBench: A Standardized Evaluation Framework for Automated Red Teaming and Robust Refusal (2024)
Mantas Mazeika, Long Phan, Xuwang Yin, Andy Zou, Zifan Wang, Norman Mu, Elham Sakhaee, Nathaniel Li, Steven Basart, Bo Li, David Forsyth, Dan Hendryckshttps://arxiv.org/pdf/2402.04249.pdf

Red-Teaming for Generative AI: Silver Bullet or Security Theater?
Michael Feffer, Anusha Sinha, Zachary C. Lipton, Hoda Heidarihttps://arxiv.org/pdf/2401.15897.pdf

Red Teaming Game: A Game-Theoretic Framework for Red Teaming Language Models
Chengdong Ma, Ziran Yang, Minquan Gao, Hai Ci, Jun Gao, Xuehai Pan, Yaodong Yanghttps://arxiv.org/pdf/2310.00322.pdf

Red-Teaming Large Language Models using Chain of Utterances for Safety-Alignment (2023)https://arxiv.org/pdf/2308.09662.pdf

Language Model Unalignment: Parametric Red-Teaming to Expose Hidden Harms and Biases
Rishabh Bhardwaj, Soujanya Poriahttps://arxiv.org/pdf/2310.14303.pdf

@jphall663
Copy link
Owner Author

jphall663 commented Mar 15, 2024

GAI Critiques:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants