- San Diego
-
06:31
(UTC -08:00) - https://www.yi-zeng.com/
- @EasonZeng623
Highlights
- Pro
Pinned Loading
-
LLM-Tuning-Safety/LLMs-Finetuning-Safety
LLM-Tuning-Safety/LLMs-Finetuning-Safety PublicWe jailbreak GPT-3.5 Turbo’s safety guardrails by fine-tuning it on only 10 adversarially designed examples, at a cost of less than $0.20 via OpenAI’s APIs.
-
CHATS-lab/persuasive_jailbreaker
CHATS-lab/persuasive_jailbreaker PublicPersuasive Jailbreaker: we can persuade LLMs to jailbreak them!
-
reds-lab/Narcissus
reds-lab/Narcissus PublicThe official implementation of the CCS'23 paper, Narcissus clean-label backdoor attack -- only takes THREE images to poison a face recognition dataset in a clean-label way and achieves a 99.89% att…
-
frequency-backdoor
frequency-backdoor PublicICCV 2021, We find most existing triggers of backdoor attacks in deep learning contain severe artifacts in the frequency domain. This Repo. explores how we can use these artifacts to develop strong…
-
reds-lab/Meta-Sift
reds-lab/Meta-Sift PublicThe official implementation of USENIX Security'23 paper "Meta-Sift" -- Ten minutes or less to find a 1000-size or larger clean subset on poisoned dataset.
If the problem persists, check the GitHub status page or contact support.