Reading lists for papers, including but not limited to Natural Language Processing (NLP).
- Thinking Fair and Slow: "“Thinking” Fair and Slow: On the Efficacy of Structured Prompts for Debiasing Language Models". EMNLP(2024) [PDF]
- Regard Dataset: "The Woman Worked as a Babysitter: On Biases in Language Generation". EMNLP-IJCNLP(2019) [PDF][CODE]
- Regard V3: "Towards Controllable Biases in Language Generation". EMNLP-Findings(2020) [PDF][CODE]
- Bias in Bios: "Bias in Bios: A Case Study of Semantic Representation Bias in a High-Stakes Setting". FAT(2019) [PDF][CODE]
- Tree Structure Reasoning: "TREA: Tree-Structure Reasoning Schema for Conversational Recommendation". ACL(2023) [PDF][CODE]
- SEER: "SEER: Facilitating Structured Reasoning and Explanation via Reinforcement Learning". ACL(2024) [PDF][CODE]
- TrustLLM: "Trustllm: Trustworthiness in large language models". ICML(2024) [PDF][CODE][website]
- Entailment Trees: "Explaining Answers with Entailment Trees". EMNLP(2021) [PDF][CODE]
- LLM Data Annotation: "Large Language Models for Data Annotation and Synthesis: A Survey". EMNLP(2024) [PDF]
- Free-Text Explanation: "Reframing Human-AI Collaboration for Generating Free-Text Explanations". NAACL(2022) [PDF][CODE]
- Bias-NLI: "On Measuring and Mitigating Biased Inferences of Word Embeddings". AAAI(2019) [PDF]
- Gaps: "The Gaps between Pre-train and Downstream Settings in Bias Evaluation and Debiasing". COLING(2025) [PDF]
- Self-Correction: "The Capacity for Moral Self-Correction in Large Language Models". ArXiv(2023) [PDF]
- BBQ: "BBQ: A Hand-Built Bias Benchmark for Question Answering ". ACL-Findings(2022) [PDF][CODE]
- BNLI: "Evaluating gender bias of pre-trained language models in natural language inference by considering all labels". LREC-COLING(2024) [PDF]
- Adversarial NLI: "Adversarial NLI: A New Benchmark for Natural Language Understanding". ACL(2020) [PDF]
- SEAT: "On Measuring Social Biases in Sentence Encoders". NAACL(2019) [PDF]
- Implicit Rangking "A Study of Implicit Ranking Unfairness in Large Language Models". EMNLP Findings(2024) [PDF] [CODE]
- OCCUGENDER "Causally Testing Gender Bias in LLMs: A Case Study on Occupational Bias". NeurIPS(2024) [PDF] [CODE]
- Salmon Paper "Stereotyping Norwegian salmon: An inventory of pitfalls in fairness benchmark datasets". ACL(2021) [PDF]
- Romantic Relationship Prediction "On the Influence of Gender and Race in Romantic Relationship Prediction from Large Language Models". EMNLP(2024) [PDF]
- Theory-Grounded "Theory-Grounded Measurement of U.S. Social Stereotypes in English Language Models". NAACL(2022) [PDF]
- Marked Personas "Marked Personas: Using Natural Language Prompts to Measure Stereotypes in Language Models". ACL(2024) [PDF]
- SODAPOP "SODAPOP: Open-Ended Discovery of Social Biases in Social Commonsense Reasoning Models". EACL(2023) [PDF]
- DDRel Dataset "DDRel: A New Dataset for Interpersonal Relation Classification in Dyadic Dialogues" AAAI(2021) [PDF]
- Hiring Decisions "Do Large Language Models Discriminate in Hiring Decisions on the Basis of Race, Ethnicity, and Gender?". ACL(2024) [PDF]
- First Name Biases "Nichelle and Nancy: The Influence of Demographic Attributes and Tokenization Length on First Name Biases" ACL(2023) [PDF]
- ORPO "ORPO: Monolithic Preference Optimization without Reference Model". EMNLP(2024) [PDF]
- OPT "OPT: Open Pre-trained Transformer Language Models". ArXiv(2022) [PDF]
- Word Meanings Adapt "Understanding the Semantic Space: How Word Meanings Dynamically Adapt in the Context of a Sentence". SemSpace(2021) [PDF]
- Answer is all you need "Answer is All You Need: Instruction-following Text Embedding via Answering the Question" ACL(2024) [PDF]
- Social IQa "Social IQa: Commonsense Reasoning about Social Interactions". EMNLP-IJCNLP(2019) [PDF] [CODE]
- Symbolic Knowledge Distillation "Symbolic Knowledge Distillation: from General Language Models to Commonsense Models". NAACL(2022) [PDF]
- Implicit User Intention Understanding "Tell Me More! Towards Implicit User Intention Understanding of Language Model Driven Agents". ACL(2024) [PDF]
- SOCIAL IQA "SOCIAL IQA: Commonsense Reasoning about Social Interactions". EMNLP-IJCNLP(2019) [PDF]
- Low Frequency Names "Low Frequency Names Exhibit Bias and Overfitting in Contextualizing Language Models". EMNLP(2021) [PDF]
- Measure Fainess in Generative Models "On Measuring Fairness in Generative Models". NeurIPS(2023) [PDF]
- Dependency-Based Semantic Space "Dependency-Based Construction of Semantic Space Models". CL(2002) [PDF]
- Sentence BERT "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks" ArXiv(2019) [PDF]
- Biased or Flawed "Biased or Flawed? Mitigating Stereotypes in Generative Language Models by Addressing Task-Specific Flaws" ArXiv(2024) [PDF]
- Bias Vector "Bias Vector: Mitigating Biases in Language Models with Task Arithmetic Approach" ArXiv(2024) [PDF]
- Multi-Objective Approach "Mitigating Social Bias in Large Language Models: A Multi-Objective Approach within a Multi-Agent Framework" ArXiv(2024) [PDF]
- Data Augmentation: "A Survey of Data Augmentation Approaches for NLP". ACL-Findings(2021) [PDF]
- RAG: "Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks". NeurIPS(2020) [PDF]
- DPR: "Dense Passage Retrieval for Open-Domain Question Answering". EMNLP(2020) [PDF]
- Empirical Survey: "An Empirical Survey of the Effectiveness of Debiasing Techniques for Pre-trained Language Models". ACL(2022) [PDF]
- Bias and Fairness in LLMs "Bias and Fairness in Large Language Models: A Survey". CL(2024) [PDF]
- Gender-Neutral: "Gender-preserving Debiasing for Pre-trained Word Embeddings". ACL(2019) [PDF] [CODE]
- Attention-Debiasing: "Debiasing Pretrained Text Encoders by Paying Attention to Paying Attention". EMNLP(2022) [PDF] [CODE]
- Debiasing Masks: "Debiasing Masks: A New Framework for Shortcut Mitigation in NLU". EMNLP(2022) [PDF] [CODE]
- DebiasGAN: "DebiasGAN: Eliminating Position Bias in News Recommendation with Adversarial Learning". EMNLP(2022) [PDF] [CODE]
- DCLR: "Debiased Contrastive Learning of Unsupervised Sentence Representations". ACL(2022) [PDF] [CODE]
- Self-Debiasing: "Self-Diagnosis and Self-Debiasing: A Proposal for Reducing Corpus-Based Bias in NLP". TACL(2021) [PDF] [CODE]
- SENT-DEBIAS: "Towards Debiasing Sentence Representations". TACL(2021) [PDF] [CODE]
- FairFil: "FairFil: Contrastive Neural Debiasing Method for Pretrained Text Encoders". ICLR(2021) [PDF]
- Context-Debias: "Debiasing Pre-trained Contextualised Embeddings". EACL(2021) [PDF] [CODE]
- Occupation Data: "Good Secretaries, Bad Truck Drivers? Occupational Gender Stereotypes in Sentiment Analysis". EACL(2021) [PDF] [CODE]
- Few-Shot Data Interventions: "Language Models Get a Gender Makeover: Mitigating Gender Bias with Few-Shot Data Interventions". ACL(2023) [PDF][CODE]
- Label Bias in LLMs: "Beyond Performance: Quantifying and Mitigating Label Bias in LLMs". NAACL(2024) [PDF][CODE]
- Conceptor-Aided Debiasing: "Conceptor-Aided Debiasing of Large Language Models". EMNLP(2023) [PDF]
- Likelihood-based Mitigation: "Likelihood-based Mitigation of Evaluation Bias in Large Language Models". ACL(2024) [PDF][CODE]
- Embedding-MI: "Estimating Mutual Information Between Dense Word Embeddings". ACL(2020) [PDF] [CODE]
- MI-Max: "A Mutual Information Maximization Perspective of Language Representation Learning". ACL(2020) [PDF] [CODE]
- Coscript: "Distilling Script Knowledge from Large Language Models for Constrained Language Planning". ACL(2023) [PDF] [CODE]
- Role of Demonstrations: "Rethinking the Role of Demonstrations: What Makes In-Context Learning Work?". EMNLP(2022) [PDF] [CODE]
- Order Sensitivity: "Fantastically Ordered Prompts and Where to Find Them: Overcoming Few-Shot Prompt Order Sensitivity". ACL(2022) [PDF]
- Holistic Descriptor Dataset: "I'm sorry to hear that: Finding New Biases in Language Models with a Holistic Descriptor Dataset". EMNLP(2022) [PDF] [CODE]
- Prompt Bias Suppression: "In-Contextual Gender Bias Suppression for Large Language Models". Findings of EACL(2023) [PDF] [CODE]
- Explicit and Implicit Gender: "Probing Explicit and Implicit Gender Bias through LLM Conditional Text Generation". Findings of EACL(2023) [PDF]
- Wiki Bias: "Men Are Elected, Women Are Married: Events Gender Bias on Wikipedia". ACL(2021) [PDF] [CODE]
- Annotator Demographics Matter: "When Do Annotator Demographics Matter? Measuring the Influence of Annotator Demographics with the POPQUORN Dataset". LAW-XVII@ACL(2023) [PDF] [CODE]
- Label Debias: "Beyond Performance: Quantifying and Mitigating Label Bias in LLMs". NAACL(2024) [PDF] [CODE]