Reference

PL/SE Applications

Bug Detection

Benchmark and Empirical Study

LLMs Cannot Reliably Identify and Reason About Security Vulnerabilities (Yet?): A Comprehensive Evaluation, Framework, and Benchmarks. S&P 2024, Link
Vulnerability Detection with Code Language Models: How Far Are We? arxiv 2024, Link
A Comprehensive Study of the Capabilities of Large Language Models for Vulnerability Detection, arxiv 2024, Link
How Far Have We Gone in Vulnerability Detection Using Large Language Models, ICLR 2024, Link
Understanding the Effectiveness of Large Language Models in Detecting Security Vulnerabilities, arxiv 2023, Link
Do Language Models Learn Semantics of Code? A Case Study in Vulnerability Detection, arXiv, Link
DiverseVul: A New Vulnerable Source Code Dataset for Deep Learning Based Vulnerability Detection, RAID 2023, Link
SkipAnalyzer: An Embodied Agent for Code Analysis with Large Language Models, Link

General Analysis

A Learning-Based Approach to Static Program Slicing. OOPSLA 2024, Link
Dataflow Analysis-Inspired Deep Learning for Efficient Vulnerability Detection. ICSE 2024, Link
E&V: Prompting Large Language Models to Perform Static Analysis by Pseudo-code Execution and Verification. arXiv, Link

Domain-Specific Bug Detection(Domain-Specific Program & Bug Type)

SMARTINV: Multimodal Learning for Smart Contract Invariant Inference, S&P 2024, Link
LLM-based Resource-Oriented Intention Inference for Static Resource Detection, arxiv, Link
The Hitchhiker's Guide to Program Analysis: A Journey with Large Language Models, OOPSLA 2024, Link
Do you still need a manual smart contract audit? Link
Harnessing the Power of LLM to Support Binary Taint Analysis, arxiv, Link
Large Language Model-Powered Smart Contract Vulnerability Detection: New Perspectives. arXiv, Link
GPTScan: Detecting Logic Vulnerabilities in Smart Contracts by Combining GPT with Program Analysis. ICSE 2024 Link
Continuous Learning for Android Malware Detection, USENIX Security 2023, Link
Beware of the Unexpected: Bimodal Taint Analysis, ISSTA 2023, Link

Specification Inference and Verification

Enchanting Program Specification Synthesis by Large Language Models using Static Analysis and Program Verification, CAV 2024, Link
SpecGen: Automated Generation of Formal Program Specifications via Large Language Models, Link
Lemur: Integrating Large Language Models in Automated Program Verification, ICLR 2024, Link
Zero and Few-shot Semantic Parsing with Ambiguous Inputs, ICLR 2024, Link
Finding Inductive Loop Invariants using Large Language Models, Link
Can ChatGPT support software verification? arXiv, Link
Impact of Large Language Models on Generating Software Specifications, Link
Can Large Language Models Reason about Program Invariants?, ICML 2023, Link
Ranking LLM-Generated Loop Invariants for Program Verification, Link

Program Repair, Code Completion, and Program Synthesis

Towards AI-Assisted Synthesis of Verified Dafny Methods, FSE 2024, Link
Enabling Memory Safety of C Programs using LLMs, arxiv, Link
CodeChain: Towards Modular Code Generation Through Chain of Self-revisions with Representative Sub-modules, ICLR 2024, Link
Is Self-Repair a Silver Bullet for Code Generation? ICLR 2024, Link
Verified Multi-Step Synthesis using Large Language Models and Monte Carlo Tree Search Link
Hypothesis Search: Inductive Reasoning with Language Models, ICLR 2024, Link
CodePlan: Repository-level Coding using LLMs and Planning, FMDM & NIPS 2023, Link
Repository-Level Prompt Generation for Large Language Models of Code. ICML 2023, Link
Refactoring Programs Using Large Language Models with Few-Shot Examples. arXiv, Link
SWE-bench: Can Language Models Resolve Real-World GitHub Issues? Link
Teaching Large Language Models to Self-Debug, ICLR 2024, Link
Guess & Sketch: Language Model Guided Transpilation, ICLR 2024, Link
Optimal Neural Program Synthesis from Multimodal Specifications, EMNLP 2021, Link
CodeTrek: Flexible Modeling of Code using an Extensible Relational Representation, ICLR 2022, Link
Sporq: An Interactive Environment for Exploring Code Using Query-by-Example, UIST 2021, Link
Data Extraction via Semantic Regular Expression Synthesis, OOPSLA 2023, Link
Web Question Answering with Neurosymbolic Program Synthesis, PLDI 2021, Link
Active Inductive Logic Programming for Code Search, ICSE 2019, Link

Fuzzing and Testing

Sedar: Obtaining High-Quality Seeds for DBMS Fuzzing via Cross-DBMS SQL Transfer. ICSE 2024. Link
LLM4FUZZ: Guided Fuzzing of Smart Contracts with Large Language Models Link
Large Language Model guided Protocol Fuzzing, NDSS 2024, Link
Large Language Models are Zero-Shot Fuzzers: Fuzzing Deep-Learning Libraries via Large Language Models, ISSTA 2023, Link
Language Agents as Hackers: Evaluating Cybersecurity Skills with Capture the Flag, MASEC@NeurIPS 2023, Link

Code Model and Code Reasoning

Source Code Vulnerability Detection: Combining Code Language Models and Code Property Graphs, arxiv, Link
CodeArt: Better Code Models by Attention Regularization When Symbols Are Lacking, FSE 2024, Link
FAIR: Flow Type-Aware Pre-Training of Compiler Intermediate Representations, ICSE 2024, Link
Symmetry-Preserving Program Representations for Learning Code Semantics Link
LmPa: Improving Decompilation by Synergy of Large Language Model and Program Analysis, Link
When Do Program-of-Thought Works for Reasoning? AAAI 2024 Link
Grounded Copilot: How Programmers Interact with Code-Generating Models, OOPSLA 2023, Link
Extracting Training Data from Large Language Models, USENIX Security 2023, Link

Code Understanding and IDE-tech

Using an LLM to Help With Code Understanding, ICSE 2024, Link

Hallucination

Self-contradictory Hallucinations of Large Language Models: Evaluation, Detection and Mitigation, ICLR 2024, Link

Prompting for Reasoning Tasks

Self-Evaluation Guided Beam Search for Reasoning, NeurIPS 2023, Link
Self-consistency improves chain of thought reasoning in language models. NeurIPS 2022, Link
Tree of Thoughts: Deliberate Problem Solving with Large Language Models. NeurIPS 2023, Link
Cumulative Reasoning With Large Language Models, Link
Explanation Selection Using Unlabeled Data for Chain-of-Thought Prompting, EMNLP 2023, Link
Complementary Explanations for Effective In-Context Learning, ACL 2023, Link
Wechat Post: 大语言模型的数学之路 Link
Blog: Prompt Engineering Link
Hallucination: Survey Link

Agent, Tool Using, and Planning

Natural Language Commanding via Program Synthesis, Microsoft Link
Chain of Code: Reasoning with a Language Model-Augmented Code Emulator, Feifei Li, Google Link
If LLM Is the Wizard, Then Code Is the Wand: A Survey on How Code Empowers Large Language Models to Serve as Intelligent Agents, Link
Real-world practices of AI Agents, Link
Cognitive Architectures for Language Agents, Link
The Rise and Potential of Large Language Model Based Agents: A Survey, Link
ReAct: Synergizing Reasoning and Acting in Language Models Link
Reflexion: Language Agents with Verbal Reinforcement Learning, NeurIPS 2023, Link
Wechat Post: AutoGen, Link
SATLM: Satisfiability-Aided Language Models Using Declarative Prompting, NeurIPS 2023, Link
Awesome things about LLM-powered agents: Papers, Repos, and Blogs, Link
ChatDev: Mastering the Virtual Social Realm, Shaping the Future of Intelligent Interactions. Link
SWE-bench: Can Language Models Resolve Real-World GitHub Issues? Link

Model and Framework

LMFLow: An Extensible Toolkit for Finetuning and Inference of Large Foundation Models. Large Models for All. Link
codellama: Inference code for CodeLlama models, Link
CodeFuse: LLM for Code from Ant Group, Link
Owl-LM: Large Language Model for Blockchain, Link

Remark: Researcher List

Tao YU, The University of Hong Kong (Training)
Shunyu YAO, Princeton University (Reasoning, Agent)
Xi YE, Isil Dillig, UT Austin (Prompting)
Lingming ZHANG, UIUC (Application: Testing, Repair)
Zhiyun QIAN, UC Riverside (Application: Analysis)
Yizheng CHEN, University of Maryland (Application: Analysis)
Baishakhi Ray, Columbia University (Application: Repair, Analysis)
Martin Vechev, ETH (Data, Hallucination)

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Reference

PL/SE Applications

Bug Detection

Specification Inference and Verification

Program Repair, Code Completion, and Program Synthesis

Fuzzing and Testing

Code Model and Code Reasoning

Code Understanding and IDE-tech

Hallucination

Prompting for Reasoning Tasks

Agent, Tool Using, and Planning

Model and Framework

Remark: Researcher List

About

Releases

Packages

Yu3H0/LLM-PLSE-paper

Folders and files

Latest commit

History

Repository files navigation

Reference

PL/SE Applications

Bug Detection

Specification Inference and Verification

Program Repair, Code Completion, and Program Synthesis

Fuzzing and Testing

Code Model and Code Reasoning

Code Understanding and IDE-tech

Hallucination

Prompting for Reasoning Tasks

Agent, Tool Using, and Planning

Model and Framework

Remark: Researcher List

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages