This repository contains a curated list of awesome research papers, datasets and tools for applying machine learning techniques to compilers and program optimisation.
- Machine Learning in Compiler Optimisation - Zheng Wang and Michael O'Boyle, Proceedings of the IEEE, 2018
- A survey on compiler autotuning using machine learning - Ashouri, Amir H., William Killian, John Cavazos, Gianluca Palermo, and Cristina Silvano, ACM Computing Surveys (CSUR), 2018
- A survey of machine learning for big code and naturalness - Allamanis, Miltiadis, Earl T. Barr, Premkumar Devanbu, and Charles Sutton, ACM Computing Surveys (CSUR), 2018
- A Collaborative Filtering Approach for the Automatic Tuning of Compiler Optimisations - Cereda, Stefano, Gianluca Palermo, Paolo Cremonesi, and Stefano Doni, LCTES 2020.
- Autophase: Compiler phase-ordering for hls with deep reinforcement learning. Qijing Huang, Ameer Haj-Ali, William Moses, John Xiang, Ion Stoica, Krste Asanovic, John Wawrzynek. FCCM 2019.
- Micomp: Mitigating the compiler phase-ordering problem using optimization sub-sequences and machine learning - Amir H. Ashouri, Andrea Bignoli, Gianluca Palermo, Cristina Silvano, Sameer Kulkarni, and John Cavazos. ACM Transactions on Architecture and Code Optimization (TACO) 2017.
- Learning to superoptimize programs - Rudy Bunel, Alban Desmaison, M. Pawan Kumar, Philip H.S. Torr, Pushmeet Kohlim. ICLR 2017
- Mitigating the compiler optimization phase-ordering problem using machine learning - Sameer Kulkarni and John Cavazos, OOPSLA 2012
- MILEPOST GCC: machine learning based research compiler - Grigori Fursin, Cupertino Miranda, Olivier Temam, Mircea Namolaru, Elad Yom-Tov, Ayal Zaks, Bilha Mendelson et al., 2008
- Rapidly selecting good compiler optimizations using performance counters - John Cavazos, Grigori Fursin, Felix Agakov, Edwin Bonilla, Michael FP O'Boyle, and Olivier Temam. CGO 2007.
- Using machine learning to focus iterative optimization - Agakov, Felix, Edwin Bonilla, John Cavazos, Björn Franke, Grigori Fursin, Michael FP O'Boyle, John Thomson, Marc Toussaint, and Christopher KI Williams. CGO 2006.
- NeuroVectorizer: end-to-end vectorization with deep reinforcement learning - Ameer Haj-Ali, Nesreen K. Ahmed, Ted Willke, Yakun Sophia Shao, Krste Asanovic, and Ion Stoica. CGO 2020.
- Compiler Auto-Vectorization with Imitation Learning - Charith Mendis, Cambridge Yang, Yewen Pu, Saman P. Amarasinghe, Michael Carbin. NeurIPS 2019.
- Learning to schedule straight-line code - J. Eliot B. Moss, Paul E. Utgoff, John Cavazos, Doina Precup, Darko Stefanovic, Carla E. Brodley, and David Scheeff. NeurIPS 1998.
- TVM: An automated end-to-end optimizing compiler for deep learning - Tianqi Chen, Thierry Moreau, Ziheng Jiang, Lianmin Zheng, Eddie Yan, Haichen Shen, Meghan Cowan et al., OSDI 2018
- Cobayn: Compiler autotuning framework using bayesian networks - Amir Hossein Ashouri, Giovanni Mariani, Gianluca Palermo, Eunjung Park, John Cavazos, and Cristina Silvano, ACM Transactions on Architecture and Code Optimization (TACO), 2016.
- Autotuning algorithmic choice for input sensitivity - Yufei Ding, Jason Ansel, Kalyan Veeramachaneni, Xipeng Shen, Una-May O'Reilly, and Saman Amarasinghe. PLDI 2015
- Fast: A fast stencil autotuning framework based on an optimal-solution space model - Yulong Luo, Guangming Tan, Zeyao Mo, and Ninghui Sun. ACM Transactions on Architecture and Code Optimization (TACO), 2015.
- GPU performance and power tuning using regression trees - Wenhao Jia, Elba Garza, Kelly A. Shaw, and Margaret Martonosi. SC 2015.
- Opentuner: An extensible framework for program autotuning - Jason Ansel, Shoaib Kamil, Kalyan Veeramachaneni, Jonathan Ragan-Kelley, Jeffrey Bosboom, Una-May O'Reilly, and Saman Amarasinghe, PACT 2014
- Code Mapping in Heterogeneous Platforms Using Deep Learning and LLVM-IR - Francesco Barchi, Gianvito Urgese, Enrico Macii, and Andrea Acquaviva. DAC 2019.
- Improving spark application throughput via memory aware task co-location: A mixture of experts approach - Vicent Sanz Marco, Ben Taylor, Barry Porter, and Zheng Wang. Middleware 2017.
- Quasar: resource-efficient and QoS-aware cluster management - Christina Delimitrou, and Christos Kozyrakis. ASPLOS 2014.
- Automatic and portable mapping of data parallel programs to opencl for gpu-based heterogeneous systems - Zheng Wang, Dominik Grewe, and Michael O'boyle. ACM Transactions on Architecture and Code Optimization (TACO), 2014.
- Automatic and portable mapping of data parallel programs to opencl for gpu-based heterogeneous systems - Zheng Wang, Georgios Tournavitis, Björn Franke, and Michael FP O'boyle. ACM Transactions on Architecture and Code Optimization (TACO), 2014.
- Smart multi-task scheduling for OpenCL programs on CPU/GPU heterogeneous platforms - Yuan Wen, Zheng Wang, and Michael FP O'Boyle. HiPC 2015.
- Smart, adaptive mapping of parallelism in the presence of external workload - Murali Krishna Emani, Zheng Wang, and Michael O'Boyle. CGO 2013.
- Partitioning streaming parallelism for multi-cores: a machine learning based approach - Zheng Wang and Michael O'Boyle. PACT 2010.
- Qilin: exploiting parallelism on heterogeneous multiprocessors with adaptive mapping - Chi-Keung Luk, Sunpyo Hong, and Hyesoon Kim. MICRO 2009.
- Mapping parallelism to multi-cores: a machine learning based approach - Zheng Wang and Michael O'Boyle. PPoPP 2009.
- Bridging the gap between deep learning and sparse matrix format selection -Yue Zhao, Jiajia Li, Chunhua Liao and Xipeng Shen. PPoPP 2018.
- Halide: a language and compiler for optimizing parallelism, locality, and recomputation in image processing pipelines - Jonathan Ragan-Kelley, Connelly Barnes, Andrew Adams, Sylvain Paris, Frédo Durand, and Saman Amarasinghe, PLDI 2013.
- PetaBricks: a language and compiler for algorithmic choice - Jason Ansel, Cy Chan, Yee Lok Wong, Marek Olszewski, Qin Zhao, Alan Edelman, and Saman Amarasinghe, PLDI 2009.
- Learning to Optimize Halide with Tree Search and Random Programs - Andrew Adams, Karima Ma, Luke Anderson, Riyadh Baghdadi, Tzu-Mao Li, Michael Gharbi, Benoit Steiner, Steven Johson, Kayvon Fatahalian, Fredo Durand, Jonathan Ragan-Kelley. ACM Trans Graph, 38(4), 2019.
- Ithemal: Accurate, portable and fast basic block throughput estimation using deep neural networks - Charith Mendis, Alex Renda, Saman Amarasinghe, and Michael Carbin. ICML 2019.
- Absinthe: Learning an Analytical Performance Model to Fuse and Tile Stencil Codes in One Shot - Tobias Gysi, Tobias Grosser, and Torsten Hoefler. PACT 2019.
- Compiler-based graph representations for deep learning models of code - Alexander Brauckmann, Andrés Goens, Sebastian Ertel, and Jeronimo Castrillon. CC 2020.
- code2seq: Generating sequences from structured representations of code - Uri Alon, Shaked Brody, Omer Levy, and Eran Yahav. ICLR 2019.
- code2vec: Learning distributed representations of code - Uri Alon, Meital Zilberstein, Omer Levy, and Eran Yahav. POPL 2019.
- Neural Code Comprehension: A Learnable Representation of Code Semantics - Tal Ben-Nun, Alice Shoshana Jakobovits, and Torsten Hoefler. NeurIPS 2018.
- End-to-end deep learning of optimization heuristics - Chris Cummins, Pavlos Petoumenos, Zheng Wang, and Hugh Leather. PACT 2017.
- Using graph-based program characterization for predictive modeling - Eunjung Park, John Cavazos, and Marco A. Alvarez. CGO 2011.
- Automatic feature generation for machine learning based optimizing compilation - Hugh Leather, Edwin Bonilla, and Michael O'Boyle. CGO 2009.
- Synthesizing benchmarks for predictive modeling - Chris Cummins, Pavlos Petoumenos, Zheng Wang, and Hugh Leather. CGO 2017.
- Minimizing the cost of iterative compilation with active learning - William Ogilvie, Pavlos Petoumenos, Zheng Wang, and Hugh Leather. CGO 2017.
- Saman Amarasinghe, Compiler 2.0: Using Machine Learning to Modernize Compiler Technology. LCTES 2020.
- programl - LLVM and XLA IR program representation for machine learning.
- NeuroVectorizer - Using deep reinforcement learning (RL) to predict optimal vectorization compiler pragmas (paper).
- TVM - Open Deep Learning Compiler Stack for cpu, gpu and specialized accelerators (paper; slides).
- clgen - Benchmark generator using LSTMs (paper).
- OpenTuner - Framework for building domain-specific multi-objective program autotuners (paper; slides)
- BHive - A Benchmark Suite and Measurement Framework for Validating x86-64 Basic Block Performance Models (paper).
- cBench - 32 C benchmarks with datasets and driver scripts.
- DeepDataFlow - 469k LLVM-IR files and 8.6B data-flow analysis labels for classification labels.
- devmap - 650 OpenCL benchmark features and CPU/GPU classification labels.
- ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI
- Architectural Support for Programming Languages and Operating Systems, ASPLOS
- ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP
- International Symposium on Code Generation and Optimization, CGO
- International Conference on Parallel Architectures and Compilation Techniques, PACT
- Object-oriented Programming, Systems, Languages, and Applications, OOPSLA
- International Conference on Compiler Construction, CC
- European Conference on Computer Systems, EuroSys
- International Conference on Supercomputing, ICS
- ACM Symposium on Parallelism in Algorithms and Architectures, SPAA
- International Conference on High Performance and Embedded Architectures and Compilers, HiPEAC
- International Conference on Virtual Execution Environments, VEE
- International Conference on Languages, Compilers and Tools for Embedded Systems, LCTES
- International Conference on Computing Frontiers, CF
- International Parallel and Distributed Processing Symposium, IPDPS
- International Conference for High Performance Computing, Networking, Storage, and Analysis, SC
- EEE/ACM International Symposium on Microarchitecture, Micro
- International Conference on Compilers, Architectures, and Synthesis for Embedded Systems, CASES
- USENIX Annul Technical Conference, ATC
- USENIX Symposium on Operating Systems Design and Implementation, OSDI
- International Conference on High Performance Computing, Data and Analytics, HiPC
- International Conference on Parallel Processing, ICPP
- International Middleware Conference, Middleware
- European Conference on Parallel Processing, Euro-Par
- Machine Learning and Programming Languages Workshop, MAPL
- Languages and Compilers for Parallel Computing, LCPC
- International Conference on Learning Representations, ICLR
- Conference on Machine Learning and Systems, MLSys
See Contributions.md. TL;DR: send me (@zwang4) a pull request.