Skip to content

Latest commit

 

History

History
3265 lines (1631 loc) · 148 KB

foundationmodels.md

File metadata and controls

3265 lines (1631 loc) · 148 KB

📄 Foundation Models

Paper List

TheoremLlama: Transforming General-Purpose LLMs into Lean4 Experts2024.07.03

Ruida Wang, Jipeng Zhang, Yizhen Jia, Rui Pan, Shizhe Diao, etc


Pedestrian 3D Shape Understanding for Person Re-Identification via Multi-View Learning2024.07.01

Zaiyang Yu, Lusi Li, Jinlong Xie, Changshuo Wang, Weijun Li, etc . - 【IEEE transactions on circuits and systems for video technology (Print)】


Token Erasure as a Footprint of Implicit Vocabulary Items in LLMs2024.06.28

Sheridan Feucht, David Atkinson, Byron C. Wallace, David Bau


OMG-LLaVA: Bridging Image-level, Object-level, Pixel-level Reasoning and Understanding2024.06.27

Tao Zhang, Xiangtai Li, Hao Fei, Haobo Yuan, Shengqiong Wu, etc


Fundamental Problems With Model Editing: How Should Rational Belief Revision Work in LLMs?2024.06.27

Peter Hase, Thomas Hofweber, Xiang Zhou, Elias Stengel-Eskin, Mohit Bansal


Efficient World Models with Context-Aware Tokenization2024.06.27

Vincent Micheli, Eloi Alonso, Franccois Fleuret


The Remarkable Robustness of LLMs: Stages of Inference?2024.06.27

Vedang Lad, Wes Gurnee, Max Tegmark


ResumeAtlas: Revisiting Resume Classification with Large-Scale Datasets and Large Language Models2024.06.26

Ahmed Heakl, Youssef Mohamed, Noran Mohamed, Ali Sharkaway, A. Zaky


AITTI: Learning Adaptive Inclusive Token for Text-to-Image Generation2024.06.18

Xinyu Hou, Xiaoming Li, Chen Change Loy


Unveiling Encoder-Free Vision-Language Models2024.06.17

Haiwen Diao, Yufeng Cui, Xiaotong Li, Yueze Wang, Huchuan Lu, etc


RepLiQA: A Question-Answering Dataset for Benchmarking LLMs on Unseen Reference Content2024.06.17

João Monteiro, Pierre-Andre Noel, Étienne Marcotte, Sai Rajeswar, Valentina Zantedeschi, etc


LLaNA: Large Language and NeRF Assistant2024.06.17

Andrea Amaduzzi, Pierluigi Zama Ramirez, Giuseppe Lisanti, Samuele Salti, Luigi Di Stefano


VideoLLM-online: Online Video Large Language Model for Streaming Video2024.06.17

Joya Chen, Zhaoyang Lv, Shiwei Wu, Kevin Qinghong Lin, Chenan Song, etc


System Optimizations for Enabling Training of Extreme Long Sequence Transformer Models2024.06.17

Sam Ade Jacobs, Masahiro Tanaka, Chengming Zhang, Minjia Zhang, Reza Yazdani Aminadabi, etc . - 【ACM SIGACT-SIGOPS Symposium on Principles of Distributed Computing】


VEGA: Learning Interleaved Image-Text Comprehension in Vision-Language Large Models2024.06.14

Chenyu Zhou, Mengdan Zhang, Peixian Chen, Chaoyou Fu, Yunhang Shen, etc


EFM3D: A Benchmark for Measuring Progress Towards 3D Egocentric Foundation Models2024.06.14

Julian Straub, Daniel DeTone, Tianwei Shen, Nan Yang, Chris Sweeney, etc


ContraSolver: Self-Alignment of Language Models by Resolving Internal Preference Contradictions2024.06.13

Xu Zhang, Xunjian Yin, Xiaojun Wan


Advancing High Resolution Vision-Language Models in Biomedicine2024.06.12

Zekai Chen, Arda Pekis, Kevin Brown


Enhancing End-to-End Autonomous Driving with Latent World Model2024.06.12

Yingyan Li, Lue Fan, Jiawei He, Yu-Quan Wang, Yuntao Chen, etc


Magpie: Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing2024.06.12

Zhangchen Xu, Fengqing Jiang, Luyao Niu, Yuntian Deng, R. Poovendran, etc


Flash-VStream: Memory-Based Real-Time Understanding for Long Video Streams2024.06.12

Haoji Zhang, Yiqin Wang, Yansong Tang, Yong Liu, Jiashi Feng, etc


Vision Model Pre-training on Interleaved Image-Text Data via Latent Compression Learning2024.06.11

Chenyu Yang, Xizhou Zhu, Jinguo Zhu, Weijie Su, Junjie Wang, etc


3D-GRAND: A Million-Scale Dataset for 3D-LLMs with Better Grounding and Less Hallucination2024.06.07

Jianing Yang, Xuweiyi Chen, Nikhil Madaan, Madhavan Iyengar, Shengyi Qian, etc


Improving Alignment and Robustness with Circuit Breakers2024.06.06

Andy Zou, Long Phan, Justin Wang, Derek Duenas, Maxwell Lin, etc


Self-Play with Adversarial Critic: Provable and Scalable Offline Alignment for Language Models2024.06.06

Xiang Ji, Sanjeev Kulkarni, Mengdi Wang, Tengyang Xie


Verbalized Machine Learning: Revisiting Machine Learning with Language Models2024.06.06

Tim Z. Xiao, Robert Bamler, Bernhard Scholkopf, Weiyang Liu


Seq1F1B: Efficient Sequence-Level Pipeline Parallelism for Large Language Model Training2024.06.05

Ao Sun, Weilin Zhao, Xu Han, Cheng Yang, Zhiyuan Liu, etc


SLTrain: a sparse plus low-rank approach for parameter and memory efficient pretraining2024.06.04

Andi Han, Jiaxiang Li, Wei Huang, Mingyi Hong, Akiko Takeda, etc


Transformers are SSMs: Generalized Models and Efficient Algorithms Through Structured State Space Duality2024.05.31

Tri Dao, Albert Gu . - 【arXiv.org】


Spectrum-Aware Parameter Efficient Fine-Tuning for Diffusion Models2024.05.31

Xinxi Zhang, Song Wen, Ligong Han, Felix Juefei-Xu, Akash Srivastava, etc . - 【arXiv.org】


StrucTexTv3: An Efficient Vision-Language Model for Text-rich Image Perception, Comprehension, and Beyond2024.05.31

Pengyuan Lyu, Yulin Li, Hao Zhou, Weihong Ma, Xingyu Wan, etc . - 【arXiv.org】


Self-Augmented Preference Optimization: Off-Policy Paradigms for Language Model Alignment2024.05.31

Yueqin Yin, Zhendong Wang, Yujia Xie, Weizhu Chen, Mingyuan Zhou . - 【arXiv.org】


LACIE: Listener-Aware Finetuning for Confidence Calibration in Large Language Models2024.05.31

Elias Stengel-Eskin, Peter Hase, Mohit Bansal . - 【arXiv.org】


Graph External Attention Enhanced Transformer2024.05.31

Jianqing Liang, Min Chen, Jiye Liang . - 【arXiv.org】


Self-Exploring Language Models: Active Preference Elicitation for Online Alignment2024.05.29

Shenao Zhang, Donghan Yu, Hiteshi Sharma, Ziyi Yang, Shuohang Wang, etc . - 【arXiv.org】


Visualizing the loss landscape of Self-supervised Vision Transformer2024.05.28

Youngwan Lee, Jeffrey Willette, Jonghee Kim, Sung Ju Hwang . - 【arXiv.org】


Meteor: Mamba-based Traversal of Rationale for Large Language and Vision Models2024.05.24

Byung-Kwan Lee, Chae Won Kim, Beomchan Park, Yonghyun Ro . - 【arXiv.org】


Disease-informed Adaptation of Vision-Language Models2024.05.24

Jiajin Zhang, Ge Wang, M. Kalra, P. Yan . - 【arXiv.org】


VB-LoRA: Extreme Parameter Efficient Fine-Tuning with Vector Banks2024.05.24

Yang Li, Shaobo Han, Shihao Ji . - 【arXiv.org】


DAGER: Exact Gradient Inversion for Large Language Models2024.05.24

Ivo Petrov, D. I. Dimitrov, Maximilian Baader, Mark Niklas Müller, Martin T. Vechev . - 【arXiv.org】


Exploring Alignment in Shared Cross-lingual Spaces2024.05.23

Basel Mousi, Nadir Durrani, Fahim Dalvi, Majd Hawasly, Ahmed Abdelali . - 【arXiv.org】


Spectral Adapter: Fine-Tuning in Spectral Space2024.05.22

Fangzhao Zhang, Mert Pilanci . - 【arXiv.org】


Large Language Models are Biased Reinforcement Learners2024.05.19

William M. Hayes, Nicolas Yax, Stefano Palminteri . - 【arXiv.org】


Libra: Building Decoupled Vision System on Large Language Models2024.05.16

Yifan Xu, Xiaoshan Yang, Y. Song, Changsheng Xu . - 【arXiv.org】


Analogist: Out-of-the-box Visual In-Context Learning with Image Diffusion Model2024.05.16

Zheng Gu, Shiyuan Yang, Jing Liao, Jing Huo, Yang Gao . - 【arXiv.org】


Improving Transformers with Dynamically Composable Multi-Head Attention2024.05.14

Da Xiao, Qingye Meng, Shengping Li, Xingyuan Yuan . - 【arXiv.org】


Efficient Vision-Language Pre-training by Cluster Masking2024.05.14

Zihao Wei, Zixuan Pan, Andrew Owens . - 【arXiv.org】


Linearizing Large Language Models2024.05.10

Jean-Pierre Mercat, Igor Vasiljevic, Sedrick Scott Keh, Kushal Arora, Achal Dave, etc . - 【arXiv.org】


LLM-Generated Black-box Explanations Can Be Adversarially Helpful2024.05.10

R. Ajwani, Shashidhar Reddy Javaji, Frank Rudzicz, Zining Zhu . - 【arXiv.org】


Vision Mamba: A Comprehensive Survey and Taxonomy2024.05.07

Xiao Liu, Chenxu Zhang, Lei Zhang . - 【arXiv.org】


Learning To See But Forgetting To Follow: Visual Instruction Tuning Makes LLMs More Prone To Jailbreak Attacks2024.05.07

Georgios Pantazopoulos, Amit Parekh, Malvina Nikandrou, Alessandro Suglia . - 【SAFETY4CONVAI】


vAttention: Dynamic Memory Management for Serving LLMs without PagedAttention2024.05.07

Ramya Prabhu, Ajay Nayak, Jayashree Mohan, R. Ramjee, Ashish Panwar . - 【arXiv.org】


Single and Multi-Hop Question-Answering Datasets for Reticular Chemistry with GPT-4-Turbo2024.05.03

Nakul Rampal, Kaiyu Wang, Matthew Burigana, Lingxiang Hou, Juri Al-Johani, etc . - 【arXiv.org】


NeMo-Aligner: Scalable Toolkit for Efficient Model Alignment2024.05.02

Gerald Shen, Zhilin Wang, Olivier Delalleau, Jiaqi Zeng, Yi Dong, etc . - 【arXiv.org】


Self-Play Preference Optimization for Language Model Alignment2024.05.01

Yue Wu, Zhiqing Sun, Huizhuo Yuan, Kaixuan Ji, Yiming Yang, etc . - 【arXiv.org】


When Quantization Affects Confidence of Large Language Models?2024.05.01

Irina Proskurina, Luc Brun, Guillaume Metzler, Julien Velcin . - 【arXiv.org】


RST-LoRA: A Discourse-Aware Low-Rank Adaptation for Long Document Abstractive Summarization2024.05.01

Dongqi Pu, Vera Demberg . - 【North American Chapter of the Association for Computational Linguistics】


Investigating Automatic Scoring and Feedback using Large Language Models2024.05.01

G. Katuka, Alexander Gain, Yen-Yun Yu . - 【arXiv.org】


CultiVerse: Towards Cross-Cultural Understanding for Paintings with Large Language Model2024.05.01

Wei Zhang, Wong Kam-Kwai, Biying Xu, Yiwen Ren, Yuhuai Li, etc . - 【arXiv.org】


Lost in Recursion: Mining Rich Event Semantics in Knowledge Graphs2024.04.25

Florian Plötzky, Niklas Kiehne, Wolf-Tilo Balke . - 【Web Science Conference】


Make Your LLM Fully Utilize the Context2024.04.25

Shengnan An, Zexiong Ma, Zeqi Lin, Nanning Zheng, Jian-Guang Lou . - 【arXiv.org】


Unifying Asynchronous Logics for Hyperproperties2024.04.25

A. Bombardelli, L. Bozzelli, C'esar S'anchez, Stefano Tonetta . - 【arXiv.org】


A Survey on Visual Mamba2024.04.24

Hanwei Zhang, Ying Zhu, Dan Wang, Lijun Zhang, Tianxiang Chen, etc . - 【Applied Sciences】


The PRISM Alignment Project: What Participatory, Representative and Individualised Human Feedback Reveals About the Subjective and Multicultural Alignment of Large Language Models2024.04.24

Hannah Rose Kirk, Alexander Whitefield, Paul Rottger, Andrew M. Bean, Katerina Margatina, etc . - 【arXiv.org】


Re-Thinking Inverse Graphics With Large Language Models2024.04.23

Peter Kulits, Haiwen Feng, Weiyang Liu, Victoria Abrevaya, Michael J. Black . - 【arXiv.org】


Revisiting Unnaturalness for Automated Program Repair in the Era of Large Language Models2024.04.23

Aidan Z. H. Yang, Sophia Kolak, Vincent J. Hellendoorn, Ruben Martins, Claire Le Goues . - 【arXiv.org】


Does Instruction Tuning Make LLMs More Consistent?2024.04.23

Constanza Fierro, Jiaang Li, Anders Sogaard . - 【arXiv.org】


SkinGEN: an Explainable Dermatology Diagnosis-to-Generation Framework with Interactive Vision-Language Models2024.04.23

Bo Lin, Yingjing Xu, Xuanwen Bao, Zhou Zhao, Zuyong Zhang, etc . - 【arXiv.org】


Rethinking the Evaluation of Dialogue Systems: Effects of User Feedback on Crowdworkers and LLMs2024.04.19

Clemencia Siro, Mohammad Aliannejadi, M. D. Rijke . - 【arXiv.org】


Sample Design Engineering: An Empirical Study of What Makes Good Downstream Fine-Tuning Samples for LLMs2024.04.19

Biyang Guo, He Wang, Wenyilin Xiao, Hong Chen, Zhuxin Lee, etc . - 【arXiv.org】


When LLMs are Unfit Use FastFit: Fast and Effective Text Classification with Many Classes2024.04.18

Asaf Yehudai, Elron Bandel . - 【arXiv.org】


Omniview-Tuning: Boosting Viewpoint Invariance of Vision-Language Pre-training Models2024.04.18

Shouwei Ruan, Yinpeng Dong, Hanqing Liu, Yao Huang, Hang Su, etc . - 【arXiv.org】


V2Xum-LLM: Cross-Modal Video Summarization with Temporal Prompt Instruction Tuning2024.04.18

Hang Hua, Yunlong Tang, Chenliang Xu, Jiebo Luo . - 【arXiv.org】


Moving Object Segmentation: All You Need Is SAM (and Flow)2024.04.18

Junyu Xie, Charig Yang, Weidi Xie, Andrew Zisserman . - 【arXiv.org】


Quantifying Multilingual Performance of Large Language Models Across Languages2024.04.17

Zihao Li, Yucheng Shi, Zirui Liu, Fan Yang, Ninghao Liu, etc . - 【arXiv.org】


Self-Supervised Visual Preference Alignment2024.04.16

Ke Zhu, Liang Zhao, Zheng Ge, Xiangyu Zhang . - 【arXiv.org】


RecurrentGemma: Moving Past Transformers for Efficient Open Language Models2024.04.11

Aleksandar Botev, Soham De, Samuel L Smith, Anushan Fernando, George Muraru, etc


Any2Point: Empowering Any-modality Large Models for Efficient 3D Understanding2024.04.11

Yiwen Tang, Jiaming Liu, Dong Wang, Zhigang Wang, Shanghang Zhang, etc . - 【arXiv.org】


OpenBias: Open-set Bias Detection in Text-to-Image Generative Models2024.04.11

Moreno D'Incà, E. Peruzzo, Massimiliano Mancini, Dejia Xu, Vidit Goel, etc


Halu-NLP at SemEval-2024 Task 6: MetaCheckGPT - A Multi-task Hallucination Detection using LLM uncertainty and meta-models2024.04.10

Rahul Mehta, Andrew Hoblitzell, Jack O’keefe, Hyeju Jang, Vasudeva Varma . - 【International Workshop on Semantic Evaluation】


Scaling Up Video Summarization Pretraining with Large Language Models2024.04.04

Dawit Mureja Argaw, Seunghyun Yoon, Fabian Caba Heilbron, Hanieh Deilamsalehy, Trung Bui, etc


Linear Attention Sequence Parallelism2024.04.03

Weigao Sun, Zhen Qin, Dong Li, Xuyang Shen, Yu Qiao, etc . - 【arXiv.org】


Pre-trained Vision and Language Transformers Are Few-Shot Incremental Learners2024.04.02

Keon-Hee Park, Kyungwoo Song, Gyeong-Moon Park


Fault detection of complicated processes based on an enhanced transformer network with graph attention mechanism2024.04.01

Yuping Cao, Xiaoguang Tang, Xiaogang Deng, Ping Wang . - 【Chemical engineering research & design】


WavLLM: Towards Robust and Adaptive Speech Large Language Model2024.03.31

Shujie Hu, Long Zhou, Shujie Liu, Sanyuan Chen, Hongkun Hao, etc . - 【arXiv.org】


Extensive Self-Contrast Enables Feedback-Free Language Model Alignment2024.03.31

Xiao Liu, Xixuan Song, Yuxiao Dong, Jie Tang . - 【arXiv.org】


MTLoRA: A Low-Rank Adaptation Approach for Efficient Multi-Task Learning2024.03.29

Ahmed A. Agiza, Marina Neseem, S. Reda . - 【arXiv.org】


ReALM: Reference Resolution As Language Modeling2024.03.29

Joel Ruben Antony Moniz, Soundarya Krishnan, Melis Ozyildirim, Prathamesh Saraf, Halim Cagri Ates, etc . - 【arXiv.org】


RSMamba: Remote Sensing Image Classification with State Space Model2024.03.28

Keyan Chen, Bo-Ying Chen, Chenyang Liu, Wenyuan Li, Zhengxia Zou, etc . - 【arXiv.org】


DreamLIP: Language-Image Pre-training with Long Captions2024.03.25

Kecheng Zheng, Yifei Zhang, Wei Wu, Fan Lu, Shuailei Ma, etc


Instruction Multi-Constraint Molecular Generation Using a Teacher-Student Large Language Model2024.03.20

Peng Zhou, Jianmin Wang, Chunyan Li, Zixu Wang, Yiping Liu, etc


MoPE-CLIP: Structured Pruning for Efficient Vision-Language Models with Module-wise Pruning Error Metric2024.03.12

Haokun Lin, Haoli Bai, Zhili Liu, Lu Hou, Muyi Sun, etc . - 【arXiv.org】


VideoMamba: State Space Model for Efficient Video Understanding2024.03.11

Kunchang Li, Xinhao Li, Yi Wang, Yinan He, Yali Wang, etc


Mamba4Rec: Towards Efficient Sequential Recommendation with Selective State Space Models2024.03.06

Chengkai Liu, Jianghao Lin, Jianling Wang, Hanzhou Liu, James Caverlee


Semantics-enhanced Cross-modal Masked Image Modeling for Vision-Language Pre-training2024.03.01

Haowei Liu, Yaya Shi, Haiyang Xu, Chunfen Yuan, Qinghao Ye, etc . - 【International Conference on Language Resources and Evaluation】


Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models2024.02.29

Soham De, Samuel L. Smith, Anushan Fernando, Aleksandar Botev, George Cristian-Muraru, etc


LeMo-NADe: Multi-Parameter Neural Architecture Discovery with LLMs2024.02.28

Md Hafizur Rahman, Prabuddha Chakraborty


LoRA-SP: Streamlined Partial Parameter Adaptation for Resource-Efficient Fine-Tuning of Large Language Models2024.02.28

Yichao Wu, Yafei Xiang, Shuning Huo, Yulu Gong, Penghao Liang


GISTEmbed: Guided In-sample Selection of Training Negatives for Text Embedding Fine-tuning2024.02.26

Aivin V. Solatorio . - 【arXiv.org】


Set the Clock: Temporal Alignment of Pretrained Language Models2024.02.26

Bowen Zhao, Zander Brumbaugh, Yizhong Wang, Hanna Hajishirzi, Noah A. Smith . - 【arXiv.org】


Generative Pretrained Hierarchical Transformer for Time Series Forecasting2024.02.26

Zhiding Liu, Jiqian Yang, Mingyue Cheng, Yucong Luo, Zhi Li


GROUNDHOG: Grounding Large Language Models to Holistic Segmentation2024.02.26

Yichi Zhang, Ziqiao Ma, Xiaofeng Gao, Suhaila Shakiah, Qiaozi Gao, etc


Genie: Generative Interactive Environments2024.02.23

Jake Bruce, Michael Dennis, Ashley Edwards, Jack Parker-Holder, Yuge Shi, etc


Self-evolving Autoencoder Embedded Q-Network2024.02.18

Ieee J. Senthilnath Senior Member, Zhen Bangjian Zhou, Wei Ng, Deeksha Aggarwal, Rajdeep Dutta, etc . - 【arXiv.org】


Self-Play Fine-Tuning of Diffusion Models for Text-to-Image Generation2024.02.15

Huizhuo Yuan, Zixiang Chen, Kaixuan Ji, Quanquan Gu . - 【arXiv.org】


Efficient Stagewise Pretraining via Progressive Subnetworks2024.02.08

Abhishek Panigrahi, Nikunj Saunshi, Kaifeng Lyu, Sobhan Miryoosefi, Sashank J. Reddi, etc . - 【arXiv.org】


ConvLoRA and AdaBN based Domain Adaptation via Self-Training2024.02.07

Sidra Aleem, J. Dietlmeier, Eric Arazo, Suzanne Little . - 【arXiv.org】


LoTR: Low Tensor Rank Weight Adaptation2024.02.02

Daniel Bershatsky, Daria Cherniuk, Talgat Daulbaev, A. Mikhalev, Ivan Oseledets . - 【arXiv.org】


UPDP: A Unified Progressive Depth Pruner for CNN and Vision Transformer2024.01.12

Ji Liu, Dehua Tang, Yuanxian Huang, Li Zhang, Xiaocheng Zeng, etc . - 【arXiv.org】


MagicVideo-V2: Multi-Stage High-Aesthetic Video Generation2024.01.09

Weimin Wang, Jiawei Liu, Zhijie Lin, Jiangqiao Yan, Shuo Chen, etc . - 【arXiv.org】


Task Oriented Dialogue as a Catalyst for Self-Supervised Automatic Speech Recognition2024.01.04

David M. Chan, Shalini Ghosh, Hitesh Tulsiani, A. Rastrow, Bjorn Hoffmeister . - 【arXiv.org】


Instruct-Imagen: Image Generation with Multi-modal Instruction2024.01.03

Hexiang Hu, Kelvin C.K. Chan, Yu-Chuan Su, Wenhu Chen, Yandong Li, etc . - 【arXiv.org】


EmbodiedScan: A Holistic Multi-Modal 3D Perception Suite Towards Embodied AI2023.12.26

Tai Wang, Xiaohan Mao, Chenming Zhu, Runsen Xu, Ruiyuan Lyu, etc . - 【arXiv.org】


Time is Encoded in the Weights of Finetuned Language Models2023.12.20

Kai Nylund, Suchin Gururangan, Noah A. Smith


Photorealistic Video Generation with Diffusion Models2023.12.11

Agrim Gupta, Lijun Yu, Kihyuk Sohn, Xiuye Gu, Meera Hahn, etc


Mamba: Linear-Time Sequence Modeling with Selective State Spaces2023.12.01

Albert Gu, Tri Dao


Minimizing Factual Inconsistency and Hallucination in Large Language Models2023.11.23

I. Muneeswaran, Shreya Saxena, Siva Prasad, M. V. S. Prakash, Advaith Shankar, etc . - 【arXiv.org】


White-Box Transformers via Sparse Rate Reduction: Compression Is All There Is?2023.11.22

Yaodong Yu, Sam Buchanan, Druv Pai, Tianzhe Chu, Ziyang Wu, etc . - 【arXiv.org】


Learning skillful medium-range global weather forecasting.2023.11.14

Remi Lam, Alvaro Sanchez-Gonzalez, Matthew Willson, Peter Wirnsberger, Meire Fortunato, etc . - 【Science】


Chat-UniVi: Unified Visual Representation Empowers Large Language Models with Image and Video Understanding2023.11.14

Peng Jin, Ryuichi Takanobu, Caiwan Zhang, Xiaochun Cao, Li Yuan . - 【arXiv.org】


SpectralGPT: Spectral Foundation Model2023.11.13

D. Hong, Bing Zhang, Xuyang Li, Yuxuan Li, Chenyu Li, etc . - 【arXiv.org】


Social Motion Prediction with Cognitive Hierarchies2023.11.08

Wentao Zhu, Jason Qin, Yuke Lou, Hang Ye, Xiaoxuan Ma, etc . - 【arXiv.org】


Pre-training LLMs using human-like development data corpus2023.11.08

Khushi Bhardwaj, Raj Sanjay Shah, Sashank Varma . - 【arXiv.org】


mPLUG-Owl2: Revolutionizing Multi-modal Large Language Model with Modality Collaboration2023.11.07

Qinghao Ye, Haiyang Xu, Jiabo Ye, Mingshi Yan, Anwen Hu, etc . - 【arXiv.org】


Scalable and Transferable Black-Box Jailbreaks for Language Models via Persona Modulation2023.11.06

Rusheb Shah, Quentin Feuillade--Montixi, Soroush Pour, Arush Tagade, Stephen Casper, etc . - 【arXiv.org】


Ziya2: Data-centric Learning is All LLMs Need2023.11.06

Ruyi Gan, Ziwei Wu, Renliang Sun, Junyu Lu, Xiaojun Wu, etc . - 【arXiv.org】


Levels of AGI: Operationalizing Progress on the Path to AGI2023.11.04

Meredith Ringel Morris, Jascha Narain Sohl-Dickstein, Noah Fiedel, T. Warkentin, Allan Dafoe, etc . - 【arXiv.org】


CodeFusion: A Pre-trained Diffusion Model for Code Generation2023.10.26

Mukul Singh, J. Cambronero, Sumit Gulwani, Vu Le, Carina Negreanu, etc . - 【arXiv.org】


3D-GPT: Procedural 3D Modeling with Large Language Models2023.10.19

Chunyi Sun, Junlin Han, Weijian Deng, Xinlong Wang, Zishan Qin, etc . - 【arXiv.org】


The Foundation Model Transparency Index2023.10.19

Rishi Bommasani, Kevin Klyman, Shayne Longpre, Sayash Kapoor, Nestor Maslej, etc


Language Models Represent Space and Time2023.10.03

Wes Gurnee, Max Tegmark


Chatmap : Large Language Model Interaction with Cartographic Data2023.09.28

Eren Unlu . - 【arXiv.org】


Effective Distillation of Table-based Reasoning Ability from LLMs2023.09.22

Bohao Yang, Chen Tang, Kangning Zhao, Chenghao Xiao, Chenghua Lin . - 【arXiv.org】


Q-Transformer: Scalable Offline Reinforcement Learning via Autoregressive Q-Functions2023.09.18

Yevgen Chebotar, Q. Vuong, A. Irpan, Karol Hausman, F. Xia, etc . - 【arXiv.org】


Replacing softmax with ReLU in Vision Transformers2023.09.15

Mitchell Wortsman, Jaehoon Lee, Justin Gilmer, Simon Kornblith . - 【arXiv.org】


ZGaming: Zero-Latency 3D Cloud Gaming by Image Prediction2023.09.01

Jiangkai Wu, Yu Guan, Qi Mao, Yong Cui, Zongming Guo, etc . - 【Proceedings of the ACM SIGCOMM 2023 Conference】


Explaining Vision and Language through Graphs of Events in Space and Time2023.08.29

Mihai Masala, Nicolae Cudlenco, Traian Rebedea, Marius Leordeanu . - 【arXiv.org】


PE-MED: Prompt Enhancement for Interactive Medical Image Segmentation2023.08.26

Ao Chang, Xing Tao, Xin Yang, Yuhao Huang, Xinrui Zhou, etc . - 【arXiv.org】


SkipcrossNets: Adaptive Skip-cross Fusion for Road Detection2023.08.24

Xinyu Zhang, Yan Gong, Zhiwei Li, Xinchen Gao, Dafeng Jin, etc . - 【arXiv.org】


SeqGPT: An Out-of-the-box Large Language Model for Open Domain Sequence Understanding2023.08.21

Tianyu Yu, Chengyue Jiang, Chao Lou, Shen Huang, Xiaobin Wang, etc . - 【arXiv.org】


Prompt Switch: Efficient CLIP Adaptation for Text-Video Retrieval2023.08.15

Chaorui Deng, Qi Chen, Pengda Qin, Dave Zhenyu Chen, Qi Wu . - 【arXiv.org】


VisIT-Bench: A Benchmark for Vision-Language Instruction Following Inspired by Real-World Use2023.08.12

Yonatan Bitton, Hritik Bansal, Jack Hessel, Rulin Shao, Wanrong Zhu, etc . - 【arXiv.org】


Accelerating LLM Inference with Staged Speculative Decoding2023.08.08

Benjamin Spector, Christal Re . - 【arXiv.org】


3D-VisTA: Pre-trained Transformer for 3D Vision and Text Alignment2023.08.08

Ziyu Zhu, Xiaojian Ma, Yixin Chen, Zhidong Deng, Siyuan Huang, etc . - 【arXiv.org】


Food-500 Cap: A Fine-Grained Food Caption Benchmark for Evaluating Vision-Language Models2023.08.06

Zheng Ma, Mianzhi Pan, Wen-Lan Wu, Ka Leong Cheng, Jianbing Zhang, etc . - 【arXiv.org】


Pre-Trained Large Language Models for Industrial Control2023.08.06

Lei Song, Chuheng Zhang, Li Zhao, Jiang Bian . - 【arXiv.org】


Training Large-scale Foundation Models on Emerging AI Chips2023.08.04

Aashiq Muhamed, Christian Bock, R. Solanki, Youngsuk Park, Yida Wang, etc . - 【Knowledge Discovery and Data Mining】


FLatten Transformer: Vision Transformer using Focused Linear Attention2023.08.01

Dongchen Han, Xuran Pan, Yizeng Han, Shiji Song, Gao Huang . - 【arXiv.org】


Med-Flamingo: a Multimodal Medical Few-shot Learner2023.07.27

Michael Moor, Qian Huang, Shirley Wu, Michihiro Yasunaga, C. Zakka, etc . - 【arXiv.org】


Universal and Transferable Adversarial Attacks on Aligned Language Models2023.07.27

Andy Zou, Zifan Wang, J. Z. Kolter, Matt Fredrikson . - 【arXiv.org】


CARTIER: Cartographic lAnguage Reasoning Targeted at Instruction Execution for Robots2023.07.21

Nikhil Kakodkar, D. Rivkin, Bobak H. Baghi, F. Hogan, Gregory Dudek . - 【arXiv.org】


MasterKey: Automated Jailbreak Across Multiple Large Language Model Chatbots2023.07.16

Gelei Deng, Yi Liu, Yuekang Li, Kailong Wang, Ying Zhang, etc


Mitigating the Learning Bias towards Repetition by Self-Contrastive Training for Open-Ended Generation2023.07.04

Jian Guan, Minlie Huang . - 【Annual Meeting of the Association for Computational Linguistics】


Kosmos-2: Grounding Multimodal Large Language Models to the World2023.06.26

Zhiliang Peng, Wenhui Wang, Li Dong, Y. Hao, Shaohan Huang, etc . - 【arXiv.org】


AudioPaLM: A Large Language Model That Can Speak and Listen2023.06.22

Paul K. Rubenstein, Chulayuth Asawaroengchai, D. Nguyen, Ankur Bapna, Zalán Borsos, etc . - 【arXiv.org】


Unleashing the AI revolution: exploring the capabilities and challenges of large language models and text‐to‐image AI programs2023.06.17

A. Youssef . - 【Ultrasound in Obstetrics and Gynecology】


PIXIU: A Large Language Model, Instruction Data and Evaluation Benchmark for Finance2023.06.08

Qianqian Xie, Weiguang Han, Xiao Zhang, Yanzhao Lai, Min Peng, etc . - 【arXiv.org】


M3Exam: A Multilingual, Multimodal, Multilevel Benchmark for Examining Large Language Models2023.06.08

Wenxuan Zhang, Sharifah Mahani Aljunied, Chang Gao, Yew Ken Chia, Lidong Bing . - 【arXiv.org】


Simple and Controllable Music Generation2023.06.08

Jade Copet, Felix Kreuk, Itai Gat, Tal Remez, David Kant, etc . - 【arXiv.org】


LLM-Blender: Ensembling Large Language Models with Pairwise Ranking and Generative Fusion2023.06.05

Dongfu Jiang, Xiang Ren, Bill Yuchen Lin . - 【arXiv.org】


DiffRate : Differentiable Compression Rate for Efficient Vision Transformers2023.05.29

Mengzhao Chen, Wenqi Shao, Peng Xu, Mingbao Lin, Kaipeng Zhang, etc . - 【arXiv.org】


Dynamic Context Pruning for Efficient and Interpretable Autoregressive Transformers2023.05.25

Sotiris Anagnostidis, Dario Pavllo, L. Biggio, Lorenzo Noci, Aurélien Lucchi, etc . - 【arXiv.org】


On Degrees of Freedom in Defining and Testing Natural Language Understanding2023.05.24

Saku Sugawara, Shun Tsugita


Structural Ambiguity and its Disambiguation in Language Model Based Parsers: the Case of Dutch Clause Relativization2023.05.24

Gijs Wijnholds, Michael Moortgat


Mitigating Temporal Misalignment by Discarding Outdated Facts2023.05.24

Michael J.Q. Zhang, Eunsol Choi


On the Generalization of Diffusion Model2023.05.24

Mingyang Yi, Jiacheng Sun, Zhenguo Li


Vision + Language Applications: A Survey2023.05.24

Yutong Zhou, Nobutaka Shimada


Towards Revealing the Mystery behind Chain of Thought: a Theoretical Perspective2023.05.24

Guhao Feng, Yuntian Gu, Bohang Zhang, Haotian Ye, Di He, etc


Balancing the Picture: Debiasing Vision-Language Datasets with Synthetic Contrast Sets2023.05.24

Brandon Smith, Miguel Farinha, Siobhan Mackenzie Hall, Hannah Rose Kirk, Aleksandar Shtedritski, etc


Unit-based Speech-to-Speech Translation Without Parallel Data2023.05.24

Anuj Diwan, Anirudh Srinivasan, David F. Harwath, Eunsol Choi


AV-TranSpeech: Audio-Visual Robust Speech-to-Speech Translation2023.05.24

Rongjie Huang, Huadai Liu, Xize Cheng, Yi Ren, Linjun Li, etc


A Neural Space-Time Representation for Text-to-Image Personalization2023.05.24

Yuval Alaluf, Elad Richardson, Gal Metzer, Daniel Cohen-Or


Peek Across: Improving Multi-Document Modeling via Cross-Document Question-Answering2023.05.24

Avi Caciularu, Matthew E. Peters, Jacob Goldberger, Ido Dagan, Arman Cohan


SAMScore: A Semantic Structural Similarity Metric for Image Translation Evaluation2023.05.24

Yunxiang Li, Meixu Chen, Wenxuan Yang, Kai Wang, Jun Ma, etc


Context-Aware Transformer Pre-Training for Answer Sentence Selection2023.05.24

Luca Di Liello, Siddhant Garg, Alessandro Moschitti


A Tale of Two Features: Stable Diffusion Complements DINO for Zero-Shot Semantic Correspondence2023.05.24

Junyi Zhang, Charles Herrmann, Junhwa Hur, Luisa Polania Cabrera, Varun Jampani, etc


Visual Programming for Text-to-Image Generation and Evaluation2023.05.24

Jaemin Cho, Abhay Zala, Mohit Bansal


Towards Foundation Models for Relational Databases [Vision Paper]2023.05.24

Liane Vogel, Benjamin Hilprecht, Carsten Binnig


MultiFusion: Fusing Pre-Trained Models for Multi-Lingual, Multi-Modal Image Generation2023.05.24

Marco Bellagente, Manuel Brack, Hannah Teufel, Felix Friedrich, Bjorn Deiseroth, etc


ViTMatte: Boosting Image Matting with Pretrained Plain Vision Transformers2023.05.24

Jingfeng Yao, Xinggang Wang, Shusheng Yang, Baoyuan Wang


Winner-Take-All Column Row Sampling for Memory Efficient Adaptation of Language Model2023.05.24

Zirui Liu, Guanchu Wang, Shaochen Zhong, Zhaozhuo Xu, Daochen Zha, etc


LMs with a Voice: Spoken Language Modeling beyond Speech Tokens2023.05.24

Eliya Nachmani, Alon Levkovitch, Julian Salazar, Chulayutsh Asawaroengchai, Soroosh Mariooryad, etc


Robust Classification via a Single Diffusion Model2023.05.24

Huanran Chen, Yinpeng Dong, Zhengyi Wang, Xiao Yang, Chengqi Duan, etc


Multi-modal Machine Learning for Vehicle Rating Predictions Using Image, Text, and Parametric Data2023.05.24

Hanqi Su, Binyang Song, Faez Ahmed


L-CAD: Language-based Colorization with Any-level Descriptions2023.05.24

Zheng Chang, Shuchen Weng, Pei Zhang, Yu Li, Si Li, etc


DiffBlender: Scalable and Composable Multimodal Text-to-Image Diffusion Models2023.05.24

Sungnyun Kim, Junsoo Lee, Kibeom Hong, Daesik Kim, Namhyuk Ahn


Pre-training Multi-party Dialogue Models with Latent Discourse Inference2023.05.24

Yiyang Li, Xinting Huang, Wei Bi, Hai Zhao


Deceptive-NeRF: Enhancing NeRF Reconstruction using Pseudo-Observations from Diffusion Models2023.05.24

Xinhang Liu, Shiu-hong Kao, Jiaben Chen, Yu-Wing Tai, Chi-Keung Tang


Fourier Transformer: Fast Long Range Modeling by Removing Sequence Redundancy with FFT Operator2023.05.24

Ziwei He, Meng Yang, Minwei Feng, Jingcheng Yin, Xinbing Wang, etc


CSTS: Conditional Semantic Textual Similarity2023.05.24

Ameet Deshpande, Carlos E. Jimenez, Howard Chen, Vishvak S. Murahari, Victoria Graf, etc


STAR: Boosting Low-Resource Event Extraction by Structure-to-Text Data Generation with Large Language Models2023.05.24

Mingyu Derek Ma, Xiaoxuan Wang, Po-Nien Kung, P. Jeffrey Brantingham, Nanyun Peng, etc


Contrastive Learning of Sentence Embeddings from Scratch2023.05.24

Junlei Zhang, Zhenzhong Lan, Junxian He


Meta-Learning Online Adaptation of Language Models2023.05.24

Nathan J. Hu, Eric Mitchell, Christopher D. Manning, Chelsea Finn


Who Wrote this Code? Watermarking for Code Generation2023.05.24

Taehyun Lee, Seokhee Hong, Jaewoo Ahn, Ilgee Hong, Hwaran Lee, etc


Reasoning over Hierarchical Question Decomposition Tree for Explainable Question Answering2023.05.24

Jiajie Zhang, Shulin Cao, Tingjia Zhang, Xin Lv, Jiaxin Shi, etc


Understanding Arithmetic Reasoning in Language Models using Causal Mediation Analysis2023.05.24

Alessandro Stolfo, Yonatan Belinkov, Mrinmaya Sachan


Ranger: A Toolkit for Effect-Size Based Multi-Task Evaluation2023.05.24

Mete Sertkan, Sophia Althammer, Sebastian Hofstatter


Ghostbuster: Detecting Text Ghostwritten by Large Language Models2023.05.24

Vivek Verma, Eve Fleisig, Nicholas Tomlin, Dan Klein


Generating Faithful Synthetic Data with Large Language Models: A Case Study in Computational Social Science2023.05.24

Veniamin Veselovsky, Manoel Horta Ribeiro, Akhil Arora, Martin Josifoski, Ashton Anderson, etc


Active Learning for Natural Language Generation2023.05.24

Yotam Perlitz, Ariel Gera, Michal Shmueli-Scheuer, Dafna Sheinwald, Noam Slonim, etc


SmartTrim: Adaptive Tokens and Parameters Pruning for Efficient Vision-Language Models2023.05.24

Zekun Wang, Jingchang Chen, Wangchunshu Zhou, Ming Liu, Bing Qin


How to Distill your BERT: An Empirical Study on the Impact of Weight Initialisation and Distillation Objectives2023.05.24

Xinpeng Wang, Leonie Weissweiler, Hinrich Schutze, Barbara Plank


ChatAgri: Exploring Potentials of ChatGPT on Cross-linguistic Agricultural Text Classification2023.05.24

Biao Zhao, Weiqiang Jin, Javier Del Ser, Guang Yang


Cheap and Quick: Efficient Vision-Language Instruction Tuning for Large Language Models2023.05.24

Gen Luo, Yiyi Zhou, Tianhe Ren, Shengxin Chen, Xiaoshuai Sun, etc


Measuring Faithful and Plausible Visual Grounding in VQA2023.05.24

Daniel Reich, Felix Putze, Tanja Schultz


Unlocking Temporal Question Answering for Large Language Models Using Code Execution2023.05.24

Xingxuan Li, Liying Cheng, Qingyu Tan, Hwee Tou Ng, Shafiq Joty, etc


Bactrian-X : A Multilingual Replicable Instruction-Following Model with Low-Rank Adaptation2023.05.24

Haonan Li, Fajri Koto, Minghao Wu, Alham Fikri Aji, Timothy Baldwin


Injecting Knowledge into Biomedical Pre-trained Models via Polymorphism and Synonymous Substitution2023.05.24

Hongbo Zhang, Xiang Wan, Benyou Wang


LLMDet: A Large Language Models Detection Tool2023.05.24

Kangxi Wu, Liang Pang, Huawei Shen, Xueqi Cheng, Tat-Seng Chua


The Art of SOCRATIC QUESTIONING: Zero-shot Multimodal Reasoning with Recursive Thinking and Self-Questioning2023.05.24

Jingyuan Qi, Zhiyang Xu, Ying Shen, Minqian Liu, Di Jin, etc


Reasoning with Language Model is Planning with World Model2023.05.24

Shibo Hao, Yi Gu, Haodi Ma, Joshua Jiahua Hong, Zhen Wang, etc


MuLER: Detailed and Scalable Reference-based Evaluation2023.05.24

Taelin Karidi, Leshem Choshen, Gal Patel, Omri Abend


Large Language Models are Effective Table-to-Text Generators, Evaluators, and Feedback Providers2023.05.24

Yilun Zhao, Haowei Zhang, Shengyun Si, Linyong Nan, Xiangru Tang, etc


Non-adversarial Robustness of Deep Learning Methods for Computer Vision2023.05.24

Gorana Goji'c, Vladimir Vincan, Ognjen Kundavcina, Dragivsa Mivskovi'c, Dinu Dragan


Improving Factuality of Abstractive Summarization without Sacrificing Summary Quality2023.05.24

Tanay Dixit, Fei Wang, Muhao Chen


Sampling-based Uncertainty Estimation for an Instance Segmentation Network2023.05.24

Florian Heidecker, Ahmad El-Khateeb, Bernhard Sick


OverPrompt: Enhancing ChatGPT Capabilities through an Efficient In-Context Learning Approach2023.05.24

Jiazheng Li, Runcong Zhao, Yulan He, Lin Gui


MMNet: Multi-Mask Network for Referring Image Segmentation2023.05.24

Yichen Yan, Xingjian He, Wenxuan Wan, Jing Liu


Tricking LLMs into Disobedience: Understanding, Analyzing, and Preventing Jailbreaks2023.05.24

Abhinav Rao, Sachin Vashistha, Atharva Naik, Somak Aditya, Monojit Choudhury


Editing Commonsense Knowledge in GPT2023.05.24

Anshita Gupta, Debanjan Mondal, Akshay Krishna Sheshadri, Wenlong Zhao, Xiang Lorraine Li, etc


Focus Your Attention (with Adaptive IIR Filters)2023.05.24

Shahar Lutati, Itamar Zimerman, Lior Wolf


Cross-lingual Data Augmentation for Document-grounded Dialog Systems in Low Resource Languages2023.05.24

Qi Gou, Zehua Xia, Wen-Hau Du


Trade-Offs Between Fairness and Privacy in Language Modeling2023.05.24

Cleo Matzken, Steffen Eger, Ivan Habernal


Frugal Prompting for Dialog Models2023.05.24

Bishal Santra, Sakya Basak, Abhinandan De, Manish Gupta, Pawan Goyal


Leveraging Pre-trained Large Language Models to Construct and Utilize World Models for Model-based Task Planning2023.05.24

L. Guan, Karthik Valmeekam, Sarath Sreedharan, Subbarao Kambhampati


M4: Multi-generator, Multi-domain, and Multi-lingual Black-Box Machine-Generated Text Detection2023.05.24

Yuxia Wang, Jonibek Mansurov, Petar Ivanov, Jinyan Su, Artem Shelmanov, etc


PIVOINE: Instruction Tuning for Open-world Information Extraction2023.05.24

Keming Lu, Xiaoman Pan, Kaiqiang Song, Hongming Zhang, Dong Yu, etc


Text encoders are performance bottlenecks in contrastive vision-language models2023.05.24

Amita Kamath, Jack Hessel, Kai-Wei Chang


HARD: Hard Augmentations for Robust Distillation2023.05.24

Arne F. Nix, Max F. Burg, Fabian H Sinz


Privacy Implications of Retrieval-Based Language Models2023.05.24

Yangsibo Huang, Samyak Gupta, Zexuan Zhong, Kai Li, Danqi Chen


Interpretable by Design Visual Question Answering2023.05.24

Xingyu Fu, Ben Zhou, Sihao Chen, Mark Yatskar, D. Roth


Leveraging GPT-4 for Automatic Translation Post-Editing2023.05.24

Vikas Raunak, Amr Sharaf, Hany Hassan Awadallah, Arul Menezes


ClusterLLM: Large Language Models as a Guide for Text Clustering2023.05.24

Yuwei Zhang, Zihan Wang, Jingbo Shang


CAR: Conceptualization-Augmented Reasoner for Zero-Shot Commonsense Question Answering2023.05.24

Weiqi Wang, Tianqing Fang, Wenxuan Ding, Baixuan Xu, Xin Liu, etc


Pre-RMSNorm and Pre-CRMSNorm Transformers: Equivalent and Efficient Pre-LN Transformers2023.05.24

Zixuan Jiang, Jiaqi Gu, Hanqing Zhu, D. Pan


SWAMP: Sparse Weight Averaging with Multiple Particles for Iterative Magnitude Pruning2023.05.24

Moonseok Choi, Hyungi Lee, Giung Nam, Juho Lee


Predicting Token Impact Towards Efficient Vision Transformer2023.05.24

Hong Wang, Su Yang, Xiaoke Huang, Weishan Zhang


ComSL: A Composite Speech-Language Model for End-to-End Speech-to-Text Translation2023.05.24

Chenyang Le, Yao Qian, Long Zhou, Shujie Liu, Michael Zeng, etc


NuScenes-QA: A Multi-modal Visual Question Answering Benchmark for Autonomous Driving Scenario2023.05.24

Tianwen Qian, Jingjing Chen, Linhai Zhuo, Yang Jiao, Yu-Gang Jiang


Towards Few-shot Entity Recognition in Document Images: A Graph Neural Network Approach Robust to Image Manipulation2023.05.24

Prashant Krishnan, Zilong Wang, Yangkun Wang, Jingbo Shang


Pre-training Intent-Aware Encoders for Zero- and Few-Shot Intent Classification2023.05.24

Mujeen Sung, James Gung, Elman Mansimov, Nikolaos Pappas, Raphael Shu, etc


Machine Reading Comprehension using Case-based Reasoning2023.05.24

Dung Thai, Dhruv Agarwal, Mudit Chaudhary, R. Das, M. Zaheer, etc


Debiasing Made State-of-the-art: Revisiting the Simple Seed-based Weak Supervision for Text Classification2023.05.24

Chengyu Dong, Zihan Wang, Jingbo Shang


Text Conditional Alt-Text Generation for Twitter Images2023.05.24

Nikita Srivatsan, Sofia Samaniego, Omar Florez, Taylor Berg-Kirkpatrick


A Controllable QA-based Framework for Decontextualization2023.05.24

Benjamin Newman, Luca Soldaini, Raymond Fok, Arman Cohan, Kyle Lo


SSD-2: Scaling and Inference-time Fusion of Diffusion Language Models2023.05.24

Xiaochuang Han, Sachin Kumar, Yulia Tsvetkov, Marjan Ghazvininejad


Dual Path Transformer with Partition Attention2023.05.24

Zhengkai Jiang, Liang Liu, Jiangning Zhang, Yabiao Wang, Mingang Chen, etc


UniChart: A Universal Vision-language Pretrained Model for Chart Comprehension and Reasoning2023.05.24

Ahmed Masry, Parsa Kavehzadeh, Xuan Long Do, Enamul Hoque, Shafiq Joty


SUVR: A Search-Based Approach to Unsupervised Visual Representation Learning2023.05.24

Yizhan Xu, Chih-Yao Chen, Cheng Li . - 【IEEE International Conference on Acoustics, Speech, and Signal Processing】


ChatFace: Chat-Guided Real Face Editing via Diffusion Latent Space Manipulation2023.05.24

Dongxu Yue, Qin Guo, Munan Ning, Jiaxi Cui, Yuesheng Zhu, etc


Trusting Your Evidence: Hallucinate Less with Context-aware Decoding2023.05.24

Weijia Shi, Xiaochuang Han, M. Lewis, Yulia Tsvetkov, Luke Zettlemoyer, etc


BinaryViT: Towards Efficient and Accurate Binary Vision Transformers2023.05.24

Junrui Xiao, Zhikai Li, Lianwei Yang, Qingyi Gu


In-Context Demonstration Selection with Cross Entropy Difference2023.05.24

Dan Iter, Reid Pryzant, Ruochen Xu, Shuohang Wang, Yang Liu, etc


GlobalBench: A Benchmark for Global Progress in Natural Language Processing2023.05.24

Y. Song, Catherine Cui, Simran Khanuja, Pengfei Liu, FAHIM FAISAL, etc


The student becomes the master: Matching GPT3 on Scientific Factual Error Correction2023.05.24

Dhananjay Ashok, Atharva Kulkarni, Hai Pham, Barnab'as P'oczos


PruMUX: Augmenting Data Multiplexing with Model Compression2023.05.24

Yushan Su, Vishvak S. Murahari, Karthik Narasimhan, Kai Li


Flan-MoE: Scaling Instruction-Finetuned Language Models with Sparse Mixture of Experts2023.05.24

Sheng Shen, Le Hou, Yanqi Zhou, Nan Du, Shayne Longpre, etc


AdvFunMatch: When Consistent Teaching Meets Adversarial Robustness2023.05.24

Ziuhi Wu, Haichang Gao, Bingqian Zhou, Ping Wang


SELFOOD: Self-Supervised Out-Of-Distribution Detection via Learning to Rank2023.05.24

Dheeraj Mekala, Adithya Samavedhi, Chengyu Dong, Jingbo Shang


A Causal View of Entity Bias in (Large) Language Models2023.05.24

Fei Wang, Wenjie Mo, Yiwei Wang, Wenxuan Zhou, Muhao Chen


Fusion-in-T5: Unifying Document Ranking Signals for Improved Information Retrieval2023.05.24

S. Yu, Chenghao Fan, Chenyan Xiong, David Jin, Zhiyuan Liu, etc


Emergent inabilities? Inverse scaling over the course of pretraining2023.05.24

James A. Michaelov, B. Bergen


Optimal Linear Subspace Search: Learning to Construct Fast and High-Quality Schedulers for Diffusion Models2023.05.24

Zhongjie Duan, Chengyu Wang, Cen Chen, Jun Huang, Weining Qian


T1: Scaling Diffusion Probabilistic Fields to High-Resolution on Unified Visual Modalities2023.05.24

Kangfu Mei, Mo Zhou, Vishal M. Patel


InteractiveIE: Towards Assessing the Strength of Human-AI Collaboration in Improving the Performance of Information Extraction2023.05.24

Ishani Mondal, Michelle Yuan, N Anandhavelu, Aparna Garimella, Francis Ferraro, etc


Dealing with Cross-Task Class Discrimination in Online Continual Learning2023.05.24

Yiduo Guo, Bing Liu, Dongyan Zhao


Denoising Bottleneck with Mutual Information Maximization for Video Multimodal Fusion2023.05.24

Shaoxaing Wu, Damai Dai, Ziwei Qin, Tianyu Liu, Binghuai Lin, etc


A Joint Time-frequency Domain Transformer for Multivariate Time Series Forecasting2023.05.24

Yushu Chen, Shengzhuo Liu, Jinzhe Yang, Hao Jing, Wenlai Zhao, etc


Meta-review Generation with Checklist-guided Iterative Introspection2023.05.24

Qi Zeng, Mankeerat S. Sidhu, Hou Pong Chan, Lu Wang, Heng Ji


Reinforcement Learning finetuned Vision-Code Transformer for UI-to-Code Generation2023.05.24

Davit Soselia, Khalid Saifullah, Tianyi Zhou


CMOT: Cross-modal Mixup via Optimal Transport for Speech Translation2023.05.24

Yan Zhou, Qingkai Fang, Yang Feng


KNN-LM Does Not Improve Open-ended Text Generation2023.05.24

Shufan Wang, Yixiao Song, Andrew Drozdov, Aparna Garimella, Varun Manjunatha, etc


Abductive Commonsense Reasoning Exploiting Mutually Exclusive Explanations2023.05.24

Wenting Zhao, Justin T. Chiu, Claire Cardie, Alexander M. Rush


Connecting the Dots: What Graph-Based Text Representations Work Best for Text Classification using Graph Neural Networks?2023.05.23

Margarita Bugueno, Gerard de Melo


Adversarial Defenses via Vector Quantization2023.05.23

Zhiyi Dong, Yongyi Mao


Language Models with Rationality2023.05.23

Nora Kassner, Oyvind Tafjord, Ashish Sabharwal, Kyle Richardson, Hinrich Schütze, etc


A Trip Towards Fairness: Bias and De-Biasing in Large Language Models2023.05.23

Leonardo Ranaldi, Elena Sofia Ruzzetti, Davide Venditti, Dario Onorati, Fabio Massimo Zanzotto


Question Answering as Programming for Solving Time-Sensitive Questions2023.05.23

Xinyu Zhu, Cheng Yang, Bei Chen, Siheng Li, Jian-Guang Lou, etc


PaD: Program-aided Distillation Specializes Large Models in Reasoning2023.05.23

Xuekai Zhu, Biqing Qi, Kaiyan Zhang, Xingwei Long, Bowen Zhou


Aligning Large Language Models through Synthetic Feedback2023.05.23

Sungdong Kim, Sanghwan Bae, Jamin Shin, Soyoung Kang, Donghyun Kwak, etc


LogicLLM: Exploring Self-supervised Logic-enhanced Training for Large Language Models2023.05.23

Fangkai Jiao, Zhiyang Teng, Shafiq Joty, Bosheng Ding, Aixin Sun, etc


ChatCoT: Tool-Augmented Chain-of-Thought Reasoning on Chat-based Large Language Models2023.05.23

Z. Chen, Kun Zhou, Beichen Zhang, Zheng Gong, Wayne Xin Zhao, etc


ViT-TTS: Visual Text-to-Speech with Scalable Diffusion Transformer2023.05.22

Huadai Liu, Rongjie Huang, Xuan Lin, Wenqiang Xu, Maozong Zheng, etc . - 【arXiv.org】


DADA: Dialect Adaptation via Dynamic Aggregation of Linguistic Rules2023.05.22

Yanchen Liu, William Held, Diyi Yang


Knowledge-Retrieval Task-Oriented Dialog Systems with Semi-Supervision2023.05.22

Yucheng Cai, Hong Liu, Zhijian Ou, Y. Huang, Junlan Feng


Sentence Representations via Gaussian Embedding2023.05.22

Shohei Yoda, Hayato Tsukagoshi, Ryohei Sasano, Koichi Takeda


LM-Switch: Lightweight Language Model Conditioning in Word Embedding Space2023.05.22

Chi Han, Jialiang Xu, Manling Li, Y. Fung, Chenkai Sun, etc


MacLaSa: Multi-Aspect Controllable Text Generation via Efficient Sampling from Compact Latent Space2023.05.22

Hanxing Ding, Liang Pang, Z. Wei, Huawei Shen, Xueqi Cheng, etc


Enhancing Cross-lingual Natural Language Inference by Soft Prompting with Multilingual Verbalizer2023.05.22

Shuang Li, Xuming Hu, Aiwei Liu, Yawen Yang, Fukun Ma, etc


A Benchmark on Extremely Weakly Supervised Text Classification: Reconcile Seed Matching and Prompting Approaches2023.05.22

Zihan Wang, Tianle Wang, Dheeraj Mekala, Jingbo Shang


Keeping Up with the Language Models: Robustness-Bias Interplay in NLI Data and Models2023.05.22

Ioana Baldini, Chhavi Yadav, Payel Das, K. Varshney


To Repeat or Not To Repeat: Insights from Scaling LLM under Token-Crisis2023.05.22

Fuzhao Xue, Yao Fu, Wangchunshu Zhou, Zangwei Zheng, Yang You


Multi-Task Instruction Tuning of LLaMa for Specific Scenarios: A Preliminary Study on Writing Assistance2023.05.22

Yue Zhang, Leyang Cui, Deng Cai, Xinting Huang, Tao Fang, etc


InheritSumm: A General, Versatile and Compact Summarizer by Distilling from GPT2023.05.22

Yichong Xu, Ruochen Xu, Dan Iter, Yang Liu, Shuo Wang, etc


Making Language Models Better Tool Learners with Execution Feedback2023.05.22

Shuofei Qiao, Honghao Gui, Huajun Chen, Ningyu Zhang


GPT-SW3: An Autoregressive Language Model for the Nordic Languages2023.05.22

Ariel Ekgren, Amaru Cuba Gyllensten, F. Stollenwerk, Joey Ohman, Tim Isbister, etc


ExplainCPE: A Free-text Explanation Benchmark of Chinese Pharmacist Examination2023.05.22

Dongfang Li, Jindi Yu, Baotian Hu, Zhenran Xu, Min Zhang


Infor-Coef: Information Bottleneck-based Dynamic Token Downsampling for Compact and Efficient language model2023.05.21

Wenxin Tan


Contrastive Learning with Logic-driven Data Augmentation for Logical Reasoning over Text2023.05.21

Qiming Bao, Alex Yuxuan Peng, Zhenyun Deng, Wanjun Zhong, Neset Tan, etc


Retrieving Texts based on Abstract Descriptions2023.05.21

Shauli Ravfogel, Valentina Pyatkin, Amir D. N. Cohen, Avshalom Manevich, Yoav Goldberg


Pruning Pre-trained Language Models with Principled Importance and Self-regularization2023.05.21

Siyu Ren, Kenny Q. Zhu


Model-Generated Pretraining Signals Improves Zero-Shot Generalization of Text-to-Text Transformers2023.05.21

Linyuan Gong, Chenyan Xiong, Xiaodong Liu, Payal Bajaj, Yiqing Xie, etc


Movie101: A New Movie Understanding Benchmark2023.05.20

Zihao Yue, Qi Zhang, Anwen Hu, Liang Zhang, Ziheng Wang, etc


The Scope of ChatGPT in Software Engineering: A Thorough Investigation2023.05.20

Wei Ma, Shangqing Liu, Wenhan Wang, Qiang Hu, Ye Liu, etc


Pointwise Mutual Information Based Metric and Decoding Strategy for Faithful Generation in Document Grounded Dialogs2023.05.20

Yatin Nandwani, Vineet Kumar, Dinesh Raghu, Sachindra Joshi, L. Lastras


Logic-LM: Empowering Large Language Models with Symbolic Solvers for Faithful Logical Reasoning2023.05.20

Liangming Pan, Alon Albalak, Xinyi Wang, William Yang Wang


LogiCoT: Logical Chain-of-Thought Instruction-Tuning Data Collection with GPT-42023.05.20

Hanmeng Liu, Zhiyang Teng, Leyang Cui, Chaoli Zhang, Qiji Zhou, etc


Late-Constraint Diffusion Guidance for Controllable Image Synthesis2023.05.19

Chang Liu, Dong Liu . - 【arXiv.org】


PASTS: Progress-Aware Spatio-Temporal Transformer Speaker For Vision-and-Language Navigation2023.05.19

Liuyi Wang, Chengju Liu, Zongtao He, Shu Li, Qingqing Yan, etc . - 【arXiv.org】


Self-QA: Unsupervised Knowledge Guided Language Model Alignment2023.05.19

Xuanyu Zhang, Qing Yang


SelfzCoT: a Self-Prompt Zero-shot CoT from Semantic-level to Code-level for a Better Utilization of LLMs2023.05.19

IokTong Lei, ZhiDong Deng . - 【arXiv.org】


Self-Agreement: A Framework for Fine-tuning Language Models to Find Agreement among Diverse Opinions2023.05.19

Shiyao Ding, Takayuki Ito . - 【arXiv.org】


MaGIC: Multi-modality Guided Image Completion2023.05.19

Yongsheng Yu, Hao Wang, Tiejian Luo, Hengrui Fan, Libo Zhang . - 【arXiv.org】


Towards Accurate Image Coding: Improved Autoregressive Image Generation with Dynamic Vector Quantization2023.05.19

Mengqi Huang, Zhendong Mao, Zhuowei Chen, Yongdong Zhang . - 【arXiv.org】


BOLT: Fast Energy-based Controlled Text Generation with Tunable Biases2023.05.19

Xin Liu, Muhammad Khalifa, Lu Wang


STOAT: Structured Data to Analytical Text With Controls2023.05.19

Deepanway Ghosal, Preksha Nema, A. Raghuveer . - 【arXiv.org】


Decouple knowledge from paramters for plug-and-play language modeling2023.05.19

Xin Cheng, Yankai Lin, Xiuying Chen, Dongyan Zhao, Rui Yan . - 【arXiv.org】


Enhancing Personalized Dialogue Generation with Contrastive Latent Variables: Combining Sparse and Dense Persona2023.05.19

Yihong Tang, Bo Wang, Miao Fang, Dongming Zhao, Kun Huang, etc . - 【arXiv.org】


XuanYuan 2.0: A Large Chinese Financial Chat Model with Hundreds of Billions Parameters2023.05.19

Xuanyu Zhang, Qing Yang, Dongliang Xu


Controlling the Extraction of Memorized Data from Large Language Models via Prompt-Tuning2023.05.19

Mustafa Safa Ozdayi, Charith S. Peris, Jack G. M. FitzGerald, Christophe Dupuy, Jimit Majmudar, etc . - 【arXiv.org】


RCOT: Detecting and Rectifying Factual Inconsistency in Reasoning by Reversing Chain-of-Thought2023.05.19

Tianci Xue, Ziqi Wang, Zhenhailong Wang, Chi Han, Pengfei Yu, etc . - 【arXiv.org】


LLM Itself Can Read and Generate CXR Images2023.05.19

Suhyeon Lee, Won Jun Kim, Jong-Chul Ye . - 【arXiv.org】


Post Hoc Explanations of Language Models Can Improve Language Models2023.05.19

Satyapriya, Krishna, Jiaqi Ma, Dylan Slack, Asma Ghandeharioun, etc . - 【arXiv.org】


Federated Foundation Models: Privacy-Preserving and Collaborative Learning for Large Models2023.05.19

Sixing Yu, J. P. Muñoz, A. Jannesari . - 【arXiv.org】


Do Models Really Learn to Follow Instructions? An Empirical Study of Instruction Tuning2023.05.19

Po-Nien Kung, Nanyun Peng . - 【arXiv.org】


AutoTrial: Prompting Language Models for Clinical Trial Design2023.05.19

Zifeng Wang, Cao Xiao, Jimeng Sun . - 【arXiv.org】


Democratized Diffusion Language Model2023.05.18

Nikita Balagansky, Daniil Gavrilov . - 【arXiv.org】


Ahead-of-Time P-Tuning2023.05.18

Daniil Gavrilov, Nikita Balagansky . - 【arXiv.org】


VideoFactory: Swap Attention in Spatiotemporal Diffusions for Text-to-Video Generation2023.05.18

Wenjing Wang, Huan Yang, Zixi Tuo, Huiguo He, Junchen Zhu, etc . - 【arXiv.org】


TextDiffuser: Diffusion Models as Text Painters2023.05.18

Jingye Chen, Yupan Huang, Tengchao Lv, Lei Cui, Qifeng Chen, etc . - 【arXiv.org】


LDM3D: Latent Diffusion Model for 3D2023.05.18

Gabriela Ben Melech Stan, Diana Wofk, Scottie Fox, Alex Redden, Will Saxton, etc . - 【arXiv.org】


Catch-Up Distillation: You Only Need to Train Once for Accelerating Sampling2023.05.18

Shitong Shao, Xu Dai, Shouyi Yin, Lujun Li, Huanran Chen, etc . - 【arXiv.org】


Adversarial Amendment is the Only Force Capable of Transforming an Enemy into a Friend2023.05.18

Chong Yu, Tao Chen, Zhongxue Gan . - 【arXiv.org】


Boost Vision Transformer with GPU-Friendly Sparsity and Quantization2023.05.18

Chong Yu, Tao Chen, Zhongxue Gan, Jiayuan Fan . - 【arXiv.org】


Zero-Day Backdoor Attack against Text-to-Image Diffusion Models via Personalization2023.05.18

Yihao Huang, Qing Guo, Felix Juefei-Xu . - 【arXiv.org】


Tuned Contrastive Learning2023.05.18

Chaitanya Animesh, Manmohan Chandraker . - 【arXiv.org】


Content-based Unrestricted Adversarial Attack2023.05.18

Zhaoyu Chen, Bo Li, Shuang Wu, Kaixun Jiang, Shouhong Ding, etc . - 【arXiv.org】


SimOAP: Improve Coherence and Consistency in Persona-based Dialogue Generation via Over-sampling and Post-evaluation2023.05.18

Junkai Zhou, Liang Pang, Huawei Shen, Xueqi Cheng . - 【arXiv.org】


How does the task complexity of masked pretraining objectives affect downstream performance?2023.05.18

Atsuki Yamaguchi, Hiroaki Ozaki, Terufumi Morishita, Gaku Morio, Yasuhiro Sogawa . - 【arXiv.org】


Ditto: A Simple and Efficient Approach to Improve Sentence Embeddings2023.05.18

Qian Chen, Wen Wang, Qinglin Zhang, Siqi Zheng, Chong Deng, etc . - 【arXiv.org】


ReGen: Zero-Shot Text Classification via Training Data Generation with Progressive Dense Retrieval2023.05.18

Yue Yu, Yuchen Zhuang, Rongzhi Zhang, Yu Meng, Jiaming Shen, etc . - 【arXiv.org】


LIMA: Less Is More for Alignment2023.05.18

Chunting Zhou, Pengfei Liu, Puxin Xu, Srini Iyer, Jiao Sun, etc . - 【arXiv.org】


Efficient Prompting via Dynamic In-Context Learning2023.05.18

Wangchunshu Zhou, Yuchen Jiang, Ryan Cotterell, Mrinmaya Sachan . - 【arXiv.org】


SpeechGPT: Empowering Large Language Models with Intrinsic Cross-Modal Conversational Abilities2023.05.18

Dong Zhang, Shimin Li, Xin Zhang, Jun Zhan, P. Wang, etc . - 【arXiv.org】


The Web Can Be Your Oyster for Improving Large Language Models2023.05.18

Junyi Li, Tianyi Tang, Wayne Xin Zhao, Jingyuan Wang, J. Nie, etc . - 【arXiv.org】


TOME: A Two-stage Approach for Model-based Retrieval2023.05.18

Ruiyang Ren, Wayne Xin Zhao, J. Liu, Huaqin Wu, Ji-rong Wen, etc . - 【arXiv.org】


Inverted Non-maximum Suppression for more Accurate and Neater Face Detection2023.05.17

Lian Liu, Liguo Zhou . - 【arXiv.org】


Preserve Your Own Correlation: A Noise Prior for Video Diffusion Models2023.05.17

Songwei Ge, Seungjun Nah, Guilin Liu, Tyler Poon, Andrew Tao, etc . - 【arXiv.org】


G-Adapter: Towards Structure-Aware Parameter-Efficient Transfer Learning for Graph Transformer Networks2023.05.17

Anchun Gui, Jinqiang Ye, Han Xiao . - 【arXiv.org】


When Gradient Descent Meets Derivative-Free Optimization: A Match Made in Black-Box Scenario2023.05.17

Chengcheng Han, Liqing Cui, Renyu Zhu, J. Wang, Nuo Chen, etc . - 【arXiv.org】


AD-KD: Attribution-Driven Knowledge Distillation for Language Model Compression2023.05.17

Siyue Wu, Hongzhan Chen, Xiaojun Quan, Qifan Wang, Rui Wang . - 【arXiv.org】


CooK: Empowering General-Purpose Language Models with Modular and Collaborative Knowledge2023.05.17

Shangbin Feng, Weijia Shi, Yuyang Bai, Vidhisha Balachandran, Tianxing He, etc . - 【arXiv.org】


SLiC-HF: Sequence Likelihood Calibration with Human Feedback2023.05.17

Yao Zhao, Rishabh Joshi, Tianqi Liu, Misha Khalman, Mohammad Saleh, etc . - 【arXiv.org】


LeTI: Learning to Generate from Textual Interactions2023.05.17

Xingyao Wang, Hao Peng, Reyhaneh Jabbarvand, Heng Ji . - 【arXiv.org】


M3KE: A Massive Multi-Level Multi-Subject Knowledge Evaluation Benchmark for Chinese Large Language Models2023.05.17

Chuang Liu, Renren Jin, Yuqi Ren, Linhao Yu, Tianyu Dong, etc . - 【arXiv.org】


Reprompting: Automated Chain-of-Thought Prompt Inference Through Gibbs Sampling2023.05.17

Weijia Xu, Andrzej Banburski-Fahey, N. Jojic . - 【arXiv.org】


On Dataset Transferability in Active Learning for Transformers2023.05.16

Fran Jelenić, Josip Jukic, Nina Drobac, Jan vSnajder . - 【arXiv.org】


X-LLM: Bootstrapping Advanced Large Language Models by Treating Multi-Modalities as Foreign Languages2023.05.07

Feilong Chen, Minglun Han, Haozhi Zhao, Qingyang Zhang, Jing Shi, etc . - 【arXiv.org】


Principle-Driven Self-Alignment of Language Models from Scratch with Minimal Human Supervision2023.05.04

Zhiqing Sun, Yikang Shen, Qinhong Zhou, Hongxin Zhang, Zhenfang Chen, etc . - 【arXiv.org】


AutoML-GPT: Automatic Machine Learning with GPT2023.05.04

Shujian Zhang, Chengyue Gong, Lemeng Wu, Xingchao Liu, Mi Zhou . - 【arXiv.org】


Distilling Step-by-Step! Outperforming Larger Language Models with Less Training Data and Smaller Model Sizes2023.05.03

Cheng-Yu Hsieh, Chun-Liang Li, Chih-Kuan Yeh, Hootan Nakhost, Yasuhisa Fujii, etc . - 【arXiv.org】


Unlimiformer: Long-Range Transformers with Unlimited Length Input2023.05.02

Amanda Bertsch, Uri Alon, Graham Neubig, Matthew R. Gormley . - 【arXiv.org】


Transfer Visual Prompt Generator across LLMs2023.05.02

Ao Zhang, Hao Fei, Yuan Yao, Wei Ji, Li Li, etc . - 【arXiv.org】


Improving Grounded Language Understanding in a Collaborative Environment by Interacting with Agents Through Help Feedback2023.04.21

Nikhil Mehta, Milagro Teruel, Patricio Figueroa Sanz, Xinwei Deng, A. Awadallah, etc


Segment Anything Model for Medical Image Analysis: an Experimental Study2023.04.20

Maciej A. Mazurowski, Haoyu Dong, Han Gu, Jichen Yang, N. Konz, etc


Chameleon: Plug-and-Play Compositional Reasoning with Large Language Models2023.04.19

Pan Lu, Baolin Peng, Hao Cheng, Michel Galley, Kai-Wei Chang, etc


Accuracy of Segment-Anything Model (SAM) in medical image segmentation tasks2023.04.18

Sheng He, Rina Bao, Jingpeng Li, P. Grant, Yangming Ou


When SAM Meets Medical Images: An Investigation of Segment Anything Model (SAM) on Multi-phase Liver Tumor Segmentation2023.04.17

Chuanfei Hu, Xinde Li


The Segment Anything foundation model achieves favorable brain tumor autosegmentation accuracy on MRI to support radiotherapy treatment planning2023.04.16

F. Putz, Johanna Grigo, T. Weissmann, P. Schubert, D. Hoefler, etc


Deep learning universal crater detection using Segment Anything Model (SAM)2023.04.16

I. Giannakis, A. Bhardwaj, L. Sam, G. Leontidis . - 【arXiv.org】


Segment Anything Model (SAM) for Digital Pathology: Assess Zero-shot Segmentation on Whole Slide Imaging2023.04.09

Ruining Deng, C. Cui, Quan Liu, Tianyuan Yao, L. W. Remedios, etc . - 【arXiv.org】


TagGPT: Large Language Models are Zero-shot Multimodal Taggers2023.04.06


Pythia: A Suite for Analyzing Large Language Models Across Training and Scaling2023.04.03

Stella Biderman, Hailey Schoelkopf, Quentin Anthony, Herbie Bradley, Kyle O'Brien, etc


BloombergGPT: A Large Language Model for Finance2023.03.30

Shijie Wu, Ozan Irsoy, Steven Lu, Vadim Dabravolski, Mark Dredze, etc


Scaling Expert Language Models with Unsupervised Domain Discovery2023.03.24

Suchin Gururangan, Margaret Li, Mike Lewis, Weijia Shi, Tim Althoff, etc


Sparks of Artificial General Intelligence: Early experiments with GPT-42023.03.22

S'ebastien Bubeck, Varun Chandrasekaran, Ronen Eldan, Johannes Gehrke, Eric Horvitz, etc


CoLT5: Faster Long-Range Transformers with Conditional Computation2023.03.17

J. Ainslie, Tao Lei, Michiel de Jong, Santiago Ontan'on, Siddhartha Brahma, etc . - 【ArXiv】


Meet in the Middle: A New Pre-training Paradigm2023.03.13

A. Nguyen, Nikos Karampatziakis, Weizhu Chen . - 【ArXiv】


High-throughput Generative Inference of Large Language Models with a Single GPU2023.03.13

Ying Sheng, Lianmin Zheng, Binhang Yuan, Zhuohan Li, Max Ryabinin, etc . - 【ArXiv】


Retinexformer: One-stage Retinex-based Transformer for Low-light Image Enhancement2023.03.12

Yuanhao Cai, Hao Bian, Jing Lin, Haoqian Wang, R. Timofte, etc . - 【arXiv.org】


Stabilizing Transformer Training by Preventing Attention Entropy Collapse2023.03.11

Shuangfei Zhai, T. Likhomanenko, Etai Littwin, Dan Busbridge, Jason Ramapuram, etc . - 【ArXiv】


An Overview on Language Models: Recent Developments and Outlook2023.03.10

Chen Wei, Yun Cheng Wang, Bin Wang, C.-C. Jay Kuo . - 【ArXiv】


Foundation Models for Decision Making: Problems, Methods, and Opportunities2023.03.07

Sherry Yang, Ofir Nachum, Yilun Du, Jason Wei, P. Abbeel, etc . - 【ArXiv】


How Do Transformers Learn Topic Structure: Towards a Mechanistic Understanding2023.03.07

Yuchen Li, Yuan-Fang Li, Andrej Risteski . - 【ArXiv】


LLaMA: Open and Efficient Foundation Language Models2023.02.27

Hugo Touvron, Thibaut Lavril, Gautier Izacard, Xavier Martinet, Marie-Anne Lachaux, etc . - 【ArXiv】


Complex QA and language models hybrid architectures, Survey2023.02.17

Xavier Daull, P. Bellot, Emmanuel Bruno, Vincent Martin, Elisabeth Murisasco . - 【arXiv.org】


OVO: One-shot Vision Transformer Search with Online distillation2022.12.28

Zimian Wei, H. Pan, Xin-Yi Niu, Dongsheng Li . - 【arXiv.org】


Self-Instruct: Aligning Language Model with Self Generated Instructions2022.12.20

Yizhong Wang, Yeganeh Kordi, Swaroop Mishra, Alisa Liu, Noah A. Smith, etc . - 【ArXiv】


Solving Math Word Problem via Cooperative Reasoning induced Language Models2022.10.28

Xinyu Zhu, Junjie Wang, Lin Zhang, Yuxiang Zhang, Ruyi Gan, etc . - 【ArXiv】


Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback2022.04.12

Yuntao Bai, Andy Jones, Kamal Ndousse, Amanda Askell, Anna Chen, etc . - 【ArXiv】


PaLM: Scaling Language Modeling with Pathways2022.04.05

Aakanksha Chowdhery, Sharan Narang, Jacob Devlin, Maarten Bosma, Gaurav Mishra, etc . - 【ArXiv】


Training language models to follow instructions with human feedback2022.03.04

Long Ouyang, Jeff Wu, Xu Jiang, Diogo Almeida, Carroll L. Wainwright, etc . - 【ArXiv】


LoRA: Low-Rank Adaptation of Large Language Models2021.06.17

Edward J. Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, etc . - 【International Conference on Learning Representations】


Transformers in Vision: A Survey2021.01.04

Salman Hameed Khan, Muzammal Naseer, Munawar Hayat, Syed Waqas Zamir, F. Khan, etc . - 【ACM Computing Surveys】


Unsupervised embedding of trajectories captures the latent structure of mobility2020.12.04

Dakota S. Murray, Jisung Yoon, Sadamori Kojaku, R. Costas, Woo-Sung Jung, etc . - 【arXiv.org】


Multi-News: A Large-Scale Multi-Document Summarization Dataset and Abstractive Hierarchical Model2019.06.04

Alexander R. Fabbri, Irene Li, Tianwei She, Suyi Li, Dragomir R. Radev . - 【Annual Meeting of the Association for Computational Linguistics】


Social Boundaries of Appropriate Speech in HCI: A Politeness Perspective2018.07.01

L. Clark


Steering the conversation: A linguistic exploration of natural language interactions with a digital assistant during simulated driving.2017.09.01

D. Large, L. Clark, Annie Quandt, G. Burnett, L. Skrypchuk . - 【Applied Ergonomics】


Teaching Machines to Read and Comprehend2015.06.10

K. Hermann, Tomás Kociský, Edward Grefenstette, Lasse Espeholt, Will Kay, etc . - 【NIPS】


From discourse structures to text summaries

D. Marcu . - 【Workshop On Intelligent Scalable Text Summarization】


Language Models are Unsupervised Multitask Learners

Alec Radford, Jeff Wu, Rewon Child, D. Luan, Dario Amodei, etc


SmartMoE: Efficiently Training Sparsely-Activated Models through Combining Offline and Online Parallelization

Mingshu Zhai, Jiaao He, Zixuan Ma, Zan Zong, Runqing Zhang, etc . - 【USENIX Annual Technical Conference】


LightVLP: A Lightweight Vision-Language Pre-training via Gated Interactive Masked AutoEncoders

Xingwu Sun, Zhen Yang, Ruobing Xie, Fengzong Lian, Zhanhui Kang, etc . - 【International Conference on Language Resources and Evaluation】


MetaCheckGPT - A Multi-task Hallucination Detector Using LLM Uncertainty and Meta-models

Rahul Mehta, Andrew Hoblitzell, Jack O’keefe, Hyeju Jang, Vasudeva Varma . - 【arXiv.org】


Entropy-Regularized Token-Level Policy Optimization for Large Language Models

Muning Wen, Cheng Deng, Jun Wang, Weinan Zhang, Ying Wen . - 【arXiv.org】


L4Q: Parameter Efficient Quantization-Aware Training on Large Language Models via LoRA-wise LSQ

Hyesung Jeon, Yulhwa Kim, Jae-Joon Kim . - 【arXiv.org】


PointNAT: Large-Scale Point Cloud Semantic Segmentation via Neighbor Aggregation With Transformer

Ziyin Zeng, Huan Qiu, Jian Zhou, Z. Dong, Jinsheng Xiao, etc . - 【IEEE Transactions on Geoscience and Remote Sensing】

CONTINUE...