GitHub - JiuTian-VL/JiuTian-FALCON: Official repository of "FALCON: Resolving Visual Redundancy and Fragmentation in High-resolution Multimodal Large Language Models via Visual Registers"

FALCON: Resolving Visual Redundancy and Fragmentation in High-resolution Multimodal Large Language Models via Visual Registers

Renshan Zhang¹, Rui Shao¹†, Gongwei Chen¹, Weili Guan¹, Kaiwen Zhou², Liqiang Nie¹†

¹Harbin Institute of Technology, Shenzhen
²Huawei Noah's Ark Lab
†Corresponding author

If you find this work useful for your research, please kindly cite our paper and star our repo.

Updates

[01/2025] Arxiv paper released.

Introduction

This is the github repository of FALCON: Resolving Visual Redundancy and Fragmentation in High-resolution Multimodal Large Language Models via Visual Registers. In this work, we propose the FALCON model, which introduces a novel visual register technique to simultaneously address the issues of visual redundancy and fragmentation in the high-resolution visual encoding of MLLMs.

The framework of the proposed FALCON model:

🔥 Details will be released. Stay tuned.

Citation

If you find this work useful for your research, please kindly cite our paper:

@misc{zhang2025falcon,
      title={FALCON: Resolving Visual Redundancy and Fragmentation in High-resolution Multimodal Large Language Models via Visual Registers}, 
      author={Renshan Zhang and Rui Shao and Gongwei Chen and Kaiwen Zhou and Weili Guan and Liqiang Nie},
      year={2025},
      eprint={2501.16297},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2501.16297}, 
}

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
assets		assets
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

FALCON: Resolving Visual Redundancy and Fragmentation in High-resolution Multimodal Large Language Models via Visual Registers

If you find this work useful for your research, please kindly cite our paper and star our repo.

Updates

Introduction

🔥 Details will be released. Stay tuned.

Citation

About

Releases

Packages

License

JiuTian-VL/JiuTian-FALCON

Folders and files

Latest commit

History

Repository files navigation

FALCON: Resolving Visual Redundancy and Fragmentation in High-resolution Multimodal Large Language Models via Visual Registers

If you find this work useful for your research, please kindly cite our paper and star our repo.

Updates

Introduction

🔥 Details will be released. Stay tuned.

Citation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Packages