Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issues with Running MaskCLIP++ Demo #3

Closed
9 of 10 tasks
oneHFR opened this issue Jan 7, 2025 · 4 comments
Closed
9 of 10 tasks

Issues with Running MaskCLIP++ Demo #3

oneHFR opened this issue Jan 7, 2025 · 4 comments

Comments

@oneHFR
Copy link

oneHFR commented Jan 7, 2025

Hi~~ I have followed the official installation instructions to set up the environment and installed all the necessary weights.

Environment Specifications:

  • torch.__version__: 2.5.1+cu124
  • torch.version.cuda: 12.4
  • torch.backends.cudnn.version: 90100
  • Python: 3.10.16

Installation Steps Followed:

  1. Installation:

  2. Preparations:

    • Datasets: Prepared as per [Preparing Datasets for MaskCLIP++](datasets/README.md).
    • Pretrained CLIP Models: Downloaded automatically from Hugging Face.
    • Mask Generators: Downloaded the required mask generator models manually using the provided URLs and placed them in the specified paths.
    • Fine-tuning on COCO-Stuff Dataset
  3. Demo Usage:

    config="configs/coco-stuff/eva-clip-vit-l-14-336/maft-l/maskclippp_coco-stuff_eva-clip-vit-l-14-336_wtext_maft-l_ens.yaml"
    ckpt="output/ckpts/maskclippp/maskclippp_coco-stuff_eva-clip-vit-l-14-336_wtext.pth"
    python demo/app.py \
        --config-file $config \
        --opts \
        MODEL.WEIGHTS $ckpt \
        MODEL.MASK_FORMER.TEST.PANOPTIC_ON False \
        MODEL.MASK_FORMER.TEST.INSTANCE_ON False \
        MODEL.MASK_FORMER.TEST.SEMANTIC_ON True

    Here is mine:

    python demo/app.py --config-file configs/coco-stuff/eva-clip-vit-l-14-336/maft-l/maskclippp_coco-stuff_eva-clip-vit-l-14-336_wtext_maft-l_ens.yaml --opts MODEL.WEIGHTS output/ckpts/maskclippp/maskclippp_coco-stuff_eva-clip-vit-l-14-336_wtext.pth MODEL.MASK_FORMER.TEST.PANOPTIC_ON False MODEL.MASK_FORMER.TEST.INSTANCE_ON False MODEL.MASK_FORMER.TEST.SEMANTIC_ON True

Issue:

During the execution, I encountered several issues, mostly related to library version mismatches. I was able to resolve some of them with minor code modifications. For example, I removed the type annotation as shown below:

# Original function definition with type annotation
def extract_features(self, inputs: PaddedList) -> Dict[str, Tensor]:

# Modified function definition without type annotation
def extract_features(self, inputs: PaddedList):
    if self._finetune_none:
        self.eval()
        with torch.no_grad():
            return self._extract_features(inputs.images)
    else:
        return self._extract_features(inputs.images)

However, I am now stuck at the following error:

[01/08 03:13:38 detectron2]: Predefined Classes: ['cocostuff']
User Classes: []
Available features: dict_keys(['stage1_f', 'stage2_f', 'stage3_f', 'stage4_f', 'input_f', 'test_t_embs_f', 'num_synonyms'])
Required features: ['stage2_f', 'stage3_f', 'stage4_f']

/root/miniconda3/envs/mcp/lib/python3.10/site-packages/torch/functional.py:534: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at ../aten/src/ATen/native/TensorShape.cpp:3595.)
  return _VF.meshgrid(tensors, **kwargs)  # type: ignore[attr-defined]

Traceback (most recent call last):
  File "/root/autodl-tmp/MaskCLIPpp/demo/app.py", line 90, in process_image
    predictions, visualized_output = meta_demo.run_on_image(bgr_image)
  File "/root/autodl-tmp/MaskCLIPpp/demo/predictor.py", line 220, in run_on_image
    predictions = self.predictor(image)
  File "/root/autodl-tmp/MaskCLIPpp/demo/detectron2/detectron2/engine/defaults.py", line 351, in __call__
    predictions = self.model([inputs])[0]
  File "/root/autodl-tmp/MaskCLIPpp/demo/../maskclippp/maskclippp.py", line 623, in forward
    encode_dict.update(self.visual_encoder(images, masks=valid_masks_by_imgs))  # List(B) of Q,D
  File "/root/autodl-tmp/MaskCLIPpp/demo/../maskclippp/vencoder/eva_clip_vit.py", line 382, in extract_features
    return self._extract_features(inputs.images, masks)
  File "/root/autodl-tmp/MaskCLIPpp/demo/../maskclippp/vencoder/eva_clip_vit.py", line 338, in _extract_features
    attn_biases, areas = self._masks_to_attn_biases(masks, curr_grid_size)
  File "/root/autodl-tmp/MaskCLIPpp/demo/../maskclippp/vencoder/eva_clip_vit.py", line 310, in _masks_to_attn_biases
    attn_bias[:, :, :Q, -hw:].copy_(down_mask)
RuntimeError: output with shape [3, 16, 1, 1064] doesn't match the broadcast shape [3, 16, 3, 1064]

Current Progress:

I have modified some of the code to address some compatibility issues. Currently, the error seems to be occurring in the following function:

attn_bias[:, :, :Q, -hw:].copy_(down_mask)

Question:

Is completing the above modifications sufficient to run the demo? It appears that running the demo with the following command leads to a series of errors:

python demo/app.py --config-file configs/coco-stuff/eva-clip-vit-l-14-336/maft-l/maskclippp_coco-stuff_eva-clip-vit-l-14-336_wtext_maft-l_ens.yaml --opts MODEL.WEIGHTS output/ckpts/maskclippp/maskclippp_coco-stuff_eva-clip-vit-l-14-336_wtext.pth MODEL.MASK_FORMER.TEST.PANOPTIC_ON False MODEL.MASK_FORMER.TEST.INSTANCE_ON False MODEL.MASK_FORMER.TEST.SEMANTIC_ON True

Could you please advise on what additional installations or configurations are required to successfully run the demo?


Issue in Chinese

我按照官方的安装说明完成了环境搭建,并安装了所有必要的权重.

环境基本参数:

  • torch.__version__: 2.5.1+cu124
  • torch.version.cuda: 12.4
  • torch.backends.cudnn.version: 90100
  • Python: 3.10.16

安装步骤:

  1. 安装:

  2. 准备工作:

    • deleted #4
    • 预训练的 CLIP 模型: 从 Hugging Face 自动下载。
    • Mask 生成器: 根据提供的链接手动下载所需的 Mask 生成器模型,并放置在指定路径。
    • 在 COCO-Stuff 数据集上的微调的权重下载
  3. Demo 使用指令:

    python demo/app.py --config-file configs/coco-stuff/eva-clip-vit-l-14-336/maft-l/maskclippp_coco-stuff_eva-clip-vit-l-14-336_wtext_maft-l_ens.yaml --opts MODEL.WEIGHTS output/ckpts/maskclippp/maskclippp_coco-stuff_eva-clip-vit-l-14-336_wtext.pth MODEL.MASK_FORMER.TEST.PANOPTIC_ON False MODEL.MASK_FORMER.TEST.INSTANCE_ON False MODEL.MASK_FORMER.TEST.SEMANTIC_ON True

问题:

在执行过程中,我遇到了不少问题,主要是库的版本不匹配。通过微小的代码修改,我解决了一些问题。例如,删除了类型注解,如下所示:

# 原始函数定义带有类型注解
def extract_features(self, inputs: PaddedList) -> Dict[str, Tensor]:

# 修改后的函数定义移除了类型注解
def extract_features(self, inputs: PaddedList):
    if self._finetune_none:
        self.eval()
        with torch.no_grad():
            return self._extract_features(inputs.images)
    else:
        return self._extract_features(inputs.images)

但是,现在卡在了以下错误:

[01/08 03:13:38 detectron2]: Predefined Classes: ['cocostuff']
User Classes: []
Available features: dict_keys(['stage1_f', 'stage2_f', 'stage3_f', 'stage4_f', 'input_f', 'test_t_embs_f', 'num_synonyms'])
Required features: ['stage2_f', 'stage3_f', 'stage4_f']

/root/miniconda3/envs/mcp/lib/python3.10/site-packages/torch/functional.py:534: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at ../aten/src/ATen/native/TensorShape.cpp:3595.)
  return _VF.meshgrid(tensors, **kwargs)  # type: ignore[attr-defined]

Traceback (most recent call last):
  File "/root/autodl-tmp/MaskCLIPpp/demo/app.py", line 90, in process_image
    predictions, visualized_output = meta_demo.run_on_image(bgr_image)
  File "/root/autodl-tmp/MaskCLIPpp/demo/predictor.py", line 220, in run_on_image
    predictions = self.predictor(image)
  File "/root/autodl-tmp/MaskCLIPpp/demo/detectron2/detectron2/engine/defaults.py", line 351, in __call__
    predictions = self.model([inputs])[0]
  File "/root/autodl-tmp/MaskCLIPpp/demo/../maskclippp/maskclippp.py", line 623, in forward
    encode_dict.update(self.visual_encoder(images, masks=valid_masks_by_imgs))  # List(B) of Q,D
  File "/root/autodl-tmp/MaskCLIPpp/demo/../maskclippp/vencoder/eva_clip_vit.py", line 382, in extract_features
    return self._extract_features(inputs.images, masks)
  File "/root/autodl-tmp/MaskCLIPpp/demo/../maskclippp/vencoder/eva_clip_vit.py", line 338, in _extract_features
    attn_biases, areas = self._masks_to_attn_biases(masks, curr_grid_size)
  File "/root/autodl-tmp/MaskCLIPpp/demo/../maskclippp/vencoder/eva_clip_vit.py", line 310, in _masks_to_attn_biases
    attn_bias[:, :, :Q, -hw:].copy_(down_mask)
RuntimeError: output with shape [3, 16, 1, 1064] doesn't match the broadcast shape [3, 16, 3, 1064]

当前进展:

我逐步修改了一些代码的细节,目前错误应该发生在以下函数中:

attn_bias[:, :, :Q, -hw:].copy_(down_mask)

问题:

是否完成上述基本操作其实是远远不够的,实际上还需要完善更多配置才能运行demo?运行以下命令时出现了一连串的错误,每一个错误都让我猜测是不是缺少了某个环节操作,所以有一系列的某个形参的mask没有传入,每个特征的维度不匹配无法继续传递等问题

python demo/app.py --config-file configs/coco-stuff/eva-clip-vit-l-14-336/maft-l/maskclippp_coco-stuff_eva-clip-vit-l-14-336_wtext_maft-l_ens.yaml --opts MODEL.WEIGHTS output/ckpts/maskclippp/maskclippp_coco-stuff_eva-clip-vit-l-14-336_wtext.pth MODEL.MASK_FORMER.TEST.PANOPTIC_ON False MODEL.MASK_FORMER.TEST.INSTANCE_ON False MODEL.MASK_FORMER.TEST.SEMANTIC_ON True

请问我还需要安装或配置哪些内容才能成功运行demo?非常感谢期待您的回复解答!

@ashun989
Copy link
Collaborator

@oneHFR,

Thank you for providing such a detailed description of the issue. My current suggestions are as follows:

  1. There are no issues with your installation steps or Preparations. According to the installation guide, the PyTorch environment I am using is 2.3.1+cu121. I haven’t verified compatibility with version 2.5.1+cu124, so I recommend reinstalling the environment accordingly.

  2. The error message indicates that the number of candidate masks generated during inference is Q=1 or 3. However, under the correct configuration, this value is typically Q=100 or Q=250. Since you’ve modified the code, I cannot definitively pinpoint where the issue lies.

  3. I’ve updated the code to simplify the launch command for the Gradio demo. If you reinstall the environment and run the latest code but still encounter the issue, feel free to reach out here or via email, and provide your inputs and outputs.

@oneHFR
Copy link
Author

oneHFR commented Jan 16, 2025

Thank you so much for your very effective and helpful response. I have updated my environment based on your suggestions, and the details are as follows:

  • PyTorch: 2.3.0
  • CUDA: 12.1
  • Python: 3.12 (Ubuntu 22.04)

After updating my environment, I also pulled the latest code from the GitHub repository, and I'm happy to report that the demo now runs successfully! I want to express my appreciation for the outstanding work you and the team have done. The project is impressive, and it's exciting to see such an innovative and well-implemented solution!

However, I have one more question regarding the PyTorch version. Does the environment must use PyTorch 2.3.0 with CUDA 12.1 (torch==2.3.0+cu121), or would other versions also work? In my current experiments, I would like to try your design, but my existing environment uses:

  • PyTorch: 1.12.1
  • CUDA: 11.3

Would downgrading the PyTorch version to 2.3.0+cu121 in my environment completely break compatibility with MaskCLIPpp, or is there any flexibility in using the environment I currently have?

Looking forward to your insights, and once again, great job on the project! 🎉👍👏

@ashun989
Copy link
Collaborator

@oneHFR
I'm delighted to hear that our work may be helpful for your research.
The following are the answers regarding environmental issues:

  1. The main reason why PyTorch > 2.0 is required in the code is that when I modified the eva_clip code, I used scaled_dot_product_attention (sdpa) to replace the dependency on xformers.
  2. In the latest code, I added a conditional judgment: when the PyTorch version is less than 2.0, the vanilla implementation of attention will be used. I have already tested the training and inference processes in the environment where PyTorch = 1.12.1 + CUDA = 11.3, and so far, no issues have been found.
  3. Since various dependent libraries may not necessarily remain forward-compatible during future development, we prefer to provide explicit versions in the installation instructions. We hope you can understand this.

@oneHFR
Copy link
Author

oneHFR commented Jan 19, 2025

@ashun989

Thank you so much for your detailed and thoughtful responses! I really appreciate how thoroughly you've explained the environment compatibility issues and provided flexible solutions. Your work on this project is truly impressive - the code is well-structured and the documentation is exceptionally clear. Really cool that you added support for PyTorch 1.12.1 too - that's super helpful ! 😊

Really appreciate all your help and the awesome work you're doing! Keep rocking! 🚀✨

@oneHFR oneHFR closed this as completed Jan 19, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants