forked from facebookresearch/segment-anything
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
18 changed files
with
360 additions
and
61 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,27 @@ | ||
name: ci | ||
on: | ||
push: | ||
branches: | ||
- master | ||
- main | ||
permissions: | ||
contents: write | ||
jobs: | ||
deploy: | ||
runs-on: ubuntu-latest | ||
steps: | ||
- uses: actions/checkout@v3 | ||
- uses: actions/setup-python@v4 | ||
with: | ||
python-version: 3.x | ||
- uses: actions/cache@v3 | ||
with: | ||
key: mkdocs-material-${{ github.ref }} | ||
path: .cache | ||
restore-keys: | | ||
mkdocs-material- | ||
- run: pip install mkdocs-material | ||
- run: pip install mkdocs-jupyter | ||
- run: pip install jieba | ||
- run: pip install mkdocs-git-revision-date-localized-plugin | ||
- run: mkdocs gh-deploy --force |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -8,39 +8,41 @@ | |
|
||
![SAM design](assets/model_diagram.png?raw=true) | ||
|
||
The **Segment Anything Model (SAM)** produces high quality object masks from input prompts such as points or boxes, and it can be used to generate masks for all objects in an image. It has been trained on a [dataset](https://segment-anything.com/dataset/index.html) of 11 million images and 1.1 billion masks, and has strong zero-shot performance on a variety of segmentation tasks. | ||
**Segment Anything Model (SAM)** 从诸如点或框之类的输入提示生成高质量的对象掩码,可用于为图像中的所有对象生成掩码。它已经在包含1100万张图像和11亿个掩码的[数据集](https://segment-anything.com/dataset/index.html)上进行了训练,并在各种分割任务上具有强大的零样本zero-shot性能。 | ||
|
||
<p float="left"> | ||
<img src="assets/masks1.png?raw=true" width="37.25%" /> | ||
<img src="assets/masks2.jpg?raw=true" width="61.5%" /> | ||
</p> | ||
|
||
## Installation | ||
|
||
The code requires `python>=3.8`, as well as `pytorch>=1.7` and `torchvision>=0.8`. Please follow the instructions [here](https://pytorch.org/get-started/locally/) to install both PyTorch and TorchVision dependencies. Installing both PyTorch and TorchVision with CUDA support is strongly recommended. | ||
## 安装 | ||
|
||
Install Segment Anything: | ||
代码要求`python>=3.8`,以及`pytorch>=1.7`和`torchvision>=0.8`。请按照[这里](https://pytorch.org/get-started/locally/)的说明安装PyTorch和TorchVision的依赖项。强烈建议安装支持CUDA的PyTorch和TorchVision。 | ||
|
||
``` | ||
pip install git+https://github.com/facebookresearch/segment-anything.git | ||
``` | ||
--- | ||
|
||
or clone the repository locally and install with | ||
克隆存储库并在本地安装: | ||
|
||
``` | ||
git clone [email protected]:facebookresearch/segment-anything.git | ||
cd segment-anything; pip install -e . | ||
git clone https://github.com/facebookresearch/segment-anything.git | ||
cd segment-anything | ||
pip install -e . | ||
``` | ||
|
||
The following optional dependencies are necessary for mask post-processing, saving masks in COCO format, the example notebooks, and exporting the model in ONNX format. `jupyter` is also required to run the example notebooks. | ||
> 建议先fork到自己仓库后再克隆 | ||
--- | ||
|
||
以下是必要的可选依赖项,用于掩码后处理、以COCO格式保存掩码、示例jupyter笔记本以及将模型导出为ONNX格式。运行示例jupyter笔记本还需要`jupyter`。 | ||
|
||
``` | ||
pip install opencv-python pycocotools matplotlib onnxruntime onnx | ||
``` | ||
|
||
## <a name="GettingStarted"></a>Getting Started | ||
## 入门 | ||
|
||
First download a [model checkpoint](#model-checkpoints). Then the model can be used in just a few lines to get masks from a given prompt: | ||
首先下载一个[模型检查点](#模型检查点)。然后,可以使用以下几行代码从给定提示中获取掩码: | ||
|
||
``` | ||
from segment_anything import SamPredictor, sam_model_registry | ||
|
@@ -50,7 +52,7 @@ predictor.set_image(<your_image>) | |
masks, _, _ = predictor.predict(<input_prompts>) | ||
``` | ||
|
||
or generate masks for an entire image: | ||
或者为整个图像生成掩码: | ||
|
||
``` | ||
from segment_anything import SamAutomaticMaskGenerator, sam_model_registry | ||
|
@@ -59,53 +61,60 @@ mask_generator = SamAutomaticMaskGenerator(sam) | |
masks = mask_generator.generate(<your_image>) | ||
``` | ||
|
||
Additionally, masks can be generated for images from the command line: | ||
此外,可以使用命令行为图像生成掩码: | ||
|
||
``` | ||
python scripts/amg.py --checkpoint <path/to/checkpoint> --model-type <model_type> --input <image_or_folder> --output <path/to/output> | ||
``` | ||
|
||
See the examples notebooks on [using SAM with prompts](/notebooks/predictor_example.ipynb) and [automatically generating masks](/notebooks/automatic_mask_generator_example.ipynb) for more details. | ||
有关更多详细信息,请参阅[使用提示生成掩码](https://eanyang7.github.io/segment-anything/notebooks/predictor_example/)和[自动生成对象掩码](https://eanyang7.github.io/segment-anything/notebooks/automatic_mask_generator_example/)的示例笔记本。 | ||
|
||
<p float="left"> | ||
<img src="assets/notebook1.png?raw=true" width="49.1%" /> | ||
<img src="assets/notebook2.png?raw=true" width="48.9%" /> | ||
</p> | ||
|
||
## ONNX Export | ||
|
||
SAM's lightweight mask decoder can be exported to ONNX format so that it can be run in any environment that supports ONNX runtime, such as in-browser as showcased in the [demo](https://segment-anything.com/demo). Export the model with | ||
## ONNX导出 | ||
|
||
SAM的轻量级掩码解码器可以导出为ONNX格式,以便在支持ONNX运行时的任何环境中运行,例如在[演示](https://segment-anything.com/demo)中展示的浏览器中。使用以下命令导出模型: | ||
|
||
``` | ||
python scripts/export_onnx_model.py --checkpoint <path/to/checkpoint> --model-type <model_type> --output <path/to/output> | ||
``` | ||
|
||
See the [example notebook](https://github.com/facebookresearch/segment-anything/blob/main/notebooks/onnx_model_example.ipynb) for details on how to combine image preprocessing via SAM's backbone with mask prediction using the ONNX model. It is recommended to use the latest stable version of PyTorch for ONNX export. | ||
请参阅[示例笔记本](https://eanyang7.github.io/segment-anything/notebooks/onnx_model_example/)以了解如何通过SAM的骨干进行图像预处理,然后使用ONNX模型进行掩码预测的详细信息。建议使用PyTorch的最新稳定版本进行ONNX导出。 | ||
|
||
### Web demo | ||
### Web演示 | ||
|
||
The `demo/` folder has a simple one page React app which shows how to run mask prediction with the exported ONNX model in a web browser with multithreading. Please see [`demo/README.md`](https://github.com/facebookresearch/segment-anything/blob/main/demo/README.md) for more details. | ||
`demo/`文件夹中有一个简单的单页React应用程序,展示了如何在支持多线程的Web浏览器中使用导出的ONNX模型运行掩码预测。请查看[`demo/README.md`](https://github.com/facebookresearch/segment-anything/blob/main/demo/README.md)以获取更多详细信息。 | ||
|
||
## <a name="Models"></a>Model Checkpoints | ||
## 模型检查点 | ||
|
||
Three model versions of the model are available with different backbone sizes. These models can be instantiated by running | ||
提供了三个模型版本,具有不同的骨干大小。可以通过运行以下代码实例化这些模型: | ||
|
||
``` | ||
from segment_anything import sam_model_registry | ||
sam = sam_model_registry["<model_type>"](checkpoint="<path/to/checkpoint>") | ||
``` | ||
|
||
Click the links below to download the checkpoint for the corresponding model type. | ||
单击下面的链接下载相应模型类型的检查点。 | ||
|
||
- **`default`或`vit_h`:[ViT-H SAM模型。](https://dl.fbaipublicfiles.com/segment_anything/sam_vit_h_4b8939.pth)** | ||
- `vit_l`:[ViT-L SAM模型。](https://dl.fbaipublicfiles.com/segment_anything/sam_vit_l_0b3195.pth) | ||
- `vit_b`:[ViT-B SAM模型。](https://dl.fbaipublicfiles.com/segment_anything/sam_vit_b_01ec64.pth) | ||
|
||
- **`default` or `vit_h`: [ViT-H SAM model.](https://dl.fbaipublicfiles.com/segment_anything/sam_vit_h_4b8939.pth)** | ||
- `vit_l`: [ViT-L SAM model.](https://dl.fbaipublicfiles.com/segment_anything/sam_vit_l_0b3195.pth) | ||
- `vit_b`: [ViT-B SAM model.](https://dl.fbaipublicfiles.com/segment_anything/sam_vit_b_01ec64.pth) | ||
> b:base基础模型 | ||
> | ||
> l:large较大模型 | ||
> | ||
> h:huge最大的模型 | ||
## Dataset | ||
## 数据集 | ||
|
||
See [here](https://ai.facebook.com/datasets/segment-anything/) for an overview of the datastet. The dataset can be downloaded [here](https://ai.facebook.com/datasets/segment-anything-downloads/). By downloading the datasets you agree that you have read and accepted the terms of the SA-1B Dataset Research License. | ||
请参阅[此处](https://ai.facebook.com/datasets/segment-anything/)以获取有关数据集的概述。可以在[此处](https://ai.facebook.com/datasets/segment-anything-downloads/)下载数据集。通过下载数据集,您同意已阅读并接受了SA-1B数据集研究许可条款。 | ||
|
||
We save masks per image as a json file. It can be loaded as a dictionary in python in the below format. | ||
每个图像的掩码保存为json文件。它可以在以下格式的Python字典中加载。 | ||
|
||
```python | ||
{ | ||
|
@@ -114,52 +123,38 @@ We save masks per image as a json file. It can be loaded as a dictionary in pyth | |
} | ||
|
||
image_info { | ||
"image_id" : int, # Image id | ||
"width" : int, # Image width | ||
"height" : int, # Image height | ||
"file_name" : str, # Image filename | ||
"image_id" : int, # 图像id | ||
"width" : int, # 图像宽度 | ||
"height" : int, # 图像高度 | ||
"file_name" : str, # 图像文件名 | ||
} | ||
|
||
annotation { | ||
"id" : int, # Annotation id | ||
"segmentation" : dict, # Mask saved in COCO RLE format. | ||
"bbox" : [x, y, w, h], # The box around the mask, in XYWH format | ||
"area" : int, # The area in pixels of the mask | ||
"predicted_iou" : float, # The model's own prediction of the mask's quality | ||
"stability_score" : float, # A measure of the mask's quality | ||
"crop_box" : [x, y, w, h], # The crop of the image used to generate the mask, in XYWH format | ||
"point_coords" : [[x, y]], # The point coordinates input to the model to generate the mask | ||
"id" : int, # 注释id | ||
"segmentation" : dict, # 以COCO RLE格式保存的掩码。 | ||
"bbox" : [x, y, w, h], # 掩码周围的框,以XYWH格式表示 | ||
"area" : int, # 掩码的像素面积 | ||
"predicted_iou" : float, # 模型对掩码质量的自身预测 | ||
"stability_score" : float, # 掩码质量的度量 | ||
"crop_box" : [x, y, w, h], # 用于生成掩码的图像的裁剪,以XYWH格式表示 | ||
"point_coords" : [[x, y]], # 输入模型生成掩码的点坐标 | ||
} | ||
``` | ||
|
||
Image ids can be found in sa_images_ids.txt which can be downloaded using the above [link](https://ai.facebook.com/datasets/segment-anything-downloads/) as well. | ||
图像ID可以在`sa_images_ids.txt`中找到,可以使用上述[链接](https://ai.facebook.com/datasets/segment-anything-downloads/)下载。 | ||
|
||
To decode a mask in COCO RLE format into binary: | ||
要将COCO RLE格式的掩码解码为二进制: | ||
|
||
``` | ||
from pycocotools import mask as mask_utils | ||
mask = mask_utils.decode(annotation["segmentation"]) | ||
``` | ||
|
||
See [here](https://github.com/cocodataset/cocoapi/blob/master/PythonAPI/pycocotools/mask.py) for more instructions to manipulate masks stored in RLE format. | ||
|
||
## License | ||
|
||
The model is licensed under the [Apache 2.0 license](LICENSE). | ||
|
||
## Contributing | ||
|
||
See [contributing](CONTRIBUTING.md) and the [code of conduct](CODE_OF_CONDUCT.md). | ||
|
||
## Contributors | ||
|
||
The Segment Anything project was made possible with the help of many contributors (alphabetical): | ||
|
||
Aaron Adcock, Vaibhav Aggarwal, Morteza Behrooz, Cheng-Yang Fu, Ashley Gabriel, Ahuva Goldstand, Allen Goodman, Sumanth Gurram, Jiabo Hu, Somya Jain, Devansh Kukreja, Robert Kuo, Joshua Lane, Yanghao Li, Lilian Luong, Jitendra Malik, Mallika Malhotra, William Ngan, Omkar Parkhi, Nikhil Raina, Dirk Rowe, Neil Sejoor, Vanessa Stark, Bala Varadarajan, Bram Wasti, Zachary Winstrom | ||
请参阅[此处](https://github.com/cocodataset/cocoapi/blob/master/PythonAPI/pycocotools/mask.py)以获取有关如何操作以RLE格式存储的掩码的更多说明。 | ||
|
||
## Citing Segment Anything | ||
## 引用Segment Anything | ||
|
||
If you use SAM or SA-1B in your research, please use the following BibTeX entry. | ||
如果您在研究中使用SAM或SA-1B,请使用以下BibTeX条目。 | ||
|
||
``` | ||
@article{kirillov2023segany, | ||
|
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Oops, something went wrong.