-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Expand the model zoo (example model set) #700
Comments
Examples (To be updated): |
Will Start working with MobileNetV2, InceptionNetV3 |
I will work on Mask RCNN |
Hi, guys, @Shashankwer , @Alvinnyk , thanks for your help. Before you start, please check this Jira, https://issues.apache.org/jira/projects/SINGA/issues/SINGA-509?filter=allopenissues It lists the models which require support and for some of the models, SINGA may lack the necessary operators now. So I hope you:
|
I will try to work on the ShuffleNet |
Hi, may I ask how do we get the Amazon AWS link to download the model itself? For example, for ResNet, the link is https://s3.amazonaws.com/onnx-model-zoo/resnet/resnet18v1/resnet18v1.tar.gz. How do we get the link for the others? Because if I copy the link from ONNX's github, the link to download is Github's link, not Amazon AWS link. Thank you! |
For the models conversion do we also nee to train model on SINGA after conversion from onnx, and have a comparison matrix? For example super_resolution via subpixel convolution layer was trained on BSD 300. Model is originally present in pytorch and can be converted well into SINGA using ONNX. It gives similar result for an image across pytorch, onnxruntime and SINGA. Do we need to compare the performance of the model across these platforms with the dataset they were trained on or SINGA compatibility is what we are checking at the moment? Thanks and Regards, |
There is no need to retrain the model. |
https://github.com/onnx/models |
Hi, @agnesnatasya , it's not necessary to use the AWS download link. As you said, for the image classification models, I found the download link from the backend test cases here. But if for other models, you can use the download link within GitHub repo directly. |
For this version of SINGA, we don't need to consider the matter of training(or retaining), we just need to test if we can run the model without errors, and the result is correct. In the next version(what we are doing now), the ONNX of SINGA will support retraining. So for now, we just need to do three things:
|
@joddiy Thank you for the reply, I could find the link to download, and I have completed SqueezeNet. May I ask how do I test if my conversion is already correct? Thank you! |
you can use this code to verify the model by using its test data set:
|
Hi, Can anyone verify if for SuperResolutionNet, attached notebook would be sufficient? Thanks and Regards, |
Hi, Shufflenetv2 for average pooling layer the conversion from onnx fails on below error: with below stack trace ValueError Traceback (most recent call last) 3 frames /usr/local/lib/python3.7/site-packages/singa/sonnx.py in _onnx_model_to_singa_net(cls, model, init_inputs, device, opset_version) /usr/local/lib/python3.7/site-packages/singa/sonnx.py in _onnx_node_to_singa_op(cls, onnx_node, inputs, opset_version) /usr/local/lib/python3.7/site-packages/singa/sonnx.py in _create_max_avg_pool(cls, onnx_node, inputs, opset_version) ValueError: Not implemented yet for count_include_pad or ceil_mode. Kindly let us know the further steps to be taken Thanks and Regards, |
@Shashankwer I got the same error too, there is a "not implemented error" too when I try to run ShufflenetV1 I got a floating point exception: 8 when loading GPT2, does anyone know what causes this? |
@joddiy Hi, can I ask, for the other models that is not in ONNX, how do we find the reliable reference for it? I look at the papers, and they usually only discuss about the general idea and comparison between models, but not the details on operations to be done on every layer. |
Hi @agnesnatasya , For ShufflenetV1 issue is it requires cuda gpu for implementation. It makes use of grouped 1x1 convolution which does not seem to be supported by cpu module and is written only for cuda. It refers get_default_device as '-1' and gpu devices starting from 0. For me it failed on the check condition Thanks, |
@Shashankwer Oh, I see, thank you for that! |
@joddiy Hi, can I ask, for the other models that is not in ONNX, how do we find the reliable reference for it? I look at the papers, and they usually only discuss about the general idea and comparison between models, but not the details on operations to be done on every layer. If I look at https://modelzoo.co/, there are a lot of different implementations. How do we choose the one that we could reference from? Thank you! |
Typically, when an author published his paper, he usually had to release his code at the same time. So, I guess if you cannot find the code on the GitHub, you can try to send an email to the author to ask for the code. |
@Shashankwer @agnesnatasya , sorry for the late, I saw this reply just now. Yes, as @agnesnatasya has said, some features in ONNX have not been implemented in SINGA. For these errors, please skip this model. I guess we will implement these features soon. |
Hi @joddiy, For Shufflenetv2 the model conversion is failing due to use of ceil_mode. Although the value is set to false in onnx model (0) its still failing at a condition. I have opened a new issue for the same. If we can split the condition checking
to
Shufflenetv2 can be successfully converted to singa. |
Thanks, @Shashankwer , you can open a PR update like this:
|
Hi @joddiy , Do we need to prepare the model architecture or train it as well? For training what dataset should we include? (some models are trained on imagenet where dataset can be huge). |
Sure, all these works have been done in this PR #703 , if you are interested at it, you can check the new SONNX code. |
Hi @joddiy I tried to implement InceptionV1, but it involves LRN thus it is not yet supported. I tried to implement InceptionV2 but I received this error
May I ask if there is any other model (outside ONNX) that I could implement? Thank you! |
Thanks @agnesnatasya , I'll test the InceptionV2 again to check the error. And thanks you all for your contribution! @Shashankwer @agnesnatasya @Alvinnyk For the next plan, I list some necessary operators we have to implement: Type 1
Type 2
Type 3
Type 4TopK c++ Type 1 means the operators have been implemented in autograd.py, we only need to add it into the sonnx.py. So the plan is:
I know it's a little difficult at first. So I hope you don't be urgent to do, just read the code carefully first and if you have any questions, feel free to ask me. By the way, please add your code to dev branch. And please skip the frontend code in sonnx.py, only read the backend one. I'm working on the frontend to upgrade it. |
Hi @joddiy, For The Type I operator floor is not present in autograd.py can we implement the same in autograd.py on taking the implementation of Ceil as the reference. Also Equal is implemented in the existing code. Thanks, |
Yes. I think you can implement it following Ceil implementation.
…On Tue, Jun 16, 2020 at 2:54 AM Shashank Nigam ***@***.***> wrote:
Hi @joddiy <https://github.com/joddiy>,
For The Type I operator floor is not present in autograd.py can we
implement the same in autograd.py on taking the implementation of Ceil as
the reference. Also Equal is implemented in the existing code.
Thanks,
Shashank
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#700 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AA47DR635GXS7ZM2PJWUTL3RWZU5LANCNFSM4NDJFC2A>
.
|
Yes, @Shashankwer , you can create a new operator in |
Hi, Exp operator is already implemented in autograd.py Thanks and Regards, |
Hi, @Shashankwer , the Type 1 means the operators have been implemented in autograd.py, we only need to add it into the sonnx.py. You can follow my PR to see how to add an operator to sonnx.py. By the way, you should test the @Shashankwer @Alvinnyk @agnesnatasya
|
Hi, Just one question, does apache singa CTensor support advanced indexing as what numpy supports? Thanks, |
cannot yet. I guess for this operator, you can use tensor.to_numpy(tensor.from_raw_tensor(CTensor)) to convert it to numpy array to do it for now. We will consider to implement it at c++ end later. |
Hi, Will work on RoiAlign. Also can GatherElements be implemented similar to ScatterElements? Thanks and Regards, |
Thanks @Shashankwer . I just saw your PR of ScatterElements. I guess you can implement the NonMaxSuppression firstly, it's easier than RoiAlign. There are some operators we cannot support for RoiAlign, for example, the crop. But the NonMaxSuppression is straightforward now. You can follow this one: https://github.com/pytorch/vision/blob/master/torchvision/ops/boxes.py And for RoiAlign, we have found an implement on GitHub, we're discussing how to implement on python or c++. |
Thanks for the reference will try implementing NonMaxSuppression first |
SINGA has multiple example models at http://singa.apache.org/docs/examples/
Some are implemented from scratch and some are converted from ONNX, which has a bigger model zoo https://github.com/onnx/models.
The task is to convert more onnx models and implement some popular (and interesting) models that are not in onnx model zoo.
Here are some reference model zoos https://modelzoo.co/, https://gluon-nlp.mxnet.io/model_zoo/index.html
The text was updated successfully, but these errors were encountered: