-
Notifications
You must be signed in to change notification settings - Fork 6.8k
[MXNET-108] Adding BilinearResize2D and AdaptiveAvgPool2d operators #9688
Conversation
src/operator/bilinear_upsample.cu
Outdated
template<typename Dtype, typename Acctype> | ||
__global__ void caffe_gpu_interp2_kernel(const int n, | ||
const Acctype rheight, const Acctype rwidth, | ||
const DeviceTensor<Dtype, 4> data1, DeviceTensor<Dtype, 4> data2) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use mshadow Tensor instead of device tensor
src/operator/bilinear_upsample-inl.h
Outdated
CHECK_EQ(outputs.size(), 1U); | ||
mshadow::Stream<xpu> *s = ctx.get_stream<xpu>(); | ||
MSHADOW_REAL_TYPE_SWITCH_EX(inputs[0].type_flag_, DType, AccReal, { | ||
SpatialUpSamplingBilinearUpdateGradInput<xpu, DType, AccReal>(s, inputs, outputs); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
check for req
src/operator/bilinear_upsample-inl.h
Outdated
namespace op { | ||
|
||
struct BilinearSampleParam : public dmlc::Parameter<BilinearSampleParam> { | ||
int out_height; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use tuple scale
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is data channel first or channel last?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This operator mainly supports fractional scale ratio (input and output sizes can be arbitrary), so it is more convenient to use output size instead of scale.
We prefer a rewrite based on mxnet utilities instead of copy paste if possible. The current code doesn't use openmp for cpu parallelization. If you use kernel launch it will be handled automatically. |
src/operator/bilinear_upsample.cc
Outdated
DType *data2 = gradOutput.data_ptr(); | ||
channels = nbatch * channels; | ||
|
||
// special case: same-size matching grids |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be handled at the top level by passing through to identity compute function
src/operator/bilinear_upsample.cc
Outdated
|
||
DMLC_REGISTER_PARAMETER(BilinearSampleParam); | ||
|
||
NNVM_REGISTER_OP(BilinearUpsample2D) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it handle upsampling only or downsample too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It also supports downsample, since the output size can be arbitrary integer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For these bilinear-sampling related ops, could we implement by directly calling mx.sym.BilinearSampler? I feel that the implementations are mostly the same.
How can we define the receptive field of this operator? |
|
Quick note: You can also run "make lint" to test the linting locally. |
I think its more appropriate to call the BilinearResize |
Agree. I will make the changes. |
I think it would be good that we can pass a reference symbol as a target shape. For example like mx.sym.Crop, where you can crop according to the shape of the second symbol. This would be useful for a variable input size. |
DMLC_DECLARE_FIELD(out_height).set_range(1, 1000) | ||
.describe("output height"); | ||
DMLC_DECLARE_FIELD(out_width).set_range(1, 1000) | ||
.describe("output width"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Simply use height
and width
as argument names.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed :)
Resize the 2D input, using bilinear interpolation. | ||
|
||
Expected input is a 4 dimensional array (batch x channel x height x width) and the output | ||
with the shape of (batch x channel x out_height x out_width). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We consistently use NCHW
to describe such layout elsewhere, so you can consider using it here too to make less verbose doc.
Hi @cjolivier01 , the CI server seems to stop working. |
Hi @szha @piiswrong @cjolivier01 , I have made the changes following the suggestions. Let me know if you have further comments. Thx |
LGTM |
Thanks @cjolivier01 ! Could you approve the request? |
.describe(R"code( | ||
Applies a 2D adaptive average pooling over an input signal composed of several input planes. | ||
|
||
The output size is (N x C x output_size x output_size), for any input (NCHW). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In pytorch output_size can be either a single int or a tuple of 2 ints.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also why do you indent here? How would it render?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An int or a tuple of 2 ints are allowed now.
int output_size; | ||
DMLC_DECLARE_PARAMETER(AdaptiveAvgPoolParam) { | ||
DMLC_DECLARE_FIELD(output_size).set_range(1, 1000) | ||
.describe("output size"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
expand the doc
int width; | ||
DMLC_DECLARE_PARAMETER(BilinearSampleParam) { | ||
DMLC_DECLARE_FIELD(height).set_range(1, 1000) | ||
.describe("output height"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same
@zhanghang1989 any updates on this ? |
@zhanghang1989, can you please provide a code example of using BilinearResize2D in a Symbol-based network fully convolutional network to do ratio based upscaling, when shape is unknown/dynamic? (x2, x3) |
|
@zhanghang1989 That's a NDArray way of doing it. I was asking about Symbol-ish way and shape inference. |
I am not familiar with Symbol API. Sorry about that. |
…pache#9688) * bilinear upsample from PYTorch * fix cpu backward * fix indent, add req * fix lint * fix lint * lint * handle req * add adaptive avg pooling operator * rename to bilinear resize * fix name * change assertion * rm unused var * refactor using mshadow tensor * rm devicetensor, only using mshadow * add docs * naming * merge * Revert "merge" This reverts commit a2a809a. * add unit test for BilinearResize2D and AdaptiveAvgPool2D * for test in python2, cast into float * mv function inside * link docs * address the comments * lint * add back private () * correct lint * decleare var * link params docs * fix bug * mv to contrib and upodate docs * contrib header * change include path for contrib * lint * register to contrib * lint * rename width, height, docs * rename param * Patch1 (#1) * two shapes input * docs * typo * lint * lint
…pache#9688) * bilinear upsample from PYTorch * fix cpu backward * fix indent, add req * fix lint * fix lint * lint * handle req * add adaptive avg pooling operator * rename to bilinear resize * fix name * change assertion * rm unused var * refactor using mshadow tensor * rm devicetensor, only using mshadow * add docs * naming * merge * Revert "merge" This reverts commit a2a809a. * add unit test for BilinearResize2D and AdaptiveAvgPool2D * for test in python2, cast into float * mv function inside * link docs * address the comments * lint * add back private () * correct lint * decleare var * link params docs * fix bug * mv to contrib and upodate docs * contrib header * change include path for contrib * lint * register to contrib * lint * rename width, height, docs * rename param * Patch1 (#1) * two shapes input * docs * typo * lint * lint
Description
Add operators:
link to JIRA issue 108
Checklist
Essentials
make lint
)Changes
Comments