Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

[MXNET-108] Adding BilinearResize2D and AdaptiveAvgPool2d operators #9688

Merged
merged 47 commits into from
Apr 9, 2018

Conversation

zhanghang1989
Copy link
Contributor

@zhanghang1989 zhanghang1989 commented Feb 3, 2018

Description

Add operators:

  1. BilinearResize2D
  2. AdaptiveAvgPooling2d

link to JIRA issue 108

Checklist

Essentials

  • Passed code style checking (make lint)
  • Changes are complete (i.e. I finished coding on this PR)
  • All changes have test coverage:
  • Unit tests are added for small changes to verify correctness (e.g. adding a new operator)
  • Nightly tests are added for complicated/long-running ones (e.g. changing distributed kvstore)
  • Build tests will be added for build configuration changes (e.g. adding a new build option with NCCL)
  • Code is well-documented:
  • For user-facing API changes, API doc string has been updated.
  • For new C++ functions in header files, their functionalities and arguments are documented.
  • For new examples, README.md is added to explain the what the example does, the source of the dataset, expected performance on test set and reference to the original paper if applicable
  • To the my best knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change

Changes

  • BilinearResize2D
  • AdaptiveAvgPool2D
  • docs
  • unit test

Comments

  • If this change is a backward incompatible change, why must this change be made.
  • Interesting edge cases to note here

template<typename Dtype, typename Acctype>
__global__ void caffe_gpu_interp2_kernel(const int n,
const Acctype rheight, const Acctype rwidth,
const DeviceTensor<Dtype, 4> data1, DeviceTensor<Dtype, 4> data2) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use mshadow Tensor instead of device tensor

CHECK_EQ(outputs.size(), 1U);
mshadow::Stream<xpu> *s = ctx.get_stream<xpu>();
MSHADOW_REAL_TYPE_SWITCH_EX(inputs[0].type_flag_, DType, AccReal, {
SpatialUpSamplingBilinearUpdateGradInput<xpu, DType, AccReal>(s, inputs, outputs);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

check for req

namespace op {

struct BilinearSampleParam : public dmlc::Parameter<BilinearSampleParam> {
int out_height;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use tuple scale

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is data channel first or channel last?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This operator mainly supports fractional scale ratio (input and output sizes can be arbitrary), so it is more convenient to use output size instead of scale.

@piiswrong
Copy link
Contributor

We prefer a rewrite based on mxnet utilities instead of copy paste if possible.

The current code doesn't use openmp for cpu parallelization. If you use kernel launch it will be handled automatically.

DType *data2 = gradOutput.data_ptr();
channels = nbatch * channels;

// special case: same-size matching grids
Copy link
Contributor

@piiswrong piiswrong Feb 3, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be handled at the top level by passing through to identity compute function


DMLC_REGISTER_PARAMETER(BilinearSampleParam);

NNVM_REGISTER_OP(BilinearUpsample2D)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it handle upsampling only or downsample too?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It also supports downsample, since the output size can be arbitrary integer.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For these bilinear-sampling related ops, could we implement by directly calling mx.sym.BilinearSampler? I feel that the implementations are mostly the same.

@chinakook
Copy link
Contributor

How can we define the receptive field of this operator?

@zhanghang1989
Copy link
Contributor Author

zhanghang1989 commented Feb 6, 2018

@chinakook

import mxnet as mx
x1 = mx.nd.ones(shape=(2,3,4,4))
y1 = mx.nd.contrib.BilinearResize2D(x1, height=5, width=5)

@marcoabreu
Copy link
Contributor

Quick note: You can also run "make lint" to test the linting locally.

@zhanghang1989 zhanghang1989 changed the title bilinear upsample from PyTorch Adapt operators from PyTorch, will keep adding Feb 7, 2018
@piiswrong
Copy link
Contributor

I think its more appropriate to call the BilinearResize

@zhanghang1989
Copy link
Contributor Author

Agree. I will make the changes.

@ascust
Copy link

ascust commented Feb 18, 2018

I think it would be good that we can pass a reference symbol as a target shape. For example like mx.sym.Crop, where you can crop according to the shape of the second symbol. This would be useful for a variable input size.

DMLC_DECLARE_FIELD(out_height).set_range(1, 1000)
.describe("output height");
DMLC_DECLARE_FIELD(out_width).set_range(1, 1000)
.describe("output width");
Copy link
Member

@szha szha Mar 22, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Simply use height and width as argument names.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed :)

Resize the 2D input, using bilinear interpolation.

Expected input is a 4 dimensional array (batch x channel x height x width) and the output
with the shape of (batch x channel x out_height x out_width).
Copy link
Member

@szha szha Mar 22, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We consistently use NCHW to describe such layout elsewhere, so you can consider using it here too to make less verbose doc.

@zhanghang1989
Copy link
Contributor Author

Hi @cjolivier01 , the CI server seems to stop working.

@zhanghang1989
Copy link
Contributor Author

Hi @szha @piiswrong @cjolivier01 , I have made the changes following the suggestions. Let me know if you have further comments. Thx

@cjolivier01
Copy link
Member

cjolivier01 commented Mar 23, 2018

LGTM

@zhanghang1989
Copy link
Contributor Author

Thanks @cjolivier01 ! Could you approve the request?

.describe(R"code(
Applies a 2D adaptive average pooling over an input signal composed of several input planes.

The output size is (N x C x output_size x output_size), for any input (NCHW).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In pytorch output_size can be either a single int or a tuple of 2 ints.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also why do you indent here? How would it render?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An int or a tuple of 2 ints are allowed now.

int output_size;
DMLC_DECLARE_PARAMETER(AdaptiveAvgPoolParam) {
DMLC_DECLARE_FIELD(output_size).set_range(1, 1000)
.describe("output size");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

expand the doc

int width;
DMLC_DECLARE_PARAMETER(BilinearSampleParam) {
DMLC_DECLARE_FIELD(height).set_range(1, 1000)
.describe("output height");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same

@anirudh2290
Copy link
Member

@zhanghang1989 any updates on this ?

@crazyleg
Copy link

@zhanghang1989, can you please provide a code example of using BilinearResize2D in a Symbol-based network fully convolutional network to do ratio based upscaling, when shape is unknown/dynamic? (x2, x3)

@zhanghang1989
Copy link
Contributor Author

@crazyleg

import mxnet as mx
x1 = mx.nd.ones(shape=(2,3,4,4))
y1 = mx.nd.contrib.BilinearResize2D(x1, height=5, width=5)

@crazyleg
Copy link

@zhanghang1989 That's a NDArray way of doing it. I was asking about Symbol-ish way and shape inference.

@zhanghang1989
Copy link
Contributor Author

I am not familiar with Symbol API. Sorry about that.

rahul003 pushed a commit to rahul003/mxnet that referenced this pull request Jun 4, 2018
…pache#9688)

* bilinear upsample from PYTorch

* fix cpu backward

* fix indent, add req

* fix lint

* fix lint

* lint

* handle req

* add adaptive avg pooling operator

* rename to bilinear resize

* fix name

* change assertion

* rm unused var

* refactor using mshadow tensor

* rm devicetensor, only using mshadow

* add docs

* naming

* merge

* Revert "merge"

This reverts commit a2a809a.

* add unit test for BilinearResize2D and AdaptiveAvgPool2D

* for test in python2, cast into float

* mv function inside

* link docs

* address the comments

* lint

* add back private ()

*  correct lint

* decleare var

* link params docs

* fix bug

* mv to contrib and upodate docs

* contrib header

* change include path for contrib

* lint

* register to contrib

* lint

* rename width, height, docs

* rename param

* Patch1 (#1)

* two shapes input

* docs

* typo

* lint

* lint
zheng-da pushed a commit to zheng-da/incubator-mxnet that referenced this pull request Jun 28, 2018
…pache#9688)

* bilinear upsample from PYTorch

* fix cpu backward

* fix indent, add req

* fix lint

* fix lint

* lint

* handle req

* add adaptive avg pooling operator

* rename to bilinear resize

* fix name

* change assertion

* rm unused var

* refactor using mshadow tensor

* rm devicetensor, only using mshadow

* add docs

* naming

* merge

* Revert "merge"

This reverts commit a2a809a.

* add unit test for BilinearResize2D and AdaptiveAvgPool2D

* for test in python2, cast into float

* mv function inside

* link docs

* address the comments

* lint

* add back private ()

*  correct lint

* decleare var

* link params docs

* fix bug

* mv to contrib and upodate docs

* contrib header

* change include path for contrib

* lint

* register to contrib

* lint

* rename width, height, docs

* rename param

* Patch1 (#1)

* two shapes input

* docs

* typo

* lint

* lint
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.