-
Notifications
You must be signed in to change notification settings - Fork 242
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rnn Relu refactoring #2541
base: develop
Are you sure you want to change the base?
Rnn Relu refactoring #2541
Conversation
…kimych/MIOpen into rnn_training_forward_refactor
Пожалуйста, введите сообщение коммита для ваших изменений. Строки,
8a6beb3
to
b230467
Compare
b230467
to
e9f20d5
Compare
@shurale-nkn @JehandadKhan please review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Partial review
@kikimych Is it time to review? I see this is still a draft. |
src/ocl/rnnocl.cpp
Outdated
const int m = direction == rnn_direction::Forward ? fbatches.at(time) | ||
: time == 0 ? rbatches.at(0) | ||
: rbatches.at(time - 1); | ||
const int cur_time = direction == rnn_direction::Forward ? time : seq_len - 1 - time; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kikimych
I remember you said that you would make a structure for direct and reverse access to elements.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Readability
auto batch_id_abs = time == seq_len - 1 ? bacc_per_time[directed_time.cur_time()] : bacc_per_time[directed_time.cur_time()] + batches.at(directed_time.next_time());
vs
auto batch_id_abs = time == seq_len - 1 ? bacc_per_time[cur_time] : bacc_per_time[cur_time] + batches.at(next_time);
for example
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The question was about all code where bacc_per_time or batches is used.
why not class with API : bacc_per_time.at(time, direction);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
auto batch_id_abs = time == seq_len - 1 ? bacc_per_time[cur_time] : bacc_per_time[cur_time] + batches.at(next_time)
vs
auto batch_id_abs = time == seq_len - 1 ? bacc_per_time(time, direction) : bacc_per_time(time,direction) + batches.at(time+1, direction)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so? what conclusion should we get based on this single line?
how this line would look was quite obvious.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After 2 weeks both cases are not obvious at all
src/ocl/rnnocl.cpp
Outdated
|
||
RNNTensorPaddingConverter::ConvertTensorData( | ||
handle, dyDesc[0], in_n, dy, packedDYIn, true); | ||
auto propagate_dhx = [*this, seqLen, &propagate_dhx_prev, &dhx](int layer) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
*this
copy-capture for current class? really?
&dhx
reference to ptr?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
src/ocl/rnnocl.cpp
Outdated
reserveSpace, | ||
&activDesc, | ||
propagate_hidden_output, | ||
propagate_hidden_prev](int layer, int time, int direction) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
src/ocl/rnnocl.cpp
Outdated
const int m = direction == rnn_direction::Forward ? fbatches.at(time) | ||
: time == 0 ? rbatches.at(0) | ||
: rbatches.at(time - 1); | ||
const int cur_time = direction == rnn_direction::Forward ? time : seq_len - 1 - time; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The question was about all code where bacc_per_time or batches is used.
why not class with API : bacc_per_time.at(time, direction);
src/ocl/rnnocl.cpp
Outdated
reserveSpace, | ||
&activDesc, | ||
propagate_hidden_output, | ||
propagate_hidden_prev](int layer, int time, int direction) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so there is no copy of lambda members? ❔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
format failed
@@ -2528,6 +3104,34 @@ void RNNDescriptor::RNNForwardTrainingPackedTensors( | |||
} | |||
return; | |||
} | |||
|
|||
if((rnnMode == miopenRNNRELU || rnnMode == miopenRNNTANH) && !use_dropout && | |||
inputMode != miopenRNNskip && !(miopen::IsDisabled(ENV(MIOPEN_RNNFWD_exp)))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not miopenRNNskip?
src/include/miopen/rnn_util.hpp
Outdated
int first_layer_offset() const { return (in_vec_sz + h_vec_sz) * weight_stride; } | ||
}; | ||
|
||
struct RNNOffsets |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not fixed
src/ocl/rnnocl.cpp
Outdated
const int m = direction == rnn_direction::Forward ? fbatches.at(time) | ||
: time == 0 ? rbatches.at(0) | ||
: rbatches.at(time - 1); | ||
const int cur_time = direction == rnn_direction::Forward ? time : seq_len - 1 - time; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so? what conclusion should we get based on this single line?
how this line would look was quite obvious.
src/include/miopen/rnn_util.hpp
Outdated
enum rnn_direction | ||
{ | ||
Forward = 0, | ||
Backward = 1 | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's mostly a question for @atamazov - do we use enum class
or have plans to switch to enum class
everywhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's prefer enum class. But switching to it everywhere is impossible IIRC.
Backward = 1 | ||
}; | ||
|
||
struct RnnBatches |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
struct X{
int at(int time, RnnDirection direction){
return RnnDirection::Forward ? batches[time] : batches[ (batches.size() - 1) - time];
}
...
std::vector<int> batches;
}
this is true for any direction
.next(time, direction) == .at(time+1, direction)
.prev(time, direction) == .at(time-1, direction)
next_time()
and prev_time()
are redundant functions.
explanation
FWD next_time:
cur_time(time, direction) + 1
=> (time+1)
BWD next_time:
cur_time(time, direction) - 1
=> (batches.size() - time - 1) - 1
=> (batches.size() -
(time+1)
- 1)
BWD prev_time:
cur_time(time, direction) + 1
=> (batches.size() - time - 1) + 1
=> (batches.size() -
(time-1)
- 1)
next()
and prev()
can be used as syntactic sugar if you want, but than remove mix using of +1
and next()
in code
CI reports about a merge conflict that cannot be resolved automatically in rnnocl.cpp |
Extracted Relu|Tanh activation functions related code from RNNForwardTrainingPackedTensors to RNNForwardTrainingTanhRelu, from RNNBackwardDataPackedTensors to RNNBackwardDataPackedTensorsRelu