How to apply pair-wise distillation on depth prediction task #57

mzy97 · 2021-02-23T13:46:49Z

Thank you for sharing this great work!
Q1: I wonder where to use pair-wise distillation loss, apply it at the end of the encoder (for example, 1/16*HW feature map of ResNet) or apply it at every scale of the encoder ( 1/16, 1/8, 1/4...)?
Q2: Can pair-wise distillation work when Teacher's encoder and Student's encoder has different downsample rate, (eg. student downsample input 1/8, while teacher downsamples input 1/16), or decoder structure?
Q3: Can this method used to distill from VNL to structure like FastDepth (different with VNL-student in the decoder), because VNL-student may have heavy decoder.

djmth · 2021-03-11T13:13:19Z

I‘m also confused about the distillation loss for the other two tasks, but especially about the pixel-wise loss.
The pixel-wise loss in the paper is for the segmentation task and is KL divergence, which is obviously not suitable for the depth task.
I really wonder how the pixel-wise loss is implemented, though the author explains this doesn't work for the depth task.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to apply pair-wise distillation on depth prediction task #57

How to apply pair-wise distillation on depth prediction task #57

mzy97 commented Feb 23, 2021

djmth commented Mar 11, 2021

How to apply pair-wise distillation on depth prediction task #57

How to apply pair-wise distillation on depth prediction task #57

Comments

mzy97 commented Feb 23, 2021

djmth commented Mar 11, 2021