yufei
December 1, 2015 5:37 PM
J. F. Henriques, R. Caseiro, P. Martins, J. Batista
Observation:
- Such sets of samples are riddled with redundancies
- any overlapping pixels are constrained to be the same.
We can diagonalize it with the Discrete Fourier Transform, reducing both storage and computation by several orders of magnitude.
KCF=ridge regression + Circulant data
ridge regression
<script type="math/tex; mode=display" id="MathJax-Element-133">
\min_\mathbb w \sum_i(f(\mathbb x_i) -y_i)^2+\lambda ||\mathbb w||^2
</script>
<script type="math/tex; mode=display" id="MathJax-Element-134">
\mathbb w =(X^TX + \lambda I)^{-1}X^Ty
</script>
Circulant data
<script type="math/tex; mode=display" id="MathJax-Element-135">
C(u)v= \mathcal F^{-1}(\mathcal F(u)\odot\mathcal F(v))
</script>
卷积定理:空域卷积 == 频域乘积
1. 卷积定理
Given:
<script type="math/tex; mode=display" id="MathJax-Element-138"> corr(\mathbf x, \mathbf x') = \mathcal F(\mathbf x)\odot \mathcal F(\mathbf x') </script>
0 | <script type="math/tex" id="MathJax-Element-139">x'_1</script> | <script type="math/tex" id="MathJax-Element-140">x'_2</script> | <script type="math/tex" id="MathJax-Element-141">x'_3</script> | 0 |
---|---|---|---|---|
0 | <script type="math/tex" id="MathJax-Element-142">x_1</script> | <script type="math/tex" id="MathJax-Element-143">x_2</script> | <script type="math/tex" id="MathJax-Element-144">x_3</script> | 0 |
<script type="math/tex" id="MathJax-Element-145">x_1</script> | <script type="math/tex" id="MathJax-Element-146">x_2</script> | <script type="math/tex" id="MathJax-Element-147">x_3</script> | 0 | 0 |
<script type="math/tex" id="MathJax-Element-148">x_2</script> | <script type="math/tex" id="MathJax-Element-149">x_3</script> | 0 | 0 | <script type="math/tex" id="MathJax-Element-150">x_1</script> |
<script type="math/tex" id="MathJax-Element-151">x_3</script> | 0 | 0 | <script type="math/tex" id="MathJax-Element-152">x_1</script> | <script type="math/tex" id="MathJax-Element-153">x_2</script> |
0 | 0 | <script type="math/tex" id="MathJax-Element-154">x_1</script> | <script type="math/tex" id="MathJax-Element-155">x_2</script> | <script type="math/tex" id="MathJax-Element-156">x_3</script> |
<script type="math/tex; mode=display" id="MathJax-Element-157"> corr(\mathbf x, \mathbf y) = \mathcal F( \tilde {\mathbf x})\odot \mathcal F(\tilde{\mathbf y}) </script>
2. Training
<script type="math/tex; mode=display" id="MathJax-Element-158">
\mathcal F({\mathbf w}) = \frac{\mathcal F(\mathbf x) \odot \mathcal F(\mathbf y)}{\mathcal F(\mathbf x) \odot \mathcal F(\mathbf x) + \lambda }
</script>
3. testing
<script type="math/tex; mode=display" id="MathJax-Element-159">
\mathbf y'=\mathcal F^{-1}(\mathcal F(\mathbf w) \odot \mathcal F(\mathbf x') )
</script>
<script type="math/tex; mode=display" id="MathJax-Element-160"> [p_x,p_y] = \arg \max\mathbf y' </script>
4. Kernel
<script type="math/tex; mode=display" id="MathJax-Element-161">
\kappa(\mathbf x, \mathbf x') = h(||\mathbf x - \mathbf x'||^2)
=h(||\mathbf x||^2 +||\mathbf x'||^2 - 2\mathbf x^T\mathbf x' )
</script>
<script type="math/tex; mode=display" id="MathJax-Element-162">
=h(||\mathbf x||^2 +||\mathbf x'||^2 - 2\mathcal F^{-1} (\mathcal F({\mathbf x}) \odot \mathcal F({\mathbf x'}) )
</script>
5. kernel ridge regression
<script type="math/tex; mode=display" id="MathJax-Element-163">
\alpha = (\kappa+ \lambda I)^{-1}\mathbf y
</script>
Solustions:
- Training:
<script type="math/tex; mode=display" id="MathJax-Element-164"> \alpha = \mathcal F^{-1}(\frac{\mathcal F(\mathbf y)}{\mathcal F(\kappa (\mathbf x, \mathbf x)) + \lambda}) </script> - Prediction:
<script type="math/tex; mode=display" id="MathJax-Element-165"> \hat {\mathbf y} = \mathcal F^{-1}( \mathcal F(\kappa(\mathbf x, \mathbf x')) \odot \mathcal F(\alpha)) </script>
Training:
$\mathcal F(\mathbf y) $
% expected response
state.window = floor(state.size*2.5);
sz = fliplr(floor(state.window/state.cell_size));
[rs, cs] = ndgrid((1:sz(1)) - bitshift(sz(1)+1,-1), (1:sz(2)) - bitshift(sz(2)+1,-1));
y = exp(-0.5 / output_sigma^2 * (rs.^2 + cs.^2));
y = circshift(y, bitshift(sz+2,-1));
state.yf = fft2(y);
- <script type="math/tex" id="MathJax-Element-166">\mathcal F(\kappa(\mathbf x, \mathbf x))</script>
x = get_region(im, state);
x = double(fhog(single(x) / 255, state.cell_size,9)); x(:,:,end) = [];
x = bsxfun(@times, x, state.cos_window);
xf = fft2(x);
kf = dense_gauss_kernel(xf,xf);
function kf = dense_gauss_kernel(xf, yf)
sigma = 0.5;
N = size(xf,1) * size(xf,2);
xx = xf(:)' * xf(:) / N; %squared norm of x
yy = yf(:)' * yf(:) / N; %squared norm of y
%cross-correlation term in Fourier domain
xyf = xf .* conj(yf);
xy = sum(real(ifft2(xyf)), 3); %to spatial domain
kf = fft2(exp(-1 / sigma^2 * max(0, (xx + yy - 2 * xy) / numel(xf))));
end
- <script type="math/tex" id="MathJax-Element-167">\alpha</script>
new_alphaf = state.yf ./(kf + 0.0001);
- <script type="math/tex" id="MathJax-Element-168">\mathcal F(\kappa(\mathbf x, \mathbf x'))</script>
% next frame
x = get_region(im, state);
x = double(fhog(single(x) / 255, state.cell_size,9)); x(:,:,end) = [];
x = bsxfun(@times, x, state.cos_window);
xf = fft2(x);
kf = dense_gauss_kernel( xf, state.z); % z is template
state.z = (1 - interp_factor) * state.z + interp_factor * xf;
- <script type="math/tex" id="MathJax-Element-169">[p_x,p_y] = \arg \max \hat{\mathbf y},</script>
response = real(ifft2(state.alphaf .* kf));
[yc, xc] = find(response == max(response(:)), 1);
Experiments:
- dataset: OTB 50 videoes
- algorithms: MOSSE, TLD, CT, STRUCK,
Conlusions
Contributions:
- the connection between Ridge Regression with cyclically shifted samples and classical correlation filters.
- it proposed closed-form solutions to compute kernels at all cyclic shifts.
- extend the original work to deal with multiple channels.
Beyond:
- Scale ?
- Loss functions: hingle loss?, exponent loss?
- … …
Martin Danelljan, Gustav H?ger, Fahad Shahbaz Khan and Michael Felsberg.
Matlab
scale = pyramid + correlation filter
先估计目标空间位置,然后估计目标的尺度。
step1: The Translation Filter:
- object size : <script type="math/tex" id="MathJax-Element-170">P\times R</script>
- scales: <script type="math/tex" id="MathJax-Element-171">n \in \{ \lfloor -\frac{s-1}{2} \rfloor, \cdots,\lfloor \frac{s-1}{2} \rfloor \}</script>
- patch <script type="math/tex" id="MathJax-Element-172">J_n</script> size: <script type="math/tex" id="MathJax-Element-173">a^n P \times a^nR</script> :Scale increment factor
Step1: scales setting
nScales= 33; % number of scale levels (denoted "S" in the paper)
scale_step = 1.02; % Scale increment factor (denoted "a" in the paper)
% scale factors
ss = 1:nScales;
scaleFactors = scale_step.^(ceil(nScales/2) - ss);
**Step2: expected scale respons <script type="math/tex" id="MathJax-Element-174">\mathbf y</script>**
% desired scale filter output (gaussian shaped), bandwidth proportional to
% number of scales
scale_sigma = nScales/sqrt(33) * scale_sigma_factor;
ss = (1:nScales) - ceil(nScales/2);
ys = exp(-0.5 * (ss.^2) / scale_sigma^2);
ysf = single(fft(ys));
Step3: scale pathes extracting
- 将目标的尺度(宽度和高度)进行缩放,获取对应位置的图像patch;
- 将patch resize为模板的大小,以便和模板比对;
- 提取新patch的hog特征;
- 将hog特征转为一个列向量,然后乘以hanning窗
function out = get_scale_sample(im, pos, base_target_sz, scaleFactors, scale_window, scale_model_sz)
% out = get_scale_sample(im, pos, base_target_sz, scaleFactors, scale_window, scale_model_sz)
%
% Extracts a sample for the scale filter at the current
% location and scale.
nScales = length(scaleFactors);
for s = 1:nScales
patch_sz = floor(base_target_sz * scaleFactors(s));
xs = floor(pos(2)) + (1:patch_sz(2)) - floor(patch_sz(2)/2);
ys = floor(pos(1)) + (1:patch_sz(1)) - floor(patch_sz(1)/2);
% check for out-of-bounds coordinates, and set them to the values at
% the borders
xs(xs < 1) = 1;
ys(ys < 1) = 1;
xs(xs > size(im,2)) = size(im,2);
ys(ys > size(im,1)) = size(im,1);
% extract image
im_patch = im(ys, xs, :);
% resize image to model size
im_patch_resized = mexResize(im_patch, scale_model_sz, 'auto');
% extract scale features
temp_hog = fhog(single(im_patch_resized), 4);
temp = temp_hog(:,:,1:31);
if s == 1
out = zeros(numel(temp), nScales, 'single');
end
% window
out(:,s) = temp(:) * scale_window(s);
end
**step4: 计算<script type="math/tex" id="MathJax-Element-175">\alpha</script>**
xs = get_scale_sample(im, pos, base_target_sz, currentScaleFactor * scaleFactors, scale_window, scale_model_sz);
% calculate the scale filter update
xsf = fft(xs,[],2);
new_sf_num = bsxfun(@times, ysf, conj(xsf));
new_sf_den = sum(xsf .* conj(xsf), 1);
**Step5: Prediction <script type="math/tex" id="MathJax-Element-176">\mathbf y'</script>**
% extract the test sample feature map for the scale filter
xs = get_scale_sample(im, pos, base_target_sz, currentScaleFactor * scaleFactors, scale_window, scale_model_sz);
% calculate the correlation response of the scale filter
xsf = fft(xs,[],2);
scale_response = real(ifft(sum(sf_num .* xsf, 1) ./ (sf_den + lambda)));
Step6: 估计目标尺度
% find the maximum scale response
recovered_scale = find(scale_response == max(scale_response(:)), 1);
% update the scale
currentScaleFactor = currentScaleFactor * scaleFactors(recovered_scale);
- dateset: ** 28 sequences**1 annotated with the scale variation attribute
- algorithms:
- learning discriminative correlation filters based on a scale pyramid representation.
- transform = 2D correlation filter , scale = 1D correlation filter
pyramid + correlation filter
- 若空间位置不准,scale估计也会出现大的偏差;
- 没有用到kernel也取得了好的效果;
- 和以前的方法(NCC)相比,只是将响应期望由冲击响应变为gauss或者laplacian分布;
Yang Li, Jianke Zhu
- Problem: scale estimation
- Idea:
- multi-scales: sampling the patch to different scales compared with template.
- feature : hog + cn
- Codes
search_size = [1 0.985 0.99 0.995 1.005 1.01 1.015];%
for i=1:size(search_size,2)
tmp_sz = floor((target_sz * (1 + padding))*search_size(i));
param0 = [pos(2), pos(1), tmp_sz(2)/window_sz(2), 0,...
tmp_sz(1)/window_sz(2)/(window_sz(1)/window_sz(2)),0];
param0 = affparam2mat(param0);
patch = uint8(warpimg(double(im), param0, window_sz));
zf = fft2(get_features(patch, features, cell_size, cos_window,w2c));
response(:,:,i) = real(ifft2(model_alphaf .* kzf));
end
[vert_delta,tmp, horiz_delta] = find(response == max(response(:)), 1);
szid = floor((tmp-1)/(size(cos_window,2)))+1;
horiz_delta = tmp - ((szid -1)* size(cos_window,2));
if vert_delta > size(zf,1) / 2,
vert_delta = vert_delta - size(zf,1);
end
if horiz_delta > size(zf,2) / 2, %same for horizontal axis
horiz_delta = horiz_delta - size(zf,2);
end
tmp_sz = floor((target_sz * (1 + padding))*search_size(szid));
current_size = tmp_sz(2)/window_sz(2);
pos = pos + current_size*cell_size * [vert_delta - 1, horiz_delta - 1];
Yang Li, Jianke Zhu, Steven C.H. Hoi
Kaihua Zhang, Lei Zhang, Qingshan Liu, David Zhang, and Ming-Hsuan Yang
European Conference on Computer Vision (ECCV 2014), pp. 127-141, Zurich, Switzerland, September, 2014.
Matlab
[Project] (http://www.cvl.isy.liu.se/research/objrec/visualtracking/regvistrack/index.html)
Martin Danelljan, Fahad Shahbaz Khan, Michael Felsberg and Joost van de Weijer.
In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2014 (Oral).
Matlab
Chao Ma, Xiaokang Yang, Chongyang Zhang, and Ming-Hsuan Yang,
Project
YiWu, Jongwoo Lim, and Ming-Hsuan Yang. Online object tracking: A benchmark. in CVPR, 2013. ↩