This repository was archived by the owner on Mar 1, 2025. It is now read-only.
This repository was archived by the owner on Mar 1, 2025. It is now read-only.
Adapting code for regression #192
Open
Description
I am trying to adapt the Chinese handwriting code to a regression task with input as binary image (very sparse, max 0.5% non zero pixels) and I am having trouble understanding the some of the nomenclature and the role played by spatial_size.
Here is a sample of how my input looks (it is a signal represented in binary form, with each non-zero pixel having 2 features, such as current and voltage values say)
My target output should be a single scalar and this-
self.spatial_size = self.sparseModel.input_spatial_size(torch.LongTensor([1]))
gives me a spatial_size of [11, 11] but my input image dimensions are (2000, 40) (i plan to vary the 40)
My doubts are-
- Should the number of Max Pooling layers and SC convolution layers be picked/arranged in a calculated way that the calculation of the above line of code should result in my input dimensions?
- In the following code, does dimension imply it's value must be set to 2 for images, 3 for point clouds ?
class InputLayer(Module):
def __init__(self, dimension, spatial_size, mode=3):
- In the following code, does nInputPlanes refer to the number of features a non-zero pixel/entry in the coords vector has? (like 3 for RGB values and 2 in my case)
def SparseVggNet(dimension, nInputPlanes, layers):
Any help or comments would be very helpful!!
Metadata
Metadata
Assignees
Labels
No labels