Skip to content
This repository was archived by the owner on Mar 1, 2025. It is now read-only.
This repository was archived by the owner on Mar 1, 2025. It is now read-only.

Adapting code for regression #192

Open
@logicatcore

Description

@logicatcore

I am trying to adapt the Chinese handwriting code to a regression task with input as binary image (very sparse, max 0.5% non zero pixels) and I am having trouble understanding the some of the nomenclature and the role played by spatial_size.

Here is a sample of how my input looks (it is a signal represented in binary form, with each non-zero pixel having 2 features, such as current and voltage values say)
image

My target output should be a single scalar and this-

self.spatial_size = self.sparseModel.input_spatial_size(torch.LongTensor([1]))

gives me a spatial_size of [11, 11] but my input image dimensions are (2000, 40) (i plan to vary the 40)

My doubts are-

  • Should the number of Max Pooling layers and SC convolution layers be picked/arranged in a calculated way that the calculation of the above line of code should result in my input dimensions?
  • In the following code, does dimension imply it's value must be set to 2 for images, 3 for point clouds ?
class InputLayer(Module):
    def __init__(self, dimension, spatial_size, mode=3):
  • In the following code, does nInputPlanes refer to the number of features a non-zero pixel/entry in the coords vector has? (like 3 for RGB values and 2 in my case)
def SparseVggNet(dimension, nInputPlanes, layers):

Any help or comments would be very helpful!!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions