Skip to content

Commit

Permalink
v1.0.5 -> optional input clamping
Browse files Browse the repository at this point in the history
  • Loading branch information
JacksonBurns committed Jun 20, 2024
1 parent 8f4f731 commit 75ed783
Show file tree
Hide file tree
Showing 3 changed files with 18 additions and 2 deletions.
3 changes: 2 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,7 @@ See the `examples` and `benchmarks` directories to see how to run training - the
There are four distinct steps in `fastprop` that define its framework:
1. Featurization - transform the input molecules (as SMILES strings) into an array of molecular descriptors which are saved
2. Preprocessing - clean the descriptors by removing or imputing missing values then rescaling the remainder
3. Training - send the processed input to the neural network, which is a simple FNN (sequential fully-connected layers with an activation function between)
3. Training - send the processed input to the neural network, which is a simple FNN (sequential fully-connected layers with an activation function between), optionally limiting the inputs to +/-3 standard deviations to aid in extrapolation
4. Prediction - save the trained model for future use

## Configurable Parameters
Expand All @@ -76,6 +76,7 @@ There are four distinct steps in `fastprop` that define its framework:
_and_
- Number of FNN layers (default 2; repeated fully connected layers of hidden size)
- Hidden Size: number of neurons per FNN layer (default 1800)
- Clamp Input: Enable/Disable input clamp to +/-3 to aid in extrapolation (default False).

_or_
- Hyperparameter optimization: runs hyperparameter optimization identify the optimal number of layers and hidden size
Expand Down
15 changes: 15 additions & 0 deletions fastprop/model.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,18 @@
logger = init_logger(__name__)


class ClampN(torch.nn.Module):
def __init__(self, n: float) -> None:
super().__init__()
self.n = n

def forward(self, batch: torch.Tensor):
return torch.clamp(batch, min=-self.n, max=self.n)

def extra_repr(self) -> str:
return f"n={self.n}"


class fastprop(pl.LightningModule):
def __init__(
self,
Expand All @@ -27,6 +39,7 @@ def __init__(
num_tasks: int = 1,
learning_rate: float = 0.001,
fnn_layers: int = 2,
clamp_input: bool = False,
problem_type: Literal["regression", "binary", "multiclass", "multilabel"] = "regression",
target_names: List[str] = [],
feature_means: Optional[torch.Tensor] = None,
Expand All @@ -43,6 +56,7 @@ def __init__(
num_tasks (int, optional): Number of distinct tasks. Defaults to 1.
learning_rate (float, optional): Learning rate for SGD. Defaults to 0.001.
fnn_layers (int, optional): Number of hidden layers. Defaults to 2.
clamp_input (bool, optional): Clamp inputs to +/-3 to aid extrapolation. Defaults to False.
problem_type (Literal["regression", "binary", "multiclass", "multilabel"], optional): Type of training task. Defaults to "regression".
target_names (list[str], optional): Names for targets in dataset, blank for simple integer names. Defaults to [].
feature_means (Optional[torch.Tensor], optional): Means for scaling features in regression. Defaults to None.
Expand All @@ -63,6 +77,7 @@ def __init__(

# fully-connected nn
layers = OrderedDict()
layers["clamp"] = ClampN(3)
for i in range(fnn_layers):
layers[f"lin{i+1}"] = torch.nn.Linear(input_size if i == 0 else hidden_size, hidden_size)
if fnn_layers == 1 or i < (fnn_layers - 1): # no output activation, unless single layer
Expand Down
2 changes: 1 addition & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"

[project]
name = "fastprop"
version = "1.0.4"
version = "1.0.5"
authors = [
{ name = "Jackson Burns" },
]
Expand Down

0 comments on commit 75ed783

Please sign in to comment.