-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(mnist): implement adaptive neural network with dynamic architecture #54
Conversation
This commit introduces a new adaptive MNIST neural network demo that automatically optimizes its architecture and training parameters. Key features include: - Dynamic network complexity adjustment based on training performance - Automatic device selection (CPU/MPS) with benchmarking - Adaptive batch size optimization - Hardware-specific optimizations using PyTorch 2.0+ - Comprehensive performance visualization - Advanced regularization techniques Technical details: - Implements AdaptiveNeuralNet with dynamic layer management - Adds AdaptiveLearningSystem for automated optimization - Includes performance monitoring and visualization tools - Supports automatic device selection and benchmarking - Implements adaptive learning rate and regularization Testing: Manual testing completed with MNIST dataset Performance: Achieves >98% accuracy with automatic optimization
Reviewer's Guide by SourceryThis PR implements an adaptive neural network system for MNIST digit classification that automatically optimizes its architecture and training parameters. The implementation uses PyTorch and features dynamic network complexity adjustment, hardware-specific optimizations, and comprehensive performance monitoring. The system is built around two main classes: AdaptiveNeuralNet for the neural network architecture and AdaptiveLearningSystem for training optimization. Sequence diagram for adaptive training processsequenceDiagram
actor User
participant Main
participant AdaptiveLearningSystem
participant AdaptiveNeuralNet
User->>Main: Run adaptive MNIST demo
Main->>AdaptiveLearningSystem: Initialize with model and data loaders
AdaptiveLearningSystem->>AdaptiveNeuralNet: Move model to optimal device
AdaptiveLearningSystem->>AdaptiveNeuralNet: Compile model if supported
loop Train for each epoch
AdaptiveLearningSystem->>AdaptiveNeuralNet: Train one epoch
AdaptiveLearningSystem->>AdaptiveNeuralNet: Evaluate model
AdaptiveLearningSystem->>AdaptiveNeuralNet: Adapt model if needed
end
AdaptiveLearningSystem->>Main: Return training results
Main->>User: Display training progress and results
Class diagram for AdaptiveNeuralNet and AdaptiveLearningSystemclassDiagram
class AdaptiveNeuralNet {
+int input_size
+ModuleList layers
+dict training_history
+float dropout_rate
+float learning_rate
+int current_complexity
+__init__(int input_size, int initial_hidden_size)
+forward(Tensor x) Tensor
+add_complexity()
+add_regularization()
}
class AdaptiveLearningSystem {
+AdaptiveNeuralNet model
+str device
+int optimal_batch_size
+int plateau_threshold
+float improvement_threshold
+int max_complexity
+__init__(AdaptiveNeuralNet model, DataLoader train_loader, DataLoader test_loader)
+benchmark_devices(Module model, int num_iterations) str
+find_optimal_batch_size() int
+update_dataloader(Dataset dataset, bool train) DataLoader
+train_epoch() (float, float)
+calculate_loss() float
+evaluate() float
+check_plateau(list accuracies, float threshold) bool
+adapt_model(int epoch)
+train(int epochs) (list, list, list)
+plot_training_progress()
}
AdaptiveLearningSystem --> AdaptiveNeuralNet
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @leonvanbokhorst - I've reviewed your changes - here's some feedback:
Overall Comments:
- Please add comprehensive unit tests for the adaptation logic and core functionality. Manual testing alone is insufficient for this complexity level.
- Include documentation of performance benchmarks and test results to validate the adaptation strategy effectiveness.
Here's what I looked at during the review
- 🟡 General issues: 3 issues found
- 🟢 Security: all looks good
- 🟢 Testing: all looks good
- 🟡 Complexity: 2 issues found
- 🟢 Documentation: all looks good
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
if self.model.current_complexity < self.max_complexity: | ||
self.model.add_complexity() | ||
# Smaller learning rate increase | ||
self.model.learning_rate *= 1.1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion (performance): Consider using a more sophisticated learning rate adjustment strategy
The current fixed multipliers (1.1 for complexity increase, 0.98 for decay) could lead to unstable training. Consider implementing a learning rate scheduler like ReduceLROnPlateau or CosineAnnealingLR for more stable adaptation.
self.scheduler = optim.lr_scheduler.ReduceLROnPlateau(self.optimizer, mode='min', factor=0.1, patience=5)
self.model.learning_rate = self.optimizer.param_groups[0]['lr']
raise ValueError("No Linear layer found in network") | ||
|
||
current_hidden_size = last_linear.in_features | ||
new_hidden_size = current_hidden_size * 2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue (performance): Add memory safety checks when increasing network complexity
Doubling the hidden size could cause out-of-memory errors on GPU/MPS devices. Consider adding a try-except block and fallback mechanism when memory allocation fails.
|
||
print("\nBenchmarking batch sizes:") | ||
for batch in batch_sizes: | ||
batched_input = sample_input.repeat(batch, 1, 1, 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue: Add error handling for batch size testing
Wrap batch size testing in try-except blocks to gracefully handle out-of-memory errors and skip unsupported batch sizes.
] | ||
return all(abs(imp) < threshold for imp in recent_improvements) | ||
|
||
def adapt_model(self, epoch): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue (complexity): Consider extracting the adaptation logic into a dedicated strategy class to improve code organization.
The adapt_model
method contains complex nested conditionals that make the adaptation logic hard to follow. Consider extracting this into a separate AdaptationStrategy
class:
class AdaptationStrategy:
def __init__(self, max_complexity=4, accuracy_threshold=98.0):
self.max_complexity = max_complexity
self.accuracy_threshold = accuracy_threshold
def should_adapt(self, current_acc, avg_recent_acc, epoch, complexity):
return (
current_acc < self.accuracy_threshold
and current_acc <= avg_recent_acc
and epoch > complexity * 4
)
def get_adaptation(self, model, optimizer, current_complexity):
if current_complexity < self.max_complexity:
return self.increase_complexity(model, optimizer)
return self.adjust_learning_rate(optimizer)
def increase_complexity(self, model, optimizer):
model.add_complexity()
model.learning_rate *= 1.1
return optim.Adam(
model.parameters(),
lr=model.learning_rate,
weight_decay=1e-5
)
def adjust_learning_rate(self, optimizer):
for param_group in optimizer.param_groups:
param_group["lr"] *= 0.98
return optimizer
This simplifies the adapt_model
method to:
def adapt_model(self, epoch):
if len(self.test_accuracies) < 5:
return
recent_accuracies = self.test_accuracies[-5:]
current_acc = self.test_accuracies[-1]
avg_recent_acc = sum(recent_accuracies) / len(recent_accuracies)
if self.strategy.should_adapt(current_acc, avg_recent_acc,
epoch, self.model.current_complexity):
self.optimizer = self.strategy.get_adaptation(
self.model, self.optimizer, self.model.current_complexity)
self.adaptation_points.append((epoch, "Adapted Model"))
This improves maintainability by:
- Separating adaptation rules from execution
- Making thresholds and strategies configurable
- Reducing nesting depth
- Making the adaptation logic easier to test
x = layer(x) | ||
return x | ||
|
||
def add_complexity(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue (complexity): Consider using a pre-initialized maximum architecture with neuron activation control instead of dynamically rebuilding layers.
The dynamic layer rebuilding in add_complexity()
introduces unnecessary complexity. Consider using a simpler activation-based approach with a pre-initialized maximum architecture:
class AdaptiveNeuralNet(nn.Module):
def __init__(self, input_size=784, max_hidden_size=512):
super().__init__()
self.input_size = input_size
self.flatten = nn.Flatten()
# Initialize maximum architecture but only activate part initially
self.hidden_layers = nn.ModuleList([
nn.Sequential(
nn.Linear(input_size, max_hidden_size),
nn.BatchNorm1d(max_hidden_size),
nn.ReLU()
)
])
self.output = nn.Linear(max_hidden_size, 10)
self.active_hidden = max_hidden_size // 8 # Start with smaller size
def forward(self, x):
x = self.flatten(x)
# Only use active portion of layers
for layer in self.hidden_layers:
x = layer(x)
x = x[:, :self.active_hidden] # Use only active neurons
return self.output(x)
def add_complexity(self):
# Simply activate more neurons
self.active_hidden = min(
self.active_hidden * 2,
self.hidden_layers[0][0].out_features
)
This approach:
- Maintains adaptivity while being more maintainable
- Eliminates complex layer rebuilding
- Reduces potential for errors
- Makes the code more predictable
This commit introduces a new adaptive MNIST neural network demo that automatically
optimizes its architecture and training parameters. Key features include:
Technical details:
Testing: Manual testing completed with MNIST dataset
Performance: Achieves >98% accuracy with automatic optimization
Summary by Sourcery
Implement an adaptive neural network for MNIST digit classification that dynamically adjusts its architecture and training parameters based on performance. The system includes features such as automatic device selection, adaptive batch size optimization, and hardware-specific optimizations. Performance visualization tools are also integrated to monitor training progress.
New Features:
Enhancements:
Tests: