feat(mnist): implement adaptive neural network with dynamic architecture #54

leonvanbokhorst · 2024-11-15T16:27:10Z

This commit introduces a new adaptive MNIST neural network demo that automatically
optimizes its architecture and training parameters. Key features include:

Dynamic network complexity adjustment based on training performance
Automatic device selection (CPU/MPS) with benchmarking
Adaptive batch size optimization
Hardware-specific optimizations using PyTorch 2.0+
Comprehensive performance visualization
Advanced regularization techniques

Technical details:

Implements AdaptiveNeuralNet with dynamic layer management
Adds AdaptiveLearningSystem for automated optimization
Includes performance monitoring and visualization tools
Supports automatic device selection and benchmarking
Implements adaptive learning rate and regularization

Testing: Manual testing completed with MNIST dataset
Performance: Achieves >98% accuracy with automatic optimization

Summary by Sourcery

Implement an adaptive neural network for MNIST digit classification that dynamically adjusts its architecture and training parameters based on performance. The system includes features such as automatic device selection, adaptive batch size optimization, and hardware-specific optimizations. Performance visualization tools are also integrated to monitor training progress.

New Features:

Introduce an adaptive MNIST neural network demo that automatically optimizes its architecture and training parameters.

Enhancements:

Implement dynamic network complexity adjustment based on training performance.
Add automatic device selection and benchmarking for optimal hardware usage.
Incorporate adaptive batch size optimization for improved training efficiency.
Utilize hardware-specific optimizations using PyTorch 2.0+ for enhanced performance.
Include comprehensive performance visualization tools.

Tests:

Conduct manual testing with the MNIST dataset to ensure functionality and performance.

This commit introduces a new adaptive MNIST neural network demo that automatically optimizes its architecture and training parameters. Key features include: - Dynamic network complexity adjustment based on training performance - Automatic device selection (CPU/MPS) with benchmarking - Adaptive batch size optimization - Hardware-specific optimizations using PyTorch 2.0+ - Comprehensive performance visualization - Advanced regularization techniques Technical details: - Implements AdaptiveNeuralNet with dynamic layer management - Adds AdaptiveLearningSystem for automated optimization - Includes performance monitoring and visualization tools - Supports automatic device selection and benchmarking - Implements adaptive learning rate and regularization Testing: Manual testing completed with MNIST dataset Performance: Achieves >98% accuracy with automatic optimization

sourcery-ai · 2024-11-15T16:27:14Z

Reviewer's Guide by Sourcery

This PR implements an adaptive neural network system for MNIST digit classification that automatically optimizes its architecture and training parameters. The implementation uses PyTorch and features dynamic network complexity adjustment, hardware-specific optimizations, and comprehensive performance monitoring. The system is built around two main classes: AdaptiveNeuralNet for the neural network architecture and AdaptiveLearningSystem for training optimization.

Sequence diagram for adaptive training process

sequenceDiagram
    actor User
    participant Main
    participant AdaptiveLearningSystem
    participant AdaptiveNeuralNet
    User->>Main: Run adaptive MNIST demo
    Main->>AdaptiveLearningSystem: Initialize with model and data loaders
    AdaptiveLearningSystem->>AdaptiveNeuralNet: Move model to optimal device
    AdaptiveLearningSystem->>AdaptiveNeuralNet: Compile model if supported
    loop Train for each epoch
        AdaptiveLearningSystem->>AdaptiveNeuralNet: Train one epoch
        AdaptiveLearningSystem->>AdaptiveNeuralNet: Evaluate model
        AdaptiveLearningSystem->>AdaptiveNeuralNet: Adapt model if needed
    end
    AdaptiveLearningSystem->>Main: Return training results
    Main->>User: Display training progress and results

Class diagram for AdaptiveNeuralNet and AdaptiveLearningSystem

classDiagram
    class AdaptiveNeuralNet {
        +int input_size
        +ModuleList layers
        +dict training_history
        +float dropout_rate
        +float learning_rate
        +int current_complexity
        +__init__(int input_size, int initial_hidden_size)
        +forward(Tensor x) Tensor
        +add_complexity()
        +add_regularization()
    }
    class AdaptiveLearningSystem {
        +AdaptiveNeuralNet model
        +str device
        +int optimal_batch_size
        +int plateau_threshold
        +float improvement_threshold
        +int max_complexity
        +__init__(AdaptiveNeuralNet model, DataLoader train_loader, DataLoader test_loader)
        +benchmark_devices(Module model, int num_iterations) str
        +find_optimal_batch_size() int
        +update_dataloader(Dataset dataset, bool train) DataLoader
        +train_epoch() (float, float)
        +calculate_loss() float
        +evaluate() float
        +check_plateau(list accuracies, float threshold) bool
        +adapt_model(int epoch)
        +train(int epochs) (list, list, list)
        +plot_training_progress()
    }
    AdaptiveLearningSystem --> AdaptiveNeuralNet

File-Level Changes

Change	Details	Files
Implementation of adaptive neural network architecture	Created base neural network with dynamic layer management Added complexity increase mechanism that doubles hidden layer size Implemented adaptive regularization with adjustable dropout rate Added performance history tracking for adaptation decisions	`pocs/adaptive_mnist_demo.py`
Implementation of adaptive learning system with hardware optimization	Added automatic device selection between CPU and MPS Implemented batch size optimization through benchmarking Added PyTorch 2.0+ compilation optimization support Created parallel data loading with pinned memory optimization	`pocs/adaptive_mnist_demo.py`
Implementation of training and adaptation logic	Added plateau detection for architecture adaptation Implemented dynamic learning rate adjustment Created comprehensive training loop with metrics tracking Added visualization system for training progress	`pocs/adaptive_mnist_demo.py`
Added data augmentation and optimization configurations	Implemented MNIST dataset loading with transformations Added random affine transformations for training data Configured CPU thread optimization Enabled cuDNN benchmarking	`pocs/adaptive_mnist_demo.py`

Tips and commands

Interacting with Sourcery

Trigger a new review: Comment @sourcery-ai review on the pull request.
Continue discussions: Reply directly to Sourcery's review comments.
Generate a GitHub issue from a review comment: Ask Sourcery to create an
issue from a review comment by replying to it.
Generate a pull request title: Write @sourcery-ai anywhere in the pull
request title to generate a title at any time.
Generate a pull request summary: Write @sourcery-ai summary anywhere in
the pull request body to generate a PR summary at any time. You can also use
this command to specify where the summary should be inserted.

Customizing Your Experience

Access your dashboard to:

Enable or disable review features such as the Sourcery-generated pull request
summary, the reviewer's guide, and others.
Change the review language.
Add, remove or edit custom review instructions.
Adjust other review settings.

Getting Help

Contact our support team for questions or feedback.
Visit our documentation for detailed guides and information.
Keep in touch with the Sourcery team by following us on X/Twitter, LinkedIn or GitHub.

sourcery-ai

Hey @leonvanbokhorst - I've reviewed your changes - here's some feedback:

Overall Comments:

Please add comprehensive unit tests for the adaptation logic and core functionality. Manual testing alone is insufficient for this complexity level.
Include documentation of performance benchmarks and test results to validate the adaptation strategy effectiveness.

Here's what I looked at during the review

🟡 General issues: 3 issues found
🟢 Security: all looks good
🟢 Testing: all looks good
🟡 Complexity: 2 issues found
🟢 Documentation: all looks good

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

sourcery-ai · 2024-11-15T16:28:12Z

pocs/adaptive_mnist_demo.py

+                if self.model.current_complexity < self.max_complexity:
+                    self.model.add_complexity()
+                    # Smaller learning rate increase
+                    self.model.learning_rate *= 1.1


suggestion (performance): Consider using a more sophisticated learning rate adjustment strategy

The current fixed multipliers (1.1 for complexity increase, 0.98 for decay) could lead to unstable training. Consider implementing a learning rate scheduler like ReduceLROnPlateau or CosineAnnealingLR for more stable adaptation.

self.scheduler = optim.lr_scheduler.ReduceLROnPlateau(self.optimizer, mode='min', factor=0.1, patience=5) self.model.learning_rate = self.optimizer.param_groups[0]['lr']

sourcery-ai · 2024-11-15T16:28:12Z

pocs/adaptive_mnist_demo.py

+            raise ValueError("No Linear layer found in network")
+
+        current_hidden_size = last_linear.in_features
+        new_hidden_size = current_hidden_size * 2


issue (performance): Add memory safety checks when increasing network complexity

Doubling the hidden size could cause out-of-memory errors on GPU/MPS devices. Consider adding a try-except block and fallback mechanism when memory allocation fails.

sourcery-ai · 2024-11-15T16:28:12Z

pocs/adaptive_mnist_demo.py

+
+        print("\nBenchmarking batch sizes:")
+        for batch in batch_sizes:
+            batched_input = sample_input.repeat(batch, 1, 1, 1)


issue: Add error handling for batch size testing

Wrap batch size testing in try-except blocks to gracefully handle out-of-memory errors and skip unsupported batch sizes.

sourcery-ai · 2024-11-15T16:28:12Z

pocs/adaptive_mnist_demo.py

+        ]
+        return all(abs(imp) < threshold for imp in recent_improvements)
+
+    def adapt_model(self, epoch):


issue (complexity): Consider extracting the adaptation logic into a dedicated strategy class to improve code organization.

The adapt_model method contains complex nested conditionals that make the adaptation logic hard to follow. Consider extracting this into a separate AdaptationStrategy class:

class AdaptationStrategy: def __init__(self, max_complexity=4, accuracy_threshold=98.0): self.max_complexity = max_complexity self.accuracy_threshold = accuracy_threshold def should_adapt(self, current_acc, avg_recent_acc, epoch, complexity): return ( current_acc < self.accuracy_threshold and current_acc <= avg_recent_acc and epoch > complexity * 4 ) def get_adaptation(self, model, optimizer, current_complexity): if current_complexity < self.max_complexity: return self.increase_complexity(model, optimizer) return self.adjust_learning_rate(optimizer) def increase_complexity(self, model, optimizer): model.add_complexity() model.learning_rate *= 1.1 return optim.Adam( model.parameters(), lr=model.learning_rate, weight_decay=1e-5 ) def adjust_learning_rate(self, optimizer): for param_group in optimizer.param_groups: param_group["lr"] *= 0.98 return optimizer

This simplifies the adapt_model method to:

def adapt_model(self, epoch): if len(self.test_accuracies) < 5: return recent_accuracies = self.test_accuracies[-5:] current_acc = self.test_accuracies[-1] avg_recent_acc = sum(recent_accuracies) / len(recent_accuracies) if self.strategy.should_adapt(current_acc, avg_recent_acc, epoch, self.model.current_complexity): self.optimizer = self.strategy.get_adaptation( self.model, self.optimizer, self.model.current_complexity) self.adaptation_points.append((epoch, "Adapted Model"))

This improves maintainability by:

Separating adaptation rules from execution

Making thresholds and strategies configurable

Reducing nesting depth

Making the adaptation logic easier to test

sourcery-ai · 2024-11-15T16:28:12Z

pocs/adaptive_mnist_demo.py

+            x = layer(x)
+        return x
+
+    def add_complexity(self):


issue (complexity): Consider using a pre-initialized maximum architecture with neuron activation control instead of dynamically rebuilding layers.

The dynamic layer rebuilding in add_complexity() introduces unnecessary complexity. Consider using a simpler activation-based approach with a pre-initialized maximum architecture:

class AdaptiveNeuralNet(nn.Module): def __init__(self, input_size=784, max_hidden_size=512): super().__init__() self.input_size = input_size self.flatten = nn.Flatten() # Initialize maximum architecture but only activate part initially self.hidden_layers = nn.ModuleList([ nn.Sequential( nn.Linear(input_size, max_hidden_size), nn.BatchNorm1d(max_hidden_size), nn.ReLU() ) ]) self.output = nn.Linear(max_hidden_size, 10) self.active_hidden = max_hidden_size // 8 # Start with smaller size def forward(self, x): x = self.flatten(x) # Only use active portion of layers for layer in self.hidden_layers: x = layer(x) x = x[:, :self.active_hidden] # Use only active neurons return self.output(x) def add_complexity(self): # Simply activate more neurons self.active_hidden = min( self.active_hidden * 2, self.hidden_layers[0][0].out_features )

This approach:

Maintains adaptivity while being more maintainable

Eliminates complex layer rebuilding

Reduces potential for errors

Makes the code more predictable

leonvanbokhorst self-assigned this Nov 15, 2024

leonvanbokhorst added documentation Improvements or additions to documentation enhancement New feature or request labels Nov 15, 2024

leonvanbokhorst added this to the Phase 1 milestone Nov 15, 2024

leonvanbokhorst merged commit ee64284 into main Nov 15, 2024
1 check failed

leonvanbokhorst deleted the adaptive-misnt branch November 15, 2024 16:28

sourcery-ai bot approved these changes Nov 15, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(mnist): implement adaptive neural network with dynamic architecture #54

feat(mnist): implement adaptive neural network with dynamic architecture #54

leonvanbokhorst commented Nov 15, 2024 •

edited by sourcery-ai bot

Loading

sourcery-ai bot commented Nov 15, 2024 •

edited

Loading

Interacting with Sourcery

Customizing Your Experience

Getting Help

sourcery-ai bot left a comment

sourcery-ai bot Nov 15, 2024

sourcery-ai bot Nov 15, 2024

sourcery-ai bot Nov 15, 2024

sourcery-ai bot Nov 15, 2024

sourcery-ai bot Nov 15, 2024

feat(mnist): implement adaptive neural network with dynamic architecture #54

feat(mnist): implement adaptive neural network with dynamic architecture #54

Conversation

leonvanbokhorst commented Nov 15, 2024 • edited by sourcery-ai bot Loading

Summary by Sourcery

sourcery-ai bot commented Nov 15, 2024 • edited Loading

Reviewer's Guide by Sourcery

Sequence diagram for adaptive training process

Class diagram for AdaptiveNeuralNet and AdaptiveLearningSystem

File-Level Changes

Interacting with Sourcery

Customizing Your Experience

Getting Help

sourcery-ai bot left a comment

Choose a reason for hiding this comment

sourcery-ai bot Nov 15, 2024

Choose a reason for hiding this comment

sourcery-ai bot Nov 15, 2024

Choose a reason for hiding this comment

sourcery-ai bot Nov 15, 2024

Choose a reason for hiding this comment

sourcery-ai bot Nov 15, 2024

Choose a reason for hiding this comment

sourcery-ai bot Nov 15, 2024

Choose a reason for hiding this comment

leonvanbokhorst commented Nov 15, 2024 •

edited by sourcery-ai bot

Loading

sourcery-ai bot commented Nov 15, 2024 •

edited

Loading