Skip to content

trungle14/AWS-Sagemaker-ML-Pipeline

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 

Repository files navigation

Machine Learning Model - Cloud Computing and AWS

The abalone dataset has been used to predict the age of abalone from physical measurements. The age of abalone is determined by cutting the shell through the cone, staining it, and counting the number of rings through a microscope - - a boring and time-consuming task. Other measurements, which are easier to obtain, are used to predict the age. Further information, such as weather patterns and location(hence food availability) may be required to solve the problem.\

More information about the dataset is at: https:// archive.ics.uci.edu/ml/datasets/abalone

A copy of the data is available through sagemaker sample data files at s3:

// sagemaker-sample-files/datasets/tabular/uci_abalone/abalone.csv

Data Collection Collect the raw data from various sources (e.g., databases, CSV files on S3).

Data Preparation

Preprocess the data (cleaning, normalization, feature engineering). Split the data into training, validation, and test sets. Select a suitable machine learning algorithm for the problem at hand. Configure SageMaker Training Job

Set up the training job in SageMaker, specifying: The algorithm or bring your own model Resource requirements (instance type, count) Input and output data locations (usually S3 paths) Hyperparameters

Train the Model

Execute the training job in SageMaker. Monitor the training process through SageMaker console or CloudWatch.

Model Evaluation

Evaluate the trained model's performance using the test dataset. If the performance is unsatisfactory, return to step 3, 4, or 5 as needed. Model Deployment

Create a SageMaker model by specifying the S3 location of the trained model artifacts and the Docker container for inference. Deploy the model to a SageMaker endpoint for real-time or batch predictions. Monitor and Update

Monitor the model's performance in production using SageMaker monitoring tools. Update the model as necessary by retraining with new data or adjusting hyperparameters.

The process concludes when the model is successfully deployed and operating as expected.

Importance Notice

Iterative Process: Model development in SageMaker is iterative. Based on model evaluation results, you may need to adjust preprocessing steps, reselect algorithms, or retune hyperparameters.

Automation: Many steps can be automated using SageMaker's built-in features, such as Automatic Model Tuning for hyperparameter optimization, and SageMaker Pipelines for automating and orchestrating the entire ML workflow.

Integration: SageMaker integrates with other AWS services, like S3 for data storage, CloudWatch for monitoring, and IAM for access control, providing a comprehensive and secure ML lifecycle environment.

Screenshot 2024-02-07 at 11 03 11

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published