diff --git a/contents/ml_systems/ml_systems.qmd b/contents/ml_systems/ml_systems.qmd index d677bc43..9eeda438 100644 --- a/contents/ml_systems/ml_systems.qmd +++ b/contents/ml_systems/ml_systems.qmd @@ -12,7 +12,7 @@ Resources: [Slides](#sec-ml-systems-resource), [Videos](#sec-ml-systems-resource Machine learning (ML) systems, built on the foundation of computing systems, hold the potential to transform our world. These systems, with their specialized roles and real-time computational capabilities, represent a critical junction where data and computation meet on a micro-scale. They are specifically tailored to optimize performance, energy usage, and spatial efficiency—key factors essential for the successful implementation of ML systems. -As this chapter progresses, we will explore embedded systems' complex and fascinating world. We'll gain insights into their structural design and operational features and understand their key role in powering ML applications. Starting with the basics of microcontroller units, we will examine the interfaces and peripherals that improve their functionalities. This chapter is designed to be a comprehensive guide elucidating the nuanced aspects of embedded systems within the ML systems framework. +As this chapter progresses, we will explore ML systems' complex and fascinating world. We'll gain insights into their structural design and operational features and understand their key role in powering ML applications. Starting with the basics of microcontroller units, we will examine the interfaces and peripherals that improve their functionalities. This chapter is designed to be a comprehensive guide elucidating the nuanced aspects of ML systems. ::: {.callout-tip} @@ -24,19 +24,39 @@ As this chapter progresses, we will explore embedded systems' complex and fascin * Examine the interfaces, power management, and real-time operating characteristics essential for efficient ML systems alongside energy efficiency, reliability, and security considerations. -* Investigate the distinctions, benefits, challenges, and use cases for Cloud ML, Edge ML, and TinyML, emphasizing selecting the appropriate machine learning approach based on specific application needs and the evolving landscape of embedded systems in machine learning. +* Investigate the distinctions, benefits, challenges, and use cases for Cloud ML, Edge ML, and TinyML, emphasizing selecting the appropriate machine learning approach based on specific application needs and the evolving landscape of ML systems. ::: ## Introduction -ML is rapidly evolving, with new paradigms reshaping how models are developed, trained, and deployed. One such paradigm is embedded machine learning, which is experiencing significant innovation driven by the proliferation of smart sensors, edge devices, and microcontrollers. Embedded machine learning refers to the integration of machine learning algorithms into the hardware of a device, enabling real-time data processing and analysis without relying on cloud connectivity. This chapter explores the landscape of embedded machine learning, covering the key approaches of Cloud ML, Edge ML, and TinyML (@fig-cloud-edge-tinyml-comparison). +ML is rapidly evolving, with new paradigms reshaping how models are developed, trained, and deployed. The field is experiencing significant innovation driven by advancements in hardware, software, and algorithmic techniques. These developments are enabling machine learning to be applied in diverse settings, from large-scale cloud infrastructures to edge devices and even tiny, resource-constrained environments. + +Modern machine learning systems span a spectrum of deployment options, each with its own set of characteristics and use cases. At one end, we have cloud-based ML, which leverages powerful centralized computing resources for complex, data-intensive tasks. Moving along the spectrum, we encounter edge ML, which brings computation closer to the data source for reduced latency and improved privacy. At the far end, we find TinyML, which enables machine learning on extremely low-power devices with severe memory and processing constraints. + +This chapter explores the landscape of contemporary machine learning systems, covering the key approaches of Cloud ML, Edge ML, and TinyML (@fig-cloud-edge-tinyml-comparison). We'll examine the unique characteristics, advantages, and challenges of each approach, as well as the emerging trends and technologies that are shaping the future of machine learning deployment. ![Cloud vs. Edge vs. TinyML: The Spectrum of Distributed Intelligence. Source: ABI Research -- TinyML.](images/png/cloud-edge-tiny.png){#fig-cloud-edge-tinyml-comparison} -ML began with Cloud ML, where powerful servers in the cloud were used to train and run large ML models. However, as the need for real-time, low-latency processing grew, Edge ML emerged, bringing inference capabilities closer to the data source on edge devices such as smartphones. The latest development in this progression is TinyML, which enables ML models to run on extremely resource-constrained microcontrollers and small embedded systems. TinyML allows for on-device inference without relying on connectivity to the cloud or edge, opening up new possibilities for intelligent, battery-operated devices. +The evolution of machine learning systems can be seen as a progression from centralized to distributed computing paradigms: + +1. **Cloud ML:** Initially, ML was predominantly cloud-based. Powerful servers in data centers were used to train and run large ML models. This approach leverages vast computational resources and storage capacities, enabling the development of complex models trained on massive datasets. Cloud ML excels at tasks requiring extensive processing power and is ideal for applications where real-time responsiveness isn't critical. + +2. **Edge ML:** As the need for real-time, low-latency processing grew, Edge ML emerged. This paradigm brings inference capabilities closer to the data source, typically on edge devices such as smartphones, smart cameras, or IoT gateways. Edge ML reduces latency, enhances privacy by keeping data local, and can operate with intermittent cloud connectivity. It's particularly useful for applications requiring quick responses or handling sensitive data. + +3. **TinyML:** The latest development in this progression is TinyML, which enables ML models to run on extremely resource-constrained microcontrollers and small embedded systems. TinyML allows for on-device inference without relying on connectivity to the cloud or edge, opening up new possibilities for intelligent, battery-operated devices. This approach is crucial for applications where size, power consumption, and cost are critical factors. + +Each of these paradigms has its own strengths and is suited to different use cases: + +- Cloud ML remains essential for tasks requiring massive computational power or large-scale data analysis. +- Edge ML is ideal for applications needing low-latency responses or local data processing. +- TinyML enables AI capabilities in small, power-efficient devices, expanding the reach of ML to new domains. + +The progression from Cloud to Edge to TinyML reflects a broader trend in computing towards more distributed, localized processing. This evolution is driven by the need for faster response times, improved privacy, reduced bandwidth usage, and the ability to operate in environments with limited or no connectivity. + +@fig-vMLsizes illustrates the key differences between Cloud ML, Edge ML, and TinyML in terms of hardware, latency, connectivity, power requirements, and model complexity. As we move from Cloud to Edge to TinyML, we see a dramatic reduction in available resources, which presents significant challenges for deploying sophisticated machine learning models. -@fig-vMLsizes shows the key differences between Cloud ML, Edge ML, and TinyML in terms of hardware, latency, connectivity, power requirements, and model complexity. This significant disparity in available resources poses challenges when attempting to deploy deep learning models on microcontrollers, as these models often require substantial memory and storage. For instance, widely used deep learning models such as ResNet-50 exceed the resource limits of microcontrollers by a factor of around 100, while more efficient models like MobileNet-V2 still surpass these constraints by a factor of approximately 20. Even when quantized to use 8-bit integers (int8) for reduced memory usage, MobileNetV2 requires more than 5 times the memory typically available on a microcontroller, making it difficult to fit the model on these tiny devices. +This resource disparity becomes particularly apparent when attempting to deploy deep learning models on microcontrollers, the primary hardware platform for TinyML. These tiny devices have severely constrained memory and storage capacities, which are often insufficient for conventional deep learning models. We will learn to put these things into perspective in this chapter. ![From cloud GPUs to microcontrollers: Navigating the memory and storage landscape across computing devices. Source: [@lin2023tiny]](./images/jpg/cloud_mobile_tiny_sizes.jpg){#fig-vMLsizes}