CodeCarbon can be automatically integrated with Comet for experiment tracking and visualization. Comet provides data scientists with powerful tools to track, compare, explain, and reproduce their experiments. Now, with CodeCarbon you can easily track the carbon footprint of your jobs along with your training metrics, hyperparameters, dataset samples, artifacts, and more.
To get started with the Comet-CodeCarbon integration, make sure you have comet-ml installed:
It is hard to quantify the entirety of computing emissions, because there are many factors in play, notably the life-cycle emissions of computing infrastructure. We therefore only focus on the direct emissions produced by running the actual code, but recognize that there is much work to be done to improve this estimation.
Create a virtual environment using conda for easier management of dependencies and packages.
For installing conda, follow the instructions on the
official conda website
Carbon dioxide (COβ) emissions, expressed as kilograms of COβ-equivalents [COβeq], are the product of two main factors :
C = Carbon Intensity of the electricity consumed for computation: quantified as g of COβ emitted per kilowatt-hour of electricity.
@@ -116,14 +116,14 @@
Carbon dioxide emissions (COβeq) can then be calculated as C*E
Carbon Intensity of the consumed electricity is calculated as a weighted average of the emissions from the different
energy sources that are used to generate electricity, including fossil fuels and renewables. In this toolkit, the fossil fuels coal, petroleum, and natural gas are associated with specific carbon intensities: a known amount of carbon dioxide is emitted for each kilowatt-hour of electricity generated. Renewable or low-carbon fuels include solar power, hydroelectricity, biomass, geothermal, and more. The nearby energy grid contains a mixture of fossil fuels and low-carbon energy sources, called the Energy Mix. Based on the mix of energy sources in the local grid, this package calculates the Carbon Intensity of the electricity consumed.
When available, CodeCarbon uses global carbon intensity of electricity per cloud provider ( here ) or per country ( here ).
If we donβt have the global carbon intensity or electricity of a country, but we have its electricity mix, we compute the carbon intensity of electricity using this table:
Power supply to the underlying hardware is tracked at frequent time intervals. This is a configurable parameter
measure_power_secs, with default value 15 seconds, that can be passed when instantiating the emissionsβ tracker.
Currently, the package supports the following hardware infrastructure.
CodeCarbon uses a 3 Watts for 8 GB ratio source .
This measure is not satisfying and if ever you have an idea how to enhance it please do not hesitate to contribute.
Apple Silicon Chips contain both the CPU and the GPU.
Codecarbon tracks Apple Silicon Chip energy consumption using powermetrics. It should be available natively on any mac.
However, this tool is only usable with sudo rights and to our current knowledge, there are no other options to track the energy consumption of the Apple Silicon Chip without administrative rights
@@ -229,10 +229,10 @@
In recent years, Artificial Intelligence, and more specifically Machine Learning, has become remarkably efficient at performing human-level tasks: recognizing objects and faces in images, driving cars, and playing sophisticated games like chess and Go.
In order to achieve these incredible levels of performance, current approaches leverage vast amounts of data to learn underlying patterns and features. Thus, state-of-the-art Machine Learning models leverage significant amounts of computing power, training on advanced processors for weeks or months, consequently consuming enormous amounts of energy. Depending on the energy grid used during this process, this can entail the emission of large amounts of greenhouse gases such as COβ.
With AI models becoming more ubiquitous and deployed across different sectors and industries, AIβs environmental impact is also growing. For this reason, it is important to estimate and curtail both the energy used and the emissions produced by training and deploying AI models. This package enables developers to track carbon dioxide (COβ) emissions across machine learning experiments or other programs.
The package has an in-built logger that logs data into a CSV file named emissions.csv in the output_dir, provided as an
input parameter (defaults to the current directory), for each experiment tracked across projects.
Prometheus is a systems and service monitoring system. It collects metrics from configured targets at given intervals, evaluates rule expressions, displays the results, and can trigger alerts when specified conditions are observed.
CodeCarbon exposes all its metrics with the suffix codecarbon_.
Current version uses pushgateway mode. If your pushgateway server needs auth, set your environment values PROMETHEUS_USERNAME and PROMETHEUS_PASSWORD so codecarbon is able to push the metrics.
The LoggerOutput class (and GoogleCloudLoggerOutput subclass) allows to send emissions tracking to a logger.
This is a specific, distinct logger than the one used by the CodeCarbon package for its βprivateβ logs.
It allows to leverage powerful logging systems, to centralize emissions to some central or cloud-based system, and build reports, triggers, etc. based on these data.
This logging output can be used in parallel with other output options provided by CodeCarbon.
In order to send emissions tracking data to the logger, first create a logger and then create an EmissionTracker. OfflineEmissionTracker
is also supported but lack of network connectivity may forbid to stream tracking data to some central or cloud-based collector.
The CO2 tracking tool can be used along with any computing framework. It supports both online (with internet access) and
offline (without internet access) modes. The tracker can be used in the following ways:
When the environment has internet access, the EmissionsTracker object or the track_emissions decorator can be used, which has
offline parameter set to False by default.
In the case of absence of a single entry and stop point for the training code base, users can instantiate a EmissionsTracker object and
pass it as a parameter to function calls to start and stop the emissions tracking of the compute section.
In case the training code base is wrapped in a function, users can use the decorator @track_emissions within the function to enable tracking
emissions of the training code.
An offline version is available to support restricted environments without internet access. The internal computations remain unchanged; however,
a country_iso_code parameter, which corresponds to the 3-letter alphabet ISO Code of the country where the compute infrastructure is hosted, is required to fetch Carbon Intensity details of the regional electricity used. A complete list of country ISO codes can be found on Wikipedia.
If you need a proxy to access internet, which is needed to call a Web API, like Codecarbon API, you have to set environment variable HTTPS_PROXY, or HTTP_PROXY if calling an http:// endpoint.
The package also comes with a DashApp containing illustrations to understand the emissions logged from various experiments across projects.
The App currently consumes logged information from a CSV file, generated from an in-built logger in the package.
The App can be run by executing the below CLI command that needs following arguments:
Users can get an understanding of net power consumption and emissions generated across projects and can dive into a particular project.
The App also provides exemplary equivalents from daily life, for example:
The App also benchmarks equivalent emissions across different regions of the cloud provider being used and recommends the most eco-friendly
region to host infrastructure for the concerned cloud provider.
Showing on the top the global energy consumed and emissions produced at an organisation level and the share of each project in this.
The App also provides comparison points with daily life activity to get a better understanding of the amount generated.
Each project can be divided into several experiments, and in each experiment several runs can happen.
The total emissions of experiments is shown on the barchart on the right hand side, and the runs on the bubble chart on the left hand side.
If ever your project has several experiments you can switch from one experimentβs runs in the bubble chart to another by clicking the bar chart.