Skip to content

VU-Irakliy/PET-energy-budget-estimation-framework

Repository files navigation

Estimation of the Energy Budget of Privacy-Enhancing Technologies

Read this before doing anything.
For any questions, contact via socials on my GitHub profile (first state where are you from and why are you interested).

Contents

  1. About the project
  2. Requirements
  3. Research files
  4. Estimation Framework
  5. (Bonus) Measurement Tool Combination Framework

1. About the project:

This is a MSc Thesis Project that explored the estimation of the energy consumption of the Privacy-Enhancing Technologies (PET), as well as privacy risks and utility of the dataset after PET treatment. The estimation is based on the dataset properties, which is used by Gradient Boosting Model.

The datasets and their sources are listed in the csv file all_datasets_and_their_sources.csv

2. Requirements

Make sure your Python version is the following:
Python 3.10.12 - 3.10.14 (pip 22.0 - 24.0)

Run the following line from the terminal:

pip install -r requirements.txt

3. Research

All research conducted can be located in the folder research_files. It includes:

  1. Cleaned datasets folder
  2. Figures folder
  3. 8 Notebooks for feature selections
  4. 2 dataset properties and measurement datasets
  5. Measurement files (main file is called data_collection_main.py)

4. Estimation Framework

Precautions

Before launching the framework:

  1. It's necessary to clean your file.
  2. Plus, it's necessary to preprocess the target attribute for classification purposes.
  3. For other attributes, please do not modify them in a way, that would change their values.
  4. After synthetic data has been generated, the framework will apply MinMaxScaler and One-Hot encoding for ML tasks.

Running on Terminal:
Linux (or WSL):
python3 main_project.py

Windows:
python main_project.py

MacOS:
python3 main_project.py

Running on Jupyter Notebook:
jupyter notebook

Then, run the notebook, like you normally would a function:

launch_estimation(filename = None, continuous_to_categorical = None, target = None, epsilon = None)

What input to provide?

  1. Name of the csv file, that is in the put_your_dataset_here folder
  2. Attributes, that are categorical, but because of their numerical format, could be mistaken for continuous
  3. Target attribute

For terminal:
Follow the instructions in the terminal!

For Jupyter Notebook:

  1. The filename needs to be just a filename. No additional path is necessary.
  2. The continuous_to_categorical must be a list of strings.
  3. Input for epsilon can be either 0 for No Differential Privacy or 1 for Differential Privacy with epsilon value of 0.1.

5. (Bonus) Measurement Tool Combination Framework

Before launching the framework:

  1. It's necessary to clean your file.
  2. Plus, it's necessary to preprocess the target attribute for classification purposes.
  3. For other attributes, please do not modify them in a way, that would change their values.
  4. After synthetic data has been generated, the framework will apply MinMaxScaler and One-Hot encoding for ML tasks.

Running on Terminal:
Linux (or WSL):
sudo python3 main_single_measurement.py*

Windows:
runas /user:Administrator "python main_single_measurement.py"*

MacOS:
sudo python3 main_single_measurement.py*

Running on Jupyter Notebook:
sudo jupyter notebook --allow-root*
Then, run the notebook, like you normally would a function.

launch_measurement(input_filename=None, target_attribute_ML=None, num_to_categ = None, possible_known_attributes = None, secret_mode = None,  save_my_report_to_csv = None)

# Note: possible_known_attributes and secret_mode are disabled. 

*It is necessary to run with the administrator rights in order to perform all hardware measurements of energy consumption. Otherwise, it won't work.

What input to provide?

  1. Name of the csv file, that is in the put_your_dataset_here folder
  2. Attributes, that are categorical, but because of their numerical format, could be mistaken for continuous
  3. Target attribute

Note: The Linkability and Inference risk measurements are disabled.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published