- Install the required libraries using
pip
:pip install pandas openpyxl
This script processes a series of text files containing timestamped data and classifies each event based on a metadata file that defines event types with start and end times. The classified event data is saved to an Excel file for further analysis.
- Place the following files in a directory called
./Task1/
:metadata.xlsx
: Contains the metadata that defines the start and end times for events.- Multiple
.txt
files: Contain timestamped data that will be classified based on the metadata.
-
The script loads the metadata from the
metadata.xlsx
file. The metadata includes:StartTime
: The start time of the event.EndTime
: The end time of the event.EventType
: The event classification.
-
It processes each
.txt
file in the./Task1/
folder and reads the data, which should include aTimestamp
column and other columns representing data for that event. -
The script compares the timestamp of each row with the start and end times from the metadata to classify the event type.
-
The classified data is then saved into an Excel file
classified_events.xlsx
.
-
Place your metadata in the
metadata.xlsx
file in the./Task1/
folder. The metadata should be structured with columns:StartTime
: The event start time.EndTime
: The event end time.EventType
: The classification for that event.
-
Place your
.txt
files in the./Task1/
folder. The data in these files should have the following structure:Timestamp
: The timestamp in the formatYYYY-DD-MM-DD-hh:mm:ss:fff
.- Other columns such as
x
,y
, andz
which will be processed for event classification.
- Ensure the necessary packages (
pandas
,openpyxl
) are installed. - Run the script to classify the data and save it in an Excel file.
python Task1.py
- Load Data
- The dataset is expected to be in the
./Task2/
folder with the filenamedata.xlsx
. The dataset should contain columns named:Index
Timestamp
Incremental_Index
Acceleration_X
Acceleration_Y
Acceleration_Z
This script processes a dataset to check for missing data points based on timestamps and incremental index values. The key objectives are to:
- Identify timestamps with fewer data points than the specified fault tolerance.
- Detect gaps in the incremental index (0-255) sequence.
- Generate a report detailing missing data points for each identified timestamp.
- Fault Tolerance: Defined by fault_tolerance (default: 20). Any timestamp with fewer data points than this threshold is flagged for missing data.
- Incremental Index Range: Set from 0 to 255. The script checks for gaps in this range and flags missing index values.
Execute the script by running:
python Task2.py
The script will print reports detailing:
- Timestamps with missing data (if consecutive data points fall below the fault tolerance threshold).
- Gaps in the incremental index sequence, indicating data loss.
Missing data detected from index 105, Timestamp at 2023-11-06 12:45:10 only has 18 data points.
Data loss found at index 210, total data loss: 4