Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Audit images #68

Draft
wants to merge 28 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
28 commits
Select commit Hold shift + click to select a range
e86f4a7
Calculation of Activity Index | Printing of global statistics
cnstll Aug 3, 2022
a02f676
Integration of Activity index into particles.csv file
cnstll Aug 3, 2022
ce41cf2
Calculation of Activity Index | Printing of global statistics
cnstll Aug 3, 2022
aa68e5d
Integration of Activity index into particles.csv file
cnstll Aug 3, 2022
c153b14
Merge branch '45-feature-wfntp-implementation-of-the-activity-index' …
cnstll Aug 3, 2022
4030f4f
Creating report_overview file
cnstll Aug 4, 2022
d0821d1
WF_NTP activity index: documentation of the current implementation
rcatini Sep 1, 2022
3742219
added auditor script to help check quality of provided images
COTHSC Aug 22, 2022
2e25d73
57 docssetup setup wf ntp under windows (#58)
p-lg-ux Sep 1, 2022
4c9d91e
High level description of Celest features
rcatini Aug 4, 2022
d72f109
add conda environment requirements in yml file for windows setup
p-lg-ux Aug 30, 2022
6e965d0
update (doc setup): update of the step of setup for linux
madvid Sep 1, 2022
aef21bd
update (set up): admonition style of quote (test if it is working)
madvid Sep 4, 2022
9c34174
update (set up): github beta style of quote (test if it is working)
madvid Sep 4, 2022
7a01588
update (set up): new line in Note quote
madvid Sep 4, 2022
43e00a7
doc + update (Readme + Setup): traduction de setup.md
madvid Sep 4, 2022
7dc0bdf
61 update documentation minor fix on readme (#63)
madvid Sep 4, 2022
ac8a686
fix (bug with converter): maybe this is the bug fix, need to get more…
madvid Aug 6, 2022
c926fae
Calculation of Activity Index | Printing of global statistics
cnstll Aug 3, 2022
095b651
added csv of available activity index values for celest and wfntp
COTHSC Aug 11, 2022
cd3845e
implemented more accurate activity index
COTHSC Aug 24, 2022
3df382a
implemented more accurate activity index
COTHSC Aug 24, 2022
d2e762e
removed hardcoded image width
COTHSC Aug 31, 2022
34c8894
Merge branch 'main' of github.com:42-AI/Elegant-Elegans into audit-im…
cnstll Sep 9, 2022
b7998b3
Normed auditor files
cnstll Sep 12, 2022
cb9143f
audit result for aws buckets images
cnstll Sep 12, 2022
396397c
Adding audit data | Short refacto of parser | Adding loop script to a…
cnstll Sep 19, 2022
4c97179
Script to download and make videos from provided tif images
cnstll Sep 19, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
921 changes: 525 additions & 396 deletions WF_NTP/WF_NTP/WF_NTP_script.py

Large diffs are not rendered by default.

19 changes: 19 additions & 0 deletions audit.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
video_name,expected_frames,number_of_actual_frames,expected_interval,average_interval,stdev_interval,actual_length_seconds,avg_fps
220427_BF_RC7_30ms_11.3x-crawl-ACR125_50ms_1,300.0,300.0,50.0,264.76,165.13202096451826,79.428,3.777005589968273
220427_BF_RC7_30ms_11.3x-crawl-ACR125_50ms_2_1,300.0,300.0,50.0,283.46666666666664,199.99634333335013,85.04,3.527751646284101
220427_BF_RC7_30ms_11.3x-crawl-ACR125_burst_1,100.0,100.0,1.0,36.308,11.82934222067049,3.6308,27.54213947339429
220427_BF_RC7_30ms_11.3x-crawl-ACR125_burst_3_1,100.0,100.0,1.0,37.5768,13.263488798619749,3.75768,26.612164952843244
220427_BF_RC7_30ms_11.3x-swim-acr125_burst_1,100.0,100.0,1.0,37.6839,17.963933726759127,3.76839,26.53653151611166
220427_BF_RC7_30ms_11.3x-swim-acr125_burst_2,100.0,107.0,1.0,34.70451612903226,9.860474293660149,3.22752,28.814693634741225
220427_BF_RC7_30ms_11.3x-swim-acr125_burst_3,100.0,100.0,1.0,35.348600000000005,10.821602508338788,3.53486,28.28966352274206
220427_BF_RC7_30ms_11.3x-swim-acr125_burst_4_2,100.0,100.0,1.0,34.4214,13.222972665063253,3.44214,29.051694585345164
raw-220503_TPS_ACR085-crawling_zoom113_10ms_100img,100.0,100.0,10.0,33.0665,7.184647404941543,3.3066500000000003,30.24208791374956
raw-220503_TPS_ACR085-trashing_zoom113_10ms_100img,100.0,100.0,10.0,33.1473,7.744858816718934,3.31473,30.1683696711346
raw-220503_TPS_ACR125-crawling_zoom113_10ms_100img,100.0,100.0,10.0,32.8245,8.158093010358304,3.28245,30.465048972566223
raw-220503_TPS_ACR125-thrashing_zoom113_10ms_100img,100.0,100.0,10.0,33.0078,6.778871450426971,3.30078,30.295869461157665
raw-220503_TPS_N2-zoom113_20ms_2,500.0,747.0,20.0,250.1225296442688,159.04675365295503,63.281,3.998040486085871
raw-220503_TPS_N2-zoom197_20ms_1,500.0,781.0,20.0,261.75342465753425,141.24246647006768,57.324,3.820389365710697
raw-220503_TPS_N2-zoom197_20ms_2,,,,,,,
raw-dt100ms_11.3x_RC8_BF_expo33ms,200.0,200.0,100.0,214.67,25.872146020353487,42.934,4.658312759118648
raw-dt10ms_11.3x_RC8_BF_expo33ms,200.0,200.0,10.0,33.1244,12.810651686121032,6.62488,30.18922606900049
raw-dt50ms_11.3x_RC8_BF_expo33ms,100.0,100.0,50.0,188.82,31.321848314206857,18.882,5.2960491473360864
34 changes: 34 additions & 0 deletions auditor/__main__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
from os import stat

from auditor.auditor import path_checker
from auditor.parser import audit_images, load_metadata, parser

# ########################################################################## #
# FUNCTIONS #
# ########################################################################## #


def main():
# parsing the argument(s)
args = parser()

# print(args)
dir_path = args.path

# checker of the path
path_checker(dir_path)

# load the json metadata file
metadata = load_metadata(dir_path)

# auditor of the tiff images based on metadata.
# Retrieving some info about frames
stat_frames = audit_images(metadata, dir_path)


# ########################################################################## #
# MAIN #
# ########################################################################## #

if __name__ == "__main__":
main()
46 changes: 46 additions & 0 deletions auditor/auditor.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
import os

NB_LIMIT = 10

# Checker related to the argument parsed.
def path_checker(path: str):
"""Check the existence and access of a directory.

Arguments:
path (str): path to the input directory (containing the images).

Raises:
NotADirectoryError: directory doesn't exist or is not a directory.
PermissionError: user doesn't have access to the directory.
"""
if not os.path.isdir(path):
raise NotADirectoryError(path + " is not a directory.")
if not os.access(path, os.R_OK | os.W_OK):
raise PermissionError("Permission denied to " + path)


def path_inside_checker(dir_path: str):
"""Check that the files inside the directory are .tif or .json, that a .json file exists and
that there are at least 100 .tif files.

Arguments:
dir_path : path to directory

Raises:
FileNotFoundError: there is no .json file
Exception: there are fewer than 100 .tif files
Exception a file other than .tif or .json
"""
number_tif_files = 0
metadata_file = False
for file in os.listdir(dir_path):
if file[-4:] == ".tif":
number_tif_files += 1
elif file == "metadata.txt":
metadata_file = True
elif file[-5:] == ".json":
continue
else:
raise Exception("File other than .tif or .json or metadata.txt found")
if metadata_file is False:
raise FileNotFoundError("No metadata file found")
175 changes: 175 additions & 0 deletions auditor/parser.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,175 @@
import argparse
import json
from cmath import exp
from curses import meta
from json import JSONDecodeError
from os import F_OK, R_OK, access, path
from os.path import exists as file_exists

import cv2 as cv
import pandas as pd


# ########################################################################### #
# Parsing of the inputs of converter #
# ########################################################################### #
# Parser related to the arguments of converter program
def parser() -> dict:
"""Parse arguments to get name directory as input and the video file's name as output.

Return:
A dictionary containing the name of the path to input directory
and the defined name of the video file as output
Namespace(output='', path='')
"""
parser = argparse.ArgumentParser()
parser.add_argument(
"--path", type=str, required=True, help="path where source will be look in."
)
return parser.parse_args()


# ########################################################################### #
# Parsing of the inputs of converter #
# ########################################################################### #


def load_metadata(directoryPath: str) -> dict:
"""loads the metadata file in a python dictionary.

args:
directorypath (str): directory path to json file to load.

raises:
filenotfounderror: file [directorypath]/metadata.txt does not exist.
exception: [directorypath]/metadata.txt is not readable by the user.
jsondecodeerror: issue when loading the metadata from file.
returns:
dict: the loaded metadata.
"""
metadata_file = directoryPath + "/metadata.txt"
if not access(metadata_file, F_OK):
raise FileNotFoundError(f"File {metadata_file} does not exists.")
if not access(metadata_file, R_OK):
raise Exception(f"File {metadata_file} is not readable for the user.")
with open(file=metadata_file, mode="r") as file:
metadata = json.load(file)
return metadata


def load_dataframe(file_path):
"""loads a dataframe from a .csv file containing the audit data, creates one if none exist.

args:
directorypath (str): directory path to json file to load.

raises:
filenotfounderror: file [directorypath]/metadata.txt does not exist.
exception: [directorypath]/metadata.txt is not readable by the user.
jsondecodeerror: issue when loading the metadata from file.
returns:
dict: the loaded metadata.
"""
if file_exists(file_path):
df = pd.read_csv(file_path, index_col=False)
else:
df = pd.DataFrame(
columns=[
"video_name",
"expected_frames",
"number_of_actual_frames",
"expected_interval",
"average_interval",
"stdev_interval",
"actual_length_seconds",
"avg_fps",
]
)
return df


def audit_images(metadata: dict, directoryPath: str) -> dict:
"""Audits the video frames and saves the results to .csv file.

Args:
metadata: the metadata resulting from loadMetadata
directoryPath: the directory where the images are located

Returns:
dict: a dictionary with the audited metadata
"""
video_name = directoryPath.rsplit("/", 1)[-1]
total_time_ms = 0
expect_frame_no = 0
expected_frames = metadata["Summary"]["Frames"]
theoretical_interval = metadata["Summary"]["Interval_ms"]
filenames_list = []
intervals_list = []
missing_frames = 0
tmp_total_time_ms = 0
audit_out_file = "./audit.csv"
# Loop over each frame obj in the metadata file
for obj in metadata:
if obj.startswith("Metadata-Default"):
filename = obj.rsplit("/", 1)[-1]
if filename != "Summary":
cv_img = cv.imread(directoryPath + "/" + filename)
if cv_img is None:
missing_frames += 1
expect_frame_no = expect_frame_no + 1
continue
filenames_list.append(filename)

# Checking the shape of the image and the expected shape
actual_height, actual_width = cv_img.shape[0], cv_img.shape[1]
expected_width = metadata[obj]["Width"]
expected_height = metadata[obj]["Height"]
if (actual_height != expected_height) or (actual_width != expected_width):
raise Exception(f"Mismatched image size: frame: {directoryPath}/{filename}")

currentFrame = metadata[obj]["Frame"]
if currentFrame == 0:
time_to_first_image = metadata[obj]["ElapsedTime-ms"]

# checks for missing frames
if currentFrame != expect_frame_no:
missing_frames = missing_frames + 1

# create a list of the intervals between two frames
total_time_ms = metadata[obj]["ElapsedTime-ms"] - time_to_first_image
intervals_list.append(total_time_ms - tmp_total_time_ms)
tmp_total_time_ms = total_time_ms
expect_frame_no = expect_frame_no + 1

if expect_frame_no != expected_frames:
missing_frames = expect_frame_no - expected_frames
df = pd.DataFrame(intervals_list, columns=["intervals"])
df2 = load_dataframe(audit_out_file)
data = [
video_name,
expected_frames,
expected_frames - missing_frames,
theoretical_interval,
df["intervals"].mean(),
df["intervals"].std(),
total_time_ms / 1000,
expect_frame_no / (total_time_ms / 1000),
]
# If video name not found create a new entry, else update data
if video_name not in df2.values:
df2.loc[len(df2.index)] = data
else:
df2.loc[df2["video_name"] == video_name] = data
df2.reset_index(drop=True, inplace=True)

print(df2)
df2.to_csv(audit_out_file, index=False)
return {
"number_of_expected_frames": expected_frames,
"number_of_actual_frames": expected_frames - missing_frames,
"expected_interval": theoretical_interval,
"average_interval": df["intervals"].mean(),
"stdev_interval": df["intervals"].std(),
"actual_length_seconds": total_time_ms / 1000,
"avg_fps": expect_frame_no / (total_time_ms / 1000),
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
video_src,CeleST_activity_index_10_normalized,Celest_activity_index_median,Celest_activity_index_median_normalized,WF_NTP_activity_index_normalized
sample05,-1.30947885393232,140.453948,-0.735046359658527,-1.28607487396111
sample05,0.366508953118392,170.811533,0.207815446789699,-0.248306412784322
sample05,1.06576611343826,206.966828,1.33074557302894,0.99869915010498
sample05,-0.12279621262433,138.249455,-0.803514660160113,0.535682136640454
sample01,-0.636284514236627,55.244184,-0.703842871953718,-0.428269899834041
sample01,-0.230149081290255,78.927102,-0.0532987065501446,1.11493457446069
sample01,1.71021625284489,132.961965,1.43098058931329,-0.566481289642593
sample01,-0.060401910118327,56.336468,-0.673839010809424,-1.11966300038972
19 changes: 19 additions & 0 deletions experiments/activity_index/bucket_path.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
#s3://lab-nematode/raw/220503_TPS_ACR085/crawling_zoom113_10ms_100img/Default/
#s3://lab-nematode/raw/220503_TPS_ACR085/trashing_zoom113_10ms_100img/Default/
#s3://lab-nematode/raw/220503_TPS_ACR125/crawling_zoom113_10ms_100img/Default/
#s3://lab-nematode/raw/220503_TPS_ACR125/thrashing_zoom113_10ms_100img/Default/
#s3://lab-nematode/raw/220503_TPS_N2/zoom113_20ms_2/Default/
#s3://lab-nematode/raw/220503_TPS_N2/zoom197_20ms_1/Default/
#s3://lab-nematode/raw/220503_TPS_N2/zoom197_20ms_2/Default/
#s3://lab-nematode/220427_BF_RC7_30ms_11.3x/crawl/ACR125_50ms_1/Default/
#s3://lab-nematode/220427_BF_RC7_30ms_11.3x/crawl/ACR125_50ms_2_1/Default/
#s3://lab-nematode/220427_BF_RC7_30ms_11.3x/crawl/ACR125_burst_1/Default/
#s3://lab-nematode/220427_BF_RC7_30ms_11.3x/crawl/ACR125_burst_3_1/Default/
#s3://lab-nematode/220427_BF_RC7_30ms_11.3x/swim/acr125_burst_1/Default/
#s3://lab-nematode/220427_BF_RC7_30ms_11.3x/swim/acr125_burst_2/Default/
#s3://lab-nematode/220427_BF_RC7_30ms_11.3x/swim/acr125_burst_3/Default/
#s3://lab-nematode/220427_BF_RC7_30ms_11.3x/swim/acr125_burst_4_2/Default/
#s3://lab-nematode/raw/dt100ms_11.3x_RC8_BF_expo33ms/Default/
#s3://lab-nematode/raw/dt10ms_11.3x_RC8_BF_expo33ms/Default/
s3://lab-nematode/raw/dt50ms_11.3x_RC8_BF_expo33ms/Default/
s3://lab-nematode/raw/stream_11.3x_RC8_BF_expo_33ms/Default/
Loading