Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: integrate analyze readii outputs functions #79

Closed
wants to merge 64 commits into from
Closed
Changes from 1 commit
Commits
Show all changes
64 commits
Select commit Hold shift + click to select a range
95586e5
feat: add function to calculate feature correlation matrix
strixy16 Dec 3, 2024
5193fb4
feat: add function to generate a heatmap plot figure from a correlati…
strixy16 Dec 3, 2024
a0771c6
feat: add init file to analyze directory
strixy16 Dec 3, 2024
cf26afc
feat: add error handling in getFeatureCorrelations
strixy16 Dec 3, 2024
e643349
feat: add general loading file, add loading config and data file func…
strixy16 Dec 3, 2024
f5882da
feat: add file for loading functions related to feature files
strixy16 Dec 3, 2024
5495550
build: add numpy and seaborn for correlation code
strixy16 Dec 3, 2024
decf8e5
refactor: remove so far unused imports
strixy16 Dec 3, 2024
fcc1b9e
feat: started test function for getFeatureCorrelations
strixy16 Dec 3, 2024
a708182
feat: make files for better function organization
strixy16 Dec 3, 2024
d706863
Merge remote-tracking branch 'origin/main' into katys/integrate-analy…
strixy16 Dec 6, 2024
d63a1c5
fix: remove duplicate tool.pixi.dependencies from merge
strixy16 Dec 6, 2024
484c12e
build: add seaborn for correlation plot functions, need to specify nu…
strixy16 Dec 6, 2024
c6b945f
feat: add init files for new directories
strixy16 Dec 6, 2024
fc83d69
feat: add function to calculate feature correlations and a function t…
strixy16 Dec 6, 2024
46f0773
feat: add function to drop a set of features at the beginning of a pa…
strixy16 Dec 6, 2024
fe56257
fix: set continuous setting in StructureSetToSegmentation to False
strixy16 Dec 6, 2024
e618269
build: moved seaborn and numpy to project dependencies
strixy16 Dec 6, 2024
a6ab888
test: make test feature matrix to test correlation functions with, up…
strixy16 Dec 6, 2024
0f1d837
feat: set StructureSetToSegmentation continuous argument to False
strixy16 Dec 6, 2024
5b0dccc
build: lock file from installing on katys mac
strixy16 Dec 6, 2024
0d9c943
Merge branch 'katys/fix_continuous_rtstruct_index' into katys/integra…
strixy16 Dec 6, 2024
5b4e5cb
feat: add functions for selecting subsets of dataframes
strixy16 Dec 9, 2024
b36f3d2
refactor: renamed process to select for specificity
strixy16 Dec 9, 2024
fa4da89
style: rename labelling for consistent filename convention
strixy16 Dec 9, 2024
0256466
feat: add function to extract patient ID label from a dataframe
strixy16 Dec 9, 2024
f0b87c2
feat: add functions to replace column values in a dataset for imputat…
strixy16 Dec 9, 2024
d44e1ce
feat: add function to save out seaborn plot figure to a png
strixy16 Dec 9, 2024
bfdc357
feat: add function to convert numerical days column to years
strixy16 Dec 9, 2024
1e89c17
feat: add function to set up a time outcome column for survival predi…
strixy16 Dec 9, 2024
948b426
feat: add function for survival status mapping from string to numeric…
strixy16 Dec 9, 2024
86f13ec
feat: add function to set patient ID column as index in a dataframe
strixy16 Dec 9, 2024
7842ebe
feat: add function to intersect two dataframes by their patient ID va…
strixy16 Dec 9, 2024
81b884a
feat: add function that takes outcome labels from clinical data and a…
strixy16 Dec 9, 2024
de2dd2c
feat: add function to get a list of image types from a directory of f…
strixy16 Dec 9, 2024
1d49ec1
feat: add function to plot and return a correlation heatmap
strixy16 Dec 9, 2024
8e0868f
feat: add function to plot a histogram of correlation values
strixy16 Dec 9, 2024
45b8fb0
feat: add functions to extract subsets of a full correlation matrix
strixy16 Dec 9, 2024
6b84ef8
style: rename plot to plot_correlations for specificity
strixy16 Dec 9, 2024
61cdedd
feat: add functions for self and cross correlation plotting
strixy16 Dec 9, 2024
e021051
refactor: remove unused imports
strixy16 Dec 9, 2024
730361b
refactor: remove unused scipy import
strixy16 Dec 9, 2024
1f4edf2
build: latest pixi lock file for analysis code addition
strixy16 Dec 9, 2024
de1c752
feat: change continuous to True in loadRTSTRUCTSITK so tests pass for…
strixy16 Dec 10, 2024
2647168
fix: need default vertical and horizontal suffixes when same feature …
strixy16 Dec 10, 2024
253aba2
fix: default feature names will have underscore at the front and unde…
strixy16 Dec 10, 2024
31bf5bf
feat: testing getFeatureCorrelations function
strixy16 Dec 10, 2024
231c390
fix: handle mutable input argument event_column_mapping
strixy16 Dec 10, 2024
40c1cba
fix: add fstring so variable is used properly in error message
strixy16 Dec 10, 2024
550c32a
fix: remove mutable version of outcome_labels input for addOutcomeLabels
strixy16 Dec 10, 2024
0d36600
fix: update error handling of old values to be replaced not existing …
strixy16 Dec 10, 2024
187b1cb
feat: change input image_types list for loadFeatureFilesFromImageType…
strixy16 Dec 10, 2024
b1daaf0
fix: change labels to drop default to None and assign in the function…
strixy16 Dec 10, 2024
6075966
refactor: use context manager for file operations and improve error h…
strixy16 Dec 10, 2024
501e20d
feat: improve error handling and input validation in loadFileToDataframe
strixy16 Dec 10, 2024
5ea0b99
refactor: change assert statements in getFeatureCorrelations to if st…
strixy16 Dec 10, 2024
da16d68
feat: handle NaN values in existing event values list in survival sta…
strixy16 Dec 10, 2024
0c8ccbf
docs: describe handling of NaNs in survival outcome column when mappi…
strixy16 Dec 10, 2024
2ab08e6
refactor: check dtype of event outcome column instead of first elemen…
strixy16 Dec 10, 2024
90839a2
refactor: simplify event column mapping dictionary check with sets
strixy16 Dec 10, 2024
80b81a7
refactor: change out string to numeric replacement with the replaceCo…
strixy16 Dec 10, 2024
b0a892d
feat: check that extracted feature directory exists
strixy16 Dec 10, 2024
edaf74c
refactor: improve error handling for dropping labels in loadFeatureFi…
strixy16 Dec 10, 2024
dc2e86a
feat: validate that any feature sets were loaded before return
strixy16 Dec 10, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
feat: add function to generate a heatmap plot figure from a correlati…
…on matrix
strixy16 committed Dec 3, 2024
commit 5193fb44c50b462a41d69f38273154c9a6f21543
82 changes: 82 additions & 0 deletions src/readii/analyze/correlation.py
Original file line number Diff line number Diff line change
@@ -42,3 +42,85 @@ def getFeatureCorrelations(vertical_features:pd.DataFrame,
correlation_matrix = features_to_correlate.corr(method=method)

return correlation_matrix


def plotCorrelationHeatmap(correlation_matrix_df:pd.DataFrame,
diagonal:Optional[bool] = False,
triangle:Optional[str] = "lower",
cmap:Optional[str] = "nipy_spectral",
xlabel:Optional[str] = "",
ylabel:Optional[str] = "",
title:Optional[str] = "",
subtitle:Optional[str] = "",
show_tick_labels:Optional[bool] = False
):
"""Function to plot a correlation heatmap.
Parameters
----------
correlation_matrix_df : pd.DataFrame
Dataframe containing the correlation matrix to plot.
diagonal : bool, optional
Whether to only plot half of the matrix. The default is False.
triangle : str, optional
Which triangle half of the matrixto plot. The default is "lower".
xlabel : str, optional
Label for the x-axis. The default is "".
ylabel : str, optional
Label for the y-axis. The default is "".
title : str, optional
Title for the plot. The default is "".
subtitle : str, optional
Subtitle for the plot. The default is "".
show_tick_labels : bool, optional
Whether to show the tick labels on the x and y axes. These would be the feature names. The default is False.
Returns
-------
corr_fig : matplotlib.pyplot.figure
Figure object containing a Seaborn heatmap.
"""

if diagonal:
# Set up mask for hiding half the matrix in the plot
if triangle == "lower":
# Mask out the upper right triangle half of the matrix
mask = np.triu(correlation_matrix_df)
elif triangle == "upper":
# Mask out the lower left triangle half of the matrix
mask = np.tril(correlation_matrix_df)
else:
raise ValueError("If diagonal is True, triangle must be either 'lower' or 'upper'.")
else:
# The entire correlation matrix will be visisble in the plot
mask = None

# Set a default title if one is not provided
if not title:
title = "Correlation Heatmap"

# Set up figure and axes for the plot
corr_fig, corr_ax = plt.subplots()

# Plot the correlation matrix
corr_ax = sns.heatmap(correlation_matrix_df,
mask = mask,
cmap=cmap,
vmin=-1.0,
vmax=1.0)

if not show_tick_labels:
# Remove the individual feature names from the axes
corr_ax.set_xticklabels(labels=[])
corr_ax.set_yticklabels(labels=[])

# Set axis labels
corr_ax.set_xlabel(xlabel)
corr_ax.set_ylabel(ylabel)

# Set title and subtitle
# Suptitle is the super title, which will be above the title
plt.title(subtitle, fontsize=12)
plt.suptitle(title, fontsize=14)

return corr_fig