Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add analysis functions #92

Merged
merged 41 commits into from
Dec 17, 2024
Merged

feat: add analysis functions #92

merged 41 commits into from
Dec 17, 2024

Conversation

strixy16
Copy link
Collaborator

@strixy16 strixy16 commented Dec 16, 2024

Includes correlation calculations and plotting those correlations as a heatmap and histogram

Summary by CodeRabbit

Release Notes

  • New Features

    • Updated version number to 1.26.0 with new dependencies: numpy, seaborn, and pandas.
    • Introduced a new module for analyzing READII outputs with several correlation functions.
    • Added visualization functions for correlation data: heatmap and histogram.
    • New validation function for DataFrame dimensions added.
    • Expanded platform support to include osx-64 and win-64.
  • Bug Fixes

    • Enhanced error handling in correlation calculations and plot saving processes.
    • Simplified exception handling in feature loading functions.
  • Documentation

    • Improved docstrings for new functions and modules for better usability.
  • Chores

    • Expanded linting configuration for broader coverage of Python files.

Copy link
Contributor

coderabbitai bot commented Dec 16, 2024

Walkthrough

This pull request introduces enhancements to the READII project, focusing on data analysis capabilities. Key changes include the addition of a new analysis module for feature correlation, updated project dependencies, improved error handling in plot writing, and expanded platform support. The version has been incremented from 1.25.0 to 1.26.0, reflecting these improvements.

Changes

File Change Summary
pyproject.toml - Version updated to 1.26.0
- Added dependencies: numpy (1.26.4.*), seaborn (>=0.13.2,<0.14), pandas (>=2.2.3,<3)
- Added platforms: osx-64, win-64
src/readii/analyze/__init__.py - Created new analysis module with functions for correlation analysis
src/readii/analyze/correlation.py - Enhanced correlation calculation functions with validation and error handling
src/readii/analyze/plot_correlation.py - Introduced functions for visualizing correlation data
tests/analyze/test_correlation.py - Added unit tests for correlation functions
src/readii/io/writers/plot_writer.py - Added custom exceptions for plot writing
- Enhanced PlotWriter with validation and error handling
src/readii/data/select.py - Added validateDataframeSubsetSelection function for DataFrame validation
src/readii/image_processing.py - Removed loadRTSTRUCTSITK import
config/ruff.toml - Updated linting configurations
- Expanded file inclusion patterns
src/readii/io/loaders/features.py - Simplified exception handling in loadFeatureFilesFromImageTypes function
notebooks/nifti_writer_example.ipynb - Updated execution counts and import statements; changed Python version metadata

Sequence Diagram

sequenceDiagram
    participant User
    participant CorrelationModule
    participant DataFrame
    participant Plotter

    User->>CorrelationModule: Request feature correlations
    CorrelationModule->>DataFrame: Validate and compute correlations
    DataFrame-->>CorrelationModule: Return correlation matrix
    CorrelationModule->>Plotter: Generate correlation visualization
    Plotter-->>User: Return correlation plot
Loading

Poem

🐰 Correlation's Rabbit Hop! 🔍

In data's wild and winding maze,
Our rabbit leaps through correlation's ways,
Seaborn and pandas, numpy's might,
Plotting patterns, pure delight!
A scientific bunny's gleeful dance! 📊🐇

Tip

CodeRabbit's docstrings feature is now available as part of our Early Access Program! Simply use the command @coderabbitai generate docstrings to have CodeRabbit automatically generate docstrings for your pull request. We would love to hear your feedback on Discord.


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 12

🧹 Nitpick comments (10)
src/readii/analyze/plot_correlation.py (6)

29-29: Correct typographical error in the docstring.

There's a typo in the parameter description for triangle. "matrixto" should be corrected to "matrix to".

Apply this diff to fix the typo:

-        Which triangle half of the matrixto plot. The default is "lower".
+        Which triangle half of the matrix to plot. The default is "lower".

48-48: Fix typo in the debug logging message.

In the debug message, "traingle" should be corrected to "triangle".

Apply this diff to correct the spelling:

-        logger.debug(f"Creating {triangle} traingle mask for diagonal correlation plot.")
+        logger.debug(f"Creating {triangle} triangle mask for diagonal correlation plot.")

62-62: Correct typographical error in the comment.

There's a typo in the comment: "visisble" should be "visible".

Apply this diff to fix the typo:

-        # The entire correlation matrix will be visisble in the plot
+        # The entire correlation matrix will be visible in the plot

73-77: Ensure consistent spacing in function parameters.

In the sns.heatmap function call, spacing around = should be consistent for readability.

Apply this diff to standardize the spacing:

    corr_ax = sns.heatmap(correlation_matrix_df,
-                         mask = mask,
+                         mask=mask,
                          cmap=cmap,
                          vmin=-1.0,
                          vmax=1.0)

141-142: Replace print statement with logger method.

Using print is not recommended for logging. Please use the logger for consistent logging practices.

Apply this diff to use the logger:

-        print("Correlation matrix is symmetric.")
+        logger.info("Correlation matrix is symmetric.")

27-29: Clarify the diagonal parameter in the docstring.

The description for diagonal might be misleading. Since setting diagonal=True results in plotting only half of the matrix, consider renaming it to mask_half or updating the docstring for clarity.

If renaming is not preferred, consider updating the docstring as follows:

-        diagonal : bool, optional
-            Whether to only plot half of the matrix. The default is False.
+        diagonal : bool, optional
+            Whether to apply a mask and plot only one triangle half of the matrix. The default is False.
src/readii/analyze/correlation.py (1)

50-50: Fix typographical error in comment.

Correct the typo "beginnging" to "beginning".

Apply this diff:

-        # Add _ to beginnging of feature names if they don't start with _ so they can be used as suffixes
+        # Add _ to beginning of feature names if they don't start with _ so they can be used as suffixes
tests/analyze/test_correlation.py (2)

68-72: Enhance exception testing by specifying expected error messages.

When using pytest.raises, include the match parameter to ensure the correct exception is raised.

Apply this diff to specify the expected error message:

     with pytest.raises(ValueError):
+        # Expected message: "Correlation method must be one of 'pearson', 'spearman', or 'kendall'."
     getFeatureCorrelations(vertical_features=random_features,
                            horizontal_features=random_features,
                            method=correlation_method)

Or better yet:

     with pytest.raises(ValueError, match="Correlation method must be one of 'pearson', 'spearman', or 'kendall'\\."):
     getFeatureCorrelations(vertical_features=random_features,
                            horizontal_features=random_features,
                            method=correlation_method)

84-93: Enhance exception testing by specifying expected error messages.

Include the match parameter in pytest.raises for testing TypeError.

Apply this diff:

     with pytest.raises(TypeError, match="horizontal_features must be a pandas DataFrame"):
         getFeatureCorrelations(vertical_features=random_features,
                                horizontal_features=wrong_features,
                                method="pearson")
     with pytest.raises(TypeError, match="vertical_features must be a pandas DataFrame"):
         getFeatureCorrelations(vertical_features=wrong_features,
                                horizontal_features=random_features,
                                method="pearson")
pyproject.toml (1)

16-16: Consider relaxing the numpy version constraint

The current pinned version numpy==1.26.4.* is very restrictive and might cause dependency conflicts with other packages that depend on numpy. Consider using a more flexible version range like >=1.26.4,<2 to allow for compatible updates while maintaining stability.

-  "numpy==1.26.4.*",
+  "numpy>=1.26.4,<2",
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c073262 and 0f9f331.

⛔ Files ignored due to path filters (1)
  • pixi.lock is excluded by !**/*.lock
📒 Files selected for processing (5)
  • pyproject.toml (1 hunks)
  • src/readii/analyze/__init__.py (1 hunks)
  • src/readii/analyze/correlation.py (1 hunks)
  • src/readii/analyze/plot_correlation.py (1 hunks)
  • tests/analyze/test_correlation.py (1 hunks)
🔇 Additional comments (1)
pyproject.toml (1)

16-18: Verify version compatibility

Let's ensure the new dependencies are compatible with:

  1. The project's Python version range (3.10-3.12)
  2. Each other (numpy, pandas, seaborn have interdependencies)
  3. Existing dependencies (especially matplotlib)
✅ Verification successful

Let me gather information about matplotlib compatibility since it's mentioned in the review.


Dependencies version compatibility verified

All dependencies are compatible:

  • Python version compatibility: All packages support Python 3.10-3.12 (numpy ≥3.10, pandas ≥3.9, seaborn ≥3.8)
  • Inter-package compatibility:
    • seaborn 0.13.2 requires numpy ≥1.20 and pandas ≥1.2
    • pandas 2.2.3 requires numpy ≥1.26.0 for Python 3.12
    • matplotlib ≥3.9.2 requires numpy ≥1.23
  • The specified versions in pyproject.toml (numpy 1.26.4.*, pandas ≥2.2.3, seaborn ≥0.13.2) satisfy all these constraints
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check compatibility of new dependencies

# Check Python version compatibility for new packages
for pkg in numpy pandas seaborn; do
  echo "Checking $pkg compatibility..."
  curl -s "https://pypi.org/pypi/$pkg/json" | jq -r '.info | "Package: \(.name)\nPython requires: \(.requires_python)\nLatest version: \(.version)"'
done

# Check if seaborn declares compatible numpy/pandas versions
curl -s "https://pypi.org/pypi/seaborn/0.13.2/json" | jq -r '.info.requires_dist[]' | grep -E 'numpy|pandas'

# Check if pandas declares compatible numpy versions
curl -s "https://pypi.org/pypi/pandas/2.2.3/json" | jq -r '.info.requires_dist[]' | grep 'numpy'

Length of output: 1371


Script:

#!/bin/bash
# Check matplotlib dependencies and version constraints
echo "Checking matplotlib in pyproject.toml..."
rg "matplotlib" pyproject.toml

echo -e "\nChecking matplotlib compatibility..."
curl -s "https://pypi.org/pypi/matplotlib/json" | jq -r '.info.requires_dist[]' | grep -E 'numpy|pandas'

# Get the specific matplotlib version if defined
echo -e "\nGetting matplotlib version constraint..."
rg -A 1 -B 1 "matplotlib" pyproject.toml

Length of output: 566

src/readii/analyze/plot_correlation.py Show resolved Hide resolved
src/readii/analyze/correlation.py Outdated Show resolved Hide resolved
src/readii/analyze/correlation.py Show resolved Hide resolved
src/readii/analyze/correlation.py Outdated Show resolved Hide resolved
src/readii/analyze/correlation.py Outdated Show resolved Hide resolved
src/readii/analyze/correlation.py Show resolved Hide resolved
src/readii/analyze/correlation.py Show resolved Hide resolved
src/readii/analyze/correlation.py Outdated Show resolved Hide resolved
src/readii/analyze/correlation.py Outdated Show resolved Hide resolved
src/readii/analyze/__init__.py Outdated Show resolved Hide resolved
Copy link

codecov bot commented Dec 16, 2024

Codecov Report

Attention: Patch coverage is 22.39583% with 149 lines in your changes missing coverage. Please review.

Project coverage is 41.23%. Comparing base (81fbd78) to head (5be1831).
Report is 8 commits behind head on main.

Files with missing lines Patch % Lines
src/readii/analyze/plot_correlation.py 16.12% 52 Missing ⚠️
src/readii/io/writers/plot_writer.py 0.00% 37 Missing ⚠️
src/readii/analyze/correlation.py 47.54% 32 Missing ⚠️
src/readii/data/select.py 3.70% 26 Missing ⚠️
src/readii/io/loaders/features.py 0.00% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main      #92      +/-   ##
==========================================
- Coverage   49.04%   41.23%   -7.81%     
==========================================
  Files          25       32       +7     
  Lines         991     1363     +372     
==========================================
+ Hits          486      562      +76     
- Misses        505      801     +296     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (3)
src/readii/io/writers/plot_writer.py (3)

9-13: Remove Redundant pass Statements in Exception Classes

When an exception class contains only a docstring, the pass statement is unnecessary because the docstring suffices to define the class body.

Apply this diff to remove the unnecessary pass statements:

 class PlotWriterError(Exception):
     """Base exception for PlotWriter errors."""
 
-    pass


 class PlotWriterIOError(PlotWriterError):
     """Raised when I/O operations fail."""
 
-    pass

Also applies to: 15-19


48-54: Catch Specific Exceptions Instead of General Exception

Catching all exceptions with except Exception as e is too broad and can mask unexpected errors. It's better to catch specific exceptions that may occur during the save operation, such as OSError.

Apply this diff to catch a more specific exception:

-        except Exception as e:
+        except OSError as e:

This change ensures that only I/O-related errors are caught, allowing other exceptions to propagate appropriately.


24-29: Evaluate the Necessity of metadata in the overwrite Field

The metadata parameter in the overwrite field is currently not utilized within the class. If it's not serving a purpose, consider removing it to simplify the code.

Apply this diff to remove the unused metadata:

         overwrite: bool = field(
             default=False,
-            metadata={
-                "help": "If True, allows overwriting existing files. If False, raises PlotWriterIOError."
-            },
         )

If the metadata is intended for future use or integration with external tools (e.g., for command-line interfaces or documentation), you may choose to retain it.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0f9f331 and 689eebf.

📒 Files selected for processing (1)
  • src/readii/io/writers/plot_writer.py (1 hunks)

src/readii/io/writers/plot_writer.py Outdated Show resolved Hide resolved
src/readii/io/writers/plot_writer.py Outdated Show resolved Hide resolved
src/readii/io/writers/plot_writer.py Outdated Show resolved Hide resolved
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
notebooks/nifti_writer_example.ipynb (2)

43-43: Consider enhancing the CSVWriter implementation.

While the current implementation is functional, consider these improvements:

  1. Add more specific type hints for the data parameter (e.g., List[Dict[str, Any]])
  2. Add docstring explaining the expected data format
  3. Add error handling for potential pandas DataFrame conversion issues

Example enhancement:

 class CSVWriter(BaseWriter):
+    """Writer class for saving data to CSV files.
+    
+    Args:
+        data (List[Dict[str, Any]]): List of dictionaries to be converted to CSV
+        **kwargs: Additional arguments for path resolution
+    
+    Returns:
+        Path: Path to the saved CSV file
+    
+    Raises:
+        ValueError: If data cannot be converted to DataFrame
+    """
     def save(self, data: list, **kwargs) -> Path:
+        if not data:
+            raise ValueError("Empty data provided")
         output_path = self.resolve_path(**kwargs)
         with output_path.open('w') as f:
             pd.DataFrame(data).to_csv(f, index=False)
         return output_path

67-67: Consider improving example code structure and configurability.

The example demonstrates good practices with context managers and error handling. Consider these enhancements:

  1. Extract magic numbers into named constants
  2. Make modalities configurable
  3. Add more comprehensive error handling

Example enhancement:

+ # Configuration
+ IMAGE_SIZES = {
+     "small": (10, 10, 10),
+     "medium": (20, 20, 20),
+     "large": (30, 30, 30)
+ }
+ MODALITIES = ["CT", "RTSTRUCT"]
+ 
 data_sets = []
 random.seed(42)
 
- for MODALITY in ["CT", "RTSTRUCT"]:
+ for MODALITY in MODALITIES:
     data_sets.extend([
         {
             "image": sitk.Image(
-                10, 10, 10,
+                *IMAGE_SIZES["small"],
                 sitk.sitkInt16
             ),
             # ... rest of the code
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 689eebf and 23c87e4.

📒 Files selected for processing (1)
  • notebooks/nifti_writer_example.ipynb (4 hunks)
🔇 Additional comments (3)
notebooks/nifti_writer_example.ipynb (3)

19-19: LGTM! Import statements are appropriate.

The imports are well-organized and necessary for the demonstrated functionality.


Line range hint 1-212: Notebook effectively demonstrates NIFTI writing functionality.

The notebook serves as a good example of using the NIFTI writer class and aligns well with the PR objectives. It demonstrates proper file handling, error management, and includes good practices like using context managers and setting random seeds for reproducibility.


212-212: Verify Python version compatibility with dependencies.

The Python version has been updated to 3.12.8. Ensure all dependencies support this version.

✅ Verification successful

Python version update appears safe to proceed

The update from Python 3.12.7 to 3.12.8 is a patch-level change that only includes bug fixes. The notebook uses standard libraries and common packages (SimpleITK, pandas) that are well-maintained and compatible with Python 3.12. No explicit version constraints were found in the project configuration files that would conflict with Python 3.12.8.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check if pyproject.toml specifies Python version constraints
if [ -f "pyproject.toml" ]; then
    echo "Checking Python version constraints in pyproject.toml:"
    grep -A 5 "python_requires" pyproject.toml
fi

# Check if requirements.txt exists and contains version constraints
if [ -f "requirements.txt" ]; then
    echo "Checking requirements.txt for potential version conflicts:"
    grep -i "python_version" requirements.txt
fi

Length of output: 219


Script:

#!/bin/bash
# Let's check for any setup.py or setup.cfg files that might contain Python version requirements
if [ -f "setup.py" ]; then
    echo "Checking setup.py for Python version requirements:"
    cat setup.py
fi

if [ -f "setup.cfg" ]; then
    echo "Checking setup.cfg for Python version requirements:"
    cat setup.cfg
fi

# Let's also check for any environment.yml or conda files
if [ -f "environment.yml" ]; then
    echo "Checking environment.yml for Python version:"
    cat environment.yml
fi

# Let's check the notebook metadata for any kernel specifications
echo "Checking notebook kernel specifications:"
fd -e ipynb . -x jq -r '.metadata.kernelspec // empty'

Length of output: 759


Script:

#!/bin/bash
# Let's check for any pixi.toml or other dependency files that might specify Python version
if [ -f "pixi.toml" ]; then
    echo "Checking pixi.toml for Python version requirements:"
    cat pixi.toml
fi

# Let's also check for any Python imports in the notebook to see what dependencies are used
echo "Checking imports in the notebook:"
jq -r '.cells[] | select(.cell_type == "code") | .source[]' notebooks/nifti_writer_example.ipynb | grep -E "^import|^from.*import"

Length of output: 485

@strixy16 strixy16 requested a review from jjjermiah December 16, 2024 20:06
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (2)
src/readii/analyze/plot_correlation.py (2)

31-31: Fix typographical error in docstring

There's a missing space in the docstring. Correct "matrixto" to "matrix to".

Apply this diff to fix the typo:

-            Which triangle half of the matrixto plot. The default is "lower".
+            Which triangle half of the matrix to plot. The default is "lower".

64-64: Fix typographical error in comment

Correct "visisble" to "visible" in the comment.

Apply this diff:

-        # The entire correlation matrix will be visisble in the plot
+        # The entire correlation matrix will be visible in the plot
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 23c87e4 and b07e5a6.

⛔ Files ignored due to path filters (1)
  • pixi.lock is excluded by !**/*.lock
📒 Files selected for processing (8)
  • .github/workflows/ci-cd.yml (1 hunks)
  • CHANGELOG.md (1 hunks)
  • pyproject.toml (3 hunks)
  • src/readii/__init__.py (1 hunks)
  • src/readii/analyze/__init__.py (1 hunks)
  • src/readii/analyze/correlation.py (1 hunks)
  • src/readii/analyze/plot_correlation.py (1 hunks)
  • src/readii/io/writers/plot_writer.py (1 hunks)
✅ Files skipped from review due to trivial changes (1)
  • src/readii/init.py
🚧 Files skipped from review as they are similar to previous changes (2)
  • src/readii/analyze/init.py
  • pyproject.toml
🧰 Additional context used
🪛 LanguageTool
CHANGELOG.md

[uncategorized] ~12-~12: Use a comma before ‘so’ if it connects two independent clauses (unless they are closely connected and short).
Context: ...r mac platform to the pixi configuration so developers with those machines can be...

(COMMA_COMPOUND_SENTENCE_2)

🪛 Markdownlint (0.37.0)
CHANGELOG.md

20-20: Column: 20
Hard tabs

(MD010, no-hard-tabs)


20-20: Column: 80
Hard tabs

(MD010, no-hard-tabs)


21-21: Column: 24
Hard tabs

(MD010, no-hard-tabs)

🔇 Additional comments (14)
.github/workflows/ci-cd.yml (1)

19-19: LGTM! Good addition of Windows testing environment.

Adding windows-latest to the OS matrix improves cross-platform compatibility testing coverage.

CHANGELOG.md (1)

4-26: LGTM! Well-structured changelog entry.

The changelog entry follows best practices:

  • Uses conventional commit format
  • Includes PR link and commit hash
  • Provides clear description of changes
  • Contains auto-generated summary
🧰 Tools
🪛 LanguageTool

[uncategorized] ~12-~12: Use a comma before ‘so’ if it connects two independent clauses (unless they are closely connected and short).
Context: ...r mac platform to the pixi configuration so developers with those machines can be...

(COMMA_COMPOUND_SENTENCE_2)

🪛 Markdownlint (0.37.0)

20-20: Column: 20
Hard tabs

(MD010, no-hard-tabs)


20-20: Column: 80
Hard tabs

(MD010, no-hard-tabs)


21-21: Column: 24
Hard tabs

(MD010, no-hard-tabs)

src/readii/analyze/plot_correlation.py (1)

61-61: Include error message when raising ValueError

Currently, ValueError is raised without an error message. Including the message improves debugging and clarity.

Apply this diff to include the error message:

-                raise ValueError()
+                raise ValueError(msg)
src/readii/analyze/correlation.py (10)

33-33: Include error message when raising TypeError

Currently, TypeError is raised without an error message. Including the message improves debugging and clarity.

Apply this diff:

            logger.exception(msg)
-            raise TypeError()
+            raise TypeError(msg)

37-37: Include error message when raising TypeError

Consistent exception messages enhance debugging.

Apply this diff:

            logger.exception(msg)
-            raise TypeError()
+            raise TypeError(msg)

42-42: Include error message when raising ValueError

Adding the error message improves clarity on the exception raised.

Apply this diff:

            logger.exception(msg)
-            raise ValueError()
+            raise ValueError(msg)

47-47: Include error message when raising ValueError

Including the message aids in debugging.

Apply this diff:

            logger.exception(msg)
-            raise ValueError()
+            raise ValueError(msg)

93-93: Include error message when raising ValueError

Consistent error messages provide better debugging information.

Apply this diff:

            logger.exception(msg)
-            raise ValueError()
+            raise ValueError(msg)

98-98: Include error message when raising ValueError

Same as above for consistency.

Apply this diff:

            logger.exception(msg)
-            raise ValueError()
+            raise ValueError(msg)

125-125: Include error message when raising ValueError

Adding the message enhances clarity.

Apply this diff:

            logger.exception(msg)
-            raise ValueError()
+            raise ValueError(msg)

130-130: Include error message when raising ValueError

Include the error message for better debugging.

Apply this diff:

            logger.exception(msg)
-            raise ValueError()
+            raise ValueError(msg)

160-160: Include error message when raising ValueError

Consistent exception handling improves maintainability.

Apply this diff:

            logger.exception(msg)
-            raise ValueError()
+            raise ValueError(msg)

165-165: Include error message when raising ValueError

Adding the message aids in debugging.

Apply this diff:

            logger.exception(msg)
-            raise ValueError()
+            raise ValueError(msg)
src/readii/io/writers/plot_writer.py (1)

1-72: LGTM!

The PlotWriter class is well-implemented with appropriate exception handling and logging.

src/readii/analyze/plot_correlation.py Outdated Show resolved Hide resolved
src/readii/analyze/correlation.py Outdated Show resolved Hide resolved
src/readii/analyze/correlation.py Outdated Show resolved Hide resolved
src/readii/analyze/correlation.py Outdated Show resolved Hide resolved
src/readii/analyze/correlation.py Outdated Show resolved Hide resolved
src/readii/analyze/correlation.py Outdated Show resolved Hide resolved
src/readii/analyze/correlation.py Outdated Show resolved Hide resolved
src/readii/io/writers/plot_writer.py Show resolved Hide resolved
src/readii/io/writers/plot_writer.py Outdated Show resolved Hide resolved
src/readii/analyze/plot_correlation.py Outdated Show resolved Hide resolved
src/readii/analyze/correlation.py Show resolved Hide resolved
src/readii/analyze/plot_correlation.py Outdated Show resolved Hide resolved
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (1)
src/readii/analyze/correlation.py (1)

31-48: ⚠️ Potential issue

Include error messages in exceptions

Error messages are logged but not included in the exceptions, which makes debugging harder.

Apply this diff:

    if not isinstance(vertical_features, pd.DataFrame):
        msg = "vertical_features must be a pandas DataFrame"
        logger.exception(msg)
-       raise TypeError()
+       raise TypeError(msg)
    if not isinstance(horizontal_features, pd.DataFrame):
        msg = "horizontal_features must be a pandas DataFrame"
        logger.exception(msg)
-       raise TypeError()
+       raise TypeError(msg)

    if method not in ["pearson", "spearman", "kendall"]:
        msg = "Correlation method must be one of 'pearson', 'spearman', or 'kendall'."
        logger.exception(msg)
-       raise ValueError()
+       raise ValueError(msg)

    if not vertical_features.index.equals(horizontal_features.index):
        msg = "Vertical and horizontal features must have the same index to calculate correlation. Set the index to the intersection of patient IDs."
        logger.exception(msg)
-       raise ValueError()
+       raise ValueError(msg)
🧹 Nitpick comments (6)
src/readii/analyze/correlation.py (4)

5-9: Add spaces after colons in type annotations

According to PEP 8, there should be a space after the colon in type annotations.

Apply this diff:

-def getFeatureCorrelations(vertical_features:pd.DataFrame,
-                           horizontal_features:pd.DataFrame,
-                           method:str = "pearson",
-                           vertical_feature_name:str = '_vertical',
-                           horizontal_feature_name:str = '_horizontal') -> pd.DataFrame:
+def getFeatureCorrelations(vertical_features: pd.DataFrame,
+                           horizontal_features: pd.DataFrame,
+                           method: str = "pearson",
+                           vertical_feature_name: str = '_vertical',
+                           horizontal_feature_name: str = '_horizontal') -> pd.DataFrame:

94-94: Remove unnecessary f-string prefix

The string doesn't contain any placeholders, so the f-string prefix is unnecessary.

Apply this diff:

-       msg = f"Number of vertical features provided is greater than the number of rows or columns in the correlation matrix."
+       msg = "Number of vertical features provided is greater than the number of rows or columns in the correlation matrix."
🧰 Tools
🪛 Ruff (0.8.2)

94-94: f-string without any placeholders

Remove extraneous f prefix

(F541)


122-122: Fix typo and remove unnecessary f-string prefix

There's a typo in the error message ("horizontalfeatures" should be "horizontal features"), and the f-string prefix is unnecessary.

Apply this diff:

-       msg = f"Number of horizontalfeatures provided is greater than the number of rows or columns in the correlation matrix."
+       msg = "Number of horizontal features provided is greater than the number of rows or columns in the correlation matrix."
🧰 Tools
🪛 Ruff (0.8.2)

122-122: f-string without any placeholders

Remove extraneous f prefix

(F541)


154-154: Remove unnecessary f-string prefix

The string doesn't contain any placeholders, so the f-string prefix is unnecessary.

Apply this diff:

-       msg = f"Number of vertical features provided is greater than the number of rows or columns in the correlation matrix."
+       msg = "Number of vertical features provided is greater than the number of rows or columns in the correlation matrix."
🧰 Tools
🪛 Ruff (0.8.2)

154-154: f-string without any placeholders

Remove extraneous f prefix

(F541)

src/readii/io/writers/plot_writer.py (2)

54-80: Enhance docstring with parameter descriptions.

The save method implementation is robust, but the docstring could be more descriptive.

-    """Save the plot to a .png file."""
+    """Save the plot to a file.
+    
+    Args:
+        plot: The matplotlib Figure to save
+        **kwargs: Format specifiers for the output filename
+    
+    Returns:
+        Path: The path where the plot was saved
+    
+    Raises:
+        PlotWriterIOError: If the file exists and overwrite=False,
+            or if there's an error saving the plot
+    
+    Note:
+        Uses bbox_inches='tight' to remove extra whitespace around the figure
+    """

83-93: Consider adding more comprehensive usage examples.

While the current example is helpful, consider adding examples that demonstrate:

  • Different file formats
  • Error handling scenarios
  • Various plot types
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d917b7a and 9b1424a.

📒 Files selected for processing (4)
  • src/readii/analyze/correlation.py (1 hunks)
  • src/readii/analyze/plot_correlation.py (1 hunks)
  • src/readii/data/select.py (1 hunks)
  • src/readii/io/writers/plot_writer.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/readii/analyze/plot_correlation.py
🧰 Additional context used
🪛 Ruff (0.8.2)
src/readii/analyze/correlation.py

94-94: f-string without any placeholders

Remove extraneous f prefix

(F541)


122-122: f-string without any placeholders

Remove extraneous f prefix

(F541)


154-154: f-string without any placeholders

Remove extraneous f prefix

(F541)

🔇 Additional comments (5)
src/readii/analyze/correlation.py (1)

50-71: LGTM! Well-implemented correlation calculation

The code properly handles feature name prefixes, uses inner join for data alignment, and has appropriate error handling for the correlation calculation.

src/readii/io/writers/plot_writer.py (4)

1-20: LGTM! Well-structured exception hierarchy.

The imports are clean and the exception hierarchy is well designed with a base class and specific error types.


22-26: LGTM! Clear and purposeful validation exception.

The validation exception is well documented and fits appropriately in the exception hierarchy.


31-36: Ensure consistency between help text and raised exception.

The metadata help text states that a FileExistsError will be raised, but the code raises PlotWriterIOError instead.

Choose one of these options:

# Option 1: Update help text
             metadata={
-                "help": "If True, allows overwriting existing files. If False, raises FileExistsError."
+                "help": "If True, allows overwriting existing files. If False, raises PlotWriterIOError."
             },

# Option 2: Use built-in exception
-                raise PlotWriterIOError(msg)
+                raise FileExistsError(msg)

46-53: LGTM! Robust validation with clear error messaging.

The post-init validation effectively ensures valid file extensions and provides a helpful error message.

src/readii/data/select.py Show resolved Hide resolved
src/readii/io/writers/plot_writer.py Outdated Show resolved Hide resolved
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

♻️ Duplicate comments (3)
src/readii/analyze/plot_correlation.py (1)

62-62: ⚠️ Potential issue

Include error message in ValueError.

The error message should be included when raising the ValueError for better debugging.

Apply this fix:

-            raise ValueError()
+            raise ValueError(msg)
src/readii/analyze/correlation.py (2)

33-40: ⚠️ Potential issue

Include error messages in TypeError raises.

Error messages should be included when raising TypeError for better debugging.

Apply these fixes:

-        raise TypeError()
+        raise TypeError(msg)
-        raise TypeError()
+        raise TypeError(msg)

42-45: ⚠️ Potential issue

Include error message in ValueError raise.

Error message should be included when raising ValueError for better debugging.

Apply this fix:

-        raise ValueError()
+        raise ValueError(msg)
🧹 Nitpick comments (5)
src/readii/io/writers/plot_writer.py (2)

91-92: Consider adding DPI parameter for better control over image quality

When saving plots, especially for publication or high-resolution displays, controlling the DPI (dots per inch) can be important.

-            plot.savefig(out_path, bbox_inches='tight')
+            plot.savefig(out_path, bbox_inches='tight', dpi=300)  # or make DPI configurable

94-96: Enhance error logging for better debugging

Consider logging the full exception traceback for better debugging capabilities.

         except Exception as e:
+            logger.exception(f"Failed to save plot to {out_path}")
             msg = f"Error saving plot to file {out_path}: {e}"
             raise PlotWriterIOError(msg) from e
src/readii/analyze/plot_correlation.py (2)

73-80: Consider memory optimization for large correlation matrices.

For large correlation matrices, consider adding a parameter to downsample or chunk the data before visualization to prevent memory issues.


109-109: Fix typo in docstring.

There's a spacing issue in the docstring.

-    """Plot a histogram to show thedistribution of correlation values for a correlation matrix.
+    """Plot a histogram to show the distribution of correlation values for a correlation matrix.
src/readii/analyze/correlation.py (1)

124-124: Fix typo in error message.

There's a spacing issue in the error message.

-        msg = "Number of horizontalfeatures provided is greater than the number of rows or columns in the correlation matrix."
+        msg = "Number of horizontal features provided is greater than the number of rows or columns in the correlation matrix."
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9b1424a and b62b735.

⛔ Files ignored due to path filters (1)
  • pixi.lock is excluded by !**/*.lock
📒 Files selected for processing (7)
  • config/ruff.toml (1 hunks)
  • src/readii/analyze/__init__.py (1 hunks)
  • src/readii/analyze/correlation.py (1 hunks)
  • src/readii/analyze/plot_correlation.py (1 hunks)
  • src/readii/image_processing.py (0 hunks)
  • src/readii/io/loaders/features.py (2 hunks)
  • src/readii/io/writers/plot_writer.py (1 hunks)
💤 Files with no reviewable changes (1)
  • src/readii/image_processing.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/readii/analyze/init.py
🔇 Additional comments (6)
src/readii/io/loaders/features.py (1)

56-56: LGTM: Exception handling improvement

Good cleanup of unused exception variables in the catch blocks. This follows Python best practices and improves code clarity.

Also applies to: 73-73

src/readii/io/writers/plot_writer.py (2)

34-35: ⚠️ Potential issue

Update help text to match actual exception

The help text mentions FileExistsError but the code raises PlotWriterIOError.

-            "help": "If True, allows overwriting existing files. If False, raises FileExistsError."
+            "help": "If True, allows overwriting existing files. If False, raises PlotWriterIOError."

Likely invalid or redundant comment.


38-44: ⚠️ Potential issue

Fix type annotation for VALID_EXTENSIONS

The type annotation indicates a list but the value is a tuple. This mismatch should be corrected.

-    VALID_EXTENSIONS: ClassVar[list[str]] = (
+    VALID_EXTENSIONS: ClassVar[tuple[str, ...]] = (

Likely invalid or redundant comment.

config/ruff.toml (1)

9-11: LGTM: Appropriate path additions

The new include paths correctly cover the new analysis functionality being added to the project.

src/readii/analyze/plot_correlation.py (1)

1-11: LGTM! Clean and appropriate imports.

The imports are well-organized and necessary for the functionality.

src/readii/analyze/correlation.py (1)

93-98: Refactor duplicate validation logic.

The validation logic is repeated across functions. Consider creating a helper function.

Create a helper function to handle the validation:

def validate_feature_count(correlation_matrix: pd.DataFrame, feature_count: int, feature_type: str):
    try:
        validateDataframeSubsetSelection(correlation_matrix, feature_count, feature_count)
    except ValueError as e:
        msg = f"Number of {feature_type} features provided is greater than the number of rows or columns in the correlation matrix."
        logger.exception(msg)
        raise e

Also applies to: 121-126, 152-157

src/readii/analyze/plot_correlation.py Outdated Show resolved Hide resolved
src/readii/analyze/plot_correlation.py Outdated Show resolved Hide resolved
src/readii/analyze/correlation.py Show resolved Hide resolved
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
src/readii/analyze/plot_correlation.py (2)

13-49: Consider using a more specific return type annotation.

While the implementation is good, the return type annotation could be more specific by using matplotlib.figure.Figure instead of just Figure.

Apply this diff:

-                           ) -> Figure:
+                           ) -> matplotlib.figure.Figure:

114-114: Fix typo in docstring.

There's a typo in the docstring: "thedistribution" should be "the distribution".

Apply this diff:

-    """Plot a histogram to show thedistribution of correlation values for a correlation matrix.
+    """Plot a histogram to show the distribution of correlation values for a correlation matrix.
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b62b735 and 4cf18a4.

📒 Files selected for processing (2)
  • src/readii/analyze/plot_correlation.py (1 hunks)
  • src/readii/io/writers/plot_writer.py (1 hunks)
🔇 Additional comments (6)
src/readii/io/writers/plot_writer.py (4)

1-26: LGTM! Well-structured exception hierarchy.

The exception classes are properly organized with clear inheritance and docstrings.


46-53: LGTM! Good validation implementation.

The post-init validation is well implemented with clear error messages.


102-112: LGTM! Good example implementation.

The main block provides a clear example of how to use the PlotWriter class.


38-44: ⚠️ Potential issue

Fix type annotation for VALID_EXTENSIONS.

The type annotation indicates list[str] but the actual value is a tuple. This inconsistency should be fixed.

Apply this diff to fix the type annotation:

-    VALID_EXTENSIONS: ClassVar[list[str]] = (
+    VALID_EXTENSIONS: ClassVar[tuple[str, ...]] = (

Likely invalid or redundant comment.

src/readii/analyze/plot_correlation.py (2)

164-166: ⚠️ Potential issue

Include error message in ValueError.

The error message is logged but not included in the raised ValueError.

Apply this diff:

         msg = f"Number of bins must be positive, got {num_bins}"
         logger.exception(msg)
-        raise ValueError()
+        raise ValueError(msg)

Likely invalid or redundant comment.


60-62: ⚠️ Potential issue

Include error message in ValueError.

The error message is logged but not included in the raised ValueError.

Apply this diff:

             msg = f"If diagonal is True, triangle must be either 'lower' or 'upper'. Got {triangle}."
             logger.exception(msg)
-            raise ValueError()
+            raise ValueError(msg)

Likely invalid or redundant comment.

Comment on lines 54 to 99
def save(self, plot:matplotlib.figure.Figure, **kwargs: str) -> Path:
"""Save the plot to a .png file with extra whitespace removed.

Parameters
----------
plot : matplotlib.figure.Figure
The plot to save.
kwargs : dict
Additional keyword arguments to pass to the save method.

Returns
-------
Path
The path to the saved file.

Raises
------
PlotWriterIOError
If there is an error saving the plot.
PlotWriterValidationError
If the filename format is invalid.
"""
logger.debug("Saving.", kwargs=kwargs)

# Generate the output path
out_path = self.resolve_path(**kwargs)

# Check if the output path already exists
if out_path.exists():
if not self.overwrite:
msg = f"File {out_path} already exists. \nSet {self.__class__.__name__}.overwrite to True to overwrite."
raise PlotWriterIOError(msg)
else:
logger.warning(f"File {out_path} already exists. Overwriting.")

logger.debug("Saving plot to file", out_path=out_path)
try:
# Save the plot to the output path, remove extra whitespace around the figure
plot.savefig(out_path, bbox_inches='tight')

except Exception as e:
msg = f"Error saving plot to file {out_path}: {e}"
raise PlotWriterIOError(msg) from e
else:
logger.info("Plot saved successfully.", out_path=out_path)
return out_path
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add type validation for the plot parameter.

While the method implementation is good, it should validate that the plot parameter is actually a Figure object before proceeding.

Add this validation at the start of the method:

     def save(self, plot:matplotlib.figure.Figure, **kwargs: str) -> Path:
+        if not isinstance(plot, matplotlib.figure.Figure):
+            msg = f"Expected matplotlib.figure.Figure, got {type(plot)}"
+            raise PlotWriterValidationError(msg)
+
         logger.debug("Saving.", kwargs=kwargs)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def save(self, plot:matplotlib.figure.Figure, **kwargs: str) -> Path:
"""Save the plot to a .png file with extra whitespace removed.
Parameters
----------
plot : matplotlib.figure.Figure
The plot to save.
kwargs : dict
Additional keyword arguments to pass to the save method.
Returns
-------
Path
The path to the saved file.
Raises
------
PlotWriterIOError
If there is an error saving the plot.
PlotWriterValidationError
If the filename format is invalid.
"""
logger.debug("Saving.", kwargs=kwargs)
# Generate the output path
out_path = self.resolve_path(**kwargs)
# Check if the output path already exists
if out_path.exists():
if not self.overwrite:
msg = f"File {out_path} already exists. \nSet {self.__class__.__name__}.overwrite to True to overwrite."
raise PlotWriterIOError(msg)
else:
logger.warning(f"File {out_path} already exists. Overwriting.")
logger.debug("Saving plot to file", out_path=out_path)
try:
# Save the plot to the output path, remove extra whitespace around the figure
plot.savefig(out_path, bbox_inches='tight')
except Exception as e:
msg = f"Error saving plot to file {out_path}: {e}"
raise PlotWriterIOError(msg) from e
else:
logger.info("Plot saved successfully.", out_path=out_path)
return out_path
def save(self, plot:matplotlib.figure.Figure, **kwargs: str) -> Path:
if not isinstance(plot, matplotlib.figure.Figure):
msg = f"Expected matplotlib.figure.Figure, got {type(plot)}"
raise PlotWriterValidationError(msg)
logger.debug("Saving.", kwargs=kwargs)
# Generate the output path
out_path = self.resolve_path(**kwargs)
# Check if the output path already exists
if out_path.exists():
if not self.overwrite:
msg = f"File {out_path} already exists. \nSet {self.__class__.__name__}.overwrite to True to overwrite."
raise PlotWriterIOError(msg)
else:
logger.warning(f"File {out_path} already exists. Overwriting.")
logger.debug("Saving plot to file", out_path=out_path)
try:
# Save the plot to the output path, remove extra whitespace around the figure
plot.savefig(out_path, bbox_inches='tight')
except Exception as e:
msg = f"Error saving plot to file {out_path}: {e}"
raise PlotWriterIOError(msg) from e
else:
logger.info("Plot saved successfully.", out_path=out_path)
return out_path

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
src/readii/io/writers/plot_writer.py (2)

11-26: Remove unnecessary pass statements from exception classes

Since the exception classes have docstrings, the pass statements are redundant and can be removed for cleaner code.

 class PlotWriterError(Exception):
     """Base exception for PlotWriter errors."""
-    pass

 class PlotWriterIOError(PlotWriterError):
     """Raised when I/O operations fail."""
-    pass

 class PlotWriterValidationError(PlotWriterError):
     """Raised when validation of writer configuration fails."""
-    pass

106-111: Consider using a more conventional example path

The example uses "TRASH" in the path, which isn't a conventional directory name. Consider using a more standard path like "examples" or "output".

     plot_writer = PlotWriter(
-        root_directory=Path("TRASH", "plot_writer_examples"),
+        root_directory=Path("examples", "plots"),
         filename_format="{DatasetName}_{PlotType}_{FeatureType}_{ImageType}_plot.png",
         overwrite=True,
         create_dirs=True
     )
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4cf18a4 and 173b212.

📒 Files selected for processing (2)
  • config/ruff.toml (1 hunks)
  • src/readii/io/writers/plot_writer.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • config/ruff.toml
🔇 Additional comments (3)
src/readii/io/writers/plot_writer.py (3)

39-45: Use tuple instead of list for VALID_EXTENSIONS

The comment indicates that extensions should be immutable, but the implementation uses a list. Additionally, the syntax used is for a tuple but the type annotation indicates a list.

-    VALID_EXTENSIONS: ClassVar[list[str]] = (
+    VALID_EXTENSIONS: ClassVar[tuple[str, ...]] = (
         ".png",
         ".pdf",
         ".ps",
         ".eps",
         ".svg"
     )

47-54: LGTM! Well-implemented validation

The post-initialization validation is thorough and provides clear error messages. Good practice calling the parent's __post_init__ method first.


55-100: 🛠️ Refactor suggestion

Add type validation for the plot parameter

While the implementation is solid, it should validate that the plot parameter is actually a Figure object before proceeding.

Add this validation at the start of the method:

     def save(self, plot:Figure, **kwargs: str) -> Path:
+        if not isinstance(plot, Figure):
+            msg = f"Expected matplotlib.figure.Figure, got {type(plot)}"
+            raise PlotWriterValidationError(msg)
+
         logger.debug("Saving.", kwargs=kwargs)

Likely invalid or redundant comment.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (2)
src/readii/analyze/correlation.py (2)

33-40: ⚠️ Potential issue

Include error messages in TypeError exceptions

Error messages are logged but not included in the raised exceptions, which loses context for error handling.

    if not isinstance(vertical_features, pd.DataFrame):
        msg = "vertical_features must be a pandas DataFrame"
        logger.exception(msg)
-        raise TypeError()
+        raise TypeError(msg)
    if not isinstance(horizontal_features, pd.DataFrame):
        msg = "horizontal_features must be a pandas DataFrame"
        logger.exception(msg)
-        raise TypeError()
+        raise TypeError(msg)

48-51: ⚠️ Potential issue

Include error message in ValueError for invalid correlation method

The error message should be included in the ValueError for better error handling.

    if method not in ["pearson", "spearman", "kendall"]:
        msg = "Correlation method must be one of 'pearson', 'spearman', or 'kendall'."
        logger.exception(msg)
-        raise ValueError()
+        raise ValueError(msg)
🧹 Nitpick comments (4)
src/readii/analyze/correlation.py (4)

12-12: Use imperative form in docstring

Per PEP-257, one-line docstrings should be in imperative form.

-    """Calculate correlation between two sets of features.
+    """Calculate correlations between two sets of features and return correlation matrix.

85-85: Enhance docstring clarity

The docstring could better explain the matrix quadrant selection and its significance.

-    """Get the vertical (y-axis) self correlations from a correlation matrix. Gets the top left quadrant of the correlation matrix.
+    """Extract vertical feature self-correlations from the top-left quadrant of the correlation matrix.
+    
+    This quadrant represents how vertical features correlate with each other.

130-130: Fix typo in error message

There's a missing space in "horizontalfeatures".

-        msg = "Number of horizontalfeatures provided is greater than the number of rows or columns in the correlation matrix."
+        msg = "Number of horizontal features provided is greater than the number of rows or columns in the correlation matrix."

99-104: Consider refactoring common error handling pattern

The error handling pattern is repeated across functions. Consider extracting it into a helper function.

def handle_validation_error(e: ValueError, context: str) -> None:
    msg = f"Number of {context} features provided is greater than the number of rows or columns in the correlation matrix."
    logger.exception(msg)
    raise e

# Usage in functions:
try:
    validateDataframeSubsetSelection(correlation_matrix, num_vertical_features, num_vertical_features)
except ValueError as e:
    handle_validation_error(e, "vertical")

Also applies to: 127-132, 158-163

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 173b212 and 5be1831.

📒 Files selected for processing (1)
  • src/readii/analyze/correlation.py (1 hunks)
🔇 Additional comments (2)
src/readii/analyze/correlation.py (2)

1-6: LGTM! Clean import structure

The imports are well-organized and appropriately separated, importing only the necessary dependencies.


142-165: LGTM! Well-implemented cross-correlation extraction

The function follows consistent patterns with proper validation and error handling. The implementation correctly extracts the top-right quadrant for cross-correlations.

@strixy16 strixy16 requested a review from jjjermiah December 17, 2024 16:53
@jjjermiah jjjermiah merged commit a57182c into main Dec 17, 2024
18 of 20 checks passed
@jjjermiah jjjermiah deleted the katys/add-analysis branch December 17, 2024 17:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

2 participants