feat: add io/readers to ruff config #94

strixy16 · 2024-12-17T17:37:04Z

Updated include list to have the io/readers functions and updated the files so all ruff tests pass.

Summary by CodeRabbit

New Features
- Enhanced flexibility in loading feature files by accepting both Path objects and strings.
- Improved error handling and logging for file loading and directory access.
Bug Fixes
- Enhanced error handling and logging for file loading and directory access, providing clearer context for issues.
Documentation
- Updated docstrings for functions to improve clarity and detail, including return type annotations.

…mage types

coderabbitai · 2024-12-17T17:37:13Z

Walkthrough

The pull request introduces modifications to several files in the readii project, focusing on improving error handling, logging, and code structure. Changes span configuration files and loader modules, with updates to function signatures, error handling mechanisms, and logging approaches. The modifications aim to enhance code robustness, provide more informative error messages, and modernize the codebase by utilizing Path objects and structured error logging.

Changes

File	Change Summary
`config/ruff.toml`	- Updated file inclusion pattern to `"src/readii/io//.py"` - Removed commented-out `select` list - Updated `ignore` list with specific linting rules
`src/readii/io/loaders/features.py`	- Enhanced `loadFeatureFilesFromImageTypes` function - Updated parameter type to accept `Path` or `str` - Improved directory and file handling using `Path` methods - Modernized error handling with logging and pattern matching
`src/readii/io/loaders/general.py`	- Integrated logging for error handling - Added error logging in `loadImageDatasetConfig` and `loadFileToDataFrame` - Maintained existing exception types
`src/readii/io/loaders/images.py`	- Added return type annotation to `getImageTypesFromDirectory` - Enhanced error handling with logging - Improved docstring clarity

Poem

🐰 In the realm of code, a rabbit's delight,
Paths now clearer, errors shine bright
Logging whispers where bugs might hide
With Path and pattern, we now stride
Refactored magic, clean and light! 🔧

Tip

CodeRabbit's docstrings feature is now available as part of our Early Access Program! Simply use the command @coderabbitai generate docstrings to have CodeRabbit automatically generate docstrings for your pull request. We would love to hear your feedback on Discord.

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Generate unit testing code for this file.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai generate unit testing code for this file.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and generate unit testing code.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (8)

src/readii/io/loaders/images.py (2)
32-34: Consider using f-strings consistently

While the error logging is good, the message construction could be more consistent with Python's modern practices.
-        msg = f"Directory {raw_data_dir} does not exist."
+        msg = f"Directory '{raw_data_dir}' does not exist."
38-40: Enhance error messages with more context

The error handling pattern is good, but consider adding more context to help users troubleshoot:

For NotADirectoryError: Include the actual file type

For FileNotFoundError: Include the full search pattern
-        msg = f"Path {raw_data_dir} is not a directory."
+        msg = f"Path '{raw_data_dir}' is not a directory (type: {raw_data_dir.stat().st_mode})"

-        msg = f"No files with prefix {feature_file_prefix} and suffix {feature_file_suffix} found in directory {raw_data_dir}."
+        msg = f"No files matching pattern '{feature_file_prefix}*{feature_file_suffix}' found in directory '{raw_data_dir}'"
Also applies to: 44-46
src/readii/io/loaders/general.py (1)
113-115: Consider adding data shape information

When logging empty DataFrame errors, it would be helpful to include the shape information.
-        msg = "Loaded Dataframe is empty"
+        msg = f"Loaded DataFrame is empty (shape: {df.shape})"
src/readii/io/loaders/features.py (5)
10-10: Justify or remove the noqa comment.

The # noqa comment is suppressing linter warnings, but it's not clear which specific warning is being suppressed. Consider either:

Specifying the exact warning code being suppressed (e.g., # noqa: E501 for line length)

Removing it if no longer needed

Line range hint 14-32: Update docstring to match the new type annotation.

The parameter type for extracted_feature_dir should be updated to reflect that it accepts both string and Path objects:
    Parameters
    ----------
-   extracted_feature_dir : str
+   extracted_feature_dir : Union[str, Path]
        Path to the directory containing the extracted feature csv files
45-48: Avoid duplicating the error message.

The error message is defined but not used in the exception. Consider:
    if not extracted_feature_dir.exists() or not extracted_feature_dir.is_dir():
        msg = f"Extracted feature directory {extracted_feature_dir} does not exist."
-       logger.error(f"Extracted feature directory {extracted_feature_dir} does not exist.")
-       raise FileNotFoundError()
+       logger.error(msg)
+       raise FileNotFoundError(msg)
76-78: Consider more specific exception handling.

The current implementation catches all exceptions with a broad Exception clause. Consider catching specific exceptions that might occur during file operations:
-       except Exception as e:
+       except (OSError, IOError) as e:
            logger.warning(f"Error loading {image_type} feature csv files from {extracted_feature_dir}: {e}")
            raise e
104-105: Include error message in ValueError.

Consider passing the error message to the ValueError for better debugging:
        logger.error(f"No valid feature sets were loaded from {extracted_feature_dir}")
-       raise ValueError()
+       raise ValueError(f"No valid feature sets were loaded from {extracted_feature_dir}")

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9b734be and 5dbfc73.

📒 Files selected for processing (4)

config/ruff.toml (1 hunks)
src/readii/io/loaders/features.py (4 hunks)
src/readii/io/loaders/general.py (4 hunks)
src/readii/io/loaders/images.py (2 hunks)

🧰 Additional context used

🪛 Ruff (0.8.2)

src/readii/io/loaders/features.py

93-93: SyntaxError: Expected an indented block after except clause

🔇 Additional comments (7)

src/readii/io/loaders/images.py (2)

4-5: LGTM: Logger import added correctly

The logger import from readii.utils is appropriately placed at the module level.

9-9: LGTM: Return type annotation added

The return type annotation -> list improves type safety and documentation.

src/readii/io/loaders/general.py (2)

6-7: LGTM: Logger import added correctly

The logger import from readii.utils is appropriately placed at the module level.

45-46: LGTM: Error handling in loadImageDatasetConfig

The error handling pattern is consistent and appropriate:

FileNotFoundError for missing config
ConfigError for YAML parsing issues
ConfigError for empty config

Also applies to: 52-54, 57-59

config/ruff.toml (2)

Line range hint 1-116: LGTM: Comprehensive ruff configuration

The configuration is well-structured with:

Clear categorization of rules (type annotations, imports, code quality, etc.)
Appropriate ignores for specific cases
Consistent formatting settings

9-9: Verify the impact of broader include pattern

The pattern src/readii/io/**/**.py will include all Python files under io directory. Let's verify this doesn't accidentally include any test files or other unintended files.

✅ Verification successful

The broader include pattern correctly targets only source files

The pattern src/readii/io/**/**.py is safe to use as it only includes legitimate source files:

All files are properly organized under feature-specific subdirectories (loaders, utils, writers)
No test files were found in the io directory structure
The included files align with expected source code organization

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# List all Python files that will be included by the new pattern
fd '.py$' src/readii/io/

# Check if there are any test files that might be accidentally included
fd -I 'test.*\.py$' src/readii/io/

Length of output: 440

src/readii/io/loaders/features.py (1)

33-36: LGTM! Clean Path object handling.

The conversion from string to Path is implemented correctly with proper type checking.

coderabbitai · 2024-12-17T17:38:48Z

src/readii/io/loaders/general.py

+        msg = "Error pargins {file_path}"
+        logger.exception(msg)
+        raise DataFrameLoadError() from e


⚠️ Potential issue

Fix typo in error message

There's a typo in the error message for file parsing errors.

- msg = "Error pargins {file_path}" + msg = f"Error parsing '{file_path}'"

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

msg = "Error pargins {file_path}"

logger.exception(msg)

raise DataFrameLoadError() from e

msg = f"Error parsing '{file_path}'"

logger.exception(msg)

raise DataFrameLoadError() from e

coderabbitai · 2024-12-17T17:38:48Z

src/readii/io/loaders/features.py

+        except KeyError:
        except KeyError:


⚠️ Potential issue

Fix the duplicated except clause.

There's a syntax error with the duplicated except KeyError: line:

- except KeyError: except KeyError: logger.warning(f"{feature_file_path} does not have the labels {labels_to_drop} to drop.")

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

except KeyError:

except KeyError:

except KeyError:

logger.warning(f"{feature_file_path} does not have the labels {labels_to_drop} to drop.")

🧰 Tools

🪛 Ruff (0.8.2)

93-93: SyntaxError: Expected an indented block after except clause

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (4)

src/readii/io/loaders/features.py (4)
10-10: Remove unnecessary noqa comment.

The # noqa comment appears unnecessary as there are no apparent linting issues with this line. Consider removing it unless there's a specific linting rule being suppressed.

Line range hint 14-32: Update docstring to reflect new parameter type.

The docstring for extracted_feature_dir parameter should be updated to reflect that it now accepts both string and Path objects.
-    extracted_feature_dir : str
+    extracted_feature_dir : Union[str, Path]
45-48: Improve error handling consistency.

The error handling can be improved:

The msg variable is defined but not used in the exception

The error logging message is duplicated
     if not extracted_feature_dir.exists() or not extracted_feature_dir.is_dir():
         msg = f"Extracted feature directory {extracted_feature_dir} does not exist."
-        logger.error(f"Extracted feature directory {extracted_feature_dir} does not exist.")
+        logger.error(msg)
-        raise FileNotFoundError()
+        raise FileNotFoundError(msg)
104-105: Include error message in ValueError.

The ValueError should include the error message for better debugging.
         logger.error(f"No valid feature sets were loaded from {extracted_feature_dir}")
-        raise ValueError()
+        raise ValueError(f"No valid feature sets were loaded from {extracted_feature_dir}")

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5dbfc73 and 99e8924.

⛔ Files ignored due to path filters (1)

pixi.lock is excluded by !**/*.lock

📒 Files selected for processing (1)

src/readii/io/loaders/features.py (4 hunks)

🔇 Additional comments (1)

src/readii/io/loaders/features.py (1)

10-13: Verify all callers handle the new parameter type.

The function signature has been updated to accept both Path and str types. Let's verify that all callers of this function are compatible with this change.

✅ Verification successful

Function appears to be newly added or currently unused

The function loadFeatureFilesFromImageTypes is only defined in src/readii/io/loaders/features.py and exported in src/readii/io/loaders/__init__.py, but there are no actual calls to this function found in the codebase. This suggests that either:

The function is newly added and will be used in future implementations
The function is used in external code that imports this package
The function is used in test files that might be in a separate directory

Since the change to accept both Path and str types makes the function more flexible for future use, and there are no current callers that need updating, this change is safe.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for all calls to loadFeatureFilesFromImageTypes
ast-grep --pattern 'loadFeatureFilesFromImageTypes($$$)'

Length of output: 58

Script:

#!/bin/bash
# Search for function calls using ripgrep for better coverage
rg "loadFeatureFilesFromImageTypes" -l

# Also search for the function definition to understand its location
rg "def loadFeatureFilesFromImageTypes" -l

Length of output: 185

Script:

#!/bin/bash
# Let's see the actual usage context in both files
rg "loadFeatureFilesFromImageTypes" -B 2 -A 2 src/readii/io/loaders/features.py src/readii/io/loaders/__init__.py

Length of output: 1177

Script:

#!/bin/bash
# Let's search for any imports and actual usage of this function in the entire codebase
rg -g '!{features,__init__}.py' "loadFeatureFilesFromImageTypes" -B 3 -A 3

Length of output: 74

src/readii/io/loaders/features.py

codecov · 2024-12-17T17:58:26Z

Codecov Report

Attention: Patch coverage is 9.37500% with 58 lines in your changes missing coverage. Please review.

Project coverage is 40.31%. Comparing base (81fbd78) to head (99e8924).
Report is 11 commits behind head on main.

Files with missing lines	Patch %	Lines
src/readii/io/loaders/features.py	13.79%	25 Missing ⚠️
src/readii/io/loaders/general.py	4.00%	24 Missing ⚠️
src/readii/io/loaders/images.py	10.00%	9 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main      #94      +/-   ##
==========================================
- Coverage   49.04%   40.31%   -8.73%     
==========================================
  Files          25       32       +7     
  Lines         991     1399     +408     
==========================================
+ Hits          486      564      +78     
- Misses        505      835     +330

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

jjjermiah · 2024-12-18T16:16:47Z

src/readii/io/loaders/features.py

+                    continue
+                case _:
+                    # Multiple files found
+                    msg = f"Multiple {image_type} feature csv files found in {extracted_feature_dir}. First one will be used."


Suggested change

msg = f"Multiple {image_type} feature csv files found in {extracted_feature_dir}. First one will be used."

msg = f"{len(matching_files) csv files found for {image_type} in {extracted_feature_dir}. First one ({matching_files[0]) will be used."

jjjermiah · 2024-12-18T16:19:24Z

src/readii/io/loaders/features.py

+        except Exception as e:
+            logger.warning(f"Error loading {image_type} feature csv files from {extracted_feature_dir}: {e}")
+            raise e


what kind of exceptions do you forsee in this try-except block? Looks like the most possible source of errors are well handled i.e no chance for index error

strixy16 added 9 commits December 17, 2024 12:14

feat: replace os with Path, update handling for matching files with i…

2d8f1ef

…mage types

refactor: ruff fixes, adding logger

614481e

style: skip qa on function name

cfced34

feat: add readii logger

c1133e9

style: make docstring imperative

6b3aaa3

feat: add return type annotation

132b9f1

feat: add readii logger

45606c0

feat: add io/loaders to ruff config

7dda5ed

Merge remote-tracking branch 'origin/main' into katys/update-ruff-config

5dbfc73

strixy16 requested a review from jjjermiah December 17, 2024 17:37

fix: remove duplicate except line

87e562e

coderabbitai bot reviewed Dec 17, 2024

View reviewed changes

build: updated lock file

99e8924

coderabbitai bot reviewed Dec 17, 2024

View reviewed changes

src/readii/io/loaders/features.py Show resolved Hide resolved

jjjermiah approved these changes Dec 18, 2024

View reviewed changes

strixy16 merged commit 7efc97b into main Dec 18, 2024
18 of 20 checks passed

strixy16 deleted the katys/update-ruff-config branch December 18, 2024 16:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add io/readers to ruff config #94

feat: add io/readers to ruff config #94

strixy16 commented Dec 17, 2024 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Dec 17, 2024 •

edited

Loading

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (`.coderabbit.yaml`)

Documentation and Community

coderabbitai bot left a comment

coderabbitai bot Dec 17, 2024

coderabbitai bot Dec 17, 2024

coderabbitai bot left a comment

codecov bot commented Dec 17, 2024 •

edited

Loading

jjjermiah Dec 18, 2024

jjjermiah Dec 18, 2024

	msg = f"Multiple {image_type} feature csv files found in {extracted_feature_dir}. First one will be used."
	msg = f"{len(matching_files) csv files found for {image_type} in {extracted_feature_dir}. First one ({matching_files[0]) will be used."

feat: add io/readers to ruff config #94

feat: add io/readers to ruff config #94

Conversation

strixy16 commented Dec 17, 2024 • edited by coderabbitai bot Loading

Summary by CodeRabbit

coderabbitai bot commented Dec 17, 2024 • edited Loading

Walkthrough

Changes

Poem

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Documentation and Community

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot Dec 17, 2024

Choose a reason for hiding this comment

coderabbitai bot Dec 17, 2024

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

codecov bot commented Dec 17, 2024 • edited Loading

Codecov Report

jjjermiah Dec 18, 2024

Choose a reason for hiding this comment

jjjermiah Dec 18, 2024

Choose a reason for hiding this comment

strixy16 commented Dec 17, 2024 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Dec 17, 2024 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)

codecov bot commented Dec 17, 2024 •

edited

Loading