Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tests: allow for parallel test execution #180

Closed
wants to merge 14 commits into from

Conversation

jjjermiah
Copy link
Contributor

@jjjermiah jjjermiah commented Jan 17, 2025

Explicitly define exported members in the module to improve clarity. Update test configurations for better coverage reporting and parallel execution. Refactor tests to utilize temporary paths for output handling.

Summary by CodeRabbit

  • Configuration

    • Updated pytest configuration options for more flexible test execution.
    • Expanded coverage reporting exclusions to include test files.
    • Added a new task for cleaning up test-related directories.
  • Module Improvements

    • Refined import and export mechanism in the ops module.
    • Explicitly defined public interface for better module clarity.
  • Testing Enhancements

    • Improved test file management using pytest's temporary directory fixture.
    • Introduced new fixture for modality paths to enhance test organization.
    • Updated test cases to use built-in tmp_path for better test isolation.
  • Code Style

    • Consistent quote usage in test files.
    • Minor formatting improvements across test suites.
    • Simplified assertion formatting in test cases.
  • Chores

    • Added .lock files to .gitignore to prevent tracking.

Copy link
Contributor

coderabbitai bot commented Jan 17, 2025

Warning

Rate limit exceeded

@jjjermiah has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 10 minutes and 42 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between 0d5199f and 3d41103.

📒 Files selected for processing (2)
  • tests/test_components.py (3 hunks)
  • tests/test_modalities.py (1 hunks)

Walkthrough

This pull request introduces several configuration and code refinements across multiple files. The changes focus on improving test configuration in pytest.ini, enhancing module imports in src/imgtools/ops/__init__.py, and updating test files to use pytest's built-in temporary directory management. The modifications aim to increase code clarity, control over module exports, and improve test isolation by leveraging pytest's native features.

Changes

File Change Summary
config/pytest.ini - Removed --cov=imgtools option
- Added --cov-append
- Added --numprocesses=auto
- Added --maxprocesses=8
- Added --dist=loadgroup
src/imgtools/ops/__init__.py - Replaced wildcard import with explicit imports
- Added __all__ list defining public interface
tests/dicom/sort/test_sort_utils.py - Updated string literals from single to double quotes
- Modified test_invalid_dicom_file to use tmp_path
tests/test_ops.py - Commented out custom output_path fixture
- Updated test_output to use tmp_path
- Replaced single quotes with double quotes
config/coverage.toml - Updated omit list to include "tests/**/*.py"
pixi.toml - Added new task [feature.test.tasks.clean_tests] to clean test cache and data
tests/conftest.py - Updated dataset_path fixture scope from 'session' to 'package'
- Removed modalities_path fixture
- Added prepare_dataset fixture
tests/test_components.py - Reformatted assertions in test_pipeline method for clarity
tests/test_modalities.py - Added new fixture modalities_path
- Removed parameterization for modalities
.gitignore - Added entry for *.lock files

Possibly related PRs

Suggested reviewers

  • skim2257

Poem

🐰 A Pytest Rabbit's Delight

Config tweaked with careful might,
Imports now clear, no wildcard's flight,
Temporary paths, clean and bright,
Tests dancing with pytest's light!

Hop, hop, hooray! 🧪✨


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
tests/dicom/sort/test_sort_utils.py (1)

54-54: Use context manager for temporary file creation.

Using a context manager ensures proper resource cleanup.

-    temp_file = Path(tempfile.NamedTemporaryFile(suffix=".dcm", delete=False).name)
+    with tempfile.NamedTemporaryFile(suffix=".dcm", delete=False) as tf:
+        temp_file = Path(tf.name)
🧰 Tools
🪛 Ruff (0.8.2)

54-54: Use a context manager for opening files

(SIM115)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 442d69b and 86abb74.

📒 Files selected for processing (4)
  • config/pytest.ini (2 hunks)
  • src/imgtools/ops/__init__.py (1 hunks)
  • tests/dicom/sort/test_sort_utils.py (2 hunks)
  • tests/test_ops.py (6 hunks)
🧰 Additional context used
🪛 GitHub Actions: CI-CD
config/pytest.ini

[error] Test execution was canceled before completion. The pipeline operation was terminated unexpectedly.

🪛 Ruff (0.8.2)
tests/test_ops.py

3-3: shutil imported but unused

Remove unused import: shutil

(F401)

tests/dicom/sort/test_sort_utils.py

54-54: Use a context manager for opening files

(SIM115)

🔇 Additional comments (5)
src/imgtools/ops/__init__.py (1)

1-41: Well-structured module interface definition!

The explicit imports and __all__ list definition improve code clarity and provide better control over the module's public interface. This change follows Python best practices.

tests/test_ops.py (1)

26-32: Great use of pytest's tmp_path fixture!

The replacement of the custom output_path fixture with pytest's built-in tmp_path fixture is an excellent change that:

  • Improves test isolation
  • Enables parallel test execution
  • Eliminates the need for manual cleanup

Also applies to: 48-53

config/pytest.ini (2)

30-38: Well-configured parallel test execution!

The addition of:

  • --cov-append ensures proper coverage data collection during parallel execution
  • --numprocesses=auto with --maxprocesses=8 provides a good balance between performance and resource usage

The pipeline failure might be related to resource constraints. Consider:

✅ Verification successful

Parallel test configuration is appropriate for CI environment

The current configuration with --numprocesses=auto and --maxprocesses=8 is well-balanced:

  • Auto-detection ensures optimal resource utilization on CI runners
  • Max 8 processes prevents excessive parallelization
  • Coverage append is correctly configured
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check available resources in the CI environment
nproc
free -h

Length of output: 208


Script:

#!/bin/bash
# Search for CI configuration files
fd -e yml -e yaml -e json . .github/workflows

# Look for any resource-related documentation
rg -i "resource|memory|cpu|processor" README.md docs/ 2>/dev/null

# Check for any test-related environment variables in CI configs
rg -i "test.*env|env.*test|resource|memory|cpu|processor" .github/workflows/

Length of output: 413

🧰 Tools
🪛 GitHub Actions: CI-CD

[error] Test execution was canceled before completion. The pipeline operation was terminated unexpectedly.


17-17: Verify coverage target configuration.

The removal of the explicit coverage target (--cov=imgtools--cov) requires the target to be specified elsewhere. Please ensure it's properly configured in config/coverage.toml.

✅ Verification successful

Coverage configuration is properly set up

The removal of --cov=imgtools is safe as the coverage configuration is correctly defined in coverage.toml. The package will be automatically discovered since it follows the src layout pattern (src/imgtools).

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check if coverage target is specified in coverage.toml
cat config/coverage.toml | grep -A 5 '\[paths\]'

Length of output: 50


Script:

#!/bin/bash
# Find coverage configuration files
fd -e toml -e cfg -e ini -e coveragerc --max-depth 2

# Check pyproject.toml for coverage settings
echo "=== pyproject.toml coverage settings ==="
cat pyproject.toml 2>/dev/null | grep -A 10 '\[tool\.coverage' || true

# Check setup.cfg for coverage settings
echo -e "\n=== setup.cfg coverage settings ==="
cat setup.cfg 2>/dev/null | grep -A 10 '\[coverage:' || true

# Check .coveragerc
echo -e "\n=== .coveragerc settings ==="
cat .coveragerc 2>/dev/null || true

Length of output: 547


Script:

#!/bin/bash
# Display full content of coverage.toml
echo "=== Full content of coverage.toml ==="
cat config/coverage.toml || true

# Search for coverage target patterns in all config files
echo -e "\n=== Coverage patterns in config files ==="
rg -i "source|coverage.*path|imgtools" config/coverage.toml config/pytest.ini pyproject.toml 2>/dev/null || true

Length of output: 858

🧰 Tools
🪛 GitHub Actions: CI-CD

[error] Test execution was canceled before completion. The pipeline operation was terminated unexpectedly.

tests/dicom/sort/test_sort_utils.py (1)

144-150: Good use of tmp_path fixture!

The addition of the tmp_path fixture improves test isolation and enables parallel execution.

@@ -1,5 +1,6 @@
import copy
import pathlib
import shutil
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Remove unused import.

The shutil import is no longer needed after removing the manual cleanup code.

-import shutil
🧰 Tools
🪛 Ruff (0.8.2)

3-3: shutil imported but unused

Remove unused import: shutil

(F401)

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (5)
pixi.toml (1)

67-69: Consider improvements to the clean_tests task implementation.

While the task serves a good purpose for cleaning up test artifacts, there are several improvements to consider:

  1. The task should be a dependency for the test task to ensure a clean state before testing:
 [feature.test.tasks.test]
 cmd = "pytest -c config/pytest.ini --rootdir ."
 inputs = ["src", "tests", "config/pytest.ini", "config/coverage.toml"]
+depends-on = ["clean_tests"]
 description = "Run pytest (make sure to run in root directory)"
  1. Consider using pytest's built-in temporary directory management (tmp_path fixture) instead of manually cleaning up ./tests/temp, as this would better support parallel test execution.
tests/conftest.py (1)

42-45: Consider consolidating repetitive path construction logic.

The path construction for imgtools files follows a pattern that could be made more maintainable.

+    def _make_imgtools_path(base_path, suffix=""):
+        return pathlib.Path(base_path, f"imgtools_{dataset_name}{suffix}").as_posix()
+
-    crawl_path = pathlib.Path(imgtools_path, f"imgtools_{dataset_name}.csv").as_posix()
-    edge_path = pathlib.Path(
-        imgtools_path, f"imgtools_{dataset_name}_edges.csv"
-    ).as_posix()
+    crawl_path = _make_imgtools_path(imgtools_path, ".csv")
+    edge_path = _make_imgtools_path(imgtools_path, "_edges.csv")
tests/test_modalities.py (3)

16-36: Consider making the modality paths more configurable.

While the fixture effectively organizes modality paths, the hard-coded paths could be made more maintainable and configurable.

Consider using a configuration file or environment variables for these paths:

@pytest.fixture(scope="module")
def modalities_path(curr_path, dataset_path):
    config = {
        "CT": "3.000000-Merged-06362",
        "RTSTRUCT": "Pinnacle POI-41418",
        "RTDOSE": "11376",
        "PT": "532790.000000-LOR-RAMLA-44600"
    }
    
    qc_path = pathlib.Path(curr_path, "data", "Head-Neck-PET-CT", "HN-CHUS-052")
    assert qc_path.exists(), "Dataset not found"
    
    return {
        modality: str(pathlib.Path(qc_path, parent_dir, file_name))
        for modality, file_name in config.items()
        for parent_dir in [next(p for p in qc_path.glob("*") if file_name in str(p))]
    }

39-39: Consider adding descriptive test IDs to the parameterization.

Adding test IDs would make test output more readable and help identify which modality is being tested.

-@pytest.mark.parametrize("modalities", ["CT", "RTSTRUCT", "RTDOSE", "PT"])
+@pytest.mark.parametrize("modalities", ["CT", "RTSTRUCT", "RTDOSE", "PT"],
+                         ids=lambda x: f"testing_{x}_modality")

68-70: Consider parameterizing ROI names for better test flexibility.

The hard-coded ROI names make the test less flexible and harder to maintain.

+@pytest.fixture(params=[
+    {"names": ["GTV.?", "LARYNX"], "expected_regions": 2},
+    {"names": ["GTV.?"], "expected_regions": 1},
+])
+def roi_config(request):
+    return request.param

-        make_binary_mask = StructureSetToSegmentation(
-            roi_names=["GTV.?", "LARYNX"], continuous=False
-        )
+        make_binary_mask = StructureSetToSegmentation(
+            roi_names=roi_config["names"], continuous=False
+        )
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 86abb74 and 89bab4e.

⛔ Files ignored due to path filters (1)
  • pixi.lock is excluded by !**/*.lock
📒 Files selected for processing (5)
  • config/coverage.toml (1 hunks)
  • pixi.toml (1 hunks)
  • tests/conftest.py (1 hunks)
  • tests/test_components.py (2 hunks)
  • tests/test_modalities.py (1 hunks)
✅ Files skipped from review due to trivial changes (1)
  • tests/test_components.py
⏰ Context from checks skipped due to timeout of 90000ms (8)
  • GitHub Check: Unit-Tests (macos-14, py312)
  • GitHub Check: Unit-Tests (macos-14, py311)
  • GitHub Check: Unit-Tests (macos-14, py310)
  • GitHub Check: Unit-Tests (macos-latest, py311)
  • GitHub Check: Unit-Tests (macos-latest, py310)
  • GitHub Check: Unit-Tests (ubuntu-latest, py312)
  • GitHub Check: Unit-Tests (ubuntu-latest, py311)
  • GitHub Check: Unit-Tests (ubuntu-latest, py310)
🔇 Additional comments (5)
config/coverage.toml (2)

6-6: LGTM! Good practice to exclude test files from coverage reports.

The addition of "tests/**/*.py" to the omit list is correct. This ensures coverage reports focus on the actual codebase rather than test files, which is the standard practice.


9-9: LGTM!

pixi.toml (1)

68-68: Verify the impact of cleanup during parallel test execution.

The rm -rf command on shared directories (.pytest_cache, ./data, ./tests/temp) could potentially interfere with parallel test execution if tests are running simultaneously in different processes.

Let's check if any tests are explicitly using these paths:

✅ Verification successful

The cleanup task is safe for parallel test execution

The codebase follows best practices for test directory management:

  • Uses pytest's process-safe temporary directory fixtures (tmp_path)
  • Leverages Python's TemporaryDirectory for automatic cleanup
  • Custom fixtures properly scope test data directories

The directories in the cleanup task (.pytest_cache, ./data, ./tests/temp) are not directly referenced in tests and can be safely removed between test runs.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for explicit usage of these paths in test files
echo "Checking for explicit path usage in tests..."
rg -g '!pixi.toml' "pytest_cache|/data/|/tests/temp" ./tests

Length of output: 161


Script:

#!/bin/bash
# Search for pytest fixtures and temporary path handling
echo "Checking for pytest fixtures and tmp paths..."
rg -g "tests/**" "(fixture|tmp_path|tmpdir|tempfile|mkdir|os\.path)" -A 2

# Search for test setup/teardown patterns
echo -e "\nChecking for test setup/teardown..."
ast-grep --pattern 'def setup$$$' 
ast-grep --pattern 'def teardown$$$'

Length of output: 9642

tests/conftest.py (1)

16-16: LGTM! Good improvement for test parallelization.

Changing the fixture scope from 'session' to 'package' is a positive change that better supports parallel test execution by creating separate fixture instances for each test package.

tests/test_modalities.py (1)

38-38: LGTM! Good use of xdist grouping for test coordination.

The @pytest.mark.xdist_group("serial") marker correctly ensures that these tests run serially when needed, which is important for resource management.


output_path = pathlib.Path(curr_path, 'tests', 'temp').as_posix()
output_path = pathlib.Path(curr_path, "tests", "temp").as_posix()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Use pytest's built-in temporary directory fixtures for better test isolation.

Instead of using a fixed 'temp' directory, consider using pytest's tmp_path or tmp_path_factory fixtures. This would provide better isolation for parallel test execution and automatic cleanup.

Example refactor:

-    output_path = pathlib.Path(curr_path, "tests", "temp").as_posix()
+    @pytest.fixture(scope="package")
+    def output_path(tmp_path_factory):
+        return tmp_path_factory.mktemp("output").as_posix()

Committable suggestion skipped: line range outside the PR's diff.

zipfile.extractall(quebec_path)
os.remove(quebec_zip_path)
# os.remove(quebec_zip_path)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codebase verification

Remove the commented cleanup code for better code maintainability

The commented cleanup (os.remove(quebec_zip_path)) is not critical since:

  • The dataset fixture has package scope, meaning data persistence is intentional
  • The zip file is downloaded only once per test session
  • The disk space impact is minimal

Simply remove the commented line to maintain code cleanliness.

🔗 Analysis chain

Consider removing or justifying the commented cleanup code.

The commented-out cleanup code (os.remove(quebec_zip_path)) might lead to unnecessary disk space usage, especially in CI environments. If there's a specific reason for keeping the zip file, please document it; otherwise, consider removing the commented line.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check if this cleanup is handled elsewhere in the codebase
rg "quebec.*\.zip.*remove" --type py

Length of output: 38


Script:

#!/bin/bash
# Get broader context around the Quebec zip handling
echo "=== Quebec zip file references ==="
rg "quebec.*\.zip" --type py -B 2 -A 2

echo -e "\n=== General zip cleanup patterns in conftest ==="
rg "\.zip.*remove|remove.*\.zip" tests/conftest.py

echo -e "\n=== Full context of the fixture ==="
cat tests/conftest.py

Length of output: 3062

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 89bab4e and 02df8fe.

📒 Files selected for processing (4)
  • .gitignore (1 hunks)
  • config/pytest.ini (2 hunks)
  • tests/conftest.py (1 hunks)
  • tests/test_modalities.py (1 hunks)
🧰 Additional context used
🪛 GitHub Actions: CI-CD
tests/conftest.py

[error] 7-7: Missing required dependency: Module 'filelock' is not installed but is imported in the file

🔇 Additional comments (7)
config/pytest.ini (2)

17-17: Coverage configuration looks good for parallel execution.

The removal of hardcoded coverage target and addition of --cov-append ensures proper coverage reporting when tests run in parallel.

Also applies to: 30-32


33-40: Well-configured parallel execution settings.

The parallel execution configuration is well thought out:

  • --numprocesses=auto with --maxprocesses=8 provides good scalability
  • --dist=loadgroup ensures proper test isolation
tests/conftest.py (2)

17-52: Well-implemented dataset preparation with proper locking.

The implementation of prepare_dataset with FileLock is excellent for parallel test execution:

  • Prevents concurrent downloads when using xdist
  • Properly cleans up lock file
  • Uses appropriate logging

55-70: Consider using pytest's tmp_path fixture.

The previous review comment about using pytest's built-in temporary directory fixtures still applies. This would improve test isolation in parallel execution.

tests/test_modalities.py (2)

15-39: Well-structured modalities fixture with proper assertions.

The modalities_path fixture is well implemented:

  • Appropriate scope and location in test file
  • Good path existence checks
  • Clear path organization

44-44: Improved test parameterization.

The change from 'modalities' to 'imaging_modality' makes the test intention clearer.

Also applies to: 46-46

.gitignore (1)

232-232: Appropriate addition of lock files to .gitignore.

Adding *.lock pattern is necessary to prevent committing temporary lock files used in parallel test execution.

tests/conftest.py Show resolved Hide resolved
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
tests/conftest.py (2)

17-52: LGTM! Well-implemented parallel execution handling.

The fixture implementation is robust:

  • Uses FileLock to prevent concurrent downloads
  • Includes proper cleanup
  • Has appropriate logging

Consider moving the dataset URL to a configuration file.

The hard-coded URL at line 37 should be moved to a configuration file for better maintainability.

+# config/test_config.py
+TEST_DATASET_URL = "https://github.com/bhklab/tcia_samples/blob/main/Head-Neck-PET-CT.zip?raw=true"

-            quebec_data_url = "https://github.com/bhklab/tcia_samples/blob/main/Head-Neck-PET-CT.zip?raw=true"
+            quebec_data_url = TEST_DATASET_URL

35-44: Enhance error handling for dataset download.

The dataset download process could benefit from improved error handling:

  1. Add timeout for download
  2. Handle network errors explicitly
  3. Verify file integrity after download
+from urllib.error import URLError
+import hashlib

+EXPECTED_CHECKSUM = "..." # Add the expected SHA256 checksum

-            request.urlretrieve(quebec_data_url, quebec_zip_path)
+            try:
+                # Add 5-minute timeout
+                request.urlretrieve(quebec_data_url, quebec_zip_path, timeout=300)
+                
+                # Verify checksum
+                with open(quebec_zip_path, "rb") as f:
+                    if hashlib.sha256(f.read()).hexdigest() != EXPECTED_CHECKSUM:
+                        raise ValueError("Downloaded file has incorrect checksum")
+                        
+                with ZipFile(quebec_zip_path, "r") as zipfile:
+                    zipfile.extractall(quebec_path)
+            except (URLError, TimeoutError) as e:
+                logger.error(f"Failed to download dataset: {e}")
+                raise
+            except ValueError as e:
+                logger.error(f"File verification failed: {e}")
+                raise
+            finally:
+                if os.path.exists(quebec_zip_path):
+                    os.remove(quebec_zip_path)
-            with ZipFile(quebec_zip_path, "r") as zipfile:
-                zipfile.extractall(quebec_path)
-            os.remove(quebec_zip_path)
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 02df8fe and 0d5199f.

⛔ Files ignored due to path filters (1)
  • pixi.lock is excluded by !**/*.lock
📒 Files selected for processing (3)
  • pixi.toml (2 hunks)
  • tests/conftest.py (1 hunks)
  • tests/test_components.py (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • pixi.toml
  • tests/test_components.py
⏰ Context from checks skipped due to timeout of 90000ms (7)
  • GitHub Check: Unit-Tests (macos-14, py312)
  • GitHub Check: Unit-Tests (macos-14, py311)
  • GitHub Check: Unit-Tests (macos-14, py310)
  • GitHub Check: Unit-Tests (macos-latest, py311)
  • GitHub Check: Unit-Tests (ubuntu-latest, py312)
  • GitHub Check: Unit-Tests (ubuntu-latest, py311)
  • GitHub Check: Unit-Tests (ubuntu-latest, py310)
🔇 Additional comments (3)
tests/conftest.py (3)

7-7: Add filelock to project dependencies.

The filelock module is required for parallel test execution but is not installed.


55-70: Use pytest's built-in temporary directory fixtures for better test isolation.

Instead of using a fixed 'temp' directory, consider using pytest's tmp_path or tmp_path_factory fixtures for better isolation during parallel test execution.


12-14: LGTM! Verify the package scope impact.

The changes look good:

  • Package scope is appropriate for parallel execution
  • Path resolution is more direct using cwd()
✅ Verification successful

✅ Package scope change is safe

The curr_path fixture is only used within the root test package and not in any sub-packages, making the scope change from 'session' to 'package' safe and appropriate.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check if there are any other fixtures depending on curr_path that might be affected by the scope change
rg "@pytest\.fixture.*\bcurr_path\b" tests/

Length of output: 45


Script:

#!/bin/bash
# Find all usages of curr_path in test files
rg "\bcurr_path\b" tests/ -l

# List test directory structure to understand package organization
fd . tests/ --type d

Length of output: 182

@jjjermiah jjjermiah marked this pull request as draft January 17, 2025 21:21
@jjjermiah
Copy link
Contributor Author

addressed in #202

@jjjermiah jjjermiah closed this Jan 29, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant