Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add utils for supporting rerun panel plugin #4876

Merged
merged 4 commits into from
Oct 4, 2024
Merged

Conversation

sashankaryal
Copy link
Contributor

@sashankaryal sashankaryal commented Oct 1, 2024

This PR introduces data model to support rerun panel plugin.

Summary by CodeRabbit

  • New Features

    • Introduced new functions for enhanced schema handling: getFieldsWithEmbeddedDocType and doesSchemaContainEmbeddedDocType.
    • Added support for a new media type, RrdFile, allowing for better metadata management and retrieval of file paths.
    • New utilities for working with the Rerun data visualization tool, including the RrdFile class.
  • Bug Fixes

    • Updated schema properties for improved clarity and accuracy.
  • Tests

    • Expanded test coverage for new schema functionalities and embedded document types.

@sashankaryal sashankaryal requested a review from a team October 1, 2024 21:38
Copy link
Contributor

coderabbitai bot commented Oct 1, 2024

Walkthrough

The changes in this pull request involve modifications to several files, primarily focusing on enhancing the functionality of schema handling and metadata management. Key updates include the addition of new functions for retrieving and checking embedded document types in schema.ts, updates to schema definitions in schema.test.ts, and the introduction of a new media type (RrdFile) in metadata.py. The import statements and function signatures have been adjusted accordingly to accommodate these changes, while the overall logic and structure of existing functionalities remain intact.

Changes

File Change Summary
app/packages/state/src/utils.ts - Updated import statement to include ModalSample.
- Modified getStandardizedUrls function to accept an additional parameter type: ModalSample["urls"].
app/packages/utilities/src/schema.test.ts - Updated name and path properties for embedded and embeddedWithDbFields schema objects.
- Added new test cases for getFieldsWithEmbeddedDocType and doesSchemaContainEmbeddedDocType.
app/packages/utilities/src/schema.ts - Added getFieldsWithEmbeddedDocType function to retrieve fields matching an embeddedDocType.
- Added doesSchemaContainEmbeddedDocType function to check for the existence of an embeddedDocType.
fiftyone/server/metadata.py - Added import for RrdFile.
- Updated _ADDITIONAL_MEDIA_FIELDS to include RrdFile associated with "filepath".
- Modified _create_media_urls and _get_additional_media_fields to handle the new media type.
fiftyone/utils/rerun.py - Introduced RrdFile class with filepath and version fields for handling rrd file metadata.

Suggested reviewers

  • ritch

🐇 In the meadow, changes bloom,
New functions added, dispelling gloom.
RrdFile hops into the fray,
With paths and schemas, come what may!
A dance of code, so bright and spry,
Let’s celebrate, oh my, oh my! 🎉


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@sashankaryal sashankaryal changed the title wip: rerun panel Add utils for supporting rerun panel plugin Oct 4, 2024
@sashankaryal sashankaryal self-assigned this Oct 4, 2024
@sashankaryal sashankaryal marked this pull request as ready for review October 4, 2024 20:25
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Outside diff range and nitpick comments (7)
fiftyone/utils/rerun.py (2)

14-23: LGTM: Class definition and docstring are well-structured.

The RrdFile class is properly defined with appropriate inheritance. The docstring is informative and explains the purpose and arguments of the class well.

Consider adding a brief explanation of what an rrd file is in the class docstring, as it might not be immediately clear to all developers. For example:

"""Class for storing a rerun data (rrd) file and its associated metadata.

An rrd file is a binary format used by Rerun to store and visualize data.

Args:
    ...
"""

27-28: LGTM: Class attributes are well-defined, but consider improving version management.

The filepath and version attributes are appropriately defined as StringFields. The default version for backwards compatibility is a good approach.

Consider using a constant or configuration value for the default version instead of hardcoding it. This would make it easier to update the version across the codebase in the future. For example:

from fiftyone.constants import RERUN_DEFAULT_VERSION

class RrdFile(DynamicEmbeddedDocument, fol._HasMedia):
    # ...
    version = fof.StringField(default=RERUN_DEFAULT_VERSION)

This approach would centralize version management and make it easier to update when new Rerun versions are released.

app/packages/utilities/src/schema.ts (2)

55-74: LGTM! Consider a minor performance optimization.

The implementation of getFieldsWithEmbeddedDocType is correct and follows TypeScript best practices. It effectively traverses the schema recursively to find fields with the specified embeddedDocType.

For a slight performance improvement when dealing with large schemas, consider using Array.prototype.push.apply() to flatten the result array. Here's a suggested modification:

export function getFieldsWithEmbeddedDocType(
  schema: Schema,
  embeddedDocType: string
): Field[] {
  const result: Field[] = [];

  function recurse(schema: Schema): Field[] {
    const localResult: Field[] = [];
    for (const field of Object.values(schema ?? {})) {
      if (field.embeddedDocType === embeddedDocType) {
        localResult.push(field);
      }
      if (field.fields) {
        Array.prototype.push.apply(localResult, recurse(field.fields));
      }
    }
    return localResult;
  }

  Array.prototype.push.apply(result, recurse(schema));
  return result;
}

This change reduces the number of array allocations and copies, which can be beneficial for performance in scenarios with deeply nested schemas.


76-93: LGTM! Consider a minor simplification.

The implementation of doesSchemaContainEmbeddedDocType is correct, efficient, and follows TypeScript best practices. It effectively traverses the schema recursively to check for the presence of the specified embeddedDocType.

For improved readability, you can simplify the recurse function by removing the unnecessary return false statement. Here's a suggested modification:

export function doesSchemaContainEmbeddedDocType(
  schema: Schema,
  embeddedDocType: string
): boolean {
  function recurse(schema: Schema): boolean {
    return Object.values(schema ?? {}).some((field) => 
      field.embeddedDocType === embeddedDocType || (field.fields && recurse(field.fields))
    );
  }

  return recurse(schema);
}

This change maintains the same functionality while making the code more concise and easier to read.

app/packages/state/src/utils.ts (1)

108-108: LGTM: Function signature updated correctly

The getStandardizedUrls function signature has been updated to include ModalSample["urls"] as a possible type for the urls parameter. This change increases the function's flexibility and is consistent with the import statement modification.

Consider using a type alias for better readability:

type UrlsType = readonly { readonly field: string; readonly url: string }[] | { [field: string]: string } | ModalSample["urls"];

export const getStandardizedUrls = (urls: UrlsType) => {
  // ... function body ...
};

This change would make the function signature more concise and easier to maintain.

fiftyone/server/metadata.py (2)

37-37: LGTM: Addition of RrdFile to _ADDITIONAL_MEDIA_FIELDS is appropriate.

The inclusion of RrdFile: "filepath" in the _ADDITIONAL_MEDIA_FIELDS dictionary is consistent with the existing structure and supports the new rerun panel plugin functionality.

Consider adding a brief comment above the dictionary to explain its purpose and the significance of the "filepath" value for each entry. This would enhance code readability and maintainability.


RrdFile support is not fully integrated

The RrdFile type is defined and imported in the codebase but is not utilized in any media type handling functions or conditionals. This suggests that additional integration steps are necessary to ensure full support for RrdFile.

  • fiftyone/utils/rerun.py: RrdFile class is defined but not integrated into processing pipelines.
  • fiftyone/server/metadata.py: RrdFile is imported and referenced in _ADDITIONAL_MEDIA_FIELDS, but no further usage is found in media handling functions.
🔗 Analysis chain

Line range hint 27-37: Verify full integration of RrdFile support.

The changes to support RrdFile look good, but it's important to ensure that this new type is fully integrated throughout the codebase.

Please run the following script to check for any other necessary updates:

This will help identify any other areas of the code that might need updating to fully support the new RrdFile type.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check for potential places where RrdFile support might be needed.

# Search for functions or methods that handle different media types
echo "Searching for media type handling:"
rg --type python -n '(def|media_type|MediaType).*(\bimage\b|\bvideo\b|\bpoint_cloud\b|\bthree_d\b)'

# Search for uses of _ADDITIONAL_MEDIA_FIELDS
echo "Searching for uses of _ADDITIONAL_MEDIA_FIELDS:"
rg --type python -n '_ADDITIONAL_MEDIA_FIELDS'

# Search for other occurrences of RrdFile
echo "Searching for other occurrences of RrdFile:"
rg --type python -n 'RrdFile'

Length of output: 552


Script:

#!/bin/bash
# Description: Check for potential places where RrdFile support might be needed.

# Search for functions or methods that handle different media types
echo "Searching for media type handling:"
rg -g "*.py" -n '(def|media_type|MediaType).*(\bimage\b|\bvideo\b|\bpoint_cloud\b|\bthree_d\b)'

# Search for uses of _ADDITIONAL_MEDIA_FIELDS
echo "Searching for uses of _ADDITIONAL_MEDIA_FIELDS:"
rg -g "*.py" -n '_ADDITIONAL_MEDIA_FIELDS'

# Search for other occurrences of RrdFile
echo "Searching for other occurrences of RrdFile:"
rg -g "*.py" -n 'RrdFile'

Length of output: 9452


Script:

#!/bin/bash
# Description: Ensure RrdFile is fully integrated across media handling functions.

# Search for functions or methods that accept RrdFile as a parameter
echo "Searching for functions accepting RrdFile as a parameter:"
rg -g '*.py' -n 'def .*\(.*RrdFile.*\)'

# Search for conditionals handling RrdFile in media processing
echo "Searching for conditionals handling RrdFile:"
rg -g '*.py' -n 'if .*RrdFile.*:'

# Search for documentation or comments mentioning RrdFile integration
echo "Searching for documentation or comments about RrdFile:"
rg -g '*.py' -n '#.*RrdFile'

Length of output: 451


Script:

#!/bin/bash
# Description: Further verify RrdFile integration across the codebase.

# Search for instantiations of RrdFile
echo "Searching for instantiations of RrdFile:"
rg -g '*.py' -n 'RrdFile\('

# Search for assignments or usages within data structures
echo "Searching for assignments or usages of RrdFile within data structures:"
rg -g '*.py' -n '[:=] .*RrdFile'

# Search for RrdFile in class definitions or decorators
echo "Searching for RrdFile in class definitions or decorators:"
rg -g '*.py' -n '(class|@).*RrdFile.*'

# Search for any other import patterns related to RrdFile
echo "Searching for import patterns related to RrdFile:"
rg -g '*.py' -n 'import .*RrdFile|from .* import .*RrdFile'

Length of output: 879

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between 24a133c and 3b64f8a.

📒 Files selected for processing (5)
  • app/packages/state/src/utils.ts (2 hunks)
  • app/packages/utilities/src/schema.test.ts (7 hunks)
  • app/packages/utilities/src/schema.ts (1 hunks)
  • fiftyone/server/metadata.py (2 hunks)
  • fiftyone/utils/rerun.py (1 hunks)
🧰 Additional context used
📓 Path-based instructions (3)
app/packages/state/src/utils.ts (1)

Pattern **/*.{ts,tsx}: Review the Typescript and React code for conformity with best practices in React, Recoil, Graphql, and Typescript. Highlight any deviations.

app/packages/utilities/src/schema.test.ts (1)

Pattern **/*.{ts,tsx}: Review the Typescript and React code for conformity with best practices in React, Recoil, Graphql, and Typescript. Highlight any deviations.

app/packages/utilities/src/schema.ts (1)

Pattern **/*.{ts,tsx}: Review the Typescript and React code for conformity with best practices in React, Recoil, Graphql, and Typescript. Highlight any deviations.

🔇 Additional comments (13)
fiftyone/utils/rerun.py (2)

1-11: LGTM: File header and imports are well-structured.

The file header includes a clear docstring explaining the purpose of the module, along with appropriate copyright information. The import statements are relevant to the functionality being implemented.


25-25: LGTM: _MEDIA_FIELD attribute is correctly defined.

The _MEDIA_FIELD class attribute is properly set to "filepath", which aligns with the fol._HasMedia interface implementation.

app/packages/state/src/utils.ts (1)

21-21: LGTM: Import statement updated correctly

The import statement has been updated to include ModalSample from the "./recoil" module. This change is consistent with the modification in the getStandardizedUrls function.

fiftyone/server/metadata.py (2)

27-27: LGTM: Import statement for RrdFile is appropriate.

The import of RrdFile from fiftyone.utils.rerun aligns with the PR objective of supporting the rerun panel plugin. The specific import is well-placed and consistent with the existing code style.


Line range hint 1-524: Overall assessment: Changes look good, with minor suggestions for improvement.

The changes to support the rerun panel plugin through the addition of RrdFile are minimal, focused, and well-integrated with the existing code structure. The modifications align with the PR objectives and maintain good code quality.

Summary of review comments:

  1. The import statement for RrdFile is appropriate and well-placed.
  2. The addition to _ADDITIONAL_MEDIA_FIELDS is consistent and supports the new functionality.
  3. A suggestion was made to add a brief comment explaining the purpose of _ADDITIONAL_MEDIA_FIELDS.
  4. A verification step was requested to ensure full integration of RrdFile support throughout the codebase.

These changes appear to be a solid foundation for supporting the rerun panel plugin. Once the verification step is completed and any necessary adjustments are made, this PR should be ready for merging.

app/packages/utilities/src/schema.test.ts (8)

22-24: Updates to 'embedded' schema object are correct

The name and path properties for the embedded schema object are correctly updated to "embedded", ensuring consistency with the object key and improving clarity.


33-33: Updated path reflects correct hierarchy for nested field

The path property for the field within embedded is correctly updated to "embedded.field", accurately representing its nested position in the schema hierarchy.


45-47: Updates to 'embeddedWithDbFields' schema object are correct

The name and path properties for the embeddedWithDbFields schema object are correctly set to "embeddedWithDbFields", maintaining consistency with the object key and enhancing readability.


57-57: Updated path reflects correct hierarchy for 'sample_id' field

The path property for sample_id within embeddedWithDbFields is correctly updated to "embeddedWithDbFields.sample_id", accurately reflecting its nested structure in the schema.


Line range hint 65-80: Tests for 'getCls' function are comprehensive and correct

The added test cases for the getCls function effectively cover:

  • Retrieving class names for top-level fields.
  • Retrieving class names for embedded fields.
  • Handling of missing field paths by returning undefined.

This ensures that the getCls function behaves as expected across different scenarios.


Line range hint 82-110: Tests for 'getFieldInfo' function are thorough and accurate

The test suite for the getFieldInfo function includes:

  • Retrieving field information for top-level fields.
  • Retrieving field information for embedded fields.
  • Verifying the pathWithDbField property.
  • Handling of missing field paths by returning undefined.

These tests enhance confidence in the correctness of getFieldInfo across various use cases.


113-145: Tests for 'getFieldsWithEmbeddedDocType' function are well-designed

The new test cases for getFieldsWithEmbeddedDocType appropriately verify:

  • Retrieval of fields with a specific embeddedDocType at the top level.
  • Retrieval of fields with a specific embeddedDocType within nested fields.
  • Correct handling when the specified embeddedDocType does not exist.
  • Correct handling when the schema is empty.

This comprehensive testing ensures reliable functionality of the getFieldsWithEmbeddedDocType method.


146-176: Tests for 'doesSchemaContainEmbeddedDocType' function are comprehensive

The added tests for doesSchemaContainEmbeddedDocType function effectively check:

  • Detection of embeddedDocType at the top level.
  • Detection of embeddedDocType within nested fields.
  • Correct return value when the embeddedDocType does not exist.
  • Correct handling of an empty schema.

These tests confirm that the function accurately determines the presence of an embeddedDocType in various scenarios.

app/packages/utilities/src/schema.ts Show resolved Hide resolved
Copy link
Contributor

@brimoor brimoor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Python parts LGTM

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Outside diff range and nitpick comments (2)
fiftyone/utils/rerun.py (2)

14-20: Enhance docstring format for better readability.

While the docstring provides clear information about the class and its arguments, consider reformatting it to follow the Google Python Style Guide for improved readability and consistency with other parts of the project.

Here's a suggested improvement:

class RrdFile(DynamicEmbeddedDocument, fol._HasMedia):
    """Class for storing a rerun data (rrd) file and its associated metadata.

    Args:
        filepath (str, optional): The path to the rrd file. Defaults to None.
        version (str, optional): The version of the rrd file. Defaults to None.
    """

14-25: Consider implementing required methods and providing usage context.

The RrdFile class provides a good foundation for storing rrd file information. However, there are a couple of points to consider:

  1. Since the class implements the fol._HasMedia interface, ensure that all required methods from this interface are implemented. For example, you might need to override methods like get_media_path() or set_media().

  2. To align with the PR objectives of supporting the rerun panel plugin, it would be helpful to provide more context on how this class will be used. Consider adding comments or documentation explaining its role in the larger system.

Example of implementing a required method:

def get_media_path(self):
    return self.filepath

Adding a class-level comment for context:

class RrdFile(DynamicEmbeddedDocument, fol._HasMedia):
    """Class for storing a rerun data (rrd) file and its associated metadata.

    This class is used as part of the rerun panel plugin to manage and track
    rrd files within the FiftyOne ecosystem.

    ...
    """
📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between 3b64f8a and aab8da7.

📒 Files selected for processing (1)
  • fiftyone/utils/rerun.py (1 hunks)
🔇 Additional comments (1)
fiftyone/utils/rerun.py (1)

1-11: LGTM: File header and imports are well-structured.

The file header includes a clear docstring explaining the purpose of the module, appropriate copyright information, and relevant imports for the functionality being implemented.

fiftyone/utils/rerun.py Show resolved Hide resolved
Copy link
Contributor

@benjaminpkane benjaminpkane left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Works for me #recursion 🚀

@benjaminpkane benjaminpkane merged commit 97ccc02 into develop Oct 4, 2024
13 checks passed
@benjaminpkane benjaminpkane deleted the mvp/rerun-panel branch October 4, 2024 21:44
benjaminpkane pushed a commit that referenced this pull request Oct 4, 2024
* add dynamic embedded document field type for rrd files

* add schema utils to work with embedded doc type fields

* add versioned renderer with url+metadata support

* remove explicit version
sashankaryal added a commit that referenced this pull request Oct 4, 2024
* add dynamic embedded document field type for rrd files

* add schema utils to work with embedded doc type fields

* add versioned renderer with url+metadata support

* remove explicit version
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants