Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add string_json value type for #4897

Closed

Conversation

zerafachris
Copy link

What this PR does / why we need it:

On-demand feature views are very powerful, but we can extend their functionality further by allowing a string of JSON as a type.

Consider a feature with Field(name='value_array', dtype=String_JSON).
An example feature would be {'0': 127.5433, '1': 183.2108, '2': 157.525, '3': 170.2931, '4': 186.0841, '5': 141.6574, '6': 146.1895, '7': 148.6707, '8': 146.0211, '9': 196.4125, '10': 132.7219, '11': 154.1899, '12': 132.3854}

The on-demand feature view now can be defined as:

@on_demand_feature_view(
    sources=[driver_hourly_stats_view, input_request],
    schema=[
        Field(name="value_array_modified", dtype=Float64)
    ],
)
def array_modified(inputs: pd.DataFrame) -> pd.DataFrame:
    df = pd.DataFrame()
    df['value_array_modified'] = 0.0
    for index, row in inputs.iterrows():
        x1 = eval(row['value_array']).get('1')
        x1 = x1 if x1 is not None else 0.0
        x2 = eval(row['value_array']).get('2')
        x2 = x2 if x2 is not None else 0.0
        x3 = eval(row['value_array']).get('3')
        x3 = x3 if x3 is not None else 0.0
        df.at[index, 'value_array_modified'] = x1 + x2 + x3

    return df

Upon requested this feature, the response would be 511.0289

Which issue(s) this PR fixes:

Misc

Signed-off-by: zerafachris PERSONAL <[email protected]>
@zerafachris zerafachris requested a review from a team as a code owner January 5, 2025 10:40
Copy link
Member

@franciscojavierarceo franciscojavierarceo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Would you mind adding a test to highlight the behavior? There's an on demand transformation test file.

Signed-off-by: zerafachris PERSONAL <[email protected]>
Signed-off-by: zerafachris PERSONAL <[email protected]>
@zerafachris
Copy link
Author

Not sure how to fix the error "Error: The template is not valid. .github/workflows/pr_integration_tests.yml (Line: 91, Col: 16): hashFiles('**/py3.11-ci-requirements.txt') failed. Fail to hash files under directory '/home/runner/work/feast/feast'".

Any advise?

@franciscojavierarceo
Copy link
Member

Actually that's not from you. I saw this on your PR and created a dummy PR to confirm this is happening to others.

I'm looking into it but won't block this PR.

@redhatHameed
Copy link
Contributor

@zerafachris Can you fix the linter and unit test issues? Please refer to the comments within the test itself.

@zerafachris
Copy link
Author

I am cancelling this PR as the value type is not needed. Doing pre-processing such as if ':' not in row["feature1"] would allow for the feature projection of random_input to be successful.

@zerafachris zerafachris closed this Jan 6, 2025
@zerafachris zerafachris deleted the feature/valuetype_string_json branch January 6, 2025 19:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants