Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RequestSource can not be used with write_to_online_store #4493

Closed
HaoXuAI opened this issue Sep 6, 2024 · 3 comments
Closed

RequestSource can not be used with write_to_online_store #4493

HaoXuAI opened this issue Sep 6, 2024 · 3 comments

Comments

@HaoXuAI
Copy link
Collaborator

HaoXuAI commented Sep 6, 2024

Expected Behavior

Use RequestSource with write_to_online_store

Current Behavior

Steps to reproduce

feature view has request source defined such as

 {
  "spec": {
    "name": "embedding_catalog",
    "features": [
      {
        "name": "embeddings",
        "valueType": "FLOAT_LIST"
      },
      {
        "name": "event_timestamp",
        "valueType": "UNIX_TIMESTAMP"
      }
    ],
    "ttl": "0s",
    "batchSource": {
      "type": "REQUEST_SOURCE",
      "dataSourceClassType": "feast.data_source.RequestSource",
      "requestDataOptions": {
        "schema": [
          {
            "name": "embeddings",
            "valueType": "FLOAT_LIST"
          },
          {
            "name": "event_timestamp",
            "valueType": "UNIX_TIMESTAMP"
          }
        ]
      },
      "name": "embedding_catalog"
    },
    "online": true
  },
  "meta": {
    "createdTimestamp": "2024-09-06T07:58:13.929715Z",
    "lastUpdatedTimestamp": "2024-09-06T07:58:13.929715Z"
  }
}

in the _convert_arrow_to_proto function it will fail at extracting the timestamp field:

    event_timestamps = [
        _coerce_datetime(val)
        for val in pd.to_datetime(
            table.column(feature_view.batch_source.timestamp_field).to_numpy(
                zero_copy_only=False
            )
        )
    ]

since feature_view.batch_source (request source) doesn't have timestamp_field.

Specifications

  • Version:
  • Platform:
  • Subsystem:

Possible Solution

add an attributee for RequestSource for the user to specify the timestamp_field.

@franciscojavierarceo
Copy link
Member

franciscojavierarceo commented Sep 6, 2024

I'm doing some related work in my PR to create persistent on demand feature views #4418

@tokoko
Copy link
Collaborator

tokoko commented Sep 6, 2024

I'm a little lost here, what's the use for timestamp_field in RequestSource? RequestSources are only used in online odfv execution, you don't have a point-in-time join there.

@HaoXuAI
Copy link
Collaborator Author

HaoXuAI commented Sep 6, 2024

I'm a little lost here, what's the use for timestamp_field in RequestSource? RequestSources are only used in online odfv execution, you don't have a point-in-time join there.

I'm not using it to get history features, just use it as a data source to be able to ingest dataframe to the feature view directly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants