-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consider inline embedded media data in the JSON data file #3
Comments
It might not be frequently used, but when sharing files for analysis an
optional base64 embedding of media files would be very valuable for share
and transferring data.
One use case I can think of is a recent waterpoint dataset where we had to
download the CSV data file, download the images separately, then preprocess
to identify and split the media files based on the values of a binary
column in the text file. Eventually, I'd want all those steps to exist in
an API, it would be a lot easier to send a single file with embedded media
files to an API than a text file, compressed image file, and preprocessing
instructions. Additionally, for sharing with other machine learning
researchers (in combination with a notebook file) a single file would be
quite convenient.
|
No denying there is something appealing about having a single file that is guaranteed to be complete. Seems like supporting this as an optional thing would be useful for say one-time batch exports where the result could be zipped in the end (removing most of the downsides of base64 encoding). For actually sending stuff over APIs I would think this is an antipattern just because these will quickly grow into payloads that are impractical to transfer in a single request. |
Definitely when something running in production you want to push models to
the data, but for things I'm doing now exploring different modeling
approaches, and sharing notebooks that hook-up to APIs, this would be super
useful for me (I'm using requests w/limits on the number of rows so size is
bounded)
|
I think this is a nice to have, but should definitely be optional, and consumer configurable via a parameter. Also, since we already have two separate "files" (ie. schema and response data) I'd assume we might want to package both in some way, eg. a .zip file. @pld what about embedding the images in the zip file? Would that still work for your use case? |
Yea, a zip file is how we're doing it now and it works, just not as
convenient
…On Thu, Jul 13, 2017 at 10:26 AM, Gustavo Giráldez ***@***.*** > wrote:
I think this is a nice to have, but should definitely be optional, and
consumer configurable via a parameter.
Also, since we already have two separate "files" (ie. schema and response
data) I'd assume we might want to package both in some way, eg. a .zip
file. @pld <https://github.com/pld> what about embedding the images in
the zip file? Would that still work for your use case?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#3 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AADGEpVx4gGhpKKkZezqWtij-pyLYB19ks5sNikDgaJpZM4OPJqR>
.
|
The first version of the spec stores media data (image, audio, video) simply as a URL to an external resource.
Is it valuable to allow embedding the media for any of these question types within the JSON data file? One possibility is to allow a base-64 encoded string in place of the URL, and a response metadata indicating
"inline":1, "format":"audio/ogg", "extension":".ogg"
This allows reducing external dependencies, at the expense of large data file sizes, and the requirement to support two code paths.
The text was updated successfully, but these errors were encountered: