better set_body #42

b5 · 2019-05-16T19:02:54Z

set_body wasn't doing the right thing when it comes to determining the destination datset structure for writing a body. We now inherit structure from existing dataset data, and only fall back to json if no structure exists.

The general problem here is a separation of concerns between data in the starlark runtime and dataset serialization formats. Those should always be seperate, with set_body giving qri the data it wants serialized, or skipping serialization with ds.set_body(data, parse_as='format'

set_body wasn't doing the right thing when it comes to determining the destination datset structure for writing a body. We now inherit structure from existing dataset data, and only fall back to json if no structure exists. The general problem here is a separation of concerns between data in the starlark runtime and dataset serialization formats. Those should always be seperate. closes qri-io/qri#756

the mental model behind 'raw' and 'data_format' is off. 'data_format' implies that we're changing the ultimate data format, which is not the case. Now set_body has only one optional argument: 'parse_as', which defaults to the empty string. By default qri assumes the data value provided to set_body is an iterable starlark data structure (tuple, set, list, dict). When parse_as is set, set_body assumes the provided body value will be a string of serialized structured data in the given format. valid parse_as values are "json", "csv", "cbor", "xlsx". We'll need more tests to confirm each format works properly, but all this is doing under the hood is overriding all parsing and setting body bytes directly, which is analogous to "qri save --body file.[parse_as] me/dataset". BREAKING CHANGE: ds.set_body raw and data_format params have been replaced by parse_as

dustmop

Looks great! Have just a few tiny suggestions

dustmop · 2019-05-16T19:33:02Z

ds/testdata/test.star

+
+# csv_ds is a global variable provided by dataset_test.go
+# "cycling" csv data through starlark shouldn't have significant effects on the 


how about "round-tripping" instead of "cycling"

dustmop · 2019-05-16T19:34:07Z

ds/dataset.go

 		}

-		return starlark.None, fmt.Errorf("expected raw data for body to be a string")
+		d.write.SetBodyFile(qfs.NewMemfileBytes(fmt.Sprintf("data.%s", df), []byte(string(str))))


Use "body.%s" instead of "data.%s".

totally. makes me want this more: qri-io/dataset#188

dustmop · 2019-05-16T19:34:40Z

ds/dataset_test.go

+			},
+		},
+	}
+	ds.SetBodyFile(qfs.NewMemfileBytes("data.csv", []byte(text)))


Use "body.csv" instead of "data.csv"

b5 added 3 commits May 16, 2019 11:22

docs(ds): add initial outline docs for ds

008ac8c

b5 added the fix label May 16, 2019

b5 self-assigned this May 16, 2019

ghost added the in progress label May 16, 2019

b5 requested review from dustmop and ramfox May 16, 2019 19:03

b5 mentioned this pull request May 16, 2019

v0.8.0 tracking issue qri-io/qri#764

Closed

2 tasks

dustmop approved these changes May 16, 2019

View reviewed changes

chore(ds): update 'data' filenames to 'body'

349cafd

b5 merged commit b30516e into master May 16, 2019

b5 deleted the fix_set_body branch May 16, 2019 20:37

ghost removed the in progress label May 16, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

better set_body #42

better set_body #42

b5 commented May 16, 2019

dustmop left a comment

dustmop May 16, 2019

dustmop May 16, 2019

b5 May 16, 2019

dustmop May 16, 2019


		# csv_ds is a global variable provided by dataset_test.go
		# "cycling" csv data through starlark shouldn't have significant effects on the

better set_body #42

better set_body #42

Conversation

b5 commented May 16, 2019

dustmop left a comment

Choose a reason for hiding this comment

dustmop May 16, 2019

Choose a reason for hiding this comment

dustmop May 16, 2019

Choose a reason for hiding this comment

b5 May 16, 2019

Choose a reason for hiding this comment

dustmop May 16, 2019

Choose a reason for hiding this comment