-
-
Notifications
You must be signed in to change notification settings - Fork 687
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Redesign default .json format #782
Comments
Here's the default JSON at the moment: https://latest.datasette.io/fixtures/compound_three_primary_keys.json?_size=2 {
"database": "fixtures",
"table": "compound_three_primary_keys",
"is_view": false,
"human_description_en": "",
"rows": [
[
"a",
"a",
"a",
"a-a-a"
],
[
"a",
"a",
"b",
"a-a-b"
]
],
"truncated": false,
"filtered_table_rows_count": 1001,
"expanded_columns": [],
"expandable_columns": [],
"columns": [
"pk1",
"pk2",
"pk3",
"content"
],
"primary_keys": [
"pk1",
"pk2",
"pk3"
],
"units": {},
"query": {
"sql": "select pk1, pk2, pk3, content from compound_three_primary_keys order by pk1, pk2, pk3 limit 3",
"params": {}
},
"facet_results": {},
"suggested_facets": [
{
"name": "pk1",
"toggle_url": "http://latest.datasette.io/fixtures/compound_three_primary_keys.json?_size=2&_facet=pk1"
},
{
"name": "pk2",
"toggle_url": "http://latest.datasette.io/fixtures/compound_three_primary_keys.json?_size=2&_facet=pk2"
},
{
"name": "pk3",
"toggle_url": "http://latest.datasette.io/fixtures/compound_three_primary_keys.json?_size=2&_facet=pk3"
}
],
"next": "a,a,b",
"next_url": "http://latest.datasette.io/fixtures/compound_three_primary_keys.json?_size=2&_next=a%2Ca%2Cb",
"query_ms": 17.56119728088379,
"source": "tests/fixtures.py",
"source_url": "https://github.com/simonw/datasette/blob/master/tests/fixtures.py",
"license": "Apache License 2.0",
"license_url": "https://github.com/simonw/datasette/blob/master/LICENSE"
} There's a lot of stuff in there. This increases the risk that future minor changes might break existing API consumers. It returns rows as a list of lists of values, and expects you to correlate these with the list of columns. I originally designed it like this because I thought this was a more efficient representation than repeating the column names in a dictionary for every row. With hindsight this was a bad optimization - I always use |
https://latest.datasette.io/fixtures/compound_three_primary_keys.json?_size=2&_shape=array returns this: [
{
"pk1": "a",
"pk2": "a",
"pk3": "a",
"content": "a-a-a"
},
{
"pk1": "a",
"pk2": "a",
"pk3": "b",
"content": "a-a-b"
}
] There's one big problem with this format: it doesn't provide any space for pagination information. |
Another idea: the default output could be the list of dicts: [
{
"pk1": "a",
"pk2": "a",
"pk3": "a",
"content": "a-a-a"
},
...
] BUT... I could include pagination information in the HTTP headers - as seen in the WordPress REST API or the GitHub API:
Alternative shapes would provide the pagination information (and other extensions) in the JSON, e.g.:
{
"rows": [
{
"pk1": "a",
"pk2": "a",
"pk3": "a",
"content": "a-a-a"
}
],
"pagination": {
"next": "234",
"count": 442
}
} |
I'm going to hack together a preview of this in a branch and deploy it somewhere so people can see what I've got planned. Much easier to evaluate a working prototype than static examples. |
(I think I may have been over-thinking the details of this is for a couple of years now.) |
I'd like to revisit the idea of using |
Would it be so bad if the default format had a A default format that's an object rather than array also gives something for the |
Plan: release a new release of Datasette (probably 0.49) with the new JSON API design, but provide a plugin called something like Anyone who has built applications against 0.48 can install that plugin. |
Since the total count can be expensive to calculate, I'm inclined to make that an opt-in extra - maybe Based on that, the default JSON shape could look something like this: {
"rows": [{"id": 1}, {"id": 2}],
"next": "2",
"next_url": "/db/table?_next=2"
} And with {
"rows": [{"id": 1}, {"id": 2}],
"next": "2",
"next_url": "/db/table?_next=2",
"count": 31
} |
Should that default also include |
Maybe Or... |
The core issue that I keep reconsidering is whether the default Arguments in favour of a list:
Arguments against:
But maybe that last point is a positive? It ensures the default If Maybe {
"rows": [{"id": 1}, {"id": 2}],
"total": 104
} The thing I care about most though is
{
"rows": [{"id": 1}, {"id": 2}],
"total": 104,
"next": "2",
"next_url": "/db/table.json?_extra=total&_extra=next&_next=2"
} This is feeling a bit verbose for a common combination though. |
I'm going to prototype what it would look like if the default shape was a list of objects and |
Building this plugin reminded me of an oddity of the That's not ideal. I'd like custom renderers to be able to access this data to get at things like suggested facets, on an opt-in basis. So maybe that kind of stuff is re-implemented as "extras" which are awaitable callables - then renderer plugins can call the extras that they need to as part of their execution. To illustrate the problem (in this case the need to access from datasette import hookimpl
from datasette.utils.asgi import Response
@hookimpl
def register_output_renderer(datasette):
return {
"extension": "json-preview",
"render": json_preview,
}
def json_preview(data, columns, rows):
next_url = data.get("next_url")
headers = {}
if next_url:
headers["link"] = '<{}>; rel="next"'.format(next_url)
return Response.json([dict(zip(columns, row)) for row in rows], headers=headers) |
Here's the |
I vote against headers. It has a lot of strikes against it: poor discoverability, new developers often don’t know how to use them, makes CORS harder, makes it hard to use eg with JQ, needs ad hoc specification for each bit of metadata, etc. The only advantage of headers is that you don’t need to do .rows, but that’s actually good as a data validation step anyway—if .rows is missing assume there’s an error and do your error handling path instead of parsing the rest. |
Great point about CORS, I hadn't considered that. I think I'm going to keep the {
"total": 36,
"rows": [{"id": 1, "name": "Cleo"}],
"next_url": "https://latest-with-plugins.datasette.io/fixtures/facetable.json?_next=5"
} So three keys: |
I'll update |
I'm going to take another look at this: |
It turned out the most significant part of this change - switching from an array of arrays to an array of objects for the |
https://latest.datasette.io/fixtures/compound_three_primary_keys.json?_size=2 now shows the new default: {
"database": "fixtures",
"table": "compound_three_primary_keys",
"is_view": false,
"human_description_en": "",
"rows": [
{
"pk1": "a",
"pk2": "a",
"pk3": "a",
"content": "a-a-a"
},
{
"pk1": "a",
"pk2": "a",
"pk3": "b",
"content": "a-a-b"
}
], The old format can be had like this: https://latest.datasette.io/fixtures/compound_three_primary_keys.json?_size=2&_shape=arrays |
New thinking on the trimmed-down default. Previously I was going to use {
"ok": true,
"rows": [
{
"pk1": "a",
"pk2": "a",
"pk3": "a",
"content": "a-a-a"
},
{
"pk1": "a",
"pk2": "a",
"pk3": "b",
"content": "a-a-b"
}
],
"next": "a,a,b"
} If there isn't a next page it will return This is even more succinct. I'm OK with people having to request The |
Here's the so-far updated documentation for this change: https://github.com/simonw/datasette/blob/a2dca62360ad4a961d4c46f68eae41b7d5c7b2c9/docs/json_api.rst#different-shapes |
I'm going to rename |
https://latest.datasette.io/fixtures/compound_three_primary_keys.json?_size=2 now returns {
"database": "fixtures",
"table": "compound_three_primary_keys",
"is_view": false,
"human_description_en": "",
"rows": [
{
"pk1": "a",
"pk2": "a",
"pk3": "a",
"content": "a-a-a"
},
{
"pk1": "a",
"pk2": "a",
"pk3": "b",
"content": "a-a-b"
}
],
"truncated": false,
"count": 1001, |
Most of this is shipped in https://docs.datasette.io/en/1.0a3/changelog.html#a3-2023-08-09 |
The default JSON just isn't right. I find myself using
?_shape=array
for almost everything I build against the API.The text was updated successfully, but these errors were encountered: