Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement SQL formatter into the docs pipeline #13031

Open
qiancai opened this issue Mar 27, 2023 · 6 comments
Open

Implement SQL formatter into the docs pipeline #13031

qiancai opened this issue Mar 27, 2023 · 6 comments
Labels
2024-tidb-docs-dash This issue or PR is included in the 2024 TiDB Docs Dash event. type/enhancement The issue or PR belongs to an enhancement.

Comments

@qiancai
Copy link
Collaborator

qiancai commented Mar 27, 2023

Change Request

  1. Describe what you find is inappropriate or missing in the existing docs.

    It would be better to have SQL code be more readable by default and be stacked as opposed to all one line.

    For example, the line:

    SELECT l_orderkey, MAX(L_COMMENT), MAX(L_SHIPMODE), MAX(L_SHIPINSTRUCT), MAX(L_SHIPDATE), MAX(L_EXTENDEDPRICE) FROM lineitem GROUP BY l_orderkey HAVING SUM(l_quantity) > 314;
    

    Would be more readable as:

    SELECT 
          l_orderkey, 
          MAX(L_COMMENT), 
          MAX(L_SHIPMODE), 
          MAX(L_SHIPINSTRUCT), 
          MAX(L_SHIPDATE), 
          MAX(L_EXTENDEDPRICE) 
    FROM lineitem 
    GROUP BY l_orderkey 
    HAVING SUM(l_quantity) > 314;
    
  2. Describe your suggestion or addition.

    Try and put something like https://sqlfum.pt/ into the docs pipeline to automatically:

    • Fix the case for SQL keywords (e.g. "insert" → "INSERT")
    • Convert to multi-line
  3. Provide some reference materials (such as documents and websites) if you could.

    https://github.com/mjibson/sqlfmt (which is https://wasm.sqlfum.pt/ )
    https://sqlfmt.com/
    https://www.salvis.com/blog/2020/08/28/formatting-sql-code-blocks-in-markdown-files/

@qiancai qiancai added the type/enhancement The issue or PR belongs to an enhancement. label Mar 27, 2023
@qiancai qiancai added the 2024-tidb-docs-dash This issue or PR is included in the 2024 TiDB Docs Dash event. label Nov 29, 2023
@curiousRay
Copy link

@qiancai
I found that some SQL fragments in markdown files are not well-organized, therefore prettifying is not possible for some formatter tools.

Here are some suggestions, taking this document for example:

  • If you intend to emphasis the structure of each individual SQL sentence, you may:
    Sepereate the execution results (the character table and "x rows in set", "Query OK, x rows affected") from the entire block
    Set the results as normal code block (not marked as ```sql).

  • If you intend to show the whole workflow of some SQL sentences, you may:
    Keep the entire block as ```sql, but some markers shall be made for non-SQL content so that parser tools can identify then ignore them.

P.S. I think this formatter tool looks sweet, it comes with a yarn package and is easy to integrate.

@curiousRay
Copy link

Plus, the user might not want to copy mysql> together with SQL sentences. Is there any possibility that mysql> or MySQL [test]> is rendered purely by typescript, rather than from markdown sourcefile?

@qiancai
Copy link
Collaborator Author

qiancai commented Jan 10, 2024

Hi @curiousRay Many thanks for the investigation. Yes, as we have a large number of instances that include both SQL statements and their results in one code block, it is hard for the formatter tool to distinguish them.

As for "mysql>", we can directly remove it from the Markdown source file because the website renders the doc according to ```sql instead of "mysql>".

@qiancai
Copy link
Collaborator Author

qiancai commented Jan 10, 2024

/cc @dveeden

@dveeden
Copy link
Contributor

dveeden commented Jan 19, 2024

@curiousRay @qiancai I've created a proof of concept SQL formatter for markdown based on TiDB: https://github.com/dveeden/tidb-markdown-sql-format

If we can remove the prompt and add it back when rendering that would be good.

I think we should aim to have something like this:

```sql
SELECT 1
```
```
+----------+
| output|
+----------+
| output |
+----------+
```

This has:

  1. No prompt in the SQL code block
  2. The output and statement in separate code blocks
  3. The output code block (resultset or something like Query OK, 0 rows affected (0.11 sec)) not being set to sql

However we currently have many places with:

  1. A prompt (mysql> ) included in the SQL
  2. The statement and resultset or output combined.
  3. SQL code blocks having only output or a resultset (Statement and Result in different code blocks, both with the type set to sql)

If we split the SQL statement and output code block we could even have a formatter that formats the tabular output in a nicely looking table if we want. However I'm not sure if we should do that at some point and I don't think we should do that now.

@dveeden
Copy link
Contributor

dveeden commented Apr 12, 2024

P.S. I think this formatter tool looks sweet, it comes with a yarn package and is easy to integrate.

@curiousRay it now also has TiDB support:
https://github.com/sql-formatter-org/sql-formatter/tree/master/src/languages/tidb

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2024-tidb-docs-dash This issue or PR is included in the 2024 TiDB Docs Dash event. type/enhancement The issue or PR belongs to an enhancement.
Development

No branches or pull requests

3 participants