Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use ![alt](path) syntax not only to include images, but also text files #307

Open
AvtechScientific opened this issue Jun 29, 2024 · 22 comments

Comments

@AvtechScientific
Copy link

Use ![alt](path) syntax not only to include images, but also text files.

See: jgm/djot.js#85

"This commit enables the creation of structured documents of arbitrary complexity, contrary to the current state of affairs where only one-pagers are allowed. So now if one needs to write a book or a lengthy article all the content must be in one file. Working with big files is not convenient and may slow down/crash the editor besides being hard to read. File inclusion, implemented in jgm/djot.js#94, allows the author to subdivide the book/article into separate manageable chapters. This provides djot with the power similar to that of LaTeX and makes it positively distinct from all the Markdown-like tools."

And a ready to merge PR: jgm/djot.js#94

@Omikhleia
Copy link

Sorry, but there's a discrepancy between this description and what the code does. I see some code related to footnotes, etc. So what does it mean? What are the implications for other Djot implementations?

@Omikhleia
Copy link

Omikhleia commented Jun 29, 2024

Also, sorry, but this is bragging, and weird:

This provides djot with the power similar to that of LaTeX and makes it positively distinct from all the Markdown-like tools.

In my own workflows, I use a higher level master document file (above Djot etc.), and I have made printed books from front cover to back cover with them, assembling bits of Djot content (and also Markdown or other, when needed). I don't see anything being solved "positively" here.

@AvtechScientific
Copy link
Author

AvtechScientific commented Jun 30, 2024 via email

@AvtechScientific
Copy link
Author

AvtechScientific commented Jun 30, 2024

In my own workflows, I use a higher level master document file (above Djot etc.), and I have made printed books from front cover to back cover with them, assembling bits of Djot content (and also Markdown or other, when needed). I don't see anything being solved "positively" here.

Using your own workflows can you handle this simple case of two included files:

File_1:
Is this **bold

File_2:
or not?**... hmm

Master_File:
![File_1](./File_1) ![File_2](./File_2)

If yes - how do you do it?

@Omikhleia
Copy link

I certainly would forbid breaking semantic structures between different files. What your example above is even aiming at solving ? You do really expect someone to start a bold structure in a file... and end it in another, seriously?

@Omikhleia
Copy link

Speaking of semantics, what becomes the "alt" content in your ![alt](somefile.djot) ?

@AvtechScientific
Copy link
Author

Speaking of semantics, what becomes the "alt" content in your ![alt](somefile.djot) ?

Just to stay consistent with the image case - alt will be displayed if somefile.djot is not found.

@AvtechScientific
Copy link
Author

I certainly would forbid breaking semantic structures between different files. What your example above is even aiming at solving ? You do really expect someone to start a bold structure in a file... and end it in another, seriously?

  1. Judging by the lack of a solution on your side I conclude that you have no solution.
  2. The above mentioned example was intended for demonstration purposes only. If you can't think of a practical example for this - I'll help you. Inclusion, i.a., enables templates. Let's imagine a document that consists of 2 constant sections - header and footer, and a variable middle section (e.g. a letter to company employees were the variable middle section consists of employee names being programmatically injected). Header might open an inline element (bold, italic or whatever) that footer will close. It's just one example. I think the real life can bring more.

@Omikhleia
Copy link

Omikhleia commented Jun 30, 2024

Just to stay consistent with the image case - alt will be displayed if somefile.djot is not found.

How consistent? In pandoc-flavored markdown, what you call the "alt" text may be used as the figure legend if the image stands alone in a paragraph of its own (see the "implicit figure" option).
The Djot syntax is not very clear yet about this use case, and it does not have a general provision either for captioned images and figures -- for reference, see notably discussions #28 and #87, amongst other. But whether it eventually goes the same way as in pandoc (using the bracketed text as implicit figure) or via a generalization of the (currently table-only) ^ legend markup, the issue would remain the same.

Judging by the lack of a solution on your side I conclude that you have no solution.

Sympathetic. But I recognize I wouldn't have a solution for a non-issue edge case with no clear semantics defined ;)

You mention templating - I'm not even that sure it should be part of the document syntax... And real templating goes far beyond mere content injection.
(For the mere record, however, I already need and use some sort of custom templating logic too, see #238, in specialized Djot-based template files. I do think the real life can bring more, indeed.)

@criloz
Copy link

criloz commented Sep 27, 2024

I think the association of this syntax ![alt](path) with images is so widespread that it does not worth the effort to make it different.

An option could just use other symbol like #

  1. #(table.csv), by default use the file extension to detect the file type
  2. #[csv](table.csv), file type can be overwritten like this and ignore file extension
  3. #[=](table.csv) |#[=csv](table.csv), indicate in the AST that the file should be included as raw content, the file type could be used to potentially add some kind of syntax highlight
  4. #([email protected]) include only a slice of the file
  5. #(../user/profile.djot#who-i-am) include a section of the file that can be addressed by references

The only issue, would be that this symbol #, is used for tagging in certain system, but generally tags don't use symbols like [, ], (, ), or need to start by a valid unicode_id_start character.

@Omikhleia
Copy link

Your (4) goes far beyond the Djot input format markup and is perhaps possibly best left to renderers' interpretation... If instead of a CSV, your'd have (to stay on something similar) a speadsheet (ODT/OOXML), the "Sheet" name might be needed, etc. All of this is highly dependent on the source format, so a specification would have to be very explicit. (What's a slice of a CSV actually? Lines, columns, both? Etc.)

@criloz
Copy link

criloz commented Sep 27, 2024

@Omikhleia you are right on this and it was mostly related with other comment related to this #199 (comment), but this can also solve using URIs #(table.csv?start_row=12&end_row=34) which let more freedom to the interpreter how to handle this

@vassudanagunta
Copy link
Contributor

vassudanagunta commented Sep 27, 2024

I think the association of this syntax ![alt](path) with images is so widespread that it does not worth the effort to make it different.

It would not be making it different, but would be generalizing it, with consistent transclusion semantics for any referenced media type:

  • [label](moon.jpg): link to image
  • ![alt](moon.jpg): embed image
  • [label](phases.csv): link to CSV
  • ![alt](phases.csv): embed CSV as table

@criloz
Copy link

criloz commented Sep 27, 2024

@vassudanagunta I think that the problem with that is what would be the meaning of [alt]?, in images and links is used as label, but in a csv or djot file, etc? Also, how do you assign a file-type, just using the file extension is not enough because there are some file extension that clashes, and in other cases the file extension is not even present at all like an URL https://example.com/user/81a6bf136427d9e/raw/acde69adea3db6

other option will allow using attributes to sort this out

  • ![](https://example.com/user/81a6bf136427d9e/raw/acde69adea3db6){file_type=csv}

@Omikhleia
Copy link

Omikhleia commented Sep 27, 2024

* `![](https://example.com/user/81a6bf136427d9e/raw/acde69adea3db6){file_type=csv}`

Another way perhaps, which remains compatible with the default/current syntax: !format[alt](url) where "format" is optional and "guessed" by the renderer (btw. the file extension is not the only way, one could do a file introspection, etc.). This would allow for !csv[alt](myfile.txt)

@vassudanagunta
Copy link
Contributor

[alt] is alternate text, not a label, meant to be used instead of the referenced resource in a number of circumstances. This is current semantics.

The issue of unknown file types is a general problem not exclusive to markup languages (e.g. opening a file via your OS GUI, or HTTP GET). It should be solved the same way rather than introduce something new: use a file extension as best practice. Else check mime type. Else check magic byte. Else report error.

@criloz
Copy link

criloz commented Sep 27, 2024

* `![](https://example.com/user/81a6bf136427d9e/raw/acde69adea3db6){file_type=csv}`

Another way perhaps, which remains compatible with the default/current syntax: !format[alt](url) where "format" is optional and "guessed" by the renderer (btw. the file extension is not the only way, one could do a file introspection, etc.). This would allow for !csv[alt](myfile.txt)

This !format[alt](uri) actually could work, and be implemented easily at the lexer level, where the format could be a valid Unicode identifier, it will match !{unicode_id}[ as token to start the file transclusion otherwise will just emit a text token.

But there are 2 types of files, transclusion

  1. include the file and render it (interpreted), example a csv file would end rendered as a table
  2. include the file content without interpretation, it could be the equivalent to write a code block with the content of the file.

So based on that and to maintain consistency, djot can potentially handle both cases and use a similar syntax to what is used currently to differentiate between a raw content and a code block

  1. !=[alt](uri), interpreted without a format
  2. ![alt](uri), include the file content without a format
  3. !=format[alt](uri), interpreted with a format
  4. !format[alt](uri), include the file content with a format, allowing syntax highlight

the problem is that this will change the current semantic of ![alt](uri), but an exception could be made to Uris that have an image file type to maintain some kind of backward compatibility

@Omikhleia
Copy link

Omikhleia commented Sep 27, 2024

But there are 2 types of files, transclusion (...)

I am not so sure. Keeping on with your CSV example, it may be rendered as a table, or as a graph (I'm using this for pie charts for instance, in my own renderer, but other visualizations could be considered). This is very open and cannot be handled with a straight single = in the input syntax, so the best course of action might be to use a class (or key-value attr), e.g. ![Soccer games](soccer.csv){.piechart} and the renderer does its best to honor the class / key-value attr it knows, and to have a decent fallback otherwise...

Note that this is not "transclusion", per se, by the way.

EDIT: class vs. key-value vs. "semantic" tag, if the latter (#240) eventually makes it.

@criloz
Copy link

criloz commented Sep 27, 2024

@Omikhleia you are right, the djot class system can be used for this, there are many ways to interpret a file

@mklcp
Copy link

mklcp commented Oct 21, 2024

I second using the class system. Even the distinction between ![alt](uri) and !=[alt](uri) can be handled by the class system. And it is not always meaningful, e.g. !=[alt](uri). What does that mean? Literally include the image as a stream of bytes? Written in hex? in rgb? Continuous or interspersed with spaces between columns (like hexdump)?

I'd rather have ![alt](uri) generalized to all kind of files, and use classes to customize if needed or if ambiguous.

@criloz
Copy link

criloz commented Oct 21, 2024

@mklcp I agree about the class system, even better because attributes do accept key-vaue pairs, it can potentially mimic function invocation, which allow great versatility and extensibility

but I think it will be better if there is an unambiguous way to declare the file type, so the render engine will have 3 variables to dispatch the proper method

  1. intention to include the file ![alt](path)
  2. the file type, in cases where is impossible to determine the file type from the uri/path, it can be like the suggested from @Omikhleia !format?[alt](path)
  3. The classes and attributes, that can determine a wide ranges of option like do a histogram from this CSV file.

@AvtechScientific
Copy link
Author

Any decision/implementation on this topic?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants