Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unclear error message when using code chunk options with yaml and an extra space is present #4306

Closed
4 tasks done
coatless opened this issue Feb 10, 2023 · 10 comments
Closed
4 tasks done
Assignees
Labels
documentation Doc improvements & quarto-web enhancement New feature or request upstream Bug is in upstream library
Milestone

Comments

@coatless
Copy link
Contributor

Bug description

This issue came up earlier today when a student was having problems identifying why their quarto document wasn't able to be rendered.

In particular, the student accidentally spaced the code chunk option one extra, that is:

```{sql}
#| label: test multiqueries -5
#|  connection: db
-- ^ note the extra space

select 1
```

Thus, when the Quarto document was rendered, we were seeing:

processing file: my-homework.qmd
Error in yaml::yaml.load(meta, handlers = list(expr = parse_only)) : 
  Scanner error: mapping values are not allowed in this context at line 2, column 12
Calls: .main ... FUN -> parse_block -> partition_chunk -> <Anonymous>
Execution halted

This is a misleading error as it implies that the error falls inside of the document customization header. Instead, the error was related to a code chunk option.

Would it be possible to provide more localized support for bad code chunk specification?

my-homework.qmd :

---
title: "Example"
format: pdf
engine: knitr
editor: visual
---

```{r}
# install.packages(c("DBI", "RSQLite"))
db = DBI::dbConnect(RSQLite::SQLite(), ":memory:")
```


```{sql}
#| label: test multiqueries -5
#|  connection: db

select 1
```

Version information:

  • RStudio: 2022.12.0.353
  • Quarto: 1.2.313
  • Operating System: macOS

Checklist

  • Please include a minimal, fully reproducible example in a single .qmd file? Please provide the whole file rather than the snippet you believe is causing the issue.
  • Please format your issue so it is easier for us to read the bug report.
  • Please document the RStudio IDE version you're running (if applicable), by providing the value displayed in the "About RStudio" main menu dialog?
  • Please document the operating system you're running. If on Linux, please provide the specific distribution.
@coatless coatless added the bug Something isn't working label Feb 10, 2023
@cscheid
Copy link
Collaborator

cscheid commented Feb 10, 2023

Thus, when the Quarto document was rendered, we were seeing:

processing file: my-homework.qmd
Error in yaml::yaml.load(meta, handlers = list(expr = parse_only)) : 
  Scanner error: mapping values are not allowed in this context at line 2, column 12
Calls: .main ... FUN -> parse_block -> partition_chunk -> <Anonymous>
Execution halted

This is a misleading error as it implies that the error falls inside of the document customization header. Instead, the error was related to a code chunk option.

This is indeed unfortunate, thanks for reporting. We already do some patching of the error messages to point to the correct location, but it seems that we're not catching this one.

@cscheid cscheid self-assigned this Feb 10, 2023
@cscheid
Copy link
Collaborator

cscheid commented Feb 10, 2023

Oh, I understand now. Ok. This is an annoying one. This is actually coming from knitr, because it's an sql code cell which quarto itself doesn't process. However, knitr itself seems to have an issue here because the correct comment syntax for sql is --, not #.

Consider this instead:

---
title: issue-4306
---

```{r}
#| label: fig-1
#|  fig-cap: This is a bad line
plot(1,2)
```

That document generates this error message:

ERROR: YAMLException: bad indentation of a mapping entry (4306.qmd, 7:12)
6: #| label: fig-1
7: #|  fig-cap: This is a bad line
             ~~
8: plot(1,2)

Which still has imprecise squiggles, but is at least pointing to the correct place in the document.

Then, consider this:

---
title: issue-4306
---

```{sql}
--| label: fig-2
--|  fig-cap: This is a bad line
select * from this_does_not_exist
```

That yields this error:

ERROR: YAMLException: bad indentation of a mapping entry (4306.qmd, 7:13)
6: --| label: fig-2
7: --|  fig-cap: This is a bad line
              ~~
8: select * from this_does_not_exist

So, the problem only happens when you use the wrong comment syntax (which means quarto doesn't detect it as a yaml block to validate), and send it to knitr (which seems to ignore the wrong comment syntax and then fail to parse the yaml correctly, but give error messages with bad error numbers).

The problem is that knitr doesn't track line numbering precisely enough, and the error that comes back from knitr is too ambiguous for us to correct it after the fact.

Ultimately, I believe this needs to be fixed on knitr (@cderv).

@cderv
Copy link
Collaborator

cderv commented Feb 10, 2023

I would say that the problem is that --| comment should be used in Quarto really to benefit the YAML intelligence.

knitr will use the same logic to recognized comment for a language and so using --| works for SQL chunk.

However, yes knitr will also support the #| for all languages as a fallback if the expected comment is not found. I think this was added for convenience for R user and also because knitr does not only support YAML but also multiline R options. *

This means both

```{sql}
--| label: test multiqueries -5
--|  connection: db

select 1
```

and

```{sql}
#| label: test multiqueries -5
#|  connection: db

select 1
```

are valid for knitr and in R Markdown. However, regarding Quarto,

  • the former will trigger YAML validation in Quarto before anything happens in knitr
  • the latter won't trigger YAML validation and the chunk with its comment is handled by knitr which will try to parse it using R function from yaml 📦. As this is not valid YAML, the package function will trigger an error that knitr report. It is not knitr directly generating the issue.

Before talking about solution, @cscheid do you think there is other side effect that supporting #| in all chunk for knitr could have ? Or only Quarto YAML validation not working ?

One thing we could do is being more strict for Quarto document and not checking the fallback. Easy for us to do.
This means when using Quarto, use the comment from the language of the cell
Maybe that was a wrong idea to do that in the first place and we should remove the fallback completely, but there could be a good reason. @yihui do you remember why the addition ? Do you agree with my first suggestion above ?

The problem is that knitr doesn't track line numbering precisely enough, and the error that comes back from knitr is too ambiguous for us to correct it after the fact.

I don't think we should try to report an error similar to the Quarto YAML validation but in fact be sure it is triggered as I suggest above.

Ultimately, I believe this needs to be fixed on knitr

I agree - I am not sure it is a good idea for Quarto to adapt to knitr behavior in R document as this would make a difference with other computation engine.

@cscheid
Copy link
Collaborator

cscheid commented Feb 10, 2023

Before talking about solution, @cscheid do you think there is other side effect that supporting #| in all chunk for knitr could have ? Or only Quarto YAML validation not working ?

I discussed this with @jjallaire a long time ago and the conclusion is that we want strongly for the code syntax in code blocks to be valid code in those languages. If you support (eg) #| in C code, then those contents are no longer valid C.

@cderv
Copy link
Collaborator

cderv commented Feb 10, 2023

Yes I agree with that. We don't do that by default in knitr. So I think we could deactivate support for fallback at least when knitr is used for Quarto document.

However, it will not help find the error because the options won't be parsed at all. This is what I get by doing the above (using this branch yihui/knitr#2225) rendering with Quarto

---
title: "Example"
format: html
engine: knitr
---

```{r}
# install.packages(c("DBI", "RSQLite"))
db = DBI::dbConnect(RSQLite::SQLite(), ":memory:")
```

```{sql}
#| label: test multiqueries -5
#|  connection: db

select 1
```
processing file: test2.qmd
Error:
! Non R chunk should prefix pipe `|` with the comment character from the language for YAML options. You are using 'sql':
  you probably should start your comment with "#| "

I decided to suggest something with an error as otherwise the code chunk will be executed without chunk options being parsed (see PR comment). I believe that it why we support the fallback in the first place, but this won't "teach" users to use the correct comment line.

Do you think that would be better ? @coatless do you find that more useful ?

@yihui
Copy link
Contributor

yihui commented Feb 10, 2023

This is a misleading error as it implies that the error falls inside of the document customization header. Instead, the error was related to a code chunk option.

@cderv I think the error message definitely needs to be improved to clarify the source of the real error (telling users that "line 2, column 12" doesn't mean the line/column in the whole source document but in the YAML part of a code chunk).

I discussed this with @jjallaire a long time ago and the conclusion is that we want strongly for the code syntax in code blocks to be valid code in those languages. If you support (eg) #| in C code, then those contents are no longer valid C.

I also had a discussion with JJ a long time ago but I don't think we convinced each other. I prefer using #| for all languages, and he prefers valid code chunks. One reason for my preference was exactly that users' muscle memory might make them type #| even when they are in a code chunk with a different language that doesn't support # as the comment character (e.g., why did @coatless's student type #| in the sql chunk?). I tried to be forgiving in knitr for this case, i.e., if the language-specific comment-pipe is not found, knitr will fall back to looking for #|.

Supporting #| universally definitely brings extra burden to the editor. The editor needs to know that these (syntactically invalid) lines are special and should not be treated as normal code but metadata. I think that is the main downside. Other than that, using --| or #| for a sql chunk doesn't make a real difference: it's simple for knitr::knit() or knitr::purl() to process comments of either syntax, but it's a different case for Jupyter (I don't think Jupyter understands the comment-pipe thing, right?), which is a secondary downside.

BTW, in our original discussion, I also preferred the traditional CSV-like syntax for chunk options over the YAML syntax because sometimes YAML is hard to please. It's notoriously picky about spaces (fortunately, our YAML hero @cscheid came to rescue with a whole set of tools). That's why knitr also supports both syntaxes. That said, I have no problem with us promoting the YAML syntax as well as the language-specific comment-hash. I agree that we'd better educate users with only one way to do things.

Anyway, no matter we support #| for sql chunks or not, the original problem in this issue can be addressed independently.

@cderv
Copy link
Collaborator

cderv commented Feb 10, 2023

I think the error message definitely needs to be improved to clarify the source of the real error (telling users that "line 2, column 12" doesn't mean the line/column in the whole source document but in the YAML part of a code chunk).

Yes you're right - I agree with that. I opened an issue to track this in knitr as an independent issue

@coatless
Copy link
Contributor Author

coatless commented Feb 11, 2023

First, thank you everyone for looking in-depth at the issue.

I've aimed to reply to the comments with quoted replies:

processing file: test2.qmd
Error:
! Non R chunk should prefix pipe `|` with the comment character from the language for YAML options. You are using 'sql':
  you probably should start your comment with "#| "

I decided to suggest something with an error as otherwise the code chunk will be executed without chunk options being parsed (see PR comment). I believe that it why we support the fallback in the first place, but this won't "teach" users to use the correct comment line.

Do you think that would be better ? @coatless do you find that more useful ?

@cderv yes, this would be far more useful. It provides course staff the ability to more easily understand where to look. It also helps students out as they have a better error message to google.

Only one other question:

Would it be possible to identify the code chunk by a label or the code language type at this point? Or is that not possible either? (Thinking about a polygot document)

One reason for my preference was exactly that users' muscle memory might make them type #| even when they are in a code chunk with a different language that doesn't support # as the comment character (e.g., why did @coatless's student type #| in the sql chunk?). I tried to be forgiving in knitr for this case, i.e., if the language-specific comment-pipe is not found, knitr will fall back to looking for #|.

@yihui I think the use of #| option: value is what is predominately featured throughout the official documentation. There wasn't a section showcasing the use of a language-specific comment, e.g. -- | option: value. The main languages being shown are: R, Python, and Bash. Nothing is being shown on the public quarto site with sql / go / c/ Rcpp or one of knitr's other engines.

Even the first description of the different engines omits a description on language-specific comment usage:

https://quarto.org/docs/visual-editor/technical.html#embedded-code

@yihui
Copy link
Contributor

yihui commented Feb 11, 2023

Would it be possible to identify the code chunk by a label or the code language type at this point? Or is that not possible either?

Here is the error message that I proposed: yihui/knitr#2226 (comment) It can contain the line numbers of the source document indicating where the error comes from.

@cderv cderv assigned cderv and unassigned cscheid Feb 13, 2023
@cderv cderv added upstream Bug is in upstream library documentation Doc improvements & quarto-web labels Feb 13, 2023
@cscheid cscheid added enhancement New feature or request and removed bug Something isn't working labels Mar 10, 2023
@cscheid cscheid added this to the v1.4 milestone Mar 10, 2023
@cderv
Copy link
Collaborator

cderv commented Jan 5, 2024

This is the error we get now when using knitr v1.45

==> quarto preview index.qmd --to html --no-watch-inputs --no-browse

++ Activating rlang global_entrace



processing file: index.qmd
Failed to parse YAML inside code chunk at lines 13-17 (index.qmd):

label: test multiqueries -5
 connection: db
            ^~~~~~

@cderv cderv closed this as completed Jan 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Doc improvements & quarto-web enhancement New feature or request upstream Bug is in upstream library
Projects
None yet
Development

No branches or pull requests

4 participants