Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

646: Detect instance() expressions in notes and make them into outputs #648

Merged
merged 3 commits into from
Aug 29, 2023

Conversation

lindsay-stevens
Copy link
Contributor

@lindsay-stevens lindsay-stevens commented Aug 3, 2023

Closes #646

Why is this the best possible solution? Were any other approaches considered?

In a note label, users can show values or metadata for other form items using pyxform reference syntax e.g. ${q1}. If that reference was inside an instance call (e.g. to get the value of a secondary instance label), then the conversion would convert the pyxform token, but it should convert the whole expression.

New implementation works for pyxform references, and (nested) instance expressions with or without a pyxform reference (e.g. in a XPath predicate). It uses a similar approach to the detection of dynamic labels, where an expression lexer looks for certain grammar tokens or sequences. The lexer approach is useful here because a regex to accurately parse instance expressions is either quite complicated or not possible. Particularly considering the wide variety of XPath expressions that some users likely employ now via the workaround or will plan to after this fix.

Maybe at some stage this element of pyxform should be upgraded to use something like pyparsing. There is a fair amount of parsing in pyxform which could be consolidated. It may also allow for warning users of potentially invalid expressions. Example XPath parsing code with pyparsing found in projects arelle and xpyth_parser which both happen to target XPath for XBRL purposes (no significance to XLSForm, just a coincidence).

What are the regression risks?

This slots in to survey.py as an extra text processing layer within the existing pyxform reference replacement code, so the risk should be minimal. Obviously if there is a bug in this new code it may prevent form conversion.

Does this change require updates to documentation? If so, please file an issue here and include the link below.

It would probably be welcome news on the forum. It seems like it's something that otherwise users would assume should work.

Before submitting this PR, please make sure you have:

  • included test cases for core behavior and edge cases in tests
  • run nosetests and verified all tests pass
  • run black pyxform tests to format code
  • verified that any code or assets from external sources are properly credited in comments

- join error message written as 2 strings on one line into 1 string
- move regexes to module level and compile once for performance
- avoid variables shadowing names in outer scope (nested funcs)
- expand unnecessary double assignment
- add type hints and docstrings
- in a note label, users can add show values or metadata for other form
  items using pyxform reference syntax e.g. ${q1}. If that reference was
  inside an instance call (e.g. to get the value of a secondary instance
  label), then the conversion would convert the pyxform token, but it
  should convert the whole expression.
- new implementation works for pyxform references, and nested
  instance expressions using pyxform references. It uses a similar
  approach to the detection of dynamic labels, where an expression lexer
  looks for certain grammar tokens or sequences. Maybe at some stage
  this should be upgraded to use something like `pyparsing`.
- if an instance expression appears in a label, in order for it to be
  evaluated and replaced it needs to be converted to an output node.
  The trigger for that was only a pyxform reference, but it's possible
  that users may want an expression that doesn't use a reference.
@lindsay-stevens
Copy link
Contributor Author

An example from the docs (External XML data) that currently doesn't seem be parsed as expected for note output is putting an instance expression inside another function call. Maybe out of scope? Should be possible though.

  • input: count(instance('houses')/house[rooms = current()/../rooms ])
  • output: <label> count(<output value="instance('houses')/house[rooms = current()/../rooms"/> ]) </label>

Copy link
Contributor

@lognaturel lognaturel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! 🤩

@lognaturel lognaturel merged commit 80ebeb1 into XLSForm:master Aug 29, 2023
10 checks passed
@lognaturel
Copy link
Contributor

putting an instance expression inside another function call

Good catch. I think considering it out of scope for now is reasonable. We can see whether we use it for our own forms and/or get requests for it. For computed values like that my sense is that it generally makes sense to actually have the value in the form so it's clear what the user saw.

Maybe at some stage this element of pyxform should be upgraded to use something like pyparsing

Maybe tree-sitter would be a good option here? @eyelidlessness is likely to introduce it in Enketo for similar usage and it would be amazing to be able to share a grammar. I think it could be used in Central and Collect as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Detect instance() expressions in notes and make them into outputs
2 participants