Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Word is better than Quarto (!): embracing the user need #272

Open
matt-dray opened this issue Feb 6, 2025 · 3 comments
Open

Word is better than Quarto (!): embracing the user need #272

matt-dray opened this issue Feb 6, 2025 · 3 comments
Assignees
Labels
C&C ☕ Session idea for Coffee & Coding

Comments

@matt-dray
Copy link
Contributor

matt-dray commented Feb 6, 2025

Me and @jacgrout have quite a story to tell about populating NHP outputs ('final') reports.

We can discuss:

  • why we chose to populate docx files instead of qmd
  • how we manage to do this reproducibly in the very-important, high stakes nhp_final_reports repo
  • how we use {officer} and 'custom document properties' to update a Word template file
  • how we've read data from SharePoint, Azure and Posit Connect pins to achieve this
  • how we make use of Azure metadata and a Posit Connect app to keep information in-sync with schemes
  • how we've adapted the repo in an agile manner as needs have changed 🤸‍♀️
  • how this is a better solution for our users than 'simply' telling them to 'use a Quarto document'
  • how tracking things in issues and PRs has saved the day more than once (e.g. a recent PR reversion and how recording of a 'final' scenario process helped untangle some confusion)
  • other various pains along the way
@matt-dray matt-dray added the C&C ☕ Session idea for Coffee & Coding label Feb 6, 2025
@matt-dray matt-dray self-assigned this Feb 6, 2025
@ChrisBeeley
Copy link
Member

Yes please!

@jacgrout
Copy link
Member

jacgrout commented Feb 7, 2025

It's just mail merge for the 2020's ;-)

@francisbarton francisbarton moved this from Potential to Scheduled in Coffee and Coding ☕🧑‍💻 Feb 11, 2025
@matt-dray
Copy link
Contributor Author

matt-dray commented Feb 14, 2025

Rough structure:

  1. The need: for each NHP scheme, generate plots and calculate values and insert them into a report. The outputs app presents results and the doc needs some of that, but also we need to supply some bespoke content that comes from the underlying data. Important because this presents the formal outcome of modelling. Users: SU's NHP model relationship managers, who help schemes finesse the content. Technically the user is the scheme, but we're serving a user one step back from that (MRMs).
  2. First idea: this should be Quarto. Full control, can parameterise for each scheme, can make it look pretty.
  3. Reality: the Word doc () is long and complicated and its content is continually changing. it has to go back and forth to the clients and comments have to be made on the document. We fill a template that then needs to be expanded and will end up looking different between schemes.
  4. Solution: write code to generate content and then insert it into the template with {officer}.
  5. R code: given a scheme code, fetch their results (relies on run_stage metadata) and preferred sites (Azure json); generate plots and values; read docx from SharePoint; insert plots at a 'cursor' location found by identifying target text ('[Insert Image 1]'); insert values to document as custom properties that populate in-text fields ({ DOCPROPERTY item_01 }) on refresh; generate a uniquely-named and dated folder with all the plots, values, docx and log. Code at https://github.com/The-Strategy-Unit/nhp_final_reports, run an example. Mention: {officer}, {Microsoft365R}, {AzureStor}, {logr}, {cli}. Code had to be extracted from the app and amended because of differences in model versions not being accounted for.
  6. Governance: add an issue template when there's a new request in the DS inbox; work through it. Working together. Apprehension of doing PRs. Power of doing PRs and spotting errors/QA. Learning from each other through the process. Walked through a reversion. Doing the GitHub thing 'properly' was helpful to make a solid product/service/process (seemed 'pedantic' at first). The benefit of documenting on GitHub e.g. noting that a certain scenario had a different name because it was re-run on a newer version of the model.
  7. Strengths: reproducibility; way quicker than doing it manually; only one user-facing function; minimal invasiveness into a template we don't 'own'; flexibility to read arbitrary results. Lots of ad hoc sub-products have been produced from the code; it's helped to fast-track some other analysis. Sites and metadata run-stage tagging were a result of this process. Can generate outputs without the report (can thumb through results without opening Word) and can run arbitrary scenarios. Flexibility/agility means that upcoming 'addendum reports' or other ad hoc work can fit in to the process as it exists without too much disruption and with a lot of the legwork already done.
  8. Fragility: the fields and plot-insert locations can't be messed with; literally changing the name of the template can break it; plots have to be written to file then inserted into the template; manual step of refreshing the docx; developed iteratively with shifting needs, so code could be refactored; copies outputs code. The meaning of 'final' can change, so be careful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C&C ☕ Session idea for Coffee & Coding
Projects
Status: Scheduled
Development

No branches or pull requests

3 participants