Word is better than Quarto (!): embracing the user need #272

matt-dray · 2025-02-06T22:02:41Z

Me and @jacgrout have quite a story to tell about populating NHP outputs ('final') reports.

We can discuss:

why we chose to populate docx files instead of qmd
how we manage to do this reproducibly in the very-important, high stakes nhp_final_reports repo
how we use {officer} and 'custom document properties' to update a Word template file
how we've read data from SharePoint, Azure and Posit Connect pins to achieve this
how we make use of Azure metadata and a Posit Connect app to keep information in-sync with schemes
how we've adapted the repo in an agile manner as needs have changed 🤸‍♀️
how this is a better solution for our users than 'simply' telling them to 'use a Quarto document'
how tracking things in issues and PRs has saved the day more than once (e.g. a recent PR reversion and how recording of a 'final' scenario process helped untangle some confusion)
other various pains along the way

ChrisBeeley · 2025-02-06T22:03:28Z

Yes please!

jacgrout · 2025-02-07T08:58:10Z

It's just mail merge for the 2020's ;-)

matt-dray · 2025-02-14T16:40:55Z

Rough structure:

The need: for each NHP scheme, generate plots and calculate values and insert them into a report. The outputs app presents results and the doc needs some of that, but also we need to supply some bespoke content that comes from the underlying data. Important because this presents the formal outcome of modelling. Users: SU's NHP model relationship managers, who help schemes finesse the content. Technically the user is the scheme, but we're serving a user one step back from that (MRMs).
First idea: this should be Quarto. Full control, can parameterise for each scheme, can make it look pretty.
Reality: the Word doc () is long and complicated and its content is continually changing. it has to go back and forth to the clients and comments have to be made on the document. We fill a template that then needs to be expanded and will end up looking different between schemes.
Solution: write code to generate content and then insert it into the template with {officer}.
R code: given a scheme code, fetch their results (relies on run_stage metadata) and preferred sites (Azure json); generate plots and values; read docx from SharePoint; insert plots at a 'cursor' location found by identifying target text ('[Insert Image 1]'); insert values to document as custom properties that populate in-text fields ({ DOCPROPERTY item_01 }) on refresh; generate a uniquely-named and dated folder with all the plots, values, docx and log. Code at https://github.com/The-Strategy-Unit/nhp_final_reports, run an example. Mention: {officer}, {Microsoft365R}, {AzureStor}, {logr}, {cli}. Code had to be extracted from the app and amended because of differences in model versions not being accounted for.
Governance: add an issue template when there's a new request in the DS inbox; work through it. Working together. Apprehension of doing PRs. Power of doing PRs and spotting errors/QA. Learning from each other through the process. Walked through a reversion. Doing the GitHub thing 'properly' was helpful to make a solid product/service/process (seemed 'pedantic' at first). The benefit of documenting on GitHub e.g. noting that a certain scenario had a different name because it was re-run on a newer version of the model.
Strengths: reproducibility; way quicker than doing it manually; only one user-facing function; minimal invasiveness into a template we don't 'own'; flexibility to read arbitrary results. Lots of ad hoc sub-products have been produced from the code; it's helped to fast-track some other analysis. Sites and metadata run-stage tagging were a result of this process. Can generate outputs without the report (can thumb through results without opening Word) and can run arbitrary scenarios. Flexibility/agility means that upcoming 'addendum reports' or other ad hoc work can fit in to the process as it exists without too much disruption and with a lot of the legwork already done.
Fragility: the fields and plot-insert locations can't be messed with; literally changing the name of the template can break it; plots have to be written to file then inserted into the template; manual step of refreshing the docx; developed iteratively with shifting needs, so code could be refactored; copies outputs code. The meaning of 'final' can change, so be careful.

matt-dray added the C&C ☕ Session idea for Coffee & Coding label Feb 6, 2025

matt-dray self-assigned this Feb 6, 2025

github-project-automation bot added this to Coffee and Coding ☕🧑‍💻 Feb 6, 2025

github-project-automation bot moved this to Potential in Coffee and Coding ☕🧑‍💻 Feb 6, 2025

francisbarton moved this from Potential to Scheduled in Coffee and Coding ☕🧑‍💻 Feb 11, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Word is better than Quarto (!): embracing the user need #272

Word is better than Quarto (!): embracing the user need #272

matt-dray commented Feb 6, 2025 •

edited

Loading

ChrisBeeley commented Feb 6, 2025

jacgrout commented Feb 7, 2025

matt-dray commented Feb 14, 2025 •

edited

Loading

Word is better than Quarto (!): embracing the user need #272

Word is better than Quarto (!): embracing the user need #272

Comments

matt-dray commented Feb 6, 2025 • edited Loading

ChrisBeeley commented Feb 6, 2025

jacgrout commented Feb 7, 2025

matt-dray commented Feb 14, 2025 • edited Loading

matt-dray commented Feb 6, 2025 •

edited

Loading

matt-dray commented Feb 14, 2025 •

edited

Loading