From 4043f14f694be131074fb5aea36d0e8921f9d254 Mon Sep 17 00:00:00 2001 From: Jerry Wu Date: Sat, 14 Dec 2024 07:45:34 +0800 Subject: [PATCH] Add blog post demonstrating the usage of `GT.fmt_image()` and `vals.fmt_image()` (#558) * Add blog post demonstrating the usage of `GT.fmt_image()` and `vals.fmt_image()` * Update docstring for `path` parameter in `GT.fmt_image()` * Update the blog based on team feedback --- docs/blog/rendering-images/index.qmd | 365 +++++++++++++++++++++++++++ great_tables/_formats.py | 3 +- 2 files changed, 367 insertions(+), 1 deletion(-) create mode 100644 docs/blog/rendering-images/index.qmd diff --git a/docs/blog/rendering-images/index.qmd b/docs/blog/rendering-images/index.qmd new file mode 100644 index 000000000..4392e3038 --- /dev/null +++ b/docs/blog/rendering-images/index.qmd @@ -0,0 +1,365 @@ +--- +title: "Rendering images anywhere in Great Tables" +html-table-processing: none +author: Jerry Wu +date: 2024-12-13 +freeze: true +jupyter: python3 +format: + html: + code-summary: "Show the Code" +--- + +Rendering images in Great Tables is straightforward with `GT.fmt_image` and `vals.fmt_image()`. +In this post, we'll explore three key topics: + +* Four examples demonstrating how to render images within the body using `GT.fmt_image()`. +* How to render images anywhere using `vals.fmt_image()` and `html()`. +* How to manually render images anywhere using `html()`. + +## Rendering Images in the Body +[GT.fmt_image()](https://posit-dev.github.io/great-tables/reference/GT.fmt_image.html#great_tables.GT.fmt_image) +is the go-to tool for rendering images within the body of a table. Below, we'll present four examples +corresponding to the cases outlined in the documentation: + +* **Case 1**: Local file paths. +* **Case 2**: Full HTTP/HTTPS URLs. +* **Case 3**: Image names with the `path=` argument. +* **Case 4**: Image names using both the `path=` and `file_pattern=` arguments. + +::: {.callout-tip collapse="false"} + +## Finding the Right Case for Your Needs + +* **Case 1** and **Case 2** work best for data sourced directly from a database. +* **Case 3** is ideal for users dealing with image names relative to a base directory or URL (e.g., `/path/to/images`). +* **Case 4** is tailored for users working with patterned image names (e.g., `metro_{}.svg`). +::: + +### Preparations +For this demonstration, we'll use the first five rows of the built-in [metro](https://posit-dev.github.io/great-tables/reference/data.metro.html#great_tables.data.metro) dataset, specifically the `name` and `lines` columns. + +To ensure a smooth walkthrough, we’ll manipulate the `data` (a Python dictionary) directly. However, +in real-world applications, such operations are more likely performed at the DataFrame level to leverage +the benefits of vectorized operations. +```{python} +# | code-fold: true +import pandas as pd +from great_tables import GT, vals, html +from importlib_resources import files + +pd.set_option('display.max_colwidth', 150) + +data = { + "name": [ + "Argentine", + "Bastille", + "Bérault", + "Champs-Élysées—Clemenceau", + "Charles de Gaulle—Étoile", + ], + "lines": ["1", "1, 5, 8", "1", "1, 13", "1, 2, 6"], +} + +print("""\ +data = { + "name": [ + "Argentine", + "Bastille", + "Bérault", + "Champs-Élysées—Clemenceau", + "Charles de Gaulle—Étoile", + ], + "lines": ["1", "1, 5, 8", "1", "1, 13", "1, 2, 6"], +}\ +""") +``` + +Attentive readers may have noticed that the values for the key `lines` are lists of strings, each +containing one or more numbers separated by commas. `GT.fmt_image()` is specifically designed to +handle such cases, allowing users to render multiple images in a single row. + +### Case 1: Local File Paths +**Case 1** demonstrates how to simulate a column containing strings representing local file paths. We'll +use images stored in the `data/metro_images` directory of Great Tables: +```{python} +img_local_paths = files("great_tables") / "data/metro_images" # <1> +``` +1. These image files follow a patterned naming convention, such as `metro_1.svg`, `metro_2.svg`, and so on. + +Below is a `Pandas` DataFrame called `metro_mini1`, where the `case1` column contains local file +paths that we want to render as images. +```{python} +# | code-fold: true +metro_mini1 = pd.DataFrame( + { + **data, + "case1": [ + ", ".join( + str((img_local_paths / f"metro_{item}").with_suffix(".svg")) + for item in row.split(", ") + ) + for row in data["lines"] + ], + } +) +metro_mini1 +``` + +::: {.callout-tip collapse="false"} +## Use the `pathlib` Module to Construct Paths + +Local file paths can vary depending on the operating system, which makes it easy to accidentally +construct invalid paths. A good practice to mitigate this is to use Python's built-in +[pathlib](https://docs.python.org/3/library/pathlib.html) module to construct paths first and then +convert them to strings. In this example, `img_local_paths` is actually an instance of `pathlib.Path`. +```{python} +# | eval: false +from pathlib import Path + +isinstance(img_local_paths, Path) # True +``` + +::: + +The `case1` column is quite lengthy due to the inclusion of `img_local_paths`. In **Case 3**, we'll +share a useful trick to avoid repeating the directory name each time—stay tuned! + +For now, let's use `GT.fmt_image()` to render images by passing `"case1"` as the first argument: +```{python} +GT(metro_mini1).fmt_image("case1").cols_align(align="right", columns="case1") +``` + +### Case 2: Full HTTP/HTTPS URLs +**Case 2** demonstrates how to simulate a column containing strings representing HTTP/HTTPS URLs. We'll +use the same images as in **Case 1**, but this time, retrieve them from the Great Tables GitHub repository: +```{python} +img_url_paths = "https://raw.githubusercontent.com/posit-dev/great-tables/refs/heads/main/great_tables/data/metro_images" +``` + +Below is a `Pandas` DataFrame called `metro_mini2`, where the `case2` column contains +full HTTP/HTTPS URLs that we aim to render as images. +```{python} +# | code-fold: true +metro_mini2 = pd.DataFrame( + { + **data, + "case2": [ + ", ".join(f"{img_url_paths}/metro_{item}.svg" for item in row.split(", ")) + for row in data["lines"] + ], + } +) +metro_mini2 +``` + +The lengthy `case2` column issue can also be addressed using the trick shared in **Case 3**. + +Similarly, we can use `GT.fmt_image()` to render images by passing `"case2"` as the first argument: +```{python} +GT(metro_mini2).fmt_image("case2").cols_align(align="right", columns="case2") +``` + + +### Case 3: Image Names with the `path=` Argument +**Case 3** demonstrates how to use the `path=` argument to specify images relative to a base directory +or URL. This approach eliminates much of the repetition in file names, offering a solution to the +issues in **Case 1** and **Case 2**. + +Below is a `Pandas` DataFrame called `metro_mini3`, where the `case3` column contains file names that +we aim to render as images. +```{python} +# | code-fold: true +metro_mini3 = pd.DataFrame( + { + **data, + "case3": [ + ", ".join(f"metro_{item}.svg" for item in row.split(", ")) for row in data["lines"] + ], + } +) +metro_mini3 +``` + +Now we can use `GT.fmt_image()` to render the images by passing `"case3"` as the first argument and +specifying either `img_local_paths` or `img_url_paths` as the `path=` argument: +```{python} +# equivalent to `Case 1` +( + GT(metro_mini3) + .fmt_image("case3", path=img_local_paths) + .cols_align(align="right", columns="case3") +) + +# equivalent to `Case 2` +( + GT(metro_mini3) + .fmt_image("case3", path=img_url_paths) + .cols_align(align="right", columns="case3") +) +``` + +After exploring **Case 1** and **Case 2**, you’ll likely appreciate the functionality of the `path=` +argument. However, manually constructing file names can still be a bit tedious. If your file names +follow a consistent pattern, the `file_pattern=` argument can simplify the process. Let’s see how +this works in **Case 4** below. + +### Case 4: Image Names Using Both the `path=` and `file_pattern=` Arguments +**Case 4** demonstrates how to use `path=` and `file_pattern=` to specify images with names following +a common pattern. For example, you could use `file_pattern="metro_{}.svg"` to reference images like +`metro_1.svg`, `metro_2.svg`, and so on. + +Below is a `Pandas` DataFrame called `metro_mini4`, where the `case4` column contains a copy of +`data["lines"]`, which we aim to render as images. +```{python} +# | code-fold: true +metro_mini4 = pd.DataFrame({**data, "case4": data["lines"]}) +metro_mini4 +``` + +First, define a string pattern to illustrate the file naming convention, using `{}` to indicate the +variable portion: +```{python} +file_pattern = "metro_{}.svg" +``` + +Next, pass `"case4"` as the first argument, along with `img_local_paths` or `img_url_paths` as the +`path=` argument, and `file_pattern` as the `file_pattern=` argument. This allows `GT.fmt_image()` +to render the images: +```{python} +# equivalent to `Case 1` +( + GT(metro_mini4) + .fmt_image("case4", path=img_local_paths, file_pattern=file_pattern) + .cols_align(align="right", columns="case4") +) + +# equivalent to `Case 2` +( + GT(metro_mini4) + .fmt_image("case4", path=img_url_paths, file_pattern=file_pattern) + .cols_align(align="right", columns="case4") +) +``` + + +::: {.callout-warning collapse="true"} +## Using `file_pattern=` Independently + +The `file_pattern=` argument is typically used in conjunction with the `path=` argument, but this +is not a strict rule. If your local file paths or HTTP/HTTPS URLs follow a pattern, you can use +`file_pattern=` alone without `path=`. This allows you to include the shared portion of the file +paths or URLs directly in `file_pattern`, as shown below: +```{python} +file_pattern = str(img_local_paths / "metro_{}.svg") +( + GT(metro_mini4) + .fmt_image("case4", file_pattern=file_pattern) + .cols_align(align="right", columns="case4") +) +``` + +::: + +**Case 4** is undoubtedly one of the most powerful features of Great Tables. While mastering it may +take some practice, we hope this example helps you render images effortlessly and effectively. + +## Rendering Images Anywhere +While `GT.fmt_image()` is primarily designed for rendering images in the table body, what if you +need to display images in other locations, such as the header? In such cases, you can turn to the versatile +[vals.fmt_image()](https://posit-dev.github.io/great-tables/reference/vals.fmt_image.html#great_tables.vals.fmt_image). + +`vals.fmt_image()` is a hidden gem in Great Tables. Its usage is similar to `GT.fmt_image()`, but +instead of working directly with DataFrame columns, it lets you pass a string or a list of strings +as the first argument, returning a list of strings, each representing an image. You can then wrap +these strings with [html()](https://posit-dev.github.io/great-tables/reference/html.html#great_tables.html), +allowing Great Tables to render the images anywhere in the table. + +### Preparations +We will create a `Pandas` DataFrame named `metro_mini` using the `data` dictionary. This will be used +for demonstration in the following examples: +```{python} +# | code-fold: true +metro_mini = pd.DataFrame(data) +metro_mini +``` + +### Single Image +This example shows how to render a valid URL as an image in the title of the table header: +```{python} +logo_url = "https://posit-dev.github.io/great-tables/assets/GT_logo.svg" + +_gt_logo, *_ = vals.fmt_image(logo_url, height=100) # <1> +gt_logo = html(_gt_logo) + +( + GT(metro_mini) + .fmt_image("lines", path=img_url_paths, file_pattern="metro_{}.svg") + .tab_header(title=gt_logo) + .cols_align(align="right", columns="lines") + .opt_stylize(style=4, color="gray") +) +``` +1. `vals.fmt_image()` returns a list of strings. Here, we use tuple unpacking to extract the first +item from the list. + +### Multiple Images +This example demonstrates how to render two valid URLs as images in the title and subtitle of the +table header: +```{python} +logo_urls = [ + "https://posit-dev.github.io/great-tables/assets/GT_logo.svg", + "https://raw.githubusercontent.com/rstudio/gt/master/images/dataset_metro.svg", +] + +_gt_logo, _metro_logo = vals.fmt_image(logo_urls, height=100) # <1> +gt_logo, metro_logo = html(_gt_logo), html(_metro_logo) + +( + GT(metro_mini) + .fmt_image("lines", path=img_url_paths, file_pattern="metro_{}.svg") + .tab_header(title=gt_logo, subtitle=metro_logo) + .cols_align(align="right", columns="lines") + .opt_stylize(style=4, color="gray") +) +``` +1. Note that if you need to render images with different `height` or `width`, you might need to make +two separate calls to `vals.fmt_image()`. + +## Manually Rendering Images Anywhere +Remember, you can always use `html()` to manually construct your desired output. For example, the +previous table can be created without relying on `vals.fmt_image()` like this: +```{python} +# | eval: false +( + GT(metro_mini) + .fmt_image("lines", path=img_url_paths, file_pattern="metro_{}.svg") + .tab_header( + title=html(f''), + subtitle=html(f''), + ) + .cols_align(align="right", columns="lines") + .opt_stylize(style=4, color="gray") +) +``` + +Alternatively, you can manually encode the image using Python's built-in +[base64](https://docs.python.org/3/library/base64.html) module, specify the appropriate MIME type +and HTML attributes, and then wrap it in `html()` to display the table. + +## Final Words +In this post, we focused on the most common use cases for rendering images in Great Tables, deliberately +avoiding excessive DataFrame operations. Including such details could have overwhelmed the post with +examples of string manipulations and the complexities of working with various DataFrame libraries. + +We hope you found this guide helpful and enjoyed the structured approach. Until next time, happy +table creation with Great Tables! + +::: {.callout-note} +## Appendix: Related PRs + +If you're interested in the recent enhancements we've made to image rendering, be sure to check out +[#444](https://github.com/posit-dev/great-tables/pull/444), +[#451](https://github.com/posit-dev/great-tables/pull/451) and +[#520](https://github.com/posit-dev/great-tables/pull/520) for all the details. +::: diff --git a/great_tables/_formats.py b/great_tables/_formats.py index 5c4bf0832..91157e985 100644 --- a/great_tables/_formats.py +++ b/great_tables/_formats.py @@ -3688,7 +3688,8 @@ def fmt_image( In the output of images within a body cell, `sep=` provides the separator between each image. path - An optional path to local image files (this is combined with all filenames). + An optional path to local image files or an HTTP/HTTPS URL. + This is combined with the filenames to form the complete image paths. file_pattern The pattern to use for mapping input values in the body cells to the names of the graphics files. The string supplied should use `"{}"` in the pattern to map filename fragments to