Skip to content

Commit

Permalink
Blog post for httr2 1.0.0 (#666)
Browse files Browse the repository at this point in the history
  • Loading branch information
hadley authored Nov 14, 2023
1 parent b812554 commit b025086
Show file tree
Hide file tree
Showing 5 changed files with 782 additions and 0 deletions.
Binary file added content/blog/httr2-1-0-0/httr2.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
288 changes: 288 additions & 0 deletions content/blog/httr2-1-0-0/index.Rmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,288 @@
---
output: hugodown::hugo_document

slug: httr2-1-0-0
title: httr2 1.0.0
date: 2023-11-14
author: Hadley Wickham
description: >
httr2 is the successor to httr, providing a pipeable interface to
generate HTTP requests and handle the responses. It's focussed
on the needs of an R user wrapping a modern web API, but is flexible
enough to handle just about any HTTP related task.
photo:
url: https://unsplash.com/photos/xKShyIiTNJk
author: Mike Bowman

# one of: "deep-dive", "learn", "package", "programming", "roundup", or "other"
categories: [package]
tags: [httr2, httr]
---

```{=html}
<!--
TODO:
* [x] Look over / edit the post's title in the yaml
* [x] Edit (or delete) the description; note this appears in the Twitter card
* [x] Pick category and tags (see existing with `hugodown::tidy_show_meta()`)
* [x] Find photo & update yaml metadata
* [x] Create `thumbnail-sq.jpg`; height and width should be equal
* [x] Create `thumbnail-wd.jpg`; width should be >5x height
* [x] `hugodown::use_tidy_thumbnails()`
* [x] Add intro sentence, e.g. the standard tagline for the package
* [x] `usethis::use_tidy_thanks()`
-->
```
We're delighted to announce the release of [httr2](https://httr2.r-lib.org)[^1] 1.0.0.
httr2 is the second generation of httr: it helps you generate HTTP requests and process the responses, designed with an eye towards modern web APIs and potentially putting your code in a package.

[^1]: Pronounced "hitter 2".

You can install it from CRAN with:

```{r}
#| eval: false
install.packages("httr2")
```

httr2 has been under development for the last two years, but this is the first time we've blogged about it because we've been waiting until the user interface felt stable.
It now does, and we're ready to encourage you to use httr2 whenever you need to talk to a web server.
Most importantly httr2 is now a "real" package because it has a wonderful new logo, thanks to a collaborative effort involving Julie Jung, Greg Swineheart, and DALL•E 3.

```{r}
#| echo: false
#| out-width: 200px
#| fig-alt: >
#| The new httr2 logo is a dark blue hexagon with httr2 written in
#| bright white at the top of logo. Underneath the text is a vibrant
#| magenta baseball player hitting a ball emblazoned with the letters
#| "www".
knitr::include_graphics("httr2.png")
```

httr2 is the successor to httr.
The biggest difference is that it has an explicit request object which you can build up over multiple function calls.
This makes the interface fit more naturally with the pipe, and generally makes life easier because you can iteratively build up a complex request.
httr2 also builds on the 10 years of package development experience we've accrued since creating httr, so it should all around be more enjoyable to use.
If you're a current httr user, there's no need to switch, as we'll continue to maintain the package for many years to come, but if you start on a new project, I'd recommend that you give httr2 a shot.

If you've been following httr2 development for a while, you might want to jump to the [release notes](https://github.com/r-lib/httr2/releases/tag/v1.0.0) to see what's new (a lot!).
The most important change in this release is that [Maximilian Girlich](https://github.com/mgirlich) is now a httr2 author, in recognition of his many contributions to the package.
This release also features improved tools for performing multiple requests (more on that below) and a bunch of bug fixes and minor improvements for OAuth.

For the rest of this blog post, I'll assume that you're familiar with the basics of HTTP.
If you're not, you might want to start with `vignette("httr2")` which introduces you to HTTP using httr2.

```{r setup}
library(httr2)
```

## Making a request

httr2 is designed around the two big pieces of HTTP: requests and responses.
First you'll create a request, with a URL:

```{r}
req <- request(example_url())
req
```

Instead of using an external website, here we're using a test server that's built in to httr2.
This ensures that this blog post, and many httr2 examples, work independently from the rest of the internet.

You can see the HTTP request that httr2 will send, without actually sending it[^2], by doing a dry run:

[^2]: Well, technically, it does send the request, just to another test server that returns the request that it received.

```{r}
req |> req_dry_run()
```

As you can see, this request object will perform a simple `GET` request with automatic user agent and accept headers.

To make more complex requests, you modify the request object with functions that start with `req_`.
For example, you could make it a `HEAD` request, with some query parameters, and a custom user agent:

```{r}
req |>
req_url_query(param = "value") |>
req_user_agent("My user agent") |>
req_method("HEAD") |>
req_dry_run()
```

Or you could send some JSON in the body of the request:

```{r}
req |>
req_body_json(list(x = 1, y = "a")) |>
req_dry_run()
```

httr2 provides a [wide range of `req_` function](https://httr2.r-lib.org/dev/reference/index.html#requests) to customise the request in common ways; if there's something you need that httr2 doesn't support, please [file an issue](https://github.com/r-lib/httr2/issues/new)!

## Performing the request and handling the response

Once you have a request that you are happy with, you can send it to the server with `req_perform()`:

```{r}
req_json <- req |> req_url_path("/json")
resp <- req_json |> req_perform()
```

Performing a request will return a response object (or throw an error, which we'll talk about next).
You can see the basic details of the request by printing it or you can see the raw response with `resp_raw()`[^3]:

[^3]: This is only an approximation.
For example, it only shows the final response if there were redirects, and it automatically uncompresses the body if it was compressed.
Nevertheless, it's still pretty useful.

```{r}
resp
resp |> resp_raw()
```

But generally, you'll want to use the `resp_` functions to extract parts of the response for further processing.
For example, you could parse the JSON body into an R data structure:

```{r}
resp |>
resp_body_json() |>
str()
```

Or get the value of a header:

```{r}
resp |> resp_header("Content-Length")
```

## Error handling

You can use `resp_status()` to see the returned status:

```{r}
resp |> resp_status()
```

But this will almost always be 200, because httr2 automatically follows redirects (statuses in the 300s) and turns HTTP failures (statuses in the 400s and 500s) into R errors.
The following example shows what error handling looks like using an example endpoint that returns a response with the status defined in the URL:

```{r, error = TRUE}
req |>
req_url_path("/status/404") |>
req_perform()
req |>
req_url_path("/status/500") |>
req_perform()
```

Turning HTTP failures into R errors can make debugging hard, so httr2 provides the `last_request()` and `last_response()` helpers which you can use to figure out what went wrong:

```{r}
last_request()
last_response()
```

httr2 provides two other tools to customise error handling:

- `req_error()` gives you full control over what responses should be turned into R errors, and allows you to add additional information to the error message.
- `req_retry()` helps deal with transient errors, where you need to wait a bit and try again. For example, many APIs are rate limited and will return a 429 status if you have made too many requests.

You can learn more about both of these functions in "[Wrapping APIs](https://httr2.r-lib.org/articles/wrapping-apis.html)" as they are particularly important when creating an R package (or script) that wraps a web API.

## Control the request process

There are a number of other `req_` functions that don't directly affect the HTTP request but instead control the overall process of submitting a request and handling the response.
These include:

- `req_cache()`, which sets up a cache so if repeated requests return the same results, and you can avoid a trip to the server.

- `req_throttle()`, which automatically adds a small delay before each request so you can avoid hammering a server with many requests.

- `req_progress()`, which adds a progress bar for long downloads or uploads.

- `req_cookie_preserve()`, which lets you preserve cookies across requests.

Additionally, httr2 provides rich support for authenticating with OAuth, implementing many more OAuth flows than httr.
You've probably used OAuth a bunch without knowing what it's called: you use it when you login to a non-Google website using your Google account, when you give your phone access to your twitter account, or when you login to a streaming app on your smart TV.
OAuth is a big, complex, topic, and is documented in "[OAuth](https://httr2.r-lib.org/articles/oauth.html)".

## Multiple requests

httr2 includes three functions to perform multiple requests:

- `req_perform_sequential()` takes a list of requests and performs them one at a time.

- `req_perform_parallel()` takes a list of requests and performs them in parallel (up to 6 at a time by default).
It's similar to `req_perform_sequential()`, but is obviously faster, at the expense of potentially hammering a server.
It also has some limitations: most importantly it can't refresh an expired OAuth token and it doesn't respect `req_retry()` or `req_throttle()`.

- `req_perform_iterative()` takes a single request and a callback function to generate the next request from previous response.
It'll keep going until the callback function returns `NULL` or `max_reqs` requests have been performed.
This is very useful for paginated APIs that only tell you the URL for the *next* page.

For example, imagine we wanted to download each person from the [Star Wars API](https://swapi.dev). The URLs have a very consistent structure so we can generate a bunch of them, then create the corresponding requests:

```{r}
urls <- paste0("https://swapi.dev/api/people/", 1:10)
reqs <- lapply(urls, request)
```

Now I can perform those requests, collecting a list of responses:

```{r}
resps <- req_perform_sequential(reqs)
```

These responses contain their data in a JSON body:

```{r}
resps |>
_[[1]] |>
resp_body_json() |>
str()
```

There's lots of ways to deal with this sort of data (e.g. for loops or functional programming) but to make life easier, httr2 comes with its own helper, `resps_data()`.
This function takes a callback that retrieves the data for each response, then concatenates all the data into a single object.
In this case, we need to wrap `resp_body_json()` in a list, so we get one list for each person, rather than one list in total:

```{r}
resps |>
resps_data(\(resp) list(resp_body_json(resp))) |>
_[1:3] |>
str(list.len = 10)
```

Another option would be to convert each response into a data frame or tibble.
That's a little tricky here because of the nested lists that will need to become list-columns[^4], so we'll avoid that challenge here by focussing on the first nine columns:

[^4]: To turn these into list-columns, you need to wrap each list in another list, something like `is_list <- map_lgl(json, is.list); json[is_list] <- map(json[is_list], list)`.
This ensures that each element has length 1, the invariant for a row in a tibble.

```{r}
sw_data <- function(resp) {
tibble::as_tibble(resp_body_json(resp)[1:9])
}
resps |> resps_data(sw_data)
```

When you're performing large numbers of requests, it's almost inevitable that something will go wrong.
By default, all three functions will bubble up errors, causing you to lose all of the work that's been done so far.
You can, however, use the `on_error` argument to change what happens, either ignoring errors, or returning when you hit the first error.
This will changes the return value: instead of a list of responses, the list might now also contain error objects.
httr2 provides other helpers to work with this object:

- `resps_successes()` filters the list to find the successful responses. You'll can then pair this with `resps_data()` to get the data from the successful request.
- `resps_failures()` filters the list to find the failed responses. You'll can then pair this with `resps_requests()` to find the requests that generated them and figure out what went wrong,.

## Acknowledgements

A big thanks to all 87 folks who have helped make httr2 possible!

[\@allenbaron](https://github.com/allenbaron), [\@asadow](https://github.com/asadow), [\@atheriel](https://github.com/atheriel), [\@boshek](https://github.com/boshek), [\@casa-henrym](https://github.com/casa-henrym), [\@cderv](https://github.com/cderv), [\@colmanhumphrey](https://github.com/colmanhumphrey), [\@cstjohn810](https://github.com/cstjohn810), [\@cwang23](https://github.com/cwang23), [\@DavidRLovell](https://github.com/DavidRLovell), [\@DMerch](https://github.com/DMerch), [\@dpprdan](https://github.com/dpprdan), [\@ECOSchulz](https://github.com/ECOSchulz), [\@edavidaja](https://github.com/edavidaja), [\@elipousson](https://github.com/elipousson), [\@emmansh](https://github.com/emmansh), [\@Enchufa2](https://github.com/Enchufa2), [\@ErdaradunGaztea](https://github.com/ErdaradunGaztea), [\@fangzhou-xie](https://github.com/fangzhou-xie), [\@fh-mthomson](https://github.com/fh-mthomson), [\@fkohrt](https://github.com/fkohrt), [\@flahn](https://github.com/flahn), [\@gregleleu](https://github.com/gregleleu), [\@guga31bb](https://github.com/guga31bb), [\@gvelasq](https://github.com/gvelasq), [\@hadley](https://github.com/hadley), [\@hongooi73](https://github.com/hongooi73), [\@howardbaek](https://github.com/howardbaek), [\@jameslairdsmith](https://github.com/jameslairdsmith), [\@JBGruber](https://github.com/JBGruber), [\@jchrom](https://github.com/jchrom), [\@jemus42](https://github.com/jemus42), [\@jennybc](https://github.com/jennybc), [\@jimrothstein](https://github.com/jimrothstein), [\@jjesusfilho](https://github.com/jjesusfilho), [\@jjfantini](https://github.com/jjfantini), [\@jl5000](https://github.com/jl5000), [\@jonthegeek](https://github.com/jonthegeek), [\@JosiahParry](https://github.com/JosiahParry), [\@judith-bourque](https://github.com/judith-bourque), [\@juliasilge](https://github.com/juliasilge), [\@kasperwelbers](https://github.com/kasperwelbers), [\@kelvindso](https://github.com/kelvindso), [\@kieran-mace](https://github.com/kieran-mace), [\@KoderKow](https://github.com/KoderKow), [\@lassehjorthmadsen](https://github.com/lassehjorthmadsen), [\@llrs](https://github.com/llrs), [\@lyndon-bird](https://github.com/lyndon-bird), [\@m-mohr](https://github.com/m-mohr), [\@maelle](https://github.com/maelle), [\@maxheld83](https://github.com/maxheld83), [\@mgirlich](https://github.com/mgirlich), [\@MichaelChirico](https://github.com/MichaelChirico), [\@michaelgfalk](https://github.com/michaelgfalk), [\@misea](https://github.com/misea), [\@MislavSag](https://github.com/MislavSag), [\@mkoohafkan](https://github.com/mkoohafkan), [\@mmuurr](https://github.com/mmuurr), [\@multimeric](https://github.com/multimeric), [\@nbenn](https://github.com/nbenn), [\@nclsbarreto](https://github.com/nclsbarreto), [\@nealrichardson](https://github.com/nealrichardson), [\@Nelson-Gon](https://github.com/Nelson-Gon), [\@olivroy](https://github.com/olivroy), [\@owenjonesuob](https://github.com/owenjonesuob), [\@paul-carteron](https://github.com/paul-carteron), [\@pbulsink](https://github.com/pbulsink), [\@ramiromagno](https://github.com/ramiromagno), [\@rplati](https://github.com/rplati), [\@rressler](https://github.com/rressler), [\@samterfa](https://github.com/samterfa), [\@schnee](https://github.com/schnee), [\@sckott](https://github.com/sckott), [\@sebastian-c](https://github.com/sebastian-c), [\@selesnow](https://github.com/selesnow), [\@Shaunson26](https://github.com/Shaunson26), [\@SokolovAnatoliy](https://github.com/SokolovAnatoliy), [\@spotrh](https://github.com/spotrh), [\@stefanedwards](https://github.com/stefanedwards), [\@taerwin](https://github.com/taerwin), [\@vanhry](https://github.com/vanhry), [\@wing328](https://github.com/wing328), [\@xinzhuohkust](https://github.com/xinzhuohkust), [\@yogat3ch](https://github.com/yogat3ch), [\@yogesh-bansal](https://github.com/yogesh-bansal), [\@yutannihilation](https://github.com/yutannihilation), and [\@zacdav-db](https://github.com/zacdav-db).
Loading

0 comments on commit b025086

Please sign in to comment.