From f0d566220e0b913f40bbdcf027b562d640aecc49 Mon Sep 17 00:00:00 2001 From: vincentarelbundock Date: Tue, 23 Jan 2024 02:39:29 +0000 Subject: [PATCH] =?UTF-8?q?Deploying=20to=20gh-pages=20from=20@=20vincenta?= =?UTF-8?q?relbundock/modelsummary@abc04691a0dd68e8c2318cf98017704ed00a104?= =?UTF-8?q?6=20=F0=9F=9A=80?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- search.json | 14 +- vignettes/appearance.html | 646 ++--- vignettes/datasummary.html | 5004 +++++++++++++++++------------------ vignettes/modelsummary.html | 4958 +++++++++++++++++----------------- 4 files changed, 5311 insertions(+), 5311 deletions(-) diff --git a/search.json b/search.json index 9e29cbd0e..7d98e4e3a 100644 --- a/search.json +++ b/search.json @@ -340,7 +340,7 @@ "href": "vignettes/datasummary.html", "title": "Data Summaries", "section": "", - "text": "datasummary is a function from the modelsummary package. It allows us to create data summaries, frequency tables, crosstabs, correlation tables, balance tables (aka “Table 1”), and more. It has many benefits:\n\nEasy to use.\nExtremely flexible.\nMany output formats: HTML, LaTeX, Microsoft Word and Powerpoint, Text/Markdown, PDF, RTF, or Image files.\nEmbed tables in Rmarkdown or knitr dynamic documents.\nCustomize the appearance of tables with the gt, kableExtra or flextable packages. The possibilities are endless!\n\nThis tutorial will show how to draw tables like these (and more!):\n\n \n\n\n\ndatasummary is built around the fantastic tables package for R. It is a thin “wrapper” which adds convenience functions and arguments; a user-interface consistent with modelsummary; cleaner html output; and the ability to export tables to more formats, including gt tables, flextable objects, and Microsoft Word documents.\ndatasummary is a general-purpose table-making tool. It allows us to build (nearly) any summary table we want by using simple 2-sided formulae. For example, in the expression x + y ~ mean + sd, the left-hand side of the formula identifies the variables or statistics to display as rows, and the right-hand side defines the columns. Below, we will see how variables and statistics can be “nested” with the * operator to produce tables like the ones above.\nIn addition to datasummary, the modelsummary package includes a “family” of companion functions named datasummary_*. These functions facilitate the production of standard, commonly used tables. This family currently includes:\n\ndatasummary(): Flexible function to create custom tables using 2-sided formulae.\ndatasummary_balance(): Group characteristics (e.g., control vs. treatment)\ndatasummary_correlation(): Table of correlations.\ndatasummary_skim(): Quick summary of a dataset.\ndatasummary_df(): Create a table from any dataframe.\ndatasummary_crosstab(): Cross tabulations of categorical variables.\n\nIn the next three sections, we illustrate how to use datasummary_balance, datasummary_correlation, datasummary_skim, and datasummary_crosstab. Then, we dive into datasummary itself to highlight its ease and flexibility.\n\n\n\nThe first datasummary companion function is called datasummary_skim. It was heavily inspired by one of my favorite data exploration tools for R: the skimr package. The goal of this function is to give us a quick look at the data.\nTo illustrate, we download data from the cool new palmerpenguins package by Alison Presmanes Hill and Allison Horst. These data were collected at the Palmer Station in Antarctica by Gorman, Williams & Fraser (2014), and they include 3 categorical variables and 4 numeric variables.\n\nlibrary(modelsummary)\nlibrary(tidyverse)\n\nurl <- 'https://vincentarelbundock.github.io/Rdatasets/csv/palmerpenguins/penguins.csv'\npenguins <- read.csv(url, na.strings = \"\")\n\nTo summarize the numeric variables in the dataset, we type:\n\ndatasummary_skim(penguins)\n\n\nTo summarize the categorical variables in the dataset, we type:\n\ndatasummary_skim(penguins, type = \"categorical\")\n\n\n\n\n\n \n \n \n \n N\n %\n \n \n \n species\nAdelie\n152\n44.2\n \nChinstrap\n68\n19.8\n \nGentoo\n124\n36.0\n island\nBiscoe\n168\n48.8\n \nDream\n124\n36.0\n \nTorgersen\n52\n15.1\n sex\nfemale\n165\n48.0\n \nmale\n168\n48.8\n \nNA\n11\n3.2\n \n \n \n\n\n\n\nLater in this tutorial, it will become clear that datasummary_skim is just a convenience “template” built around datasummary, since we can achieve identical results with the latter. For example, to produce a text-only version of the tables above, we can type:\n\ndatasummary(All(penguins) ~ Mean + SD + Histogram,\n data = penguins,\n output = 'markdown')\n\n| | Mean| SD| Histogram|\n|:-----------------|-------:|------:|----------:|\n|bill_length_mm | 43.92| 5.46| ▁▅▆▆▆▇▇▂▁|\n|bill_depth_mm | 17.15| 1.97| ▃▄▄▄▇▆▇▅▂▁|\n|flipper_length_mm | 200.92| 14.06| ▂▅▇▄▁▄▄▂▁|\n|body_mass_g | 4201.75| 801.95| ▁▄▇▅▄▄▃▃▂▁|\nPrinting histograms will not work on all computers. If you have issues with this feature, try changing your computer’s locale, or try using a different display font.\nThe datasummary_skim function does not currently allow users to summarize continuous and categorical variables together in a single table, but the datasummary_balance function described in the next section can do so.\n\n\n\nThe expressions “balance table” or “Table 1” refer to a type of table which is often printed in the opening pages of a scientific peer-reviewed article. Typically, this table includes basic descriptive statistics about different subsets of the study population. For instance, analysts may want to compare the socio-demographic characteristics of members of the “control” and “treatment” groups in a randomized control trial, or the flipper lengths of male and female penguins. In addition, balance tables often include difference in means tests.\nTo illustrate how to build a balance table using the datasummary_balance function, we download data about a job training experiment studies in Lalonde (1986). Then, we clean up the data by renaming and recoding a few variables.\n\n## Download and read data\ntraining <- 'https://vincentarelbundock.github.io/Rdatasets/csv/Ecdat/Treatment.csv'\ntraining <- read.csv(training, na.strings = \"\")\n\n## Rename and recode variables\ntraining <- training %>%\n mutate(`Earnings Before` = re75 / 1000,\n `Earnings After` = re78 / 1000,\n Treatment = ifelse(treat == TRUE, 'Treatment', 'Control'),\n Married = ifelse(married == TRUE, 'Yes', 'No')) %>%\n select(`Earnings Before`,\n `Earnings After`,\n Treatment,\n Ethnicity = ethn,\n Age = age,\n Education = educ,\n Married)\n\nNow, we execute the datasummary_balance function. If the estimatr package is installed, datasummary_balance will calculate the difference in means and test statistics.\n\ncaption <- 'Descriptive statistics about participants in a job training experiment. The earnings are displayed in 1000s of USD. This table was created using the \"datasummary\" function from the \"modelsummary\" package for R.'\nreference <- 'Source: Lalonde (1986) American Economic Review.'\n\nlibrary(modelsummary)\ndatasummary_balance(~Treatment,\n data = training,\n title = caption,\n notes = reference)\n\nNote that if the dataset includes columns called “blocks”, “clusters”, or “weights”, this information will automatically be taken into consideration by estimatr when calculating the difference in means and the associated statistics.\nUsers can also use the ~ 1 formula to indicate that they want to summarize all the data instead of splitting the analysis across subgroups:\n\ndatasummary_balance(~ 1, data = training)\n\n\n\n\n\n \n \n \n \n Mean\n Std. Dev.\n \n \n \n Earnings Before\n\n17.9\n13.9\n Earnings After\n\n20.5\n15.6\n Age\n\n34.2\n10.5\n Education\n\n12.0\n3.1\n \n \nN\nPct.\n Treatment\nControl\n2490\n93.1\n \nTreatment\n185\n6.9\n Ethnicity\nblack\n780\n29.2\n \nhispanic\n92\n3.4\n \nother\n1803\n67.4\n Married\nNo\n483\n18.1\n \nYes\n2192\n81.9\n \n \n \n\n\n\n\n\n\n\nThe datasummary_correlation accepts a dataframe or tibble, it identifies all the numeric variables, and calculates the correlation between each of those variables:\n\ndatasummary_correlation(mtcars)\n\n\n\n\n\n \n \n \n mpg\n cyl\n disp\n hp\n drat\n wt\n qsec\n vs\n am\n gear\n carb\n \n \n \n mpg\n1\n.\n.\n.\n.\n.\n.\n.\n.\n.\n.\n cyl\n-.85\n1\n.\n.\n.\n.\n.\n.\n.\n.\n.\n disp\n-.85\n.90\n1\n.\n.\n.\n.\n.\n.\n.\n.\n hp\n-.78\n.83\n.79\n1\n.\n.\n.\n.\n.\n.\n.\n drat\n.68\n-.70\n-.71\n-.45\n1\n.\n.\n.\n.\n.\n.\n wt\n-.87\n.78\n.89\n.66\n-.71\n1\n.\n.\n.\n.\n.\n qsec\n.42\n-.59\n-.43\n-.71\n.09\n-.17\n1\n.\n.\n.\n.\n vs\n.66\n-.81\n-.71\n-.72\n.44\n-.55\n.74\n1\n.\n.\n.\n am\n.60\n-.52\n-.59\n-.24\n.71\n-.69\n-.23\n.17\n1\n.\n.\n gear\n.48\n-.49\n-.56\n-.13\n.70\n-.58\n-.21\n.21\n.79\n1\n.\n carb\n-.55\n.53\n.39\n.75\n-.09\n.43\n-.66\n-.57\n.06\n.27\n1\n \n \n \n\n\n\n\nThe values displayed in this table are equivalent to those obtained by calling: cor(x, use='pairwise.complete.obs').\nThe datasummary_correlation function has a methods argument. The default value is \"pearson\", but it also accepts other values like \"spearman\". In addition, method can accept any function which takes a data frame and returns a matrix. For example, we can create a custom function to display information from the correlation package. This allows us to include significance stars even if the stars argument is not supported by default in datasummary_correlation():\n\nlibrary(correlation)\nlibrary(modelsummary)\n\nfun <- function(x) {\n out <- correlation(mtcars) |>\n summary() |>\n format(2) |> \n as.matrix()\n row.names(out) <- out[, 1]\n out <- out[, 2:ncol(out)]\n return(out)\n}\n\ndatasummary_correlation(mtcars, method = fun)\n\n\n\n\n\n \n \n \n carb\n gear\n am\n vs\n qsec\n wt\n drat\n hp\n disp\n cyl\n \n \n \n mpg\n-.55*\n.48\n.60**\n.66**\n.42\n-.87***\n.68***\n-.78***\n-.85***\n-.85***\n cyl\n.53*\n-.49\n-.52*\n-.81***\n-.59*\n.78***\n-.70***\n.83***\n.90***\n\n disp\n.39\n-.56*\n-.59*\n-.71***\n-.43\n.89***\n-.71***\n.79***\n\n\n hp\n.75***\n-.13\n-.24\n-.72***\n-.71***\n.66**\n-.45\n\n\n\n drat\n-.09\n.70***\n.71***\n.44\n.09\n-.71***\n\n\n\n\n wt\n.43\n-.58*\n-.69***\n-.55*\n-.17\n\n\n\n\n\n qsec\n-.66**\n-.21\n-.23\n.74***\n\n\n\n\n\n\n vs\n-.57*\n.21\n.17\n\n\n\n\n\n\n\n am\n.06\n.79***\n\n\n\n\n\n\n\n\n gear\n.27\n\n\n\n\n\n\n\n\n\n \n \n \n\n\n\n\n\n\n\nA cross tabulation is often useful to explore the association between two categorical variables.\n\nlibrary(modelsummary)\nurl <- 'https://vincentarelbundock.github.io/Rdatasets/csv/palmerpenguins/penguins.csv'\npenguins <- read.csv(url, na.strings = \"\")\n\ndatasummary_crosstab(species ~ sex, data = penguins)\n\n\n\n\n\n \n \n species\n \n female\n male\n All\n \n \n \n Adelie\nN\n73\n73\n152\n \n% row\n48.0\n48.0\n100.0\n Chinstrap\nN\n34\n34\n68\n \n% row\n50.0\n50.0\n100.0\n Gentoo\nN\n58\n61\n124\n \n% row\n46.8\n49.2\n100.0\n All\nN\n165\n168\n344\n \n% row\n48.0\n48.8\n100.0\n \n \n \n\n\n\n\nYou can create multi-level crosstabs by specifying interactions using the * operator:\n\ndatasummary_crosstab(species ~ sex * island, data = penguins)\n\n\n\n\n\n \n \n species\n \n \n female \n \n \n male \n \n All\n \n \n Biscoe\n Dream\n Torgersen\n Biscoe \n Dream \n Torgersen \n \n \n \n Adelie\nN\n22\n27\n24\n22\n28\n23\n152\n \n% row\n14.5\n17.8\n15.8\n14.5\n18.4\n15.1\n100.0\n Chinstrap\nN\n0\n34\n0\n0\n34\n0\n68\n \n% row\n0.0\n50.0\n0.0\n0.0\n50.0\n0.0\n100.0\n Gentoo\nN\n58\n0\n0\n61\n0\n0\n124\n \n% row\n46.8\n0.0\n0.0\n49.2\n0.0\n0.0\n100.0\n All\nN\n80\n61\n24\n83\n62\n23\n344\n \n% row\n23.3\n17.7\n7.0\n24.1\n18.0\n6.7\n100.0\n \n \n \n\n\n\n\nBy default, the cell counts and row percentages are shown for each cell, and both row and column totals are calculated. To show cell percentages or column percentages, or to drop row and column totals, adjust the statistic argument. This argument accepts a formula that follows the datasummary “language”. To understand exactly how it works, you may find it useful to skip to the datasummary tutorial in the next section. Example:\n\ndatasummary_crosstab(species ~ sex,\n statistic = 1 ~ Percent(\"col\"),\n data = penguins)\n\n\n\n\n\n \n \n species\n \n female\n male\n \n \n \n Adelie\n% col\n44.2\n43.5\n Chinstrap\n% col\n20.6\n20.2\n Gentoo\n% col\n35.2\n36.3\n All\n% col\n100.0\n100.0\n \n \n \n\n\n\n\nSee ?datasummary_crosstab for more details.\n\n\n\ndatasummary tables are specified using a 2-sided formula, divided by a tilde ~. The left-hand side describes the rows; the right-hand side describes the columns. To illustrate how this works, we will again be using the palmerpenguins dataset:\nTo display the flipper_length_mm variable as a row and the mean as a column, we type:\n\ndatasummary(flipper_length_mm ~ Mean,\n data = penguins)\n\n\n\n\n\n \n \n \n Mean\n \n \n \n flipper_length_mm\n200.92\n \n \n \n\n\n\n\nTo flip rows and columns, we flip the left and right-hand sides of the formula:\n\ndatasummary(Mean ~ flipper_length_mm,\n data = penguins)\n\n\n\n\n\n \n \n \n flipper_length_mm\n \n \n \n Mean\n200.92\n \n \n \n\n\n\n\n\n\nThe Mean function is a shortcut supplied by modelsummary, and it is equivalent to mean(x,na.rm=TRUE). Since the flipper_length_mm variable includes missing observation, using the mean formula (with default na.rm=FALSE) would produce a missing/empty cell:\n\ndatasummary(flipper_length_mm ~ mean,\n data = penguins)\n\n\n\n\n\n \n \n \n mean\n \n \n \n flipper_length_mm\n\n \n \n \n\n\n\n\nmodelsummary supplies these functions: Mean, SD, Min, Max, Median, P0, P25, P50, P75, P100, Histogram, and a few more (see the package documentation).\nUsers are also free to create and use their own custom summaries. Any R function which takes a vector and produces a single value is acceptable. For example, the Range functions return a numerical value, and the MinMax returns a string:\n\nRange <- function(x) max(x, na.rm = TRUE) - min(x, na.rm = TRUE)\n\ndatasummary(flipper_length_mm ~ Range,\n data = penguins)\n\n\n\n\n\n \n \n \n Range\n \n \n \n flipper_length_mm\n59.00\n \n \n \n\n\n\nMinMax <- function(x) paste0('[', min(x, na.rm = TRUE), ', ', max(x, na.rm = TRUE), ']')\n\ndatasummary(flipper_length_mm ~ MinMax,\n data = penguins)\n\n\n\n\n\n \n \n \n MinMax\n \n \n \n flipper_length_mm\n[172, 231]\n \n \n \n\n\n\n\n\n\n\nTo include more rows and columns, we use the + sign:\n\ndatasummary(flipper_length_mm + body_mass_g ~ Mean + SD,\n data = penguins)\n\n\n\n\n\n \n \n \n Mean\n SD\n \n \n \n flipper_length_mm\n200.92\n14.06\n body_mass_g\n4201.75\n801.95\n \n \n \n\n\n\n\nSometimes, it can be cumbersome to list all variables separated by + signs. The All() function is a useful shortcut:\n\ndatasummary(All(penguins) ~ Mean + SD,\n data = penguins)\n\n\n\n\n\n \n \n \n Mean\n SD\n \n \n \n rownames\n172.50\n99.45\n bill_length_mm\n43.92\n5.46\n bill_depth_mm\n17.15\n1.97\n flipper_length_mm\n200.92\n14.06\n body_mass_g\n4201.75\n801.95\n year\n2008.03\n0.82\n \n \n \n\n\n\n\nBy default, All selects all numeric variables. This behavior can be changed by modifying the function’s arguments. See ?All for details.\n\n\n\ndatasummary can nest variables and statistics inside categorical variables using the * symbol. When applying the the * operator to factor, character, or logical variables, columns or rows will automatically be nested. For instance, if we want to display separate means for each value of the variable sex, we use mean * sex:\n\ndatasummary(flipper_length_mm + body_mass_g ~ mean * sex,\n data = penguins)\n\n\n\n\n\n \n \n \n female\n male\n \n \n \n flipper_length_mm\n197.36\n204.51\n body_mass_g\n3862.27\n4545.68\n \n \n \n\n\n\n\nWe can use parentheses to nest several terms inside one another, using a call of this form: x * (y + z). Here is an example with nested columns:\n\ndatasummary(body_mass_g ~ sex * (mean + sd),\n data = penguins)\n\n\n\n\n\n \n \n \n \n female \n \n \n male \n \n \n \n mean\n sd\n mean \n sd \n \n \n \n body_mass_g\n3862.27\n666.17\n4545.68\n787.63\n \n \n \n\n\n\n\nHere is an example with nested rows:\n\ndatasummary(sex * (body_mass_g + flipper_length_mm) ~ mean + sd,\n data = penguins)\n\n\n\n\n\n \n \n sex\n \n mean\n sd\n \n \n \n female\nbody_mass_g\n3862.27\n666.17\n \nflipper_length_mm\n197.36\n12.50\n male\nbody_mass_g\n4545.68\n787.63\n \nflipper_length_mm\n204.51\n14.55\n \n \n \n\n\n\n\nThe order in which terms enter the formula determines the order in which labels are displayed. For example, this shows island above sex:\n\ndatasummary(flipper_length_mm + body_mass_g ~ mean * island * sex,\n data = penguins)\n\n\n\n\n\n \n \n \n \n Biscoe \n \n \n Dream \n \n \n Torgersen \n \n \n \n female\n male\n female \n male \n female \n male \n \n \n \n flipper_length_mm\n205.69\n213.29\n190.02\n196.31\n188.29\n194.91\n body_mass_g\n4319.38\n5104.52\n3446.31\n3987.10\n3395.83\n4034.78\n \n \n \n\n\n\n\nThis shows sex above island values:\n\ndatasummary(flipper_length_mm + body_mass_g ~ mean * sex * island,\n data = penguins)\n\n\n\n\n\n \n \n \n \n female \n \n \n male \n \n \n \n Biscoe\n Dream\n Torgersen\n Biscoe \n Dream \n Torgersen \n \n \n \n flipper_length_mm\n205.69\n190.02\n188.29\n213.29\n196.31\n194.91\n body_mass_g\n4319.38\n3446.31\n3395.83\n5104.52\n3987.10\n4034.78\n \n \n \n\n\n\n\nBy default, datasummary omits column headers with a single value/label across all columns, except for the header that sits just above the data. If the header we want to see is not displayed, we may want to reorder the terms of the formula. To show all headers, set sparse_header=FALSE:\n\ndatasummary(flipper_length_mm + body_mass_g ~ mean * sex * island,\n data = penguins,\n sparse_header = FALSE)\n\n\n\n\n\n \n \n \n \n female \n \n \n male \n \n \n \n Biscoe\n Dream\n Torgersen\n Biscoe \n Dream \n Torgersen \n \n \n \n flipper_length_mm\n205.69\n190.02\n188.29\n213.29\n196.31\n194.91\n body_mass_g\n4319.38\n3446.31\n3395.83\n5104.52\n3987.10\n4034.78\n \n \n \n\n\n\n\nWhen using sparse_header=FALSE, it is often useful to insert Heading() * in the table formula, in order to rename or omit some of the labels manually. Type ?tables::Heading for details and examples.\n\n\n\nPersonally, I prefer to rename variables and values before drawing my tables, using backticks when variable names include whitespace. For example,\n\ntmp <- penguins %>%\n select(`Flipper length (mm)` = flipper_length_mm,\n `Body mass (g)` = body_mass_g)\n\ndatasummary(`Flipper length (mm)` + `Body mass (g)` ~ Mean + SD,\n data = tmp)\n\n\n\n\n\n \n \n \n Mean\n SD\n \n \n \n Flipper length (mm)\n200.92\n14.06\n Body mass (g)\n4201.75\n801.95\n \n \n \n\n\n\n\nHowever, thanks to the tables package, datasummary offers two additional mechanisms to rename. First, we can wrap a term in parentheses and use the equal = sign: (NewName=OldName):\n\ndatasummary((`Flipper length (mm)` = flipper_length_mm) + (`Body mass (g)` = body_mass_g) ~\n island * ((Avg. = Mean) + (Std.Dev. = SD)),\n data = penguins)\n\n\n\n\n\n \n \n \n \n Biscoe \n \n \n Dream \n \n \n Torgersen \n \n \n \n Avg.\n Std.Dev.\n Avg. \n Std.Dev. \n Avg. \n Std.Dev. \n \n \n \n Flipper length (mm)\n209.71\n14.14\n193.07\n7.51\n191.20\n6.23\n Body mass (g)\n4716.02\n782.86\n3712.90\n416.64\n3706.37\n445.11\n \n \n \n\n\n\n\nSecond, we can use the Heading() function:\n\ndatasummary(Heading(\"Flipper length (mm)\") * flipper_length_mm + Heading(\"Body mass (g)\") * body_mass_g ~ island * (Mean + SD),\n data = penguins)\n\n\n\n\n\n \n \n \n \n Biscoe \n \n \n Dream \n \n \n Torgersen \n \n \n \n Mean\n SD\n Mean \n SD \n Mean \n SD \n \n \n \n Flipper length (mm)\n209.71\n14.14\n193.07\n7.51\n191.20\n6.23\n Body mass (g)\n4716.02\n782.86\n3712.90\n416.64\n3706.37\n445.11\n \n \n \n\n\n\n\nThe Heading function also has a nearData argument which can be useful in cases where some rows are nested but others are not. Compare the last row of these two tables:\n\ndatasummary(sex * (flipper_length_mm + bill_length_mm) + Heading(\"Body mass (g)\") * body_mass_g ~ Mean + SD,\n data = penguins)\n\n\n\n\n\n \n \n sex\n \n Mean\n SD\n \n \n \n female\nflipper_length_mm\n197.36\n12.50\n \nbill_length_mm\n42.10\n4.90\n male\nflipper_length_mm\n204.51\n14.55\n \nbill_length_mm\n45.85\n5.37\n \nBody mass (g)\n4201.75\n801.95\n \n \n \n\n\n\ndatasummary(sex * (flipper_length_mm + bill_length_mm) + Heading(\"Body mass (g)\", nearData=FALSE) * body_mass_g ~ Mean + SD,\n data = penguins)\n\n\n\n\n\n \n \n sex\n \n Mean\n SD\n \n \n \n female\nflipper_length_mm\n197.36\n12.50\n \nbill_length_mm\n42.10\n4.90\n male\nflipper_length_mm\n204.51\n14.55\n \nbill_length_mm\n45.85\n5.37\n Body mass (g)\n\n4201.75\n801.95\n \n \n \n\n\n\n\n\n\n\nThe tables package allows datasummary to use neat tricks to produce frequency tables:\n\nAdd a N to the right-hand side of the equation.\nAdd Percent() to the right-hand side to calculate the percentage of observations in each cell.\nAdd 1 to the left-hand side to include a row with the total number of observations:\n\n\ndatasummary(species * sex + 1 ~ N + Percent(),\n data = penguins)\n\n\n\n\n\n \n \n species\n sex\n N\n Percent\n \n \n \n Adelie\nfemale\n73\n21.22\n \nmale\n73\n21.22\n Chinstrap\nfemale\n34\n9.88\n \nmale\n34\n9.88\n Gentoo\nfemale\n58\n16.86\n \nmale\n61\n17.73\n \nAll\n344\n100.00\n \n \n \n\n\n\n\nNote that the Percent() function accepts a denom argument to determine if percentages should be calculated row or column-wise, or if they should take into account all cells.\n\n\n\nThe Percent() pseudo-function also accepts a fn argument, which must be a function which accepts two vectors: x is the values in the current cell, and y is all the values in the whole dataset. The default fn is:\n\ndatasummary(species * sex + 1 ~ N + Percent(fn = function(x, y) 100 * length(x) / length(y)),\n output = \"markdown\",\n data = penguins)\n\n\n\n\nspecies\nsex\nN\nPercent\n\n\n\n\nAdelie\nfemale\n73\n21.22\n\n\n\nmale\n73\n21.22\n\n\nChinstrap\nfemale\n34\n9.88\n\n\n\nmale\n34\n9.88\n\n\nGentoo\nfemale\n58\n16.86\n\n\n\nmale\n61\n17.73\n\n\n\nAll\n344\n100.00\n\n\n\n\n\nThe code above takes the number of elements in the cell length(x) and divides it by the number of total elements length(y).\nNow, let’s say we want to display percentages weighted by one of the variables of the dataset. This can often be useful with survey weights, for example. Here, we use an arbitrary column of weights called flipper_length_mm:\n\nwtpct <- function(x, y) sum(x, na.rm = TRUE) / sum(y, na.rm = TRUE) * 100\ndatasummary(species * sex + 1 ~ N + flipper_length_mm * Percent(fn = wtpct),\n output = \"markdown\",\n data = penguins)\n\n\n\n\nspecies\nsex\nN\nPercent\n\n\n\n\nAdelie\nfemale\n73\n19.95\n\n\n\nmale\n73\n20.44\n\n\nChinstrap\nfemale\n34\n9.49\n\n\n\nmale\n34\n9.89\n\n\nGentoo\nfemale\n58\n17.95\n\n\n\nmale\n61\n19.67\n\n\n\nAll\n344\n100.00\n\n\n\n\n\nIn each cell we now have the sum of weights in that cell, divided by the total sum of weights in the column.\n\n\n\nHere is another simple illustration of Percent function mechanism in action, where we combine counts and percentages in a simple nice label:\n\ndat <- mtcars\ndat$cyl <- as.factor(dat$cyl)\n\nfn <- function(x, y) {\n out <- sprintf(\n \"%s (%.1f%%)\",\n length(x),\n length(x) / length(y) * 100)\n}\ndatasummary(\n cyl ~ Percent(fn = fn),\n data = dat)\n\n\n\n\n\n \n \n cyl\n Percent\n \n \n \n 4\n11 (34.4%)\n 6\n7 (21.9%)\n 8\n14 (43.8%)\n \n \n \n\n\n\n\n\n\n\nThe * nesting operator that we used above works automatically for factor, character, and logical variables. Sometimes, it is convenient to use it with other types of variables, such as binary numeric variables. In that case, we can wrap the variable name inside a call to Factor(). This allows us to treat a variable as factor, without having to modify it in the original data. For example, in the mtcars data, there is a binary numeric variable call am. We nest statistics within categories of am by typing:\n\ndatasummary(mpg + hp ~ Factor(am) * (mean + sd),\n data = mtcars)\n\n\n\n\n\n \n \n \n \n 0 \n \n \n 1 \n \n \n \n mean\n sd\n mean \n sd \n \n \n \n mpg\n17.15\n3.83\n24.39\n6.17\n hp\n160.26\n53.91\n126.85\n84.06\n \n \n \n\n\n\n\n\n\n\nWe can pass any argument to the summary function by including a call to Arguments(). For instance, there are missing values in the flipper_length_mm variable of the penguins dataset. Therefore, the standard mean function will produce no result, because its default argument is na.rm=FALSE. We can change that by calling:\n\ndatasummary(flipper_length_mm ~ mean + mean*Arguments(na.rm=TRUE),\n data = penguins)\n\n\n\n\n\n \n \n \n mean\n mean \n \n \n \n flipper_length_mm\n\n200.92\n \n \n \n\n\n\n\nNotice that there is an empty cell (NA) under the normal mean function, but that the mean call with Arguments(na.rm=TRUE) produced a numeric result.\nWe can pass the same arguments to multiple functions using the parentheses:\n\ndatasummary(flipper_length_mm ~ (mean + sd) * Arguments(na.rm=TRUE),\n data = penguins)\n\n\n\n\n\n \n \n \n mean\n sd\n \n \n \n flipper_length_mm\n200.92\n14.06\n \n \n \n\n\n\n\nNote that the shortcut functions that modelsummary supplies use na.rm=TRUE by default, so we can use them directly without arguments, even when there are missing values:\n\ndatasummary(flipper_length_mm ~ Mean + Var + P75 + Range,\n data = penguins)\n\n\n\n\n\n \n \n \n Mean\n Var\n P75\n Range\n \n \n \n flipper_length_mm\n200.92\n197.73\n213.00\n59.00\n \n \n \n\n\n\n\n\n\n\nYou can use the Arguments mechanism to do various things, such as calculating weighted means:\n\nnewdata <- data.frame(\n x = rnorm(20),\n w = rnorm(20),\n y = rnorm(20))\n\ndatasummary(x + y ~ weighted.mean * Arguments(w = w),\n data = newdata, output = \"markdown\")\n\n\n\n\n\nweighted.mean\n\n\n\n\nx\n1.05\n\n\ny\n-3.82\n\n\n\n\n\nWhich produces the same results as:\n\nweighted.mean(newdata$x, newdata$w)\n\n[1] 1.051561\n\nweighted.mean(newdata$y, newdata$w)\n\n[1] -3.816827\n\n\nBut different results from:\n\nmean(newdata$x)\n\n[1] 0.1759387\n\nmean(newdata$y)\n\n[1] 0.1215528\n\n\n\n\n\nSometimes, if we nest too much and the dataset is not large/diverse enough, we end up with empty cells. In that case, we add *DropEmpty() to the formula:\n\ndatasummary(island * species * body_mass_g ~ sex * (Mean + SD),\n data = penguins)\n\n\n\n\n\n \n \n island\n species\n \n \n female \n \n \n male \n \n \n \n Mean\n SD\n Mean \n SD \n \n \n \n Biscoe\nAdelie\nbody_mass_g\n3369.32\n343.47\n4050.00\n355.57\n \nChinstrap\nbody_mass_g\n\n\n\n\n \nGentoo\nbody_mass_g\n4679.74\n281.58\n5484.84\n313.16\n Dream\nAdelie\nbody_mass_g\n3344.44\n212.06\n4045.54\n330.55\n \nChinstrap\nbody_mass_g\n3527.21\n285.33\n3938.97\n362.14\n \nGentoo\nbody_mass_g\n\n\n\n\n Torgersen\nAdelie\nbody_mass_g\n3395.83\n259.14\n4034.78\n372.47\n \nChinstrap\nbody_mass_g\n\n\n\n\n \nGentoo\nbody_mass_g\n\n\n\n\n \n \n \n\n\n\ndatasummary(island * species * body_mass_g ~ sex * (Mean + SD) * DropEmpty(),\n data = penguins)\n\n\n\n\n\n \n \n island\n species\n \n \n female \n \n \n male \n \n \n \n Mean\n SD\n Mean \n SD \n \n \n \n Biscoe\nAdelie\nbody_mass_g\n3369.32\n343.47\n4050.00\n355.57\n \nGentoo\nbody_mass_g\n4679.74\n281.58\n5484.84\n313.16\n Dream\nAdelie\nbody_mass_g\n3344.44\n212.06\n4045.54\n330.55\n \nChinstrap\nbody_mass_g\n3527.21\n285.33\n3938.97\n362.14\n Torgersen\nAdelie\nbody_mass_g\n3395.83\n259.14\n4034.78\n372.47\n \n \n \n\n\n\n\n\n\n\nCool stuff is possible with logical subsets:\n\ndatasummary((bill_length_mm > 44.5) + (bill_length_mm <= 44.5) ~ Mean * flipper_length_mm,\n data = penguins)\n\n\n\n\n\n \n \n \n \n \n \n \n bill_length_mm > 44.5\n209.68\n bill_length_mm <= 44.5\n192.45\n \n \n \n\n\n\n\nSee the tables package documentation for more details and examples.\n\n\n\n\nAll functions in the datasummary_* family accept the same output argument. Tables can be saved to several file formats:\n\nf <- flipper_length_mm ~ island * (mean + sd)\ndatasummary(f, data = penguins, output = 'table.html')\ndatasummary(f, data = penguins, output = 'table.tex')\ndatasummary(f, data = penguins, output = 'table.docx')\ndatasummary(f, data = penguins, output = 'table.pptx')\ndatasummary(f, data = penguins, output = 'table.md')\ndatasummary(f, data = penguins, output = 'table.rtf')\ndatasummary(f, data = penguins, output = 'table.jpg')\ndatasummary(f, data = penguins, output = 'table.png')\n\nThey can be returned in human-readable data.frames, markdown, html, or LaTeX code to the console:\n\ndatasummary(f, data = penguins, output = 'data.frame')\ndatasummary(f, data = penguins, output = 'markdown')\ndatasummary(f, data = penguins, output = 'html')\ndatasummary(f, data = penguins, output = 'latex')\n\ndatasummary can return objects compatible with the gt, kableExtra, flextable, huxtable, and DT table creation and customization packages:\n\ndatasummary(f, data = penguins, output = 'gt')\ndatasummary(f, data = penguins, output = 'kableExtra')\ndatasummary(f, data = penguins, output = 'flextable')\ndatasummary(f, data = penguins, output = 'huxtable')\ndatasummary(f, data = penguins, output = 'DT')\n\nPlease note that hierarchical or “nested” column labels are only available for these output formats: kableExtra, gt, html, rtf, and LaTeX. When saving tables to other formats, nested labels will be combined to a “flat” header.\n\n\n\nThe fmt argument allows us to set the printing format of numeric values. It accepts a single number representing the number of digits after the period, or a string to be passed to the sprintf function. For instance, setting fmt=\"%.4f\" will keep 4 digits after the dot (see ?sprintf for more options):\n\ndatasummary(flipper_length_mm ~ Mean + SD,\n fmt = 4,\n data = penguins)\n\n\n\n\n\n \n \n \n Mean\n SD\n \n \n \n flipper_length_mm\n200.9152\n14.0617\n \n \n \n\n\n\n\nWe can set the formatting on a term-by-term basis by using the same Arguments function that we used to handle missing values in the previous section. The shortcut functions that ship with modelsummary (e.g., Mean, SD, Median, P25) all include a fmt argument:\n\ndatasummary(flipper_length_mm ~ Mean * Arguments(fmt = \"%.4f\") + SD * Arguments(fmt = \"%.1f\"),\n data = penguins)\n\n\n\n\n\n \n \n \n Mean\n SD\n \n \n \n flipper_length_mm\n200.9152\n14.1\n \n \n \n\n\n\n\nIf we do not want datasummary to format numbers, and we want to keep the numerical values instead of formatted strings, set fmt=NULL. This can be useful when post-processing tables with packages like gt, which allow us to transform cells based on their numerical content (this gt table will be omitted from PDF output):\n\nlibrary(gt)\n\ndatasummary(All(mtcars) ~ Mean + SD,\n data = mtcars,\n fmt = NULL,\n output = 'gt') %>%\n tab_style(style = cell_fill(color = \"pink\"),\n locations = cells_body(rows = Mean > 10, columns = 2))\n\n\n\n\n\n \n \n \n Mean\n SD\n \n \n \n mpg\n20.090625\n6.0269481\n cyl\n6.187500\n1.7859216\n disp\n230.721875\n123.9386938\n hp\n146.687500\n68.5628685\n drat\n3.596563\n0.5346787\n wt\n3.217250\n0.9784574\n qsec\n17.848750\n1.7869432\n vs\n0.437500\n0.5040161\n am\n0.406250\n0.4989909\n gear\n3.687500\n0.7378041\n carb\n2.812500\n1.6152000\n \n \n \n\n\n\n\nPlease note that the N() function is supplied by the upstream tables package, and does not have a fmt argument. Fortunately, it is easy to override the built-in function to use custom formatting:\n\ntmp <- data.frame(X = sample(letters[1:3], 1e6, replace = TRUE))\nN <- \\(x) format(length(x), big.mark = \",\")\ndatasummary(X ~ N, data = tmp)\n\n\n\n\n\n \n \n X\n N\n \n \n \n a\n333,404\n b\n332,896\n c\n333,700\n \n \n \n\n\n\n\n\n\n\ndatasummary includes the same title and notes arguments as in modelsummary:\n\ndatasummary(All(penguins) ~ Mean + SD,\n data = penguins,\n title = 'Statistics about the famous Palmer Penguins.',\n notes = c('A note at the bottom of the table.'))\n\n\n\n\n\n Statistics about the famous Palmer Penguins.\n \n \n \n Mean\n SD\n \n \n \n rownames\n172.50\n99.45\n bill_length_mm\n43.92\n5.46\n bill_depth_mm\n17.15\n1.97\n flipper_length_mm\n200.92\n14.06\n body_mass_g\n4201.75\n801.95\n year\n2008.03\n0.82\n \n \n \n A note at the bottom of the table.\n \n \n \n\n\n\n\n\n\n\nWe can align columns using the align argument. align should be a string of length equal to the number of columns, and which includes only the letters “l”, “c”, or “r”:\n\ndatasummary(flipper_length_mm + bill_length_mm ~ Mean + SD + Range,\n data = penguins,\n align = 'lrcl')\n\n\n\n\n\n \n \n \n Mean\n SD\n Range\n \n \n \n flipper_length_mm\n200.92\n14.06\n59.00\n bill_length_mm\n43.92\n5.46\n27.50\n \n \n \n\n\n\n\n\n\n\n\nnew_rows <- data.frame('Does',\n 2,\n 'plus',\n 2,\n 'equals',\n 5,\n '?')\ndatasummary(flipper_length_mm + body_mass_g ~ species * (Mean + SD),\n data = penguins,\n add_rows = new_rows)\n\n\n\n\n\n \n \n \n \n Adelie \n \n \n Chinstrap \n \n \n Gentoo \n \n \n \n Mean\n SD\n Mean \n SD \n Mean \n SD \n \n \n \n flipper_length_mm\n189.95\n6.54\n195.82\n7.13\n217.19\n6.48\n body_mass_g\n3700.66\n458.57\n3733.09\n384.34\n5076.02\n504.12\n Does\n2.00\nplus\n2.00\nequals\n5.00\n?\n \n \n \n\n\n\n\n\n\n\n\nnew_cols <- data.frame('New Stat' = runif(2))\ndatasummary(flipper_length_mm + body_mass_g ~ species * (Mean + SD),\n data = penguins,\n add_columns = new_cols)\n\n\n\n\n\n \n \n \n \n Adelie \n \n \n Chinstrap \n \n \n Gentoo \n \n New.Stat\n \n \n Mean\n SD\n Mean \n SD \n Mean \n SD \n \n \n \n flipper_length_mm\n189.95\n6.54\n195.82\n7.13\n217.19\n6.48\n0.83\n body_mass_g\n3700.66\n458.57\n3733.09\n384.34\n5076.02\n504.12\n0.39\n \n \n \n\n\n\n\n\n\n\nThe datasummary family of functions allow users to display in-line spark-style histograms to describe the distribution of the variables. For example, the datasummary_skim produces such a histogram:\n\ntmp <- mtcars[, c(\"mpg\", \"hp\")]\ndatasummary_skim(tmp)\n\n\n\n\n\n \n \n \n Unique (#)\n Missing (%)\n Mean\n SD\n Min\n Median\n Max\n \n \n \n \n mpg\n25\n0\n20.1\n6.0\n10.4\n19.2\n33.9\n \n hp\n22\n0\n146.7\n68.6\n52.0\n123.0\n335.0\n \n \n \n \n\n\n\n\nEach of the histograms in the table above is actually an SVG image, produced by the kableExtra package. For this reason, the histogram will not appear when users use a different output backend, such as gt, flextable, or huxtable.\nThe datasummary function is incredibly flexible, but it does not include a histogram option by default. Here is a simple example of how one can customize the output of datasummary. We proceed in 4 steps:\n\nNormalize the variables and store them in a list\nCreate the table with datasummary, making sure to include 2 “empty” columns. In the example, we use a simple function called emptycol to fill those columns with empty strings.\nAdd the histograms or boxplots using functions from the kableExtra package.\n\n\ntmp <- mtcars[, c(\"mpg\", \"hp\")]\n\n## create a list with individual variables\n## remove missing and rescale\ntmp_list <- lapply(tmp, na.omit)\ntmp_list <- lapply(tmp_list, scale)\n\n## create a table with `datasummary`\n## add a histogram with column_spec and spec_hist\n## add a boxplot with colun_spec and spec_box\nemptycol = function(x) \" \"\ndatasummary(mpg + hp ~ Mean + SD + Heading(\"Boxplot\") * emptycol + Heading(\"Histogram\") * emptycol,\n output = \"kableExtra\",\n data = tmp) %>%\n kableExtra::column_spec(column = 4, image = spec_boxplot(tmp_list)) %>%\n kableExtra::column_spec(column = 5, image = spec_hist(tmp_list))\n\nIf you want a simpler solution, you can try the Histogram function which works in datasummary automatically and comes bundled with modelsummary. The downside of this function is that it uses Unicode characters to create the histogram. This kind of histogram may not display well with certain typefaces or on some operating systems (Windows!).\n\ndatasummary(mpg + hp ~ Mean + SD + Histogram, data = tmp)\n\n\n\n\n\n \n \n \n Mean\n SD\n Histogram\n \n \n \n mpg\n20.09\n6.03\n▂▅▇▇▆▃▁▁▂▂\n hp\n146.69\n68.56\n▅▅▇▂▆▂▃▁▁\n \n \n \n\n\n\n\n\n\n\nAt least 3 distinct issues can arise related to missing values.\n\n\nAn empty cell can appear in the table when a statistical function returns NA instead of a numeric value. In those cases, you should:\n\nMake sure that your statistical function (e.g., mean or sd) uses na.rm=TRUE by default\nUse the Arguments strategy to set na.rm=TRUE (see the Arguments section of this vignette).\nUse a convenience function supplied by modelsummary, where na.rm is TRUE by default: Mean, SD, P25, etc.\n\n\n\n\nAn empty cell can appear in the table when a crosstab is deeply nested, and there are no observations for a given combination of covariates. In those cases, you can use the * DropEmpty pseudo-function. See the “Empty cells” section of this vignette for examples.\n\n\n\nBy default, the factor function in R does not assign a distinct factor level to missing values: the factor function’s exclude argument is set to NA by default. To ensure that NAs appear in your table, make sure you set exclude=NULL when you create the factor.\nInternally, the datasummary_balance and datasummary_crosstab functions convert logical and character variables to factor with the exclude=NULL argument. This means that NAs will appear in the table as distinct rows/columns. If you do not want NAs to appear in your table, convert them to factors yourself ahead of time. For example:\n\nmycars <- mtcars[, c(\"cyl\", \"mpg\", \"hp\", \"vs\")]\nmycars$cyl[c(1, 3, 6, 8)] <- NA\nmycars$cyl_nona <- factor(mycars$cyl)\nmycars$cyl_na <- factor(mycars$cyl, exclude = NULL)\n\ndatasummary_crosstab(cyl_nona ~ vs, data = mycars)\n\n\n\n\n\n \n \n cyl_nona\n \n 0\n 1\n All\n \n \n \n 4\nN\n1\n8\n9\n \n% row\n11.1\n88.9\n100.0\n 6\nN\n2\n3\n5\n \n% row\n40.0\n60.0\n100.0\n 8\nN\n14\n0\n14\n \n% row\n100.0\n0.0\n100.0\n All\nN\n18\n14\n32\n \n% row\n56.2\n43.8\n100.0\n \n \n \n\n\n\ndatasummary_crosstab(cyl_na ~ vs, data = mycars)\n\n\n\n\n\n \n \n cyl_na\n \n 0\n 1\n All\n \n \n \n 4\nN\n1\n8\n9\n \n% row\n11.1\n88.9\n100.0\n 6\nN\n2\n3\n5\n \n% row\n40.0\n60.0\n100.0\n 8\nN\n14\n0\n14\n \n% row\n100.0\n0.0\n100.0\n NA\nN\n1\n3\n4\n \n% row\n25.0\n75.0\n100.0\n All\nN\n18\n14\n32\n \n% row\n56.2\n43.8\n100.0\n \n \n \n\n\n\n\n\n\n\n\nIn the Appearance Vignette we saw how to customize the tables produced by the modelsummary function. The same customization possibilities are also available for all functions in the datasummary_* family of functions. Indeed, we can customize the tables produced by datasummary using the functions provided by gt, kableExtra, flextable, huxtable, and DT. For instance, to customize tables using kableExtra, we can do:\n\ndatasummary(All(penguins) ~ sex * (Mean + SD),\n data = penguins,\n output = 'kableExtra') %>%\n kableExtra::row_spec(3, background = 'cyan', color = 'red')\n\nTo customize a table using the gt package, we can:\n\nlibrary(gt)\n\nadelie <- function(x) web_image('https://user-images.githubusercontent.com/987057/85402702-20b1d280-b52a-11ea-9950-f3a03133fd45.png', height = 100)\ngentoo <- function(x) web_image('https://user-images.githubusercontent.com/987057/85402718-27404a00-b52a-11ea-9ad3-dd7562f6438d.png', height = 100)\nchinstrap <- function(x) web_image('https://user-images.githubusercontent.com/987057/85402708-23acc300-b52a-11ea-9a77-de360a0d1f7d.png', height = 100)\n\ncap <- 'Flipper lengths (mm) of the famous penguins of Palmer Station, Antarctica.'\nf <- (`Species` = species) ~ (` ` = flipper_length_mm) * (`Distribution` = Histogram) + flipper_length_mm * sex * ((`Avg.` = Mean)*Arguments(fmt='%.0f') + (`Std. Dev.` = SD)*Arguments(fmt='%.1f'))\ndatasummary(f,\n data = penguins,\n output = 'gt',\n title = cap,\n notes = 'Artwork by @allison_horst',\n sparse_header = TRUE) %>%\n text_transform(locations = cells_body(columns = 1, rows = 1), fn = adelie) %>%\n text_transform(locations = cells_body(columns = 1, rows = 2), fn = chinstrap) %>%\n text_transform(locations = cells_body(columns = 1, rows = 3), fn = gentoo) %>%\n tab_style(style = list(cell_text(color = \"#FF6700\", size = 'x-large')), locations = cells_body(rows = 1)) %>%\n tab_style(style = list(cell_text(color = \"#CD51D1\", size = 'x-large')), locations = cells_body(rows = 2)) %>%\n tab_style(style = list(cell_text(color = \"#007377\", size = 'x-large')), locations = cells_body(rows = 3)) %>%\n tab_options(table_body.hlines.width = 0, table.border.top.width = 0, table.border.bottom.width = 0) %>%\n cols_align('center', columns = 3:6)\n\n\n\n\n\n Flipper lengths (mm) of the famous penguins of Palmer Station, Antarctica.\n \n \n Species\n Distribution\n \n Female \n \n \n Male \n \n \n \n Avg.\n Std. Dev.\n Avg. \n Std. Dev. \n \n \n \n \n▁▃▆▇▃▅▂\n188\n5.6\n192\n6.6\n \n▁▄▃▇▆▄▃▁▂\n192\n5.8\n200\n6.0\n \n▂▅▅▇▃▅▂▁▃\n213\n3.9\n222\n5.7\n \n \n \n Artwork by @allison_horst" + "text": "datasummary is a function from the modelsummary package. It allows us to create data summaries, frequency tables, crosstabs, correlation tables, balance tables (aka “Table 1”), and more. It has many benefits:\n\nEasy to use.\nExtremely flexible.\nMany output formats: HTML, LaTeX, Microsoft Word and Powerpoint, Text/Markdown, PDF, RTF, or Image files.\nEmbed tables in Rmarkdown or knitr dynamic documents.\nCustomize the appearance of tables with the gt, kableExtra or flextable packages. The possibilities are endless!\n\nThis tutorial will show how to draw tables like these (and more!):\n\n \n\n\n\ndatasummary is built around the fantastic tables package for R. It is a thin “wrapper” which adds convenience functions and arguments; a user-interface consistent with modelsummary; cleaner html output; and the ability to export tables to more formats, including gt tables, flextable objects, and Microsoft Word documents.\ndatasummary is a general-purpose table-making tool. It allows us to build (nearly) any summary table we want by using simple 2-sided formulae. For example, in the expression x + y ~ mean + sd, the left-hand side of the formula identifies the variables or statistics to display as rows, and the right-hand side defines the columns. Below, we will see how variables and statistics can be “nested” with the * operator to produce tables like the ones above.\nIn addition to datasummary, the modelsummary package includes a “family” of companion functions named datasummary_*. These functions facilitate the production of standard, commonly used tables. This family currently includes:\n\ndatasummary(): Flexible function to create custom tables using 2-sided formulae.\ndatasummary_balance(): Group characteristics (e.g., control vs. treatment)\ndatasummary_correlation(): Table of correlations.\ndatasummary_skim(): Quick summary of a dataset.\ndatasummary_df(): Create a table from any dataframe.\ndatasummary_crosstab(): Cross tabulations of categorical variables.\n\nIn the next three sections, we illustrate how to use datasummary_balance, datasummary_correlation, datasummary_skim, and datasummary_crosstab. Then, we dive into datasummary itself to highlight its ease and flexibility.\n\n\n\nThe first datasummary companion function is called datasummary_skim. It was heavily inspired by one of my favorite data exploration tools for R: the skimr package. The goal of this function is to give us a quick look at the data.\nTo illustrate, we download data from the cool new palmerpenguins package by Alison Presmanes Hill and Allison Horst. These data were collected at the Palmer Station in Antarctica by Gorman, Williams & Fraser (2014), and they include 3 categorical variables and 4 numeric variables.\n\nlibrary(modelsummary)\nlibrary(tidyverse)\n\nurl <- 'https://vincentarelbundock.github.io/Rdatasets/csv/palmerpenguins/penguins.csv'\npenguins <- read.csv(url, na.strings = \"\")\n\nTo summarize the numeric variables in the dataset, we type:\n\ndatasummary_skim(penguins)\n\n\nTo summarize the categorical variables in the dataset, we type:\n\ndatasummary_skim(penguins, type = \"categorical\")\n\n\n\n\n\n \n \n \n \n N\n %\n \n \n \n species\nAdelie\n152\n44.2\n \nChinstrap\n68\n19.8\n \nGentoo\n124\n36.0\n island\nBiscoe\n168\n48.8\n \nDream\n124\n36.0\n \nTorgersen\n52\n15.1\n sex\nfemale\n165\n48.0\n \nmale\n168\n48.8\n \nNA\n11\n3.2\n \n \n \n\n\n\n\nLater in this tutorial, it will become clear that datasummary_skim is just a convenience “template” built around datasummary, since we can achieve identical results with the latter. For example, to produce a text-only version of the tables above, we can type:\n\ndatasummary(All(penguins) ~ Mean + SD + Histogram,\n data = penguins,\n output = 'markdown')\n\n| | Mean| SD| Histogram|\n|:-----------------|-------:|------:|----------:|\n|bill_length_mm | 43.92| 5.46| ▁▅▆▆▆▇▇▂▁|\n|bill_depth_mm | 17.15| 1.97| ▃▄▄▄▇▆▇▅▂▁|\n|flipper_length_mm | 200.92| 14.06| ▂▅▇▄▁▄▄▂▁|\n|body_mass_g | 4201.75| 801.95| ▁▄▇▅▄▄▃▃▂▁|\nPrinting histograms will not work on all computers. If you have issues with this feature, try changing your computer’s locale, or try using a different display font.\nThe datasummary_skim function does not currently allow users to summarize continuous and categorical variables together in a single table, but the datasummary_balance function described in the next section can do so.\n\n\n\nThe expressions “balance table” or “Table 1” refer to a type of table which is often printed in the opening pages of a scientific peer-reviewed article. Typically, this table includes basic descriptive statistics about different subsets of the study population. For instance, analysts may want to compare the socio-demographic characteristics of members of the “control” and “treatment” groups in a randomized control trial, or the flipper lengths of male and female penguins. In addition, balance tables often include difference in means tests.\nTo illustrate how to build a balance table using the datasummary_balance function, we download data about a job training experiment studies in Lalonde (1986). Then, we clean up the data by renaming and recoding a few variables.\n\n## Download and read data\ntraining <- 'https://vincentarelbundock.github.io/Rdatasets/csv/Ecdat/Treatment.csv'\ntraining <- read.csv(training, na.strings = \"\")\n\n## Rename and recode variables\ntraining <- training %>%\n mutate(`Earnings Before` = re75 / 1000,\n `Earnings After` = re78 / 1000,\n Treatment = ifelse(treat == TRUE, 'Treatment', 'Control'),\n Married = ifelse(married == TRUE, 'Yes', 'No')) %>%\n select(`Earnings Before`,\n `Earnings After`,\n Treatment,\n Ethnicity = ethn,\n Age = age,\n Education = educ,\n Married)\n\nNow, we execute the datasummary_balance function. If the estimatr package is installed, datasummary_balance will calculate the difference in means and test statistics.\n\ncaption <- 'Descriptive statistics about participants in a job training experiment. The earnings are displayed in 1000s of USD. This table was created using the \"datasummary\" function from the \"modelsummary\" package for R.'\nreference <- 'Source: Lalonde (1986) American Economic Review.'\n\nlibrary(modelsummary)\ndatasummary_balance(~Treatment,\n data = training,\n title = caption,\n notes = reference)\n\nNote that if the dataset includes columns called “blocks”, “clusters”, or “weights”, this information will automatically be taken into consideration by estimatr when calculating the difference in means and the associated statistics.\nUsers can also use the ~ 1 formula to indicate that they want to summarize all the data instead of splitting the analysis across subgroups:\n\ndatasummary_balance(~ 1, data = training)\n\n\n\n\n\n \n \n \n \n Mean\n Std. Dev.\n \n \n \n Earnings Before\n\n17.9\n13.9\n Earnings After\n\n20.5\n15.6\n Age\n\n34.2\n10.5\n Education\n\n12.0\n3.1\n \n \nN\nPct.\n Treatment\nControl\n2490\n93.1\n \nTreatment\n185\n6.9\n Ethnicity\nblack\n780\n29.2\n \nhispanic\n92\n3.4\n \nother\n1803\n67.4\n Married\nNo\n483\n18.1\n \nYes\n2192\n81.9\n \n \n \n\n\n\n\n\n\n\nThe datasummary_correlation accepts a dataframe or tibble, it identifies all the numeric variables, and calculates the correlation between each of those variables:\n\ndatasummary_correlation(mtcars)\n\n\n\n\n\n \n \n \n mpg\n cyl\n disp\n hp\n drat\n wt\n qsec\n vs\n am\n gear\n carb\n \n \n \n mpg\n1\n.\n.\n.\n.\n.\n.\n.\n.\n.\n.\n cyl\n-.85\n1\n.\n.\n.\n.\n.\n.\n.\n.\n.\n disp\n-.85\n.90\n1\n.\n.\n.\n.\n.\n.\n.\n.\n hp\n-.78\n.83\n.79\n1\n.\n.\n.\n.\n.\n.\n.\n drat\n.68\n-.70\n-.71\n-.45\n1\n.\n.\n.\n.\n.\n.\n wt\n-.87\n.78\n.89\n.66\n-.71\n1\n.\n.\n.\n.\n.\n qsec\n.42\n-.59\n-.43\n-.71\n.09\n-.17\n1\n.\n.\n.\n.\n vs\n.66\n-.81\n-.71\n-.72\n.44\n-.55\n.74\n1\n.\n.\n.\n am\n.60\n-.52\n-.59\n-.24\n.71\n-.69\n-.23\n.17\n1\n.\n.\n gear\n.48\n-.49\n-.56\n-.13\n.70\n-.58\n-.21\n.21\n.79\n1\n.\n carb\n-.55\n.53\n.39\n.75\n-.09\n.43\n-.66\n-.57\n.06\n.27\n1\n \n \n \n\n\n\n\nThe values displayed in this table are equivalent to those obtained by calling: cor(x, use='pairwise.complete.obs').\nThe datasummary_correlation function has a methods argument. The default value is \"pearson\", but it also accepts other values like \"spearman\". In addition, method can accept any function which takes a data frame and returns a matrix. For example, we can create a custom function to display information from the correlation package. This allows us to include significance stars even if the stars argument is not supported by default in datasummary_correlation():\n\nlibrary(correlation)\nlibrary(modelsummary)\n\nfun <- function(x) {\n out <- correlation(mtcars) |>\n summary() |>\n format(2) |> \n as.matrix()\n row.names(out) <- out[, 1]\n out <- out[, 2:ncol(out)]\n return(out)\n}\n\ndatasummary_correlation(mtcars, method = fun)\n\n\n\n\n\n \n \n \n carb\n gear\n am\n vs\n qsec\n wt\n drat\n hp\n disp\n cyl\n \n \n \n mpg\n-.55*\n.48\n.60**\n.66**\n.42\n-.87***\n.68***\n-.78***\n-.85***\n-.85***\n cyl\n.53*\n-.49\n-.52*\n-.81***\n-.59*\n.78***\n-.70***\n.83***\n.90***\n\n disp\n.39\n-.56*\n-.59*\n-.71***\n-.43\n.89***\n-.71***\n.79***\n\n\n hp\n.75***\n-.13\n-.24\n-.72***\n-.71***\n.66**\n-.45\n\n\n\n drat\n-.09\n.70***\n.71***\n.44\n.09\n-.71***\n\n\n\n\n wt\n.43\n-.58*\n-.69***\n-.55*\n-.17\n\n\n\n\n\n qsec\n-.66**\n-.21\n-.23\n.74***\n\n\n\n\n\n\n vs\n-.57*\n.21\n.17\n\n\n\n\n\n\n\n am\n.06\n.79***\n\n\n\n\n\n\n\n\n gear\n.27\n\n\n\n\n\n\n\n\n\n \n \n \n\n\n\n\n\n\n\nA cross tabulation is often useful to explore the association between two categorical variables.\n\nlibrary(modelsummary)\nurl <- 'https://vincentarelbundock.github.io/Rdatasets/csv/palmerpenguins/penguins.csv'\npenguins <- read.csv(url, na.strings = \"\")\n\ndatasummary_crosstab(species ~ sex, data = penguins)\n\n\n\n\n\n \n \n species\n \n female\n male\n All\n \n \n \n Adelie\nN\n73\n73\n152\n \n% row\n48.0\n48.0\n100.0\n Chinstrap\nN\n34\n34\n68\n \n% row\n50.0\n50.0\n100.0\n Gentoo\nN\n58\n61\n124\n \n% row\n46.8\n49.2\n100.0\n All\nN\n165\n168\n344\n \n% row\n48.0\n48.8\n100.0\n \n \n \n\n\n\n\nYou can create multi-level crosstabs by specifying interactions using the * operator:\n\ndatasummary_crosstab(species ~ sex * island, data = penguins)\n\n\n\n\n\n \n \n species\n \n \n female \n \n \n male \n \n All\n \n \n Biscoe\n Dream\n Torgersen\n Biscoe \n Dream \n Torgersen \n \n \n \n Adelie\nN\n22\n27\n24\n22\n28\n23\n152\n \n% row\n14.5\n17.8\n15.8\n14.5\n18.4\n15.1\n100.0\n Chinstrap\nN\n0\n34\n0\n0\n34\n0\n68\n \n% row\n0.0\n50.0\n0.0\n0.0\n50.0\n0.0\n100.0\n Gentoo\nN\n58\n0\n0\n61\n0\n0\n124\n \n% row\n46.8\n0.0\n0.0\n49.2\n0.0\n0.0\n100.0\n All\nN\n80\n61\n24\n83\n62\n23\n344\n \n% row\n23.3\n17.7\n7.0\n24.1\n18.0\n6.7\n100.0\n \n \n \n\n\n\n\nBy default, the cell counts and row percentages are shown for each cell, and both row and column totals are calculated. To show cell percentages or column percentages, or to drop row and column totals, adjust the statistic argument. This argument accepts a formula that follows the datasummary “language”. To understand exactly how it works, you may find it useful to skip to the datasummary tutorial in the next section. Example:\n\ndatasummary_crosstab(species ~ sex,\n statistic = 1 ~ Percent(\"col\"),\n data = penguins)\n\n\n\n\n\n \n \n species\n \n female\n male\n \n \n \n Adelie\n% col\n44.2\n43.5\n Chinstrap\n% col\n20.6\n20.2\n Gentoo\n% col\n35.2\n36.3\n All\n% col\n100.0\n100.0\n \n \n \n\n\n\n\nSee ?datasummary_crosstab for more details.\n\n\n\ndatasummary tables are specified using a 2-sided formula, divided by a tilde ~. The left-hand side describes the rows; the right-hand side describes the columns. To illustrate how this works, we will again be using the palmerpenguins dataset:\nTo display the flipper_length_mm variable as a row and the mean as a column, we type:\n\ndatasummary(flipper_length_mm ~ Mean,\n data = penguins)\n\n\n\n\n\n \n \n \n Mean\n \n \n \n flipper_length_mm\n200.92\n \n \n \n\n\n\n\nTo flip rows and columns, we flip the left and right-hand sides of the formula:\n\ndatasummary(Mean ~ flipper_length_mm,\n data = penguins)\n\n\n\n\n\n \n \n \n flipper_length_mm\n \n \n \n Mean\n200.92\n \n \n \n\n\n\n\n\n\nThe Mean function is a shortcut supplied by modelsummary, and it is equivalent to mean(x,na.rm=TRUE). Since the flipper_length_mm variable includes missing observation, using the mean formula (with default na.rm=FALSE) would produce a missing/empty cell:\n\ndatasummary(flipper_length_mm ~ mean,\n data = penguins)\n\n\n\n\n\n \n \n \n mean\n \n \n \n flipper_length_mm\n\n \n \n \n\n\n\n\nmodelsummary supplies these functions: Mean, SD, Min, Max, Median, P0, P25, P50, P75, P100, Histogram, and a few more (see the package documentation).\nUsers are also free to create and use their own custom summaries. Any R function which takes a vector and produces a single value is acceptable. For example, the Range functions return a numerical value, and the MinMax returns a string:\n\nRange <- function(x) max(x, na.rm = TRUE) - min(x, na.rm = TRUE)\n\ndatasummary(flipper_length_mm ~ Range,\n data = penguins)\n\n\n\n\n\n \n \n \n Range\n \n \n \n flipper_length_mm\n59.00\n \n \n \n\n\n\nMinMax <- function(x) paste0('[', min(x, na.rm = TRUE), ', ', max(x, na.rm = TRUE), ']')\n\ndatasummary(flipper_length_mm ~ MinMax,\n data = penguins)\n\n\n\n\n\n \n \n \n MinMax\n \n \n \n flipper_length_mm\n[172, 231]\n \n \n \n\n\n\n\n\n\n\nTo include more rows and columns, we use the + sign:\n\ndatasummary(flipper_length_mm + body_mass_g ~ Mean + SD,\n data = penguins)\n\n\n\n\n\n \n \n \n Mean\n SD\n \n \n \n flipper_length_mm\n200.92\n14.06\n body_mass_g\n4201.75\n801.95\n \n \n \n\n\n\n\nSometimes, it can be cumbersome to list all variables separated by + signs. The All() function is a useful shortcut:\n\ndatasummary(All(penguins) ~ Mean + SD,\n data = penguins)\n\n\n\n\n\n \n \n \n Mean\n SD\n \n \n \n rownames\n172.50\n99.45\n bill_length_mm\n43.92\n5.46\n bill_depth_mm\n17.15\n1.97\n flipper_length_mm\n200.92\n14.06\n body_mass_g\n4201.75\n801.95\n year\n2008.03\n0.82\n \n \n \n\n\n\n\nBy default, All selects all numeric variables. This behavior can be changed by modifying the function’s arguments. See ?All for details.\n\n\n\ndatasummary can nest variables and statistics inside categorical variables using the * symbol. When applying the the * operator to factor, character, or logical variables, columns or rows will automatically be nested. For instance, if we want to display separate means for each value of the variable sex, we use mean * sex:\n\ndatasummary(flipper_length_mm + body_mass_g ~ mean * sex,\n data = penguins)\n\n\n\n\n\n \n \n \n female\n male\n \n \n \n flipper_length_mm\n197.36\n204.51\n body_mass_g\n3862.27\n4545.68\n \n \n \n\n\n\n\nWe can use parentheses to nest several terms inside one another, using a call of this form: x * (y + z). Here is an example with nested columns:\n\ndatasummary(body_mass_g ~ sex * (mean + sd),\n data = penguins)\n\n\n\n\n\n \n \n \n \n female \n \n \n male \n \n \n \n mean\n sd\n mean \n sd \n \n \n \n body_mass_g\n3862.27\n666.17\n4545.68\n787.63\n \n \n \n\n\n\n\nHere is an example with nested rows:\n\ndatasummary(sex * (body_mass_g + flipper_length_mm) ~ mean + sd,\n data = penguins)\n\n\n\n\n\n \n \n sex\n \n mean\n sd\n \n \n \n female\nbody_mass_g\n3862.27\n666.17\n \nflipper_length_mm\n197.36\n12.50\n male\nbody_mass_g\n4545.68\n787.63\n \nflipper_length_mm\n204.51\n14.55\n \n \n \n\n\n\n\nThe order in which terms enter the formula determines the order in which labels are displayed. For example, this shows island above sex:\n\ndatasummary(flipper_length_mm + body_mass_g ~ mean * island * sex,\n data = penguins)\n\n\n\n\n\n \n \n \n \n Biscoe \n \n \n Dream \n \n \n Torgersen \n \n \n \n female\n male\n female \n male \n female \n male \n \n \n \n flipper_length_mm\n205.69\n213.29\n190.02\n196.31\n188.29\n194.91\n body_mass_g\n4319.38\n5104.52\n3446.31\n3987.10\n3395.83\n4034.78\n \n \n \n\n\n\n\nThis shows sex above island values:\n\ndatasummary(flipper_length_mm + body_mass_g ~ mean * sex * island,\n data = penguins)\n\n\n\n\n\n \n \n \n \n female \n \n \n male \n \n \n \n Biscoe\n Dream\n Torgersen\n Biscoe \n Dream \n Torgersen \n \n \n \n flipper_length_mm\n205.69\n190.02\n188.29\n213.29\n196.31\n194.91\n body_mass_g\n4319.38\n3446.31\n3395.83\n5104.52\n3987.10\n4034.78\n \n \n \n\n\n\n\nBy default, datasummary omits column headers with a single value/label across all columns, except for the header that sits just above the data. If the header we want to see is not displayed, we may want to reorder the terms of the formula. To show all headers, set sparse_header=FALSE:\n\ndatasummary(flipper_length_mm + body_mass_g ~ mean * sex * island,\n data = penguins,\n sparse_header = FALSE)\n\n\n\n\n\n \n \n \n \n female \n \n \n male \n \n \n \n Biscoe\n Dream\n Torgersen\n Biscoe \n Dream \n Torgersen \n \n \n \n flipper_length_mm\n205.69\n190.02\n188.29\n213.29\n196.31\n194.91\n body_mass_g\n4319.38\n3446.31\n3395.83\n5104.52\n3987.10\n4034.78\n \n \n \n\n\n\n\nWhen using sparse_header=FALSE, it is often useful to insert Heading() * in the table formula, in order to rename or omit some of the labels manually. Type ?tables::Heading for details and examples.\n\n\n\nPersonally, I prefer to rename variables and values before drawing my tables, using backticks when variable names include whitespace. For example,\n\ntmp <- penguins %>%\n select(`Flipper length (mm)` = flipper_length_mm,\n `Body mass (g)` = body_mass_g)\n\ndatasummary(`Flipper length (mm)` + `Body mass (g)` ~ Mean + SD,\n data = tmp)\n\n\n\n\n\n \n \n \n Mean\n SD\n \n \n \n Flipper length (mm)\n200.92\n14.06\n Body mass (g)\n4201.75\n801.95\n \n \n \n\n\n\n\nHowever, thanks to the tables package, datasummary offers two additional mechanisms to rename. First, we can wrap a term in parentheses and use the equal = sign: (NewName=OldName):\n\ndatasummary((`Flipper length (mm)` = flipper_length_mm) + (`Body mass (g)` = body_mass_g) ~\n island * ((Avg. = Mean) + (Std.Dev. = SD)),\n data = penguins)\n\n\n\n\n\n \n \n \n \n Biscoe \n \n \n Dream \n \n \n Torgersen \n \n \n \n Avg.\n Std.Dev.\n Avg. \n Std.Dev. \n Avg. \n Std.Dev. \n \n \n \n Flipper length (mm)\n209.71\n14.14\n193.07\n7.51\n191.20\n6.23\n Body mass (g)\n4716.02\n782.86\n3712.90\n416.64\n3706.37\n445.11\n \n \n \n\n\n\n\nSecond, we can use the Heading() function:\n\ndatasummary(Heading(\"Flipper length (mm)\") * flipper_length_mm + Heading(\"Body mass (g)\") * body_mass_g ~ island * (Mean + SD),\n data = penguins)\n\n\n\n\n\n \n \n \n \n Biscoe \n \n \n Dream \n \n \n Torgersen \n \n \n \n Mean\n SD\n Mean \n SD \n Mean \n SD \n \n \n \n Flipper length (mm)\n209.71\n14.14\n193.07\n7.51\n191.20\n6.23\n Body mass (g)\n4716.02\n782.86\n3712.90\n416.64\n3706.37\n445.11\n \n \n \n\n\n\n\nThe Heading function also has a nearData argument which can be useful in cases where some rows are nested but others are not. Compare the last row of these two tables:\n\ndatasummary(sex * (flipper_length_mm + bill_length_mm) + Heading(\"Body mass (g)\") * body_mass_g ~ Mean + SD,\n data = penguins)\n\n\n\n\n\n \n \n sex\n \n Mean\n SD\n \n \n \n female\nflipper_length_mm\n197.36\n12.50\n \nbill_length_mm\n42.10\n4.90\n male\nflipper_length_mm\n204.51\n14.55\n \nbill_length_mm\n45.85\n5.37\n \nBody mass (g)\n4201.75\n801.95\n \n \n \n\n\n\ndatasummary(sex * (flipper_length_mm + bill_length_mm) + Heading(\"Body mass (g)\", nearData=FALSE) * body_mass_g ~ Mean + SD,\n data = penguins)\n\n\n\n\n\n \n \n sex\n \n Mean\n SD\n \n \n \n female\nflipper_length_mm\n197.36\n12.50\n \nbill_length_mm\n42.10\n4.90\n male\nflipper_length_mm\n204.51\n14.55\n \nbill_length_mm\n45.85\n5.37\n Body mass (g)\n\n4201.75\n801.95\n \n \n \n\n\n\n\n\n\n\nThe tables package allows datasummary to use neat tricks to produce frequency tables:\n\nAdd a N to the right-hand side of the equation.\nAdd Percent() to the right-hand side to calculate the percentage of observations in each cell.\nAdd 1 to the left-hand side to include a row with the total number of observations:\n\n\ndatasummary(species * sex + 1 ~ N + Percent(),\n data = penguins)\n\n\n\n\n\n \n \n species\n sex\n N\n Percent\n \n \n \n Adelie\nfemale\n73\n21.22\n \nmale\n73\n21.22\n Chinstrap\nfemale\n34\n9.88\n \nmale\n34\n9.88\n Gentoo\nfemale\n58\n16.86\n \nmale\n61\n17.73\n \nAll\n344\n100.00\n \n \n \n\n\n\n\nNote that the Percent() function accepts a denom argument to determine if percentages should be calculated row or column-wise, or if they should take into account all cells.\n\n\n\nThe Percent() pseudo-function also accepts a fn argument, which must be a function which accepts two vectors: x is the values in the current cell, and y is all the values in the whole dataset. The default fn is:\n\ndatasummary(species * sex + 1 ~ N + Percent(fn = function(x, y) 100 * length(x) / length(y)),\n output = \"markdown\",\n data = penguins)\n\n\n\n\nspecies\nsex\nN\nPercent\n\n\n\n\nAdelie\nfemale\n73\n21.22\n\n\n\nmale\n73\n21.22\n\n\nChinstrap\nfemale\n34\n9.88\n\n\n\nmale\n34\n9.88\n\n\nGentoo\nfemale\n58\n16.86\n\n\n\nmale\n61\n17.73\n\n\n\nAll\n344\n100.00\n\n\n\n\n\nThe code above takes the number of elements in the cell length(x) and divides it by the number of total elements length(y).\nNow, let’s say we want to display percentages weighted by one of the variables of the dataset. This can often be useful with survey weights, for example. Here, we use an arbitrary column of weights called flipper_length_mm:\n\nwtpct <- function(x, y) sum(x, na.rm = TRUE) / sum(y, na.rm = TRUE) * 100\ndatasummary(species * sex + 1 ~ N + flipper_length_mm * Percent(fn = wtpct),\n output = \"markdown\",\n data = penguins)\n\n\n\n\nspecies\nsex\nN\nPercent\n\n\n\n\nAdelie\nfemale\n73\n19.95\n\n\n\nmale\n73\n20.44\n\n\nChinstrap\nfemale\n34\n9.49\n\n\n\nmale\n34\n9.89\n\n\nGentoo\nfemale\n58\n17.95\n\n\n\nmale\n61\n19.67\n\n\n\nAll\n344\n100.00\n\n\n\n\n\nIn each cell we now have the sum of weights in that cell, divided by the total sum of weights in the column.\n\n\n\nHere is another simple illustration of Percent function mechanism in action, where we combine counts and percentages in a simple nice label:\n\ndat <- mtcars\ndat$cyl <- as.factor(dat$cyl)\n\nfn <- function(x, y) {\n out <- sprintf(\n \"%s (%.1f%%)\",\n length(x),\n length(x) / length(y) * 100)\n}\ndatasummary(\n cyl ~ Percent(fn = fn),\n data = dat)\n\n\n\n\n\n \n \n cyl\n Percent\n \n \n \n 4\n11 (34.4%)\n 6\n7 (21.9%)\n 8\n14 (43.8%)\n \n \n \n\n\n\n\n\n\n\nThe * nesting operator that we used above works automatically for factor, character, and logical variables. Sometimes, it is convenient to use it with other types of variables, such as binary numeric variables. In that case, we can wrap the variable name inside a call to Factor(). This allows us to treat a variable as factor, without having to modify it in the original data. For example, in the mtcars data, there is a binary numeric variable call am. We nest statistics within categories of am by typing:\n\ndatasummary(mpg + hp ~ Factor(am) * (mean + sd),\n data = mtcars)\n\n\n\n\n\n \n \n \n \n 0 \n \n \n 1 \n \n \n \n mean\n sd\n mean \n sd \n \n \n \n mpg\n17.15\n3.83\n24.39\n6.17\n hp\n160.26\n53.91\n126.85\n84.06\n \n \n \n\n\n\n\n\n\n\nWe can pass any argument to the summary function by including a call to Arguments(). For instance, there are missing values in the flipper_length_mm variable of the penguins dataset. Therefore, the standard mean function will produce no result, because its default argument is na.rm=FALSE. We can change that by calling:\n\ndatasummary(flipper_length_mm ~ mean + mean*Arguments(na.rm=TRUE),\n data = penguins)\n\n\n\n\n\n \n \n \n mean\n mean \n \n \n \n flipper_length_mm\n\n200.92\n \n \n \n\n\n\n\nNotice that there is an empty cell (NA) under the normal mean function, but that the mean call with Arguments(na.rm=TRUE) produced a numeric result.\nWe can pass the same arguments to multiple functions using the parentheses:\n\ndatasummary(flipper_length_mm ~ (mean + sd) * Arguments(na.rm=TRUE),\n data = penguins)\n\n\n\n\n\n \n \n \n mean\n sd\n \n \n \n flipper_length_mm\n200.92\n14.06\n \n \n \n\n\n\n\nNote that the shortcut functions that modelsummary supplies use na.rm=TRUE by default, so we can use them directly without arguments, even when there are missing values:\n\ndatasummary(flipper_length_mm ~ Mean + Var + P75 + Range,\n data = penguins)\n\n\n\n\n\n \n \n \n Mean\n Var\n P75\n Range\n \n \n \n flipper_length_mm\n200.92\n197.73\n213.00\n59.00\n \n \n \n\n\n\n\n\n\n\nYou can use the Arguments mechanism to do various things, such as calculating weighted means:\n\nnewdata <- data.frame(\n x = rnorm(20),\n w = rnorm(20),\n y = rnorm(20))\n\ndatasummary(x + y ~ weighted.mean * Arguments(w = w),\n data = newdata, output = \"markdown\")\n\n\n\n\n\nweighted.mean\n\n\n\n\nx\n-4.18\n\n\ny\n3.38\n\n\n\n\n\nWhich produces the same results as:\n\nweighted.mean(newdata$x, newdata$w)\n\n[1] -4.178942\n\nweighted.mean(newdata$y, newdata$w)\n\n[1] 3.378796\n\n\nBut different results from:\n\nmean(newdata$x)\n\n[1] -0.3440597\n\nmean(newdata$y)\n\n[1] 0.2460577\n\n\n\n\n\nSometimes, if we nest too much and the dataset is not large/diverse enough, we end up with empty cells. In that case, we add *DropEmpty() to the formula:\n\ndatasummary(island * species * body_mass_g ~ sex * (Mean + SD),\n data = penguins)\n\n\n\n\n\n \n \n island\n species\n \n \n female \n \n \n male \n \n \n \n Mean\n SD\n Mean \n SD \n \n \n \n Biscoe\nAdelie\nbody_mass_g\n3369.32\n343.47\n4050.00\n355.57\n \nChinstrap\nbody_mass_g\n\n\n\n\n \nGentoo\nbody_mass_g\n4679.74\n281.58\n5484.84\n313.16\n Dream\nAdelie\nbody_mass_g\n3344.44\n212.06\n4045.54\n330.55\n \nChinstrap\nbody_mass_g\n3527.21\n285.33\n3938.97\n362.14\n \nGentoo\nbody_mass_g\n\n\n\n\n Torgersen\nAdelie\nbody_mass_g\n3395.83\n259.14\n4034.78\n372.47\n \nChinstrap\nbody_mass_g\n\n\n\n\n \nGentoo\nbody_mass_g\n\n\n\n\n \n \n \n\n\n\ndatasummary(island * species * body_mass_g ~ sex * (Mean + SD) * DropEmpty(),\n data = penguins)\n\n\n\n\n\n \n \n island\n species\n \n \n female \n \n \n male \n \n \n \n Mean\n SD\n Mean \n SD \n \n \n \n Biscoe\nAdelie\nbody_mass_g\n3369.32\n343.47\n4050.00\n355.57\n \nGentoo\nbody_mass_g\n4679.74\n281.58\n5484.84\n313.16\n Dream\nAdelie\nbody_mass_g\n3344.44\n212.06\n4045.54\n330.55\n \nChinstrap\nbody_mass_g\n3527.21\n285.33\n3938.97\n362.14\n Torgersen\nAdelie\nbody_mass_g\n3395.83\n259.14\n4034.78\n372.47\n \n \n \n\n\n\n\n\n\n\nCool stuff is possible with logical subsets:\n\ndatasummary((bill_length_mm > 44.5) + (bill_length_mm <= 44.5) ~ Mean * flipper_length_mm,\n data = penguins)\n\n\n\n\n\n \n \n \n \n \n \n \n bill_length_mm > 44.5\n209.68\n bill_length_mm <= 44.5\n192.45\n \n \n \n\n\n\n\nSee the tables package documentation for more details and examples.\n\n\n\n\nAll functions in the datasummary_* family accept the same output argument. Tables can be saved to several file formats:\n\nf <- flipper_length_mm ~ island * (mean + sd)\ndatasummary(f, data = penguins, output = 'table.html')\ndatasummary(f, data = penguins, output = 'table.tex')\ndatasummary(f, data = penguins, output = 'table.docx')\ndatasummary(f, data = penguins, output = 'table.pptx')\ndatasummary(f, data = penguins, output = 'table.md')\ndatasummary(f, data = penguins, output = 'table.rtf')\ndatasummary(f, data = penguins, output = 'table.jpg')\ndatasummary(f, data = penguins, output = 'table.png')\n\nThey can be returned in human-readable data.frames, markdown, html, or LaTeX code to the console:\n\ndatasummary(f, data = penguins, output = 'data.frame')\ndatasummary(f, data = penguins, output = 'markdown')\ndatasummary(f, data = penguins, output = 'html')\ndatasummary(f, data = penguins, output = 'latex')\n\ndatasummary can return objects compatible with the gt, kableExtra, flextable, huxtable, and DT table creation and customization packages:\n\ndatasummary(f, data = penguins, output = 'gt')\ndatasummary(f, data = penguins, output = 'kableExtra')\ndatasummary(f, data = penguins, output = 'flextable')\ndatasummary(f, data = penguins, output = 'huxtable')\ndatasummary(f, data = penguins, output = 'DT')\n\nPlease note that hierarchical or “nested” column labels are only available for these output formats: kableExtra, gt, html, rtf, and LaTeX. When saving tables to other formats, nested labels will be combined to a “flat” header.\n\n\n\nThe fmt argument allows us to set the printing format of numeric values. It accepts a single number representing the number of digits after the period, or a string to be passed to the sprintf function. For instance, setting fmt=\"%.4f\" will keep 4 digits after the dot (see ?sprintf for more options):\n\ndatasummary(flipper_length_mm ~ Mean + SD,\n fmt = 4,\n data = penguins)\n\n\n\n\n\n \n \n \n Mean\n SD\n \n \n \n flipper_length_mm\n200.9152\n14.0617\n \n \n \n\n\n\n\nWe can set the formatting on a term-by-term basis by using the same Arguments function that we used to handle missing values in the previous section. The shortcut functions that ship with modelsummary (e.g., Mean, SD, Median, P25) all include a fmt argument:\n\ndatasummary(flipper_length_mm ~ Mean * Arguments(fmt = \"%.4f\") + SD * Arguments(fmt = \"%.1f\"),\n data = penguins)\n\n\n\n\n\n \n \n \n Mean\n SD\n \n \n \n flipper_length_mm\n200.9152\n14.1\n \n \n \n\n\n\n\nIf we do not want datasummary to format numbers, and we want to keep the numerical values instead of formatted strings, set fmt=NULL. This can be useful when post-processing tables with packages like gt, which allow us to transform cells based on their numerical content (this gt table will be omitted from PDF output):\n\nlibrary(gt)\n\ndatasummary(All(mtcars) ~ Mean + SD,\n data = mtcars,\n fmt = NULL,\n output = 'gt') %>%\n tab_style(style = cell_fill(color = \"pink\"),\n locations = cells_body(rows = Mean > 10, columns = 2))\n\n\n\n\n\n \n \n \n Mean\n SD\n \n \n \n mpg\n20.090625\n6.0269481\n cyl\n6.187500\n1.7859216\n disp\n230.721875\n123.9386938\n hp\n146.687500\n68.5628685\n drat\n3.596563\n0.5346787\n wt\n3.217250\n0.9784574\n qsec\n17.848750\n1.7869432\n vs\n0.437500\n0.5040161\n am\n0.406250\n0.4989909\n gear\n3.687500\n0.7378041\n carb\n2.812500\n1.6152000\n \n \n \n\n\n\n\nPlease note that the N() function is supplied by the upstream tables package, and does not have a fmt argument. Fortunately, it is easy to override the built-in function to use custom formatting:\n\ntmp <- data.frame(X = sample(letters[1:3], 1e6, replace = TRUE))\nN <- \\(x) format(length(x), big.mark = \",\")\ndatasummary(X ~ N, data = tmp)\n\n\n\n\n\n \n \n X\n N\n \n \n \n a\n333,736\n b\n333,200\n c\n333,064\n \n \n \n\n\n\n\n\n\n\ndatasummary includes the same title and notes arguments as in modelsummary:\n\ndatasummary(All(penguins) ~ Mean + SD,\n data = penguins,\n title = 'Statistics about the famous Palmer Penguins.',\n notes = c('A note at the bottom of the table.'))\n\n\n\n\n\n Statistics about the famous Palmer Penguins.\n \n \n \n Mean\n SD\n \n \n \n rownames\n172.50\n99.45\n bill_length_mm\n43.92\n5.46\n bill_depth_mm\n17.15\n1.97\n flipper_length_mm\n200.92\n14.06\n body_mass_g\n4201.75\n801.95\n year\n2008.03\n0.82\n \n \n \n A note at the bottom of the table.\n \n \n \n\n\n\n\n\n\n\nWe can align columns using the align argument. align should be a string of length equal to the number of columns, and which includes only the letters “l”, “c”, or “r”:\n\ndatasummary(flipper_length_mm + bill_length_mm ~ Mean + SD + Range,\n data = penguins,\n align = 'lrcl')\n\n\n\n\n\n \n \n \n Mean\n SD\n Range\n \n \n \n flipper_length_mm\n200.92\n14.06\n59.00\n bill_length_mm\n43.92\n5.46\n27.50\n \n \n \n\n\n\n\n\n\n\n\nnew_rows <- data.frame('Does',\n 2,\n 'plus',\n 2,\n 'equals',\n 5,\n '?')\ndatasummary(flipper_length_mm + body_mass_g ~ species * (Mean + SD),\n data = penguins,\n add_rows = new_rows)\n\n\n\n\n\n \n \n \n \n Adelie \n \n \n Chinstrap \n \n \n Gentoo \n \n \n \n Mean\n SD\n Mean \n SD \n Mean \n SD \n \n \n \n flipper_length_mm\n189.95\n6.54\n195.82\n7.13\n217.19\n6.48\n body_mass_g\n3700.66\n458.57\n3733.09\n384.34\n5076.02\n504.12\n Does\n2.00\nplus\n2.00\nequals\n5.00\n?\n \n \n \n\n\n\n\n\n\n\n\nnew_cols <- data.frame('New Stat' = runif(2))\ndatasummary(flipper_length_mm + body_mass_g ~ species * (Mean + SD),\n data = penguins,\n add_columns = new_cols)\n\n\n\n\n\n \n \n \n \n Adelie \n \n \n Chinstrap \n \n \n Gentoo \n \n New.Stat\n \n \n Mean\n SD\n Mean \n SD \n Mean \n SD \n \n \n \n flipper_length_mm\n189.95\n6.54\n195.82\n7.13\n217.19\n6.48\n0.38\n body_mass_g\n3700.66\n458.57\n3733.09\n384.34\n5076.02\n504.12\n0.88\n \n \n \n\n\n\n\n\n\n\nThe datasummary family of functions allow users to display in-line spark-style histograms to describe the distribution of the variables. For example, the datasummary_skim produces such a histogram:\n\ntmp <- mtcars[, c(\"mpg\", \"hp\")]\ndatasummary_skim(tmp)\n\n\n\n\n\n \n \n \n Unique (#)\n Missing (%)\n Mean\n SD\n Min\n Median\n Max\n \n \n \n \n mpg\n25\n0\n20.1\n6.0\n10.4\n19.2\n33.9\n \n hp\n22\n0\n146.7\n68.6\n52.0\n123.0\n335.0\n \n \n \n \n\n\n\n\nEach of the histograms in the table above is actually an SVG image, produced by the kableExtra package. For this reason, the histogram will not appear when users use a different output backend, such as gt, flextable, or huxtable.\nThe datasummary function is incredibly flexible, but it does not include a histogram option by default. Here is a simple example of how one can customize the output of datasummary. We proceed in 4 steps:\n\nNormalize the variables and store them in a list\nCreate the table with datasummary, making sure to include 2 “empty” columns. In the example, we use a simple function called emptycol to fill those columns with empty strings.\nAdd the histograms or boxplots using functions from the kableExtra package.\n\n\ntmp <- mtcars[, c(\"mpg\", \"hp\")]\n\n## create a list with individual variables\n## remove missing and rescale\ntmp_list <- lapply(tmp, na.omit)\ntmp_list <- lapply(tmp_list, scale)\n\n## create a table with `datasummary`\n## add a histogram with column_spec and spec_hist\n## add a boxplot with colun_spec and spec_box\nemptycol = function(x) \" \"\ndatasummary(mpg + hp ~ Mean + SD + Heading(\"Boxplot\") * emptycol + Heading(\"Histogram\") * emptycol,\n output = \"kableExtra\",\n data = tmp) %>%\n kableExtra::column_spec(column = 4, image = spec_boxplot(tmp_list)) %>%\n kableExtra::column_spec(column = 5, image = spec_hist(tmp_list))\n\nIf you want a simpler solution, you can try the Histogram function which works in datasummary automatically and comes bundled with modelsummary. The downside of this function is that it uses Unicode characters to create the histogram. This kind of histogram may not display well with certain typefaces or on some operating systems (Windows!).\n\ndatasummary(mpg + hp ~ Mean + SD + Histogram, data = tmp)\n\n\n\n\n\n \n \n \n Mean\n SD\n Histogram\n \n \n \n mpg\n20.09\n6.03\n▂▅▇▇▆▃▁▁▂▂\n hp\n146.69\n68.56\n▅▅▇▂▆▂▃▁▁\n \n \n \n\n\n\n\n\n\n\nAt least 3 distinct issues can arise related to missing values.\n\n\nAn empty cell can appear in the table when a statistical function returns NA instead of a numeric value. In those cases, you should:\n\nMake sure that your statistical function (e.g., mean or sd) uses na.rm=TRUE by default\nUse the Arguments strategy to set na.rm=TRUE (see the Arguments section of this vignette).\nUse a convenience function supplied by modelsummary, where na.rm is TRUE by default: Mean, SD, P25, etc.\n\n\n\n\nAn empty cell can appear in the table when a crosstab is deeply nested, and there are no observations for a given combination of covariates. In those cases, you can use the * DropEmpty pseudo-function. See the “Empty cells” section of this vignette for examples.\n\n\n\nBy default, the factor function in R does not assign a distinct factor level to missing values: the factor function’s exclude argument is set to NA by default. To ensure that NAs appear in your table, make sure you set exclude=NULL when you create the factor.\nInternally, the datasummary_balance and datasummary_crosstab functions convert logical and character variables to factor with the exclude=NULL argument. This means that NAs will appear in the table as distinct rows/columns. If you do not want NAs to appear in your table, convert them to factors yourself ahead of time. For example:\n\nmycars <- mtcars[, c(\"cyl\", \"mpg\", \"hp\", \"vs\")]\nmycars$cyl[c(1, 3, 6, 8)] <- NA\nmycars$cyl_nona <- factor(mycars$cyl)\nmycars$cyl_na <- factor(mycars$cyl, exclude = NULL)\n\ndatasummary_crosstab(cyl_nona ~ vs, data = mycars)\n\n\n\n\n\n \n \n cyl_nona\n \n 0\n 1\n All\n \n \n \n 4\nN\n1\n8\n9\n \n% row\n11.1\n88.9\n100.0\n 6\nN\n2\n3\n5\n \n% row\n40.0\n60.0\n100.0\n 8\nN\n14\n0\n14\n \n% row\n100.0\n0.0\n100.0\n All\nN\n18\n14\n32\n \n% row\n56.2\n43.8\n100.0\n \n \n \n\n\n\ndatasummary_crosstab(cyl_na ~ vs, data = mycars)\n\n\n\n\n\n \n \n cyl_na\n \n 0\n 1\n All\n \n \n \n 4\nN\n1\n8\n9\n \n% row\n11.1\n88.9\n100.0\n 6\nN\n2\n3\n5\n \n% row\n40.0\n60.0\n100.0\n 8\nN\n14\n0\n14\n \n% row\n100.0\n0.0\n100.0\n NA\nN\n1\n3\n4\n \n% row\n25.0\n75.0\n100.0\n All\nN\n18\n14\n32\n \n% row\n56.2\n43.8\n100.0\n \n \n \n\n\n\n\n\n\n\n\nIn the Appearance Vignette we saw how to customize the tables produced by the modelsummary function. The same customization possibilities are also available for all functions in the datasummary_* family of functions. Indeed, we can customize the tables produced by datasummary using the functions provided by gt, kableExtra, flextable, huxtable, and DT. For instance, to customize tables using kableExtra, we can do:\n\ndatasummary(All(penguins) ~ sex * (Mean + SD),\n data = penguins,\n output = 'kableExtra') %>%\n kableExtra::row_spec(3, background = 'cyan', color = 'red')\n\nTo customize a table using the gt package, we can:\n\nlibrary(gt)\n\nadelie <- function(x) web_image('https://user-images.githubusercontent.com/987057/85402702-20b1d280-b52a-11ea-9950-f3a03133fd45.png', height = 100)\ngentoo <- function(x) web_image('https://user-images.githubusercontent.com/987057/85402718-27404a00-b52a-11ea-9ad3-dd7562f6438d.png', height = 100)\nchinstrap <- function(x) web_image('https://user-images.githubusercontent.com/987057/85402708-23acc300-b52a-11ea-9a77-de360a0d1f7d.png', height = 100)\n\ncap <- 'Flipper lengths (mm) of the famous penguins of Palmer Station, Antarctica.'\nf <- (`Species` = species) ~ (` ` = flipper_length_mm) * (`Distribution` = Histogram) + flipper_length_mm * sex * ((`Avg.` = Mean)*Arguments(fmt='%.0f') + (`Std. Dev.` = SD)*Arguments(fmt='%.1f'))\ndatasummary(f,\n data = penguins,\n output = 'gt',\n title = cap,\n notes = 'Artwork by @allison_horst',\n sparse_header = TRUE) %>%\n text_transform(locations = cells_body(columns = 1, rows = 1), fn = adelie) %>%\n text_transform(locations = cells_body(columns = 1, rows = 2), fn = chinstrap) %>%\n text_transform(locations = cells_body(columns = 1, rows = 3), fn = gentoo) %>%\n tab_style(style = list(cell_text(color = \"#FF6700\", size = 'x-large')), locations = cells_body(rows = 1)) %>%\n tab_style(style = list(cell_text(color = \"#CD51D1\", size = 'x-large')), locations = cells_body(rows = 2)) %>%\n tab_style(style = list(cell_text(color = \"#007377\", size = 'x-large')), locations = cells_body(rows = 3)) %>%\n tab_options(table_body.hlines.width = 0, table.border.top.width = 0, table.border.bottom.width = 0) %>%\n cols_align('center', columns = 3:6)\n\n\n\n\n\n Flipper lengths (mm) of the famous penguins of Palmer Station, Antarctica.\n \n \n Species\n Distribution\n \n Female \n \n \n Male \n \n \n \n Avg.\n Std. Dev.\n Avg. \n Std. Dev. \n \n \n \n \n▁▃▆▇▃▅▂\n188\n5.6\n192\n6.6\n \n▁▄▃▇▆▄▃▁▂\n192\n5.8\n200\n6.0\n \n▂▅▅▇▃▅▂▁▃\n213\n3.9\n222\n5.7\n \n \n \n Artwork by @allison_horst" }, { "objectID": "vignettes/datasummary.html#background", @@ -382,7 +382,7 @@ "href": "vignettes/datasummary.html#datasummary", "title": "Data Summaries", "section": "", - "text": "datasummary tables are specified using a 2-sided formula, divided by a tilde ~. The left-hand side describes the rows; the right-hand side describes the columns. To illustrate how this works, we will again be using the palmerpenguins dataset:\nTo display the flipper_length_mm variable as a row and the mean as a column, we type:\n\ndatasummary(flipper_length_mm ~ Mean,\n data = penguins)\n\n\n\n\n\n \n \n \n Mean\n \n \n \n flipper_length_mm\n200.92\n \n \n \n\n\n\n\nTo flip rows and columns, we flip the left and right-hand sides of the formula:\n\ndatasummary(Mean ~ flipper_length_mm,\n data = penguins)\n\n\n\n\n\n \n \n \n flipper_length_mm\n \n \n \n Mean\n200.92\n \n \n \n\n\n\n\n\n\nThe Mean function is a shortcut supplied by modelsummary, and it is equivalent to mean(x,na.rm=TRUE). Since the flipper_length_mm variable includes missing observation, using the mean formula (with default na.rm=FALSE) would produce a missing/empty cell:\n\ndatasummary(flipper_length_mm ~ mean,\n data = penguins)\n\n\n\n\n\n \n \n \n mean\n \n \n \n flipper_length_mm\n\n \n \n \n\n\n\n\nmodelsummary supplies these functions: Mean, SD, Min, Max, Median, P0, P25, P50, P75, P100, Histogram, and a few more (see the package documentation).\nUsers are also free to create and use their own custom summaries. Any R function which takes a vector and produces a single value is acceptable. For example, the Range functions return a numerical value, and the MinMax returns a string:\n\nRange <- function(x) max(x, na.rm = TRUE) - min(x, na.rm = TRUE)\n\ndatasummary(flipper_length_mm ~ Range,\n data = penguins)\n\n\n\n\n\n \n \n \n Range\n \n \n \n flipper_length_mm\n59.00\n \n \n \n\n\n\nMinMax <- function(x) paste0('[', min(x, na.rm = TRUE), ', ', max(x, na.rm = TRUE), ']')\n\ndatasummary(flipper_length_mm ~ MinMax,\n data = penguins)\n\n\n\n\n\n \n \n \n MinMax\n \n \n \n flipper_length_mm\n[172, 231]\n \n \n \n\n\n\n\n\n\n\nTo include more rows and columns, we use the + sign:\n\ndatasummary(flipper_length_mm + body_mass_g ~ Mean + SD,\n data = penguins)\n\n\n\n\n\n \n \n \n Mean\n SD\n \n \n \n flipper_length_mm\n200.92\n14.06\n body_mass_g\n4201.75\n801.95\n \n \n \n\n\n\n\nSometimes, it can be cumbersome to list all variables separated by + signs. The All() function is a useful shortcut:\n\ndatasummary(All(penguins) ~ Mean + SD,\n data = penguins)\n\n\n\n\n\n \n \n \n Mean\n SD\n \n \n \n rownames\n172.50\n99.45\n bill_length_mm\n43.92\n5.46\n bill_depth_mm\n17.15\n1.97\n flipper_length_mm\n200.92\n14.06\n body_mass_g\n4201.75\n801.95\n year\n2008.03\n0.82\n \n \n \n\n\n\n\nBy default, All selects all numeric variables. This behavior can be changed by modifying the function’s arguments. See ?All for details.\n\n\n\ndatasummary can nest variables and statistics inside categorical variables using the * symbol. When applying the the * operator to factor, character, or logical variables, columns or rows will automatically be nested. For instance, if we want to display separate means for each value of the variable sex, we use mean * sex:\n\ndatasummary(flipper_length_mm + body_mass_g ~ mean * sex,\n data = penguins)\n\n\n\n\n\n \n \n \n female\n male\n \n \n \n flipper_length_mm\n197.36\n204.51\n body_mass_g\n3862.27\n4545.68\n \n \n \n\n\n\n\nWe can use parentheses to nest several terms inside one another, using a call of this form: x * (y + z). Here is an example with nested columns:\n\ndatasummary(body_mass_g ~ sex * (mean + sd),\n data = penguins)\n\n\n\n\n\n \n \n \n \n female \n \n \n male \n \n \n \n mean\n sd\n mean \n sd \n \n \n \n body_mass_g\n3862.27\n666.17\n4545.68\n787.63\n \n \n \n\n\n\n\nHere is an example with nested rows:\n\ndatasummary(sex * (body_mass_g + flipper_length_mm) ~ mean + sd,\n data = penguins)\n\n\n\n\n\n \n \n sex\n \n mean\n sd\n \n \n \n female\nbody_mass_g\n3862.27\n666.17\n \nflipper_length_mm\n197.36\n12.50\n male\nbody_mass_g\n4545.68\n787.63\n \nflipper_length_mm\n204.51\n14.55\n \n \n \n\n\n\n\nThe order in which terms enter the formula determines the order in which labels are displayed. For example, this shows island above sex:\n\ndatasummary(flipper_length_mm + body_mass_g ~ mean * island * sex,\n data = penguins)\n\n\n\n\n\n \n \n \n \n Biscoe \n \n \n Dream \n \n \n Torgersen \n \n \n \n female\n male\n female \n male \n female \n male \n \n \n \n flipper_length_mm\n205.69\n213.29\n190.02\n196.31\n188.29\n194.91\n body_mass_g\n4319.38\n5104.52\n3446.31\n3987.10\n3395.83\n4034.78\n \n \n \n\n\n\n\nThis shows sex above island values:\n\ndatasummary(flipper_length_mm + body_mass_g ~ mean * sex * island,\n data = penguins)\n\n\n\n\n\n \n \n \n \n female \n \n \n male \n \n \n \n Biscoe\n Dream\n Torgersen\n Biscoe \n Dream \n Torgersen \n \n \n \n flipper_length_mm\n205.69\n190.02\n188.29\n213.29\n196.31\n194.91\n body_mass_g\n4319.38\n3446.31\n3395.83\n5104.52\n3987.10\n4034.78\n \n \n \n\n\n\n\nBy default, datasummary omits column headers with a single value/label across all columns, except for the header that sits just above the data. If the header we want to see is not displayed, we may want to reorder the terms of the formula. To show all headers, set sparse_header=FALSE:\n\ndatasummary(flipper_length_mm + body_mass_g ~ mean * sex * island,\n data = penguins,\n sparse_header = FALSE)\n\n\n\n\n\n \n \n \n \n female \n \n \n male \n \n \n \n Biscoe\n Dream\n Torgersen\n Biscoe \n Dream \n Torgersen \n \n \n \n flipper_length_mm\n205.69\n190.02\n188.29\n213.29\n196.31\n194.91\n body_mass_g\n4319.38\n3446.31\n3395.83\n5104.52\n3987.10\n4034.78\n \n \n \n\n\n\n\nWhen using sparse_header=FALSE, it is often useful to insert Heading() * in the table formula, in order to rename or omit some of the labels manually. Type ?tables::Heading for details and examples.\n\n\n\nPersonally, I prefer to rename variables and values before drawing my tables, using backticks when variable names include whitespace. For example,\n\ntmp <- penguins %>%\n select(`Flipper length (mm)` = flipper_length_mm,\n `Body mass (g)` = body_mass_g)\n\ndatasummary(`Flipper length (mm)` + `Body mass (g)` ~ Mean + SD,\n data = tmp)\n\n\n\n\n\n \n \n \n Mean\n SD\n \n \n \n Flipper length (mm)\n200.92\n14.06\n Body mass (g)\n4201.75\n801.95\n \n \n \n\n\n\n\nHowever, thanks to the tables package, datasummary offers two additional mechanisms to rename. First, we can wrap a term in parentheses and use the equal = sign: (NewName=OldName):\n\ndatasummary((`Flipper length (mm)` = flipper_length_mm) + (`Body mass (g)` = body_mass_g) ~\n island * ((Avg. = Mean) + (Std.Dev. = SD)),\n data = penguins)\n\n\n\n\n\n \n \n \n \n Biscoe \n \n \n Dream \n \n \n Torgersen \n \n \n \n Avg.\n Std.Dev.\n Avg. \n Std.Dev. \n Avg. \n Std.Dev. \n \n \n \n Flipper length (mm)\n209.71\n14.14\n193.07\n7.51\n191.20\n6.23\n Body mass (g)\n4716.02\n782.86\n3712.90\n416.64\n3706.37\n445.11\n \n \n \n\n\n\n\nSecond, we can use the Heading() function:\n\ndatasummary(Heading(\"Flipper length (mm)\") * flipper_length_mm + Heading(\"Body mass (g)\") * body_mass_g ~ island * (Mean + SD),\n data = penguins)\n\n\n\n\n\n \n \n \n \n Biscoe \n \n \n Dream \n \n \n Torgersen \n \n \n \n Mean\n SD\n Mean \n SD \n Mean \n SD \n \n \n \n Flipper length (mm)\n209.71\n14.14\n193.07\n7.51\n191.20\n6.23\n Body mass (g)\n4716.02\n782.86\n3712.90\n416.64\n3706.37\n445.11\n \n \n \n\n\n\n\nThe Heading function also has a nearData argument which can be useful in cases where some rows are nested but others are not. Compare the last row of these two tables:\n\ndatasummary(sex * (flipper_length_mm + bill_length_mm) + Heading(\"Body mass (g)\") * body_mass_g ~ Mean + SD,\n data = penguins)\n\n\n\n\n\n \n \n sex\n \n Mean\n SD\n \n \n \n female\nflipper_length_mm\n197.36\n12.50\n \nbill_length_mm\n42.10\n4.90\n male\nflipper_length_mm\n204.51\n14.55\n \nbill_length_mm\n45.85\n5.37\n \nBody mass (g)\n4201.75\n801.95\n \n \n \n\n\n\ndatasummary(sex * (flipper_length_mm + bill_length_mm) + Heading(\"Body mass (g)\", nearData=FALSE) * body_mass_g ~ Mean + SD,\n data = penguins)\n\n\n\n\n\n \n \n sex\n \n Mean\n SD\n \n \n \n female\nflipper_length_mm\n197.36\n12.50\n \nbill_length_mm\n42.10\n4.90\n male\nflipper_length_mm\n204.51\n14.55\n \nbill_length_mm\n45.85\n5.37\n Body mass (g)\n\n4201.75\n801.95\n \n \n \n\n\n\n\n\n\n\nThe tables package allows datasummary to use neat tricks to produce frequency tables:\n\nAdd a N to the right-hand side of the equation.\nAdd Percent() to the right-hand side to calculate the percentage of observations in each cell.\nAdd 1 to the left-hand side to include a row with the total number of observations:\n\n\ndatasummary(species * sex + 1 ~ N + Percent(),\n data = penguins)\n\n\n\n\n\n \n \n species\n sex\n N\n Percent\n \n \n \n Adelie\nfemale\n73\n21.22\n \nmale\n73\n21.22\n Chinstrap\nfemale\n34\n9.88\n \nmale\n34\n9.88\n Gentoo\nfemale\n58\n16.86\n \nmale\n61\n17.73\n \nAll\n344\n100.00\n \n \n \n\n\n\n\nNote that the Percent() function accepts a denom argument to determine if percentages should be calculated row or column-wise, or if they should take into account all cells.\n\n\n\nThe Percent() pseudo-function also accepts a fn argument, which must be a function which accepts two vectors: x is the values in the current cell, and y is all the values in the whole dataset. The default fn is:\n\ndatasummary(species * sex + 1 ~ N + Percent(fn = function(x, y) 100 * length(x) / length(y)),\n output = \"markdown\",\n data = penguins)\n\n\n\n\nspecies\nsex\nN\nPercent\n\n\n\n\nAdelie\nfemale\n73\n21.22\n\n\n\nmale\n73\n21.22\n\n\nChinstrap\nfemale\n34\n9.88\n\n\n\nmale\n34\n9.88\n\n\nGentoo\nfemale\n58\n16.86\n\n\n\nmale\n61\n17.73\n\n\n\nAll\n344\n100.00\n\n\n\n\n\nThe code above takes the number of elements in the cell length(x) and divides it by the number of total elements length(y).\nNow, let’s say we want to display percentages weighted by one of the variables of the dataset. This can often be useful with survey weights, for example. Here, we use an arbitrary column of weights called flipper_length_mm:\n\nwtpct <- function(x, y) sum(x, na.rm = TRUE) / sum(y, na.rm = TRUE) * 100\ndatasummary(species * sex + 1 ~ N + flipper_length_mm * Percent(fn = wtpct),\n output = \"markdown\",\n data = penguins)\n\n\n\n\nspecies\nsex\nN\nPercent\n\n\n\n\nAdelie\nfemale\n73\n19.95\n\n\n\nmale\n73\n20.44\n\n\nChinstrap\nfemale\n34\n9.49\n\n\n\nmale\n34\n9.89\n\n\nGentoo\nfemale\n58\n17.95\n\n\n\nmale\n61\n19.67\n\n\n\nAll\n344\n100.00\n\n\n\n\n\nIn each cell we now have the sum of weights in that cell, divided by the total sum of weights in the column.\n\n\n\nHere is another simple illustration of Percent function mechanism in action, where we combine counts and percentages in a simple nice label:\n\ndat <- mtcars\ndat$cyl <- as.factor(dat$cyl)\n\nfn <- function(x, y) {\n out <- sprintf(\n \"%s (%.1f%%)\",\n length(x),\n length(x) / length(y) * 100)\n}\ndatasummary(\n cyl ~ Percent(fn = fn),\n data = dat)\n\n\n\n\n\n \n \n cyl\n Percent\n \n \n \n 4\n11 (34.4%)\n 6\n7 (21.9%)\n 8\n14 (43.8%)\n \n \n \n\n\n\n\n\n\n\nThe * nesting operator that we used above works automatically for factor, character, and logical variables. Sometimes, it is convenient to use it with other types of variables, such as binary numeric variables. In that case, we can wrap the variable name inside a call to Factor(). This allows us to treat a variable as factor, without having to modify it in the original data. For example, in the mtcars data, there is a binary numeric variable call am. We nest statistics within categories of am by typing:\n\ndatasummary(mpg + hp ~ Factor(am) * (mean + sd),\n data = mtcars)\n\n\n\n\n\n \n \n \n \n 0 \n \n \n 1 \n \n \n \n mean\n sd\n mean \n sd \n \n \n \n mpg\n17.15\n3.83\n24.39\n6.17\n hp\n160.26\n53.91\n126.85\n84.06\n \n \n \n\n\n\n\n\n\n\nWe can pass any argument to the summary function by including a call to Arguments(). For instance, there are missing values in the flipper_length_mm variable of the penguins dataset. Therefore, the standard mean function will produce no result, because its default argument is na.rm=FALSE. We can change that by calling:\n\ndatasummary(flipper_length_mm ~ mean + mean*Arguments(na.rm=TRUE),\n data = penguins)\n\n\n\n\n\n \n \n \n mean\n mean \n \n \n \n flipper_length_mm\n\n200.92\n \n \n \n\n\n\n\nNotice that there is an empty cell (NA) under the normal mean function, but that the mean call with Arguments(na.rm=TRUE) produced a numeric result.\nWe can pass the same arguments to multiple functions using the parentheses:\n\ndatasummary(flipper_length_mm ~ (mean + sd) * Arguments(na.rm=TRUE),\n data = penguins)\n\n\n\n\n\n \n \n \n mean\n sd\n \n \n \n flipper_length_mm\n200.92\n14.06\n \n \n \n\n\n\n\nNote that the shortcut functions that modelsummary supplies use na.rm=TRUE by default, so we can use them directly without arguments, even when there are missing values:\n\ndatasummary(flipper_length_mm ~ Mean + Var + P75 + Range,\n data = penguins)\n\n\n\n\n\n \n \n \n Mean\n Var\n P75\n Range\n \n \n \n flipper_length_mm\n200.92\n197.73\n213.00\n59.00\n \n \n \n\n\n\n\n\n\n\nYou can use the Arguments mechanism to do various things, such as calculating weighted means:\n\nnewdata <- data.frame(\n x = rnorm(20),\n w = rnorm(20),\n y = rnorm(20))\n\ndatasummary(x + y ~ weighted.mean * Arguments(w = w),\n data = newdata, output = \"markdown\")\n\n\n\n\n\nweighted.mean\n\n\n\n\nx\n1.05\n\n\ny\n-3.82\n\n\n\n\n\nWhich produces the same results as:\n\nweighted.mean(newdata$x, newdata$w)\n\n[1] 1.051561\n\nweighted.mean(newdata$y, newdata$w)\n\n[1] -3.816827\n\n\nBut different results from:\n\nmean(newdata$x)\n\n[1] 0.1759387\n\nmean(newdata$y)\n\n[1] 0.1215528\n\n\n\n\n\nSometimes, if we nest too much and the dataset is not large/diverse enough, we end up with empty cells. In that case, we add *DropEmpty() to the formula:\n\ndatasummary(island * species * body_mass_g ~ sex * (Mean + SD),\n data = penguins)\n\n\n\n\n\n \n \n island\n species\n \n \n female \n \n \n male \n \n \n \n Mean\n SD\n Mean \n SD \n \n \n \n Biscoe\nAdelie\nbody_mass_g\n3369.32\n343.47\n4050.00\n355.57\n \nChinstrap\nbody_mass_g\n\n\n\n\n \nGentoo\nbody_mass_g\n4679.74\n281.58\n5484.84\n313.16\n Dream\nAdelie\nbody_mass_g\n3344.44\n212.06\n4045.54\n330.55\n \nChinstrap\nbody_mass_g\n3527.21\n285.33\n3938.97\n362.14\n \nGentoo\nbody_mass_g\n\n\n\n\n Torgersen\nAdelie\nbody_mass_g\n3395.83\n259.14\n4034.78\n372.47\n \nChinstrap\nbody_mass_g\n\n\n\n\n \nGentoo\nbody_mass_g\n\n\n\n\n \n \n \n\n\n\ndatasummary(island * species * body_mass_g ~ sex * (Mean + SD) * DropEmpty(),\n data = penguins)\n\n\n\n\n\n \n \n island\n species\n \n \n female \n \n \n male \n \n \n \n Mean\n SD\n Mean \n SD \n \n \n \n Biscoe\nAdelie\nbody_mass_g\n3369.32\n343.47\n4050.00\n355.57\n \nGentoo\nbody_mass_g\n4679.74\n281.58\n5484.84\n313.16\n Dream\nAdelie\nbody_mass_g\n3344.44\n212.06\n4045.54\n330.55\n \nChinstrap\nbody_mass_g\n3527.21\n285.33\n3938.97\n362.14\n Torgersen\nAdelie\nbody_mass_g\n3395.83\n259.14\n4034.78\n372.47\n \n \n \n\n\n\n\n\n\n\nCool stuff is possible with logical subsets:\n\ndatasummary((bill_length_mm > 44.5) + (bill_length_mm <= 44.5) ~ Mean * flipper_length_mm,\n data = penguins)\n\n\n\n\n\n \n \n \n \n \n \n \n bill_length_mm > 44.5\n209.68\n bill_length_mm <= 44.5\n192.45\n \n \n \n\n\n\n\nSee the tables package documentation for more details and examples." + "text": "datasummary tables are specified using a 2-sided formula, divided by a tilde ~. The left-hand side describes the rows; the right-hand side describes the columns. To illustrate how this works, we will again be using the palmerpenguins dataset:\nTo display the flipper_length_mm variable as a row and the mean as a column, we type:\n\ndatasummary(flipper_length_mm ~ Mean,\n data = penguins)\n\n\n\n\n\n \n \n \n Mean\n \n \n \n flipper_length_mm\n200.92\n \n \n \n\n\n\n\nTo flip rows and columns, we flip the left and right-hand sides of the formula:\n\ndatasummary(Mean ~ flipper_length_mm,\n data = penguins)\n\n\n\n\n\n \n \n \n flipper_length_mm\n \n \n \n Mean\n200.92\n \n \n \n\n\n\n\n\n\nThe Mean function is a shortcut supplied by modelsummary, and it is equivalent to mean(x,na.rm=TRUE). Since the flipper_length_mm variable includes missing observation, using the mean formula (with default na.rm=FALSE) would produce a missing/empty cell:\n\ndatasummary(flipper_length_mm ~ mean,\n data = penguins)\n\n\n\n\n\n \n \n \n mean\n \n \n \n flipper_length_mm\n\n \n \n \n\n\n\n\nmodelsummary supplies these functions: Mean, SD, Min, Max, Median, P0, P25, P50, P75, P100, Histogram, and a few more (see the package documentation).\nUsers are also free to create and use their own custom summaries. Any R function which takes a vector and produces a single value is acceptable. For example, the Range functions return a numerical value, and the MinMax returns a string:\n\nRange <- function(x) max(x, na.rm = TRUE) - min(x, na.rm = TRUE)\n\ndatasummary(flipper_length_mm ~ Range,\n data = penguins)\n\n\n\n\n\n \n \n \n Range\n \n \n \n flipper_length_mm\n59.00\n \n \n \n\n\n\nMinMax <- function(x) paste0('[', min(x, na.rm = TRUE), ', ', max(x, na.rm = TRUE), ']')\n\ndatasummary(flipper_length_mm ~ MinMax,\n data = penguins)\n\n\n\n\n\n \n \n \n MinMax\n \n \n \n flipper_length_mm\n[172, 231]\n \n \n \n\n\n\n\n\n\n\nTo include more rows and columns, we use the + sign:\n\ndatasummary(flipper_length_mm + body_mass_g ~ Mean + SD,\n data = penguins)\n\n\n\n\n\n \n \n \n Mean\n SD\n \n \n \n flipper_length_mm\n200.92\n14.06\n body_mass_g\n4201.75\n801.95\n \n \n \n\n\n\n\nSometimes, it can be cumbersome to list all variables separated by + signs. The All() function is a useful shortcut:\n\ndatasummary(All(penguins) ~ Mean + SD,\n data = penguins)\n\n\n\n\n\n \n \n \n Mean\n SD\n \n \n \n rownames\n172.50\n99.45\n bill_length_mm\n43.92\n5.46\n bill_depth_mm\n17.15\n1.97\n flipper_length_mm\n200.92\n14.06\n body_mass_g\n4201.75\n801.95\n year\n2008.03\n0.82\n \n \n \n\n\n\n\nBy default, All selects all numeric variables. This behavior can be changed by modifying the function’s arguments. See ?All for details.\n\n\n\ndatasummary can nest variables and statistics inside categorical variables using the * symbol. When applying the the * operator to factor, character, or logical variables, columns or rows will automatically be nested. For instance, if we want to display separate means for each value of the variable sex, we use mean * sex:\n\ndatasummary(flipper_length_mm + body_mass_g ~ mean * sex,\n data = penguins)\n\n\n\n\n\n \n \n \n female\n male\n \n \n \n flipper_length_mm\n197.36\n204.51\n body_mass_g\n3862.27\n4545.68\n \n \n \n\n\n\n\nWe can use parentheses to nest several terms inside one another, using a call of this form: x * (y + z). Here is an example with nested columns:\n\ndatasummary(body_mass_g ~ sex * (mean + sd),\n data = penguins)\n\n\n\n\n\n \n \n \n \n female \n \n \n male \n \n \n \n mean\n sd\n mean \n sd \n \n \n \n body_mass_g\n3862.27\n666.17\n4545.68\n787.63\n \n \n \n\n\n\n\nHere is an example with nested rows:\n\ndatasummary(sex * (body_mass_g + flipper_length_mm) ~ mean + sd,\n data = penguins)\n\n\n\n\n\n \n \n sex\n \n mean\n sd\n \n \n \n female\nbody_mass_g\n3862.27\n666.17\n \nflipper_length_mm\n197.36\n12.50\n male\nbody_mass_g\n4545.68\n787.63\n \nflipper_length_mm\n204.51\n14.55\n \n \n \n\n\n\n\nThe order in which terms enter the formula determines the order in which labels are displayed. For example, this shows island above sex:\n\ndatasummary(flipper_length_mm + body_mass_g ~ mean * island * sex,\n data = penguins)\n\n\n\n\n\n \n \n \n \n Biscoe \n \n \n Dream \n \n \n Torgersen \n \n \n \n female\n male\n female \n male \n female \n male \n \n \n \n flipper_length_mm\n205.69\n213.29\n190.02\n196.31\n188.29\n194.91\n body_mass_g\n4319.38\n5104.52\n3446.31\n3987.10\n3395.83\n4034.78\n \n \n \n\n\n\n\nThis shows sex above island values:\n\ndatasummary(flipper_length_mm + body_mass_g ~ mean * sex * island,\n data = penguins)\n\n\n\n\n\n \n \n \n \n female \n \n \n male \n \n \n \n Biscoe\n Dream\n Torgersen\n Biscoe \n Dream \n Torgersen \n \n \n \n flipper_length_mm\n205.69\n190.02\n188.29\n213.29\n196.31\n194.91\n body_mass_g\n4319.38\n3446.31\n3395.83\n5104.52\n3987.10\n4034.78\n \n \n \n\n\n\n\nBy default, datasummary omits column headers with a single value/label across all columns, except for the header that sits just above the data. If the header we want to see is not displayed, we may want to reorder the terms of the formula. To show all headers, set sparse_header=FALSE:\n\ndatasummary(flipper_length_mm + body_mass_g ~ mean * sex * island,\n data = penguins,\n sparse_header = FALSE)\n\n\n\n\n\n \n \n \n \n female \n \n \n male \n \n \n \n Biscoe\n Dream\n Torgersen\n Biscoe \n Dream \n Torgersen \n \n \n \n flipper_length_mm\n205.69\n190.02\n188.29\n213.29\n196.31\n194.91\n body_mass_g\n4319.38\n3446.31\n3395.83\n5104.52\n3987.10\n4034.78\n \n \n \n\n\n\n\nWhen using sparse_header=FALSE, it is often useful to insert Heading() * in the table formula, in order to rename or omit some of the labels manually. Type ?tables::Heading for details and examples.\n\n\n\nPersonally, I prefer to rename variables and values before drawing my tables, using backticks when variable names include whitespace. For example,\n\ntmp <- penguins %>%\n select(`Flipper length (mm)` = flipper_length_mm,\n `Body mass (g)` = body_mass_g)\n\ndatasummary(`Flipper length (mm)` + `Body mass (g)` ~ Mean + SD,\n data = tmp)\n\n\n\n\n\n \n \n \n Mean\n SD\n \n \n \n Flipper length (mm)\n200.92\n14.06\n Body mass (g)\n4201.75\n801.95\n \n \n \n\n\n\n\nHowever, thanks to the tables package, datasummary offers two additional mechanisms to rename. First, we can wrap a term in parentheses and use the equal = sign: (NewName=OldName):\n\ndatasummary((`Flipper length (mm)` = flipper_length_mm) + (`Body mass (g)` = body_mass_g) ~\n island * ((Avg. = Mean) + (Std.Dev. = SD)),\n data = penguins)\n\n\n\n\n\n \n \n \n \n Biscoe \n \n \n Dream \n \n \n Torgersen \n \n \n \n Avg.\n Std.Dev.\n Avg. \n Std.Dev. \n Avg. \n Std.Dev. \n \n \n \n Flipper length (mm)\n209.71\n14.14\n193.07\n7.51\n191.20\n6.23\n Body mass (g)\n4716.02\n782.86\n3712.90\n416.64\n3706.37\n445.11\n \n \n \n\n\n\n\nSecond, we can use the Heading() function:\n\ndatasummary(Heading(\"Flipper length (mm)\") * flipper_length_mm + Heading(\"Body mass (g)\") * body_mass_g ~ island * (Mean + SD),\n data = penguins)\n\n\n\n\n\n \n \n \n \n Biscoe \n \n \n Dream \n \n \n Torgersen \n \n \n \n Mean\n SD\n Mean \n SD \n Mean \n SD \n \n \n \n Flipper length (mm)\n209.71\n14.14\n193.07\n7.51\n191.20\n6.23\n Body mass (g)\n4716.02\n782.86\n3712.90\n416.64\n3706.37\n445.11\n \n \n \n\n\n\n\nThe Heading function also has a nearData argument which can be useful in cases where some rows are nested but others are not. Compare the last row of these two tables:\n\ndatasummary(sex * (flipper_length_mm + bill_length_mm) + Heading(\"Body mass (g)\") * body_mass_g ~ Mean + SD,\n data = penguins)\n\n\n\n\n\n \n \n sex\n \n Mean\n SD\n \n \n \n female\nflipper_length_mm\n197.36\n12.50\n \nbill_length_mm\n42.10\n4.90\n male\nflipper_length_mm\n204.51\n14.55\n \nbill_length_mm\n45.85\n5.37\n \nBody mass (g)\n4201.75\n801.95\n \n \n \n\n\n\ndatasummary(sex * (flipper_length_mm + bill_length_mm) + Heading(\"Body mass (g)\", nearData=FALSE) * body_mass_g ~ Mean + SD,\n data = penguins)\n\n\n\n\n\n \n \n sex\n \n Mean\n SD\n \n \n \n female\nflipper_length_mm\n197.36\n12.50\n \nbill_length_mm\n42.10\n4.90\n male\nflipper_length_mm\n204.51\n14.55\n \nbill_length_mm\n45.85\n5.37\n Body mass (g)\n\n4201.75\n801.95\n \n \n \n\n\n\n\n\n\n\nThe tables package allows datasummary to use neat tricks to produce frequency tables:\n\nAdd a N to the right-hand side of the equation.\nAdd Percent() to the right-hand side to calculate the percentage of observations in each cell.\nAdd 1 to the left-hand side to include a row with the total number of observations:\n\n\ndatasummary(species * sex + 1 ~ N + Percent(),\n data = penguins)\n\n\n\n\n\n \n \n species\n sex\n N\n Percent\n \n \n \n Adelie\nfemale\n73\n21.22\n \nmale\n73\n21.22\n Chinstrap\nfemale\n34\n9.88\n \nmale\n34\n9.88\n Gentoo\nfemale\n58\n16.86\n \nmale\n61\n17.73\n \nAll\n344\n100.00\n \n \n \n\n\n\n\nNote that the Percent() function accepts a denom argument to determine if percentages should be calculated row or column-wise, or if they should take into account all cells.\n\n\n\nThe Percent() pseudo-function also accepts a fn argument, which must be a function which accepts two vectors: x is the values in the current cell, and y is all the values in the whole dataset. The default fn is:\n\ndatasummary(species * sex + 1 ~ N + Percent(fn = function(x, y) 100 * length(x) / length(y)),\n output = \"markdown\",\n data = penguins)\n\n\n\n\nspecies\nsex\nN\nPercent\n\n\n\n\nAdelie\nfemale\n73\n21.22\n\n\n\nmale\n73\n21.22\n\n\nChinstrap\nfemale\n34\n9.88\n\n\n\nmale\n34\n9.88\n\n\nGentoo\nfemale\n58\n16.86\n\n\n\nmale\n61\n17.73\n\n\n\nAll\n344\n100.00\n\n\n\n\n\nThe code above takes the number of elements in the cell length(x) and divides it by the number of total elements length(y).\nNow, let’s say we want to display percentages weighted by one of the variables of the dataset. This can often be useful with survey weights, for example. Here, we use an arbitrary column of weights called flipper_length_mm:\n\nwtpct <- function(x, y) sum(x, na.rm = TRUE) / sum(y, na.rm = TRUE) * 100\ndatasummary(species * sex + 1 ~ N + flipper_length_mm * Percent(fn = wtpct),\n output = \"markdown\",\n data = penguins)\n\n\n\n\nspecies\nsex\nN\nPercent\n\n\n\n\nAdelie\nfemale\n73\n19.95\n\n\n\nmale\n73\n20.44\n\n\nChinstrap\nfemale\n34\n9.49\n\n\n\nmale\n34\n9.89\n\n\nGentoo\nfemale\n58\n17.95\n\n\n\nmale\n61\n19.67\n\n\n\nAll\n344\n100.00\n\n\n\n\n\nIn each cell we now have the sum of weights in that cell, divided by the total sum of weights in the column.\n\n\n\nHere is another simple illustration of Percent function mechanism in action, where we combine counts and percentages in a simple nice label:\n\ndat <- mtcars\ndat$cyl <- as.factor(dat$cyl)\n\nfn <- function(x, y) {\n out <- sprintf(\n \"%s (%.1f%%)\",\n length(x),\n length(x) / length(y) * 100)\n}\ndatasummary(\n cyl ~ Percent(fn = fn),\n data = dat)\n\n\n\n\n\n \n \n cyl\n Percent\n \n \n \n 4\n11 (34.4%)\n 6\n7 (21.9%)\n 8\n14 (43.8%)\n \n \n \n\n\n\n\n\n\n\nThe * nesting operator that we used above works automatically for factor, character, and logical variables. Sometimes, it is convenient to use it with other types of variables, such as binary numeric variables. In that case, we can wrap the variable name inside a call to Factor(). This allows us to treat a variable as factor, without having to modify it in the original data. For example, in the mtcars data, there is a binary numeric variable call am. We nest statistics within categories of am by typing:\n\ndatasummary(mpg + hp ~ Factor(am) * (mean + sd),\n data = mtcars)\n\n\n\n\n\n \n \n \n \n 0 \n \n \n 1 \n \n \n \n mean\n sd\n mean \n sd \n \n \n \n mpg\n17.15\n3.83\n24.39\n6.17\n hp\n160.26\n53.91\n126.85\n84.06\n \n \n \n\n\n\n\n\n\n\nWe can pass any argument to the summary function by including a call to Arguments(). For instance, there are missing values in the flipper_length_mm variable of the penguins dataset. Therefore, the standard mean function will produce no result, because its default argument is na.rm=FALSE. We can change that by calling:\n\ndatasummary(flipper_length_mm ~ mean + mean*Arguments(na.rm=TRUE),\n data = penguins)\n\n\n\n\n\n \n \n \n mean\n mean \n \n \n \n flipper_length_mm\n\n200.92\n \n \n \n\n\n\n\nNotice that there is an empty cell (NA) under the normal mean function, but that the mean call with Arguments(na.rm=TRUE) produced a numeric result.\nWe can pass the same arguments to multiple functions using the parentheses:\n\ndatasummary(flipper_length_mm ~ (mean + sd) * Arguments(na.rm=TRUE),\n data = penguins)\n\n\n\n\n\n \n \n \n mean\n sd\n \n \n \n flipper_length_mm\n200.92\n14.06\n \n \n \n\n\n\n\nNote that the shortcut functions that modelsummary supplies use na.rm=TRUE by default, so we can use them directly without arguments, even when there are missing values:\n\ndatasummary(flipper_length_mm ~ Mean + Var + P75 + Range,\n data = penguins)\n\n\n\n\n\n \n \n \n Mean\n Var\n P75\n Range\n \n \n \n flipper_length_mm\n200.92\n197.73\n213.00\n59.00\n \n \n \n\n\n\n\n\n\n\nYou can use the Arguments mechanism to do various things, such as calculating weighted means:\n\nnewdata <- data.frame(\n x = rnorm(20),\n w = rnorm(20),\n y = rnorm(20))\n\ndatasummary(x + y ~ weighted.mean * Arguments(w = w),\n data = newdata, output = \"markdown\")\n\n\n\n\n\nweighted.mean\n\n\n\n\nx\n-4.18\n\n\ny\n3.38\n\n\n\n\n\nWhich produces the same results as:\n\nweighted.mean(newdata$x, newdata$w)\n\n[1] -4.178942\n\nweighted.mean(newdata$y, newdata$w)\n\n[1] 3.378796\n\n\nBut different results from:\n\nmean(newdata$x)\n\n[1] -0.3440597\n\nmean(newdata$y)\n\n[1] 0.2460577\n\n\n\n\n\nSometimes, if we nest too much and the dataset is not large/diverse enough, we end up with empty cells. In that case, we add *DropEmpty() to the formula:\n\ndatasummary(island * species * body_mass_g ~ sex * (Mean + SD),\n data = penguins)\n\n\n\n\n\n \n \n island\n species\n \n \n female \n \n \n male \n \n \n \n Mean\n SD\n Mean \n SD \n \n \n \n Biscoe\nAdelie\nbody_mass_g\n3369.32\n343.47\n4050.00\n355.57\n \nChinstrap\nbody_mass_g\n\n\n\n\n \nGentoo\nbody_mass_g\n4679.74\n281.58\n5484.84\n313.16\n Dream\nAdelie\nbody_mass_g\n3344.44\n212.06\n4045.54\n330.55\n \nChinstrap\nbody_mass_g\n3527.21\n285.33\n3938.97\n362.14\n \nGentoo\nbody_mass_g\n\n\n\n\n Torgersen\nAdelie\nbody_mass_g\n3395.83\n259.14\n4034.78\n372.47\n \nChinstrap\nbody_mass_g\n\n\n\n\n \nGentoo\nbody_mass_g\n\n\n\n\n \n \n \n\n\n\ndatasummary(island * species * body_mass_g ~ sex * (Mean + SD) * DropEmpty(),\n data = penguins)\n\n\n\n\n\n \n \n island\n species\n \n \n female \n \n \n male \n \n \n \n Mean\n SD\n Mean \n SD \n \n \n \n Biscoe\nAdelie\nbody_mass_g\n3369.32\n343.47\n4050.00\n355.57\n \nGentoo\nbody_mass_g\n4679.74\n281.58\n5484.84\n313.16\n Dream\nAdelie\nbody_mass_g\n3344.44\n212.06\n4045.54\n330.55\n \nChinstrap\nbody_mass_g\n3527.21\n285.33\n3938.97\n362.14\n Torgersen\nAdelie\nbody_mass_g\n3395.83\n259.14\n4034.78\n372.47\n \n \n \n\n\n\n\n\n\n\nCool stuff is possible with logical subsets:\n\ndatasummary((bill_length_mm > 44.5) + (bill_length_mm <= 44.5) ~ Mean * flipper_length_mm,\n data = penguins)\n\n\n\n\n\n \n \n \n \n \n \n \n bill_length_mm > 44.5\n209.68\n bill_length_mm <= 44.5\n192.45\n \n \n \n\n\n\n\nSee the tables package documentation for more details and examples." }, { "objectID": "vignettes/datasummary.html#output", @@ -396,7 +396,7 @@ "href": "vignettes/datasummary.html#fmt", "title": "Data Summaries", "section": "", - "text": "The fmt argument allows us to set the printing format of numeric values. It accepts a single number representing the number of digits after the period, or a string to be passed to the sprintf function. For instance, setting fmt=\"%.4f\" will keep 4 digits after the dot (see ?sprintf for more options):\n\ndatasummary(flipper_length_mm ~ Mean + SD,\n fmt = 4,\n data = penguins)\n\n\n\n\n\n \n \n \n Mean\n SD\n \n \n \n flipper_length_mm\n200.9152\n14.0617\n \n \n \n\n\n\n\nWe can set the formatting on a term-by-term basis by using the same Arguments function that we used to handle missing values in the previous section. The shortcut functions that ship with modelsummary (e.g., Mean, SD, Median, P25) all include a fmt argument:\n\ndatasummary(flipper_length_mm ~ Mean * Arguments(fmt = \"%.4f\") + SD * Arguments(fmt = \"%.1f\"),\n data = penguins)\n\n\n\n\n\n \n \n \n Mean\n SD\n \n \n \n flipper_length_mm\n200.9152\n14.1\n \n \n \n\n\n\n\nIf we do not want datasummary to format numbers, and we want to keep the numerical values instead of formatted strings, set fmt=NULL. This can be useful when post-processing tables with packages like gt, which allow us to transform cells based on their numerical content (this gt table will be omitted from PDF output):\n\nlibrary(gt)\n\ndatasummary(All(mtcars) ~ Mean + SD,\n data = mtcars,\n fmt = NULL,\n output = 'gt') %>%\n tab_style(style = cell_fill(color = \"pink\"),\n locations = cells_body(rows = Mean > 10, columns = 2))\n\n\n\n\n\n \n \n \n Mean\n SD\n \n \n \n mpg\n20.090625\n6.0269481\n cyl\n6.187500\n1.7859216\n disp\n230.721875\n123.9386938\n hp\n146.687500\n68.5628685\n drat\n3.596563\n0.5346787\n wt\n3.217250\n0.9784574\n qsec\n17.848750\n1.7869432\n vs\n0.437500\n0.5040161\n am\n0.406250\n0.4989909\n gear\n3.687500\n0.7378041\n carb\n2.812500\n1.6152000\n \n \n \n\n\n\n\nPlease note that the N() function is supplied by the upstream tables package, and does not have a fmt argument. Fortunately, it is easy to override the built-in function to use custom formatting:\n\ntmp <- data.frame(X = sample(letters[1:3], 1e6, replace = TRUE))\nN <- \\(x) format(length(x), big.mark = \",\")\ndatasummary(X ~ N, data = tmp)\n\n\n\n\n\n \n \n X\n N\n \n \n \n a\n333,404\n b\n332,896\n c\n333,700" + "text": "The fmt argument allows us to set the printing format of numeric values. It accepts a single number representing the number of digits after the period, or a string to be passed to the sprintf function. For instance, setting fmt=\"%.4f\" will keep 4 digits after the dot (see ?sprintf for more options):\n\ndatasummary(flipper_length_mm ~ Mean + SD,\n fmt = 4,\n data = penguins)\n\n\n\n\n\n \n \n \n Mean\n SD\n \n \n \n flipper_length_mm\n200.9152\n14.0617\n \n \n \n\n\n\n\nWe can set the formatting on a term-by-term basis by using the same Arguments function that we used to handle missing values in the previous section. The shortcut functions that ship with modelsummary (e.g., Mean, SD, Median, P25) all include a fmt argument:\n\ndatasummary(flipper_length_mm ~ Mean * Arguments(fmt = \"%.4f\") + SD * Arguments(fmt = \"%.1f\"),\n data = penguins)\n\n\n\n\n\n \n \n \n Mean\n SD\n \n \n \n flipper_length_mm\n200.9152\n14.1\n \n \n \n\n\n\n\nIf we do not want datasummary to format numbers, and we want to keep the numerical values instead of formatted strings, set fmt=NULL. This can be useful when post-processing tables with packages like gt, which allow us to transform cells based on their numerical content (this gt table will be omitted from PDF output):\n\nlibrary(gt)\n\ndatasummary(All(mtcars) ~ Mean + SD,\n data = mtcars,\n fmt = NULL,\n output = 'gt') %>%\n tab_style(style = cell_fill(color = \"pink\"),\n locations = cells_body(rows = Mean > 10, columns = 2))\n\n\n\n\n\n \n \n \n Mean\n SD\n \n \n \n mpg\n20.090625\n6.0269481\n cyl\n6.187500\n1.7859216\n disp\n230.721875\n123.9386938\n hp\n146.687500\n68.5628685\n drat\n3.596563\n0.5346787\n wt\n3.217250\n0.9784574\n qsec\n17.848750\n1.7869432\n vs\n0.437500\n0.5040161\n am\n0.406250\n0.4989909\n gear\n3.687500\n0.7378041\n carb\n2.812500\n1.6152000\n \n \n \n\n\n\n\nPlease note that the N() function is supplied by the upstream tables package, and does not have a fmt argument. Fortunately, it is easy to override the built-in function to use custom formatting:\n\ntmp <- data.frame(X = sample(letters[1:3], 1e6, replace = TRUE))\nN <- \\(x) format(length(x), big.mark = \",\")\ndatasummary(X ~ N, data = tmp)\n\n\n\n\n\n \n \n X\n N\n \n \n \n a\n333,736\n b\n333,200\n c\n333,064" }, { "objectID": "vignettes/datasummary.html#title-notes", @@ -424,7 +424,7 @@ "href": "vignettes/datasummary.html#add_columns", "title": "Data Summaries", "section": "", - "text": "new_cols <- data.frame('New Stat' = runif(2))\ndatasummary(flipper_length_mm + body_mass_g ~ species * (Mean + SD),\n data = penguins,\n add_columns = new_cols)\n\n\n\n\n\n \n \n \n \n Adelie \n \n \n Chinstrap \n \n \n Gentoo \n \n New.Stat\n \n \n Mean\n SD\n Mean \n SD \n Mean \n SD \n \n \n \n flipper_length_mm\n189.95\n6.54\n195.82\n7.13\n217.19\n6.48\n0.83\n body_mass_g\n3700.66\n458.57\n3733.09\n384.34\n5076.02\n504.12\n0.39" + "text": "new_cols <- data.frame('New Stat' = runif(2))\ndatasummary(flipper_length_mm + body_mass_g ~ species * (Mean + SD),\n data = penguins,\n add_columns = new_cols)\n\n\n\n\n\n \n \n \n \n Adelie \n \n \n Chinstrap \n \n \n Gentoo \n \n New.Stat\n \n \n Mean\n SD\n Mean \n SD \n Mean \n SD \n \n \n \n flipper_length_mm\n189.95\n6.54\n195.82\n7.13\n217.19\n6.48\n0.38\n body_mass_g\n3700.66\n458.57\n3733.09\n384.34\n5076.02\n504.12\n0.88" }, { "objectID": "vignettes/datasummary.html#histograms", @@ -998,7 +998,7 @@ "href": "vignettes/modelsummary.html", "title": "Model Summaries", "section": "", - "text": "modelsummary includes a powerful set of utilities to customize the information displayed in your model summary tables. You can easily rename, reorder, subset or omit parameter estimates; choose the set of goodness-of-fit statistics to display; display various “robust” standard errors or confidence intervals; add titles, footnotes, or source notes; insert stars or custom characters to indicate levels of statistical significance; or add rows with supplemental information about your models.\n\nlibrary(modelsummary)\nlibrary(kableExtra)\nlibrary(gt)\n\nurl <- 'https://vincentarelbundock.github.io/Rdatasets/csv/HistData/Guerry.csv'\ndat <- read.csv(url)\n\nmodels <- list(\n \"OLS 1\" = lm(Donations ~ Literacy + Clergy, data = dat),\n \"Poisson 1\" = glm(Donations ~ Literacy + Commerce, family = poisson, data = dat),\n \"OLS 2\" = lm(Crime_pers ~ Literacy + Clergy, data = dat),\n \"Poisson 2\" = glm(Crime_pers ~ Literacy + Commerce, family = poisson, data = dat),\n \"OLS 3\" = lm(Crime_prop ~ Literacy + Clergy, data = dat)\n)\n\nmodelsummary(models)\n\n\n\n\n\n \n \n \n OLS 1\n Poisson 1\n OLS 2\n Poisson 2\n OLS 3\n \n \n \n (Intercept)\n7948.667\n8.241\n16259.384\n9.876\n11243.544\n \n(2078.276)\n(0.006)\n(2611.140)\n(0.003)\n(1011.240)\n Literacy\n-39.121\n0.003\n3.680\n0.000\n-68.507\n \n(37.052)\n(0.000)\n(46.552)\n(0.000)\n(18.029)\n Clergy\n15.257\n\n77.148\n\n-16.376\n \n(25.735)\n\n(32.334)\n\n(12.522)\n Commerce\n\n0.011\n\n0.001\n\n \n\n(0.000)\n\n(0.000)\n\n Num.Obs.\n86\n86\n86\n86\n86\n R2\n0.020\n\n0.065\n\n0.152\n R2 Adj.\n-0.003\n\n0.043\n\n0.132\n AIC\n1740.8\n274160.8\n1780.0\n257564.4\n1616.9\n BIC\n1750.6\n274168.2\n1789.9\n257571.7\n1626.7\n Log.Lik.\n-866.392\n-137077.401\n-886.021\n-128779.186\n-804.441\n F\n0.866\n18294.559\n2.903\n279.956\n7.441\n RMSE\n5740.99\n5491.61\n7212.97\n7451.70\n2793.43\n \n \n \n\n\n\n\n\n\nThe output argument determines the type of object returned by modelsummary and/or the file where this table should be written.\nIf you want to save a table directly to file, you can type:\n\nmodelsummary(models, output = \"table.docx\")\nmodelsummary(models, output = \"table.html\")\nmodelsummary(models, output = \"table.tex\")\nmodelsummary(models, output = \"table.md\")\nmodelsummary(models, output = \"table.txt\")\nmodelsummary(models, output = \"table.png\")\n\nIf you want a raw HTML, LaTeX, or Markdown table, you can type:\n\nmodelsummary(models, output = \"html\")\nmodelsummary(models, output = \"latex\")\nmodelsummary(models, output = \"markdown\")\n\nIf you to customize the appearance of your table using external tools like gt, kableExtra, flextable, or huxtable, you can type:\n\nmodelsummary(models, output = \"gt\")\nmodelsummary(models, output = \"kableExtra\")\nmodelsummary(models, output = \"flextable\")\nmodelsummary(models, output = \"huxtable\")\n\nWarning: When a file name is supplied to the output argument, the table is written immediately to file. If you want to customize your table by post-processing it with an external package, you need to choose a different output format and saving mechanism. Unfortunately, the approach differs from package to package:\n\ngt: set output=\"gt\", post-process your table, and use the gt::gtsave function.\nkableExtra: set output to your destination format (e.g., “latex”, “html”, “markdown”), post-process your table, and use kableExtra::save_kable function.\n\n\n\n\nThe fmt argument defines how numeric values are rounded and presented in the table. This argument accepts three types of input:\n\nInteger: Number of decimal digits\nUser-supplied function: Accepts a numeric vector and returns a character vector of the same length.\nmodelsummary function: fmt_decimal(), fmt_significant(), fmt_sprintf(), fmt_term(), fmt_statistic, fmt_identity()\n\nExamples:\n\nmod <- lm(mpg ~ hp + drat + qsec, data = mtcars)\n\n## decimal digits\nmodelsummary(mod, fmt = 3)\n\n## user-supplied function\nmodelsummary(mod, fmt = function(x) round(x, 2))\n\n## p values with different number of digits\nmodelsummary(mod, fmt = fmt_decimal(1, 3), statistic = c(\"std.error\", \"p.value\"))\n\n## significant digits\nmodelsummary(mod, fmt = fmt_significant(3))\n\n## sprintf(): decimal digits\nmodelsummary(mod, fmt = fmt_sprintf(\"%.5f\"))\n\n## sprintf(): scientific notation \nmodelsummary(mod, fmt = fmt_sprintf(\"%.5e\"))\n\n## statistic-specific formatting\nmodelsummary(mod, fmt = fmt_statistic(estimate = 4, conf.int = 1), statistic = \"conf.int\")\n\n## term-specific formatting\nmodelsummary(mod, fmt = fmt_term(hp = 4, drat = 1, default = fmt_significant(2)))\n\nmodelsummary(mod, fmt = NULL)\n\nCustom formatting function with big mark commas:\n\nmodf <- lm(I(mpg * 100) ~ hp, mtcars)\nf <- function(x) formatC(x, digits = 2, big.mark = \",\", format = \"f\")\nmodelsummary(modf, fmt = f, gof_map = NA)\n\n\n\n\n\n \n \n \n (1)\n \n \n \n (Intercept)\n3,009.89\n \n(163.39)\n hp\n-6.82\n \n(1.01)\n \n \n \n\n\n\n\nIn many languages the comma is used as a decimal mark instead of the period. modelsummary respects the global R OutDec option, so you can simply execute this command and your tables will be adjusted automatically:\n\noptions(OutDec=\",\")\n\n\n\n\nBy default, modelsummary prints each coefficient estimate on its own row. You can customize this by changing the estimate argument. For example, this would produce a table of p values instead of coefficient estimates:\n\nmodelsummary(models, estimate = \"p.value\")\n\nYou can also use glue string, using curly braces to specify the statistics you want. For example, this displays the estimate next to a confidence interval:\n\nmodelsummary(\n models,\n fmt = 1,\n estimate = \"{estimate} [{conf.low}, {conf.high}]\",\n statistic = NULL,\n coef_omit = \"Intercept\")\n\n\n\n\n\n \n \n \n OLS 1\n Poisson 1\n OLS 2\n Poisson 2\n OLS 3\n \n \n \n Literacy\n-39.1 [-112.8, 34.6]\n0.0 [0.0, 0.0]\n3.7 [-88.9, 96.3]\n0.0 [0.0, 0.0]\n-68.5 [-104.4, -32.6]\n Clergy\n15.3 [-35.9, 66.4]\n\n77.1 [12.8, 141.5]\n\n-16.4 [-41.3, 8.5]\n Commerce\n\n0.0 [0.0, 0.0]\n\n0.0 [0.0, 0.0]\n\n Num.Obs.\n86\n86\n86\n86\n86\n R2\n0.020\n\n0.065\n\n0.152\n R2 Adj.\n-0.003\n\n0.043\n\n0.132\n AIC\n1740.8\n274160.8\n1780.0\n257564.4\n1616.9\n BIC\n1750.6\n274168.2\n1789.9\n257571.7\n1626.7\n Log.Lik.\n-866.392\n-137077.401\n-886.021\n-128779.186\n-804.441\n F\n0.866\n18294.559\n2.903\n279.956\n7.441\n RMSE\n5740.99\n5491.61\n7212.97\n7451.70\n2793.43\n \n \n \n\n\n\n\nGlue strings can also apply R functions to estimates. However, since modelsummary rounds numbers and transforms them to character by default, we must set fmt = NULL:\n\nm <- glm(am ~ mpg, data = mtcars, family = binomial)\nmodelsummary(\n m,\n fmt = NULL,\n estimate = \"{round(exp(estimate), 5)}\",\n statistic = \"{round(exp(estimate) * std.error, 3)}\")\n\n\n\n\n\n \n \n \n (1)\n \n \n \n (Intercept)\n0.00136\n \n0.003\n mpg\n1.35938\n \n0.156\n Num.Obs.\n32\n AIC\n33.7\n BIC\n36.6\n Log.Lik.\n-14.838\n F\n7.148\n RMSE\n0.39\n \n \n \n\n\n\n\nYou can also use different estimates for different models by using a vector of strings:\n\nmodelsummary(\n models,\n fmt = 1,\n estimate = c(\"estimate\",\n \"{estimate}{stars}\",\n \"{estimate} ({std.error})\",\n \"{estimate} ({std.error}){stars}\",\n \"{estimate} [{conf.low}, {conf.high}]\"),\n statistic = NULL,\n coef_omit = \"Intercept\")\n\n\n\n\n\n \n \n \n OLS 1\n Poisson 1\n OLS 2\n Poisson 2\n OLS 3\n \n \n \n Literacy\n-39.1\n0.0***\n3.7 (46.6)\n0.0 (0.0)***\n-68.5 [-104.4, -32.6]\n Clergy\n15.3\n\n77.1 (32.3)\n\n-16.4 [-41.3, 8.5]\n Commerce\n\n0.0***\n\n0.0 (0.0)***\n\n Num.Obs.\n86\n86\n86\n86\n86\n R2\n0.020\n\n0.065\n\n0.152\n R2 Adj.\n-0.003\n\n0.043\n\n0.132\n AIC\n1740.8\n274160.8\n1780.0\n257564.4\n1616.9\n BIC\n1750.6\n274168.2\n1789.9\n257571.7\n1626.7\n Log.Lik.\n-866.392\n-137077.401\n-886.021\n-128779.186\n-804.441\n F\n0.866\n18294.559\n2.903\n279.956\n7.441\n RMSE\n5740.99\n5491.61\n7212.97\n7451.70\n2793.43\n \n \n \n\n\n\n\n\n\n\nBy default, modelsummary prints the coefficient’s standard error in parentheses below the corresponding estimate. The value of this uncertainty statistic is determined by the statistic argument. The statistic argument accepts any of the column names produced by get_estimates(model). For example:\n\nmodelsummary(models, statistic = 'std.error')\nmodelsummary(models, statistic = 'p.value')\nmodelsummary(models, statistic = 'statistic')\n\nYou can also display confidence intervals in brackets by setting statistic=\"conf.int\":\n\nmodelsummary(models,\n fmt = 1,\n statistic = 'conf.int', \n conf_level = .99)\n\n\n\n\n\n \n \n \n OLS 1\n Poisson 1\n OLS 2\n Poisson 2\n OLS 3\n \n \n \n (Intercept)\n7948.7\n8.2\n16259.4\n9.9\n11243.5\n \n[2469.6, 13427.8]\n[8.2, 8.3]\n[9375.5, 23143.3]\n[9.9, 9.9]\n[8577.5, 13909.5]\n Literacy\n-39.1\n0.0\n3.7\n0.0\n-68.5\n \n[-136.8, 58.6]\n[0.0, 0.0]\n[-119.0, 126.4]\n[0.0, 0.0]\n[-116.0, -21.0]\n Clergy\n15.3\n\n77.1\n\n-16.4\n \n[-52.6, 83.1]\n\n[-8.1, 162.4]\n\n[-49.4, 16.6]\n Commerce\n\n0.0\n\n0.0\n\n \n\n[0.0, 0.0]\n\n[0.0, 0.0]\n\n Num.Obs.\n86\n86\n86\n86\n86\n R2\n0.020\n\n0.065\n\n0.152\n R2 Adj.\n-0.003\n\n0.043\n\n0.132\n AIC\n1740.8\n274160.8\n1780.0\n257564.4\n1616.9\n BIC\n1750.6\n274168.2\n1789.9\n257571.7\n1626.7\n Log.Lik.\n-866.392\n-137077.401\n-886.021\n-128779.186\n-804.441\n F\n0.866\n18294.559\n2.903\n279.956\n7.441\n RMSE\n5740.99\n5491.61\n7212.97\n7451.70\n2793.43\n \n \n \n\n\n\n\nAlternatively, you can supply a glue string to get more complicated results:\n\nmodelsummary(models,\n statistic = \"{std.error} ({p.value})\")\n\n\n\n\n\n \n \n \n OLS 1\n Poisson 1\n OLS 2\n Poisson 2\n OLS 3\n \n \n \n (Intercept)\n7948.667\n8.241\n16259.384\n9.876\n11243.544\n \n2078.276 (<0.001)\n0.006 (<0.001)\n2611.140 (<0.001)\n0.003 (<0.001)\n1011.240 (<0.001)\n Literacy\n-39.121\n0.003\n3.680\n0.000\n-68.507\n \n37.052 (0.294)\n0.000 (<0.001)\n46.552 (0.937)\n0.000 (<0.001)\n18.029 (<0.001)\n Clergy\n15.257\n\n77.148\n\n-16.376\n \n25.735 (0.555)\n\n32.334 (0.019)\n\n12.522 (0.195)\n Commerce\n\n0.011\n\n0.001\n\n \n\n0.000 (<0.001)\n\n0.000 (<0.001)\n\n Num.Obs.\n86\n86\n86\n86\n86\n R2\n0.020\n\n0.065\n\n0.152\n R2 Adj.\n-0.003\n\n0.043\n\n0.132\n AIC\n1740.8\n274160.8\n1780.0\n257564.4\n1616.9\n BIC\n1750.6\n274168.2\n1789.9\n257571.7\n1626.7\n Log.Lik.\n-866.392\n-137077.401\n-886.021\n-128779.186\n-804.441\n F\n0.866\n18294.559\n2.903\n279.956\n7.441\n RMSE\n5740.99\n5491.61\n7212.97\n7451.70\n2793.43\n \n \n \n\n\n\n\nYou can also display several different uncertainty estimates below the coefficient estimates by using a vector. For example,\n\nmodelsummary(models, gof_omit = \".*\",\n statistic = c(\"conf.int\",\n \"s.e. = {std.error}\", \n \"t = {statistic}\",\n \"p = {p.value}\"))\n\n\n\n\n\n \n \n \n OLS 1\n Poisson 1\n OLS 2\n Poisson 2\n OLS 3\n \n \n \n (Intercept)\n7948.667\n8.241\n16259.384\n9.876\n11243.544\n \n[3815.060, 12082.275]\n[8.230, 8.252]\n[11065.933, 21452.836]\n[9.869, 9.883]\n[9232.228, 13254.860]\n \ns.e. = 2078.276\ns.e. = 0.006\ns.e. = 2611.140\ns.e. = 0.003\ns.e. = 1011.240\n \nt = 3.825\nt = 1408.907\nt = 6.227\nt = 2864.987\nt = 11.119\n \np = <0.001\np = <0.001\np = <0.001\np = <0.001\np = <0.001\n Literacy\n-39.121\n0.003\n3.680\n0.000\n-68.507\n \n[-112.816, 34.574]\n[0.003, 0.003]\n[-88.910, 96.270]\n[0.000, 0.000]\n[-104.365, -32.648]\n \ns.e. = 37.052\ns.e. = 0.000\ns.e. = 46.552\ns.e. = 0.000\ns.e. = 18.029\n \nt = -1.056\nt = 33.996\nt = 0.079\nt = -4.989\nt = -3.800\n \np = 0.294\np = <0.001\np = 0.937\np = <0.001\np = <0.001\n Clergy\n15.257\n\n77.148\n\n-16.376\n \n[-35.930, 66.443]\n\n[12.837, 141.459]\n\n[-41.282, 8.530]\n \ns.e. = 25.735\n\ns.e. = 32.334\n\ns.e. = 12.522\n \nt = 0.593\n\nt = 2.386\n\nt = -1.308\n \np = 0.555\n\np = 0.019\n\np = 0.195\n Commerce\n\n0.011\n\n0.001\n\n \n\n[0.011, 0.011]\n\n[0.001, 0.001]\n\n \n\ns.e. = 0.000\n\ns.e. = 0.000\n\n \n\nt = 174.542\n\nt = 15.927\n\n \n\np = <0.001\n\np = <0.001\n\n \n \n \n\n\n\n\nSetting statistic=NULL omits all statistics. This can often be useful if, for example, you want to display confidence intervals next to coefficients:\n\nmodelsummary(models, gof_omit = \".*\",\n estimate = \"{estimate} [{conf.low}, {conf.high}]\",\n statistic = NULL)\n\n\n\n\n\n \n \n \n OLS 1\n Poisson 1\n OLS 2\n Poisson 2\n OLS 3\n \n \n \n (Intercept)\n7948.667 [3815.060, 12082.275]\n8.241 [8.230, 8.252]\n16259.384 [11065.933, 21452.836]\n9.876 [9.869, 9.883]\n11243.544 [9232.228, 13254.860]\n Literacy\n-39.121 [-112.816, 34.574]\n0.003 [0.003, 0.003]\n3.680 [-88.910, 96.270]\n0.000 [0.000, 0.000]\n-68.507 [-104.365, -32.648]\n Clergy\n15.257 [-35.930, 66.443]\n\n77.148 [12.837, 141.459]\n\n-16.376 [-41.282, 8.530]\n Commerce\n\n0.011 [0.011, 0.011]\n\n0.001 [0.001, 0.001]\n\n \n \n \n\n\n\n\n\n\n\nYou can use clustered or robust uncertainty estimates by modifying the vcov parameter. This function accepts 5 different types of input. You can use a string or a vector of strings:\nmodelsummary(models, vcov = \"robust\")\nmodelsummary(models, vcov = c(\"classical\", \"robust\", \"bootstrap\", \"stata\", \"HC4\"))\nThese variance-covariance matrices are calculated using the sandwich package. You can pass arguments to the sandwich functions directly from the modelsummary function. For instance, to change the number of bootstrap replicates and to specify a clustering variable we could call:\nmodelsummary(mod, vcov = \"bootstrap\", R = 1000, cluster = \"country\")\nYou can use a one-sided formula or list of one-sided formulas to use clustered standard errors:\nmodelsummary(models, vcov = ~Region)\nYou can specify a function that produces variance-covariance matrices:\nlibrary(sandwich)\nmodelsummary(models, vcov = vcovHC)\nYou can supply a list of functions of the same length as your model list:\nmodelsummary(models, \n vcov = list(vcov, vcovHC, vcovHAC, vcovHC, vcov))\nYou can supply a list of named variance-covariance matrices:\nvcov_matrices <- lapply(models, vcovHC)\nmodelsummary(models, vcov = vcov_matrices)\nYou can supply a list of named vectors:\nvc <- list(\n `OLS 1` = c(`(Intercept)` = 2, Literacy = 3, Clergy = 4), \n `Poisson 1` = c(`(Intercept)` = 3, Literacy = -5, Commerce = 3),\n `OLS 2` = c(`(Intercept)` = 7, Literacy = -6, Clergy = 9), \n `Poisson 2` = c(`(Intercept)` = 4, Literacy = -7, Commerce = -9),\n `OLS 3` = c(`(Intercept)` = 1, Literacy = -5, Clergy = -2))\nmodelsummary(models, vcov = vc)\n\n\n\nSome people like to add “stars” to their model summary tables to mark statistical significance. The stars argument can take three types of input:\n\nNULL omits any stars or special marks (default)\nTRUE uses these default values: + p < 0.1, * p < 0.05, ** p < 0.01, *** p < 0.001\nNamed numeric vector for custom stars.\n\nmodelsummary(models)\nmodelsummary(models, stars = TRUE) \nmodelsummary(models, stars = c('+' = .1, '&' = .01)) \nWhenever stars is not NULL, modelsummary adds a note at the bottom of the table automatically. If you would like to add stars but not include a note at the bottom of the table, you can define the display of your estimate manually using a glue string, as described in the estimate argument section of the documentation. Whenever the {stars} string appears in the estimate or statistic arguments, modelsummary will assume that you want fine-grained control over your table, and will not include a note about stars.\n\nmodelsummary(models,\n estimate = \"{estimate}{stars}\",\n gof_omit = \".*\")\n\n\n\n\n\n \n \n \n OLS 1\n Poisson 1\n OLS 2\n Poisson 2\n OLS 3\n \n \n \n (Intercept)\n7948.667***\n8.241***\n16259.384***\n9.876***\n11243.544***\n \n(2078.276)\n(0.006)\n(2611.140)\n(0.003)\n(1011.240)\n Literacy\n-39.121\n0.003***\n3.680\n0.000***\n-68.507***\n \n(37.052)\n(0.000)\n(46.552)\n(0.000)\n(18.029)\n Clergy\n15.257\n\n77.148*\n\n-16.376\n \n(25.735)\n\n(32.334)\n\n(12.522)\n Commerce\n\n0.011***\n\n0.001***\n\n \n\n(0.000)\n\n(0.000)\n\n \n \n \n\n\n\n\nIf you want to create your own stars description, you can add custom notes with the notes argument.\n\n\n\nAn alternative mechanism to subset coefficients is to use the coef_omit argument, which accepts a vector of integer or a regular expression. For example, we can omit the first and second coefficients as follows:\n\nmodelsummary(models, coef_omit = 1:2, gof_map = NA)\n\n\n\n\n\n \n \n \n OLS 1\n Poisson 1\n OLS 2\n Poisson 2\n OLS 3\n \n \n \n Clergy\n15.257\n\n77.148\n\n-16.376\n \n(25.735)\n\n(32.334)\n\n(12.522)\n Commerce\n\n0.011\n\n0.001\n\n \n\n(0.000)\n\n(0.000)\n\n \n \n \n\n\n\n\nNegative indices determine which coefficients to keep:\n\nmodelsummary(models, coef_omit = c(-1, -2), gof_map = NA)\n\n\n\n\n\n \n \n \n OLS 1\n Poisson 1\n OLS 2\n Poisson 2\n OLS 3\n \n \n \n (Intercept)\n7948.667\n8.241\n16259.384\n9.876\n11243.544\n \n(2078.276)\n(0.006)\n(2611.140)\n(0.003)\n(1011.240)\n Literacy\n-39.121\n0.003\n3.680\n0.000\n-68.507\n \n(37.052)\n(0.000)\n(46.552)\n(0.000)\n(18.029)\n \n \n \n\n\n\n\nWhen coef_omit is a string, it is fed to grepl(x,perl=TRUE) to detect the variable names which should be excluded from the table.\n\nmodelsummary(models, coef_omit = \"Intercept|.*merce\", gof_map = NA)\n\n\n\n\n\n \n \n \n OLS 1\n Poisson 1\n OLS 2\n Poisson 2\n OLS 3\n \n \n \n Literacy\n-39.121\n0.003\n3.680\n0.000\n-68.507\n \n(37.052)\n(0.000)\n(46.552)\n(0.000)\n(18.029)\n Clergy\n15.257\n\n77.148\n\n-16.376\n \n(25.735)\n\n(32.334)\n\n(12.522)\n \n \n \n\n\n\n\nSince coef_omit accepts regexes, you can do interesting things with it, such as specifying the list of variables that modelsummary should keep instead of omit. To do this, we use a negative lookahead. To keep only the coefficients starting with “Lit”, we call:\n\nmodelsummary(models, coef_omit = \"^(?!Lit)\", gof_map = NA)\n\n\n\n\n\n \n \n \n OLS 1\n Poisson 1\n OLS 2\n Poisson 2\n OLS 3\n \n \n \n Literacy\n-39.121\n0.003\n3.680\n0.000\n-68.507\n \n(37.052)\n(0.000)\n(46.552)\n(0.000)\n(18.029)\n \n \n \n\n\n\n\nTo keep all coefficients matching the “y” substring:\n\nmodelsummary(models, coef_omit = \"^(?!.*y)\", gof_map = NA)\n\n\n\n\n\n \n \n \n OLS 1\n Poisson 1\n OLS 2\n Poisson 2\n OLS 3\n \n \n \n Literacy\n-39.121\n0.003\n3.680\n0.000\n-68.507\n \n(37.052)\n(0.000)\n(46.552)\n(0.000)\n(18.029)\n Clergy\n15.257\n\n77.148\n\n-16.376\n \n(25.735)\n\n(32.334)\n\n(12.522)\n \n \n \n\n\n\n\nTo keep all coefficients matching one of two substrings:\n\nmodelsummary(models, coef_omit = \"^(?!.*tercept|.*y)\", gof_map = NA)\n\n\n\n\n\n \n \n \n OLS 1\n Poisson 1\n OLS 2\n Poisson 2\n OLS 3\n \n \n \n (Intercept)\n7948.667\n8.241\n16259.384\n9.876\n11243.544\n \n(2078.276)\n(0.006)\n(2611.140)\n(0.003)\n(1011.240)\n Literacy\n-39.121\n0.003\n3.680\n0.000\n-68.507\n \n(37.052)\n(0.000)\n(46.552)\n(0.000)\n(18.029)\n Clergy\n15.257\n\n77.148\n\n-16.376\n \n(25.735)\n\n(32.334)\n\n(12.522)\n \n \n \n\n\n\n\n\n\n\nmodelsummary offers powerful and innovative mechanisms to rename, reorder, and subset coefficients and goodness-of-fit statistics.\nYou can rename coefficients using the coef_rename argument. For example, if you have two models with different explanatory variables, but you want both variables to have the same name and appear on the same row, you can do:\n\nx <- list(lm(hp ~ drat, mtcars),\n lm(hp ~ vs, mtcars))\n\nmodelsummary(x, coef_rename = c(\"drat\" = \"Explanator\", \"vs\" = \"Explanator\"))\n\n\n\n\n\n \n \n \n (1)\n (2)\n \n \n \n (Intercept)\n353.653\n189.722\n \n(76.049)\n(11.347)\n Explanator\n-57.545\n-98.365\n \n(20.922)\n(17.155)\n Num.Obs.\n32\n32\n R2\n0.201\n0.523\n R2 Adj.\n0.175\n0.507\n AIC\n359.2\n342.7\n BIC\n363.6\n347.1\n Log.Lik.\n-176.588\n-168.347\n F\n7.565\n32.876\n RMSE\n60.31\n46.61\n \n \n \n\n\n\n\nIf you provide a named character vector to coef_rename, only exact matches of the complete original term name will be replaced.\nFor complex modifications, you can feed a function which returns a named vector to the coef_rename argument. For example, modelsummary ships with a function called coef_rename, which executes some common renaming tasks automatically. This example also uses the dvnames function to extract the name of the dependent variable in each model:\n\nx <- list(\n lm(mpg ~ factor(cyl) + drat + disp, data = mtcars),\n lm(hp ~ factor(cyl) + drat + disp, data = mtcars)\n)\n\nmodelsummary(dvnames(x), coef_rename = coef_rename)\n\n\n\n\n\n \n \n \n mpg\n hp\n \n \n \n (Intercept)\n26.158\n-86.788\n \n(6.537)\n(79.395)\n 6\n-4.547\n46.485\n \n(1.731)\n(21.027)\n 8\n-4.580\n121.892\n \n(2.952)\n(35.853)\n Drat\n0.783\n37.815\n \n(1.478)\n(17.952)\n Disp\n-0.026\n0.147\n \n(0.011)\n(0.137)\n Num.Obs.\n32\n32\n R2\n0.786\n0.756\n R2 Adj.\n0.754\n0.720\n AIC\n167.4\n327.2\n BIC\n176.2\n336.0\n Log.Lik.\n-77.719\n-157.623\n F\n24.774\n20.903\n RMSE\n2.74\n33.34\n \n \n \n\n\n\n\nOf course, you can also define your own custom functions. For instance, to rename a model with interacted variables (e.g., “drat:mpg”), you could define a custom rename_explanator function:\n\ny <- list(\n lm(hp ~ drat / mpg, mtcars),\n lm(hp ~ vs / mpg, mtcars)\n)\n\nrename_explanator <- function(old_names) {\n new_names <- gsub(\"drat|vs\", \"Explanator\", old_names)\n setNames(new_names, old_names)\n}\n\nmodelsummary(y, coef_rename = rename_explanator)\n\n\n\n\n\n \n \n \n (1)\n (2)\n \n \n \n (Intercept)\n91.206\n189.722\n \n(72.344)\n(11.205)\n Explanator\n68.331\n-18.316\n \n(27.390)\n(62.531)\n Explanator:mpg\n-2.558\n-3.260\n \n(0.467)\n(2.451)\n Num.Obs.\n32\n32\n R2\n0.608\n0.550\n R2 Adj.\n0.581\n0.519\n AIC\n338.4\n342.8\n BIC\n344.3\n348.7\n Log.Lik.\n-165.218\n-167.399\n F\n22.454\n17.743\n RMSE\n42.27\n45.25\n \n \n \n\n\n\n\nBeware of inadvertently replacing parts of other variable names! Making your regex pattern as specific as possible (e.g., by adding word boundaries) is likely a good idea. The custom rename function is also a good place to re-introduce the replacement of “:” with “×” if you are dealing with interaction terms – modelsummary makes this replacement for you only when the coef_rename argument is not specified.\nAnother possibility is to assign variable labels to attributes in the data used to fit the model. Then, we can automatically rename them:\n\ndatlab <- mtcars\ndatlab$cyl <- factor(datlab$cyl)\nattr(datlab$cyl, \"label\") <- \"Cylinders\"\nattr(datlab$am, \"label\") <- \"Transmission\"\nmodlab <- lm(mpg ~ cyl + am, data = datlab)\nmodelsummary(modlab, coef_rename = TRUE)\n\n\n\n\n\n \n \n \n (1)\n \n \n \n (Intercept)\n24.802\n \n(1.323)\n Cylinders [6]\n-6.156\n \n(1.536)\n Cylinders [8]\n-10.068\n \n(1.452)\n Transmission\n2.560\n \n(1.298)\n Num.Obs.\n32\n R2\n0.765\n R2 Adj.\n0.740\n AIC\n168.4\n BIC\n175.7\n Log.Lik.\n-79.199\n F\n30.402\n RMSE\n2.87\n \n \n \n\n\n\n\n\n\n\nThe coef_map argument is a named vector which allows users to rename, reorder, and subset coefficient estimates. Values of this vector correspond to the “clean” variable name. Names of this vector correspond to the “raw” variable name. The table will be sorted in the order in which terms are presented in coef_map. Coefficients which are not included in coef_map will be excluded from the table.\n\ncm <- c('Literacy' = 'Literacy (%)',\n 'Commerce' = 'Patents per capita',\n '(Intercept)' = 'Constant')\nmodelsummary(models, coef_map = cm)\n\n\n\n\n\n \n \n \n OLS 1\n Poisson 1\n OLS 2\n Poisson 2\n OLS 3\n \n \n \n Literacy (%)\n-39.121\n0.003\n3.680\n0.000\n-68.507\n \n(37.052)\n(0.000)\n(46.552)\n(0.000)\n(18.029)\n Patents per capita\n\n0.011\n\n0.001\n\n \n\n(0.000)\n\n(0.000)\n\n Constant\n7948.667\n8.241\n16259.384\n9.876\n11243.544\n \n(2078.276)\n(0.006)\n(2611.140)\n(0.003)\n(1011.240)\n Num.Obs.\n86\n86\n86\n86\n86\n R2\n0.020\n\n0.065\n\n0.152\n R2 Adj.\n-0.003\n\n0.043\n\n0.132\n AIC\n1740.8\n274160.8\n1780.0\n257564.4\n1616.9\n BIC\n1750.6\n274168.2\n1789.9\n257571.7\n1626.7\n Log.Lik.\n-866.392\n-137077.401\n-886.021\n-128779.186\n-804.441\n F\n0.866\n18294.559\n2.903\n279.956\n7.441\n RMSE\n5740.99\n5491.61\n7212.97\n7451.70\n2793.43\n \n \n \n\n\n\n\n\n\n\ngof_omit is a regular expression which will be fed to grepl(x,perl=TRUE) to detect the names of the statistics which should be excluded from the table.\nmodelsummary(models, gof_omit = 'DF|Deviance|R2|AIC|BIC')\n\n\n\nThe gof_map argument can be used to rename, re-order, subset, and format the statistics displayed in the bottom section of the table (“goodness-of-fit”).\nThe first type of values allowed is a character vector with elements equal to column names in the data.frame produced by get_gof(model):\n\nmodelsummary(models, gof_map = c(\"nobs\", \"r.squared\"))\n\n\n\n\n\n \n \n \n OLS 1\n Poisson 1\n OLS 2\n Poisson 2\n OLS 3\n \n \n \n (Intercept)\n7948.667\n8.241\n16259.384\n9.876\n11243.544\n \n(2078.276)\n(0.006)\n(2611.140)\n(0.003)\n(1011.240)\n Literacy\n-39.121\n0.003\n3.680\n0.000\n-68.507\n \n(37.052)\n(0.000)\n(46.552)\n(0.000)\n(18.029)\n Clergy\n15.257\n\n77.148\n\n-16.376\n \n(25.735)\n\n(32.334)\n\n(12.522)\n Commerce\n\n0.011\n\n0.001\n\n \n\n(0.000)\n\n(0.000)\n\n Num.Obs.\n86\n86\n86\n86\n86\n R2\n0.020\n\n0.065\n\n0.152\n \n \n \n\n\n\n\nA more powerful mechanism is to supply a data.frame (or tibble) through the gof_map argument. This data.frame must include 3 columns:\n\nraw: a string with the name of a column produced by get_gof(model).\nclean: a string with the “clean” name of the statistic you want to appear in your final table.\nfmt: a string which will be used to round/format the string in question (e.g., \"%.3f\"). This follows the same standards as the fmt argument in ?modelsummary.\n\nYou can see an example of a valid data frame by typing modelsummary::gof_map. This is the default data.frame that modelsummary uses to subset and reorder goodness-of-fit statistics. As you can see, omit == TRUE for quite a number of statistics. You can include setting omit == FALSE:\ngm <- modelsummary::gof_map\ngm$omit <- FALSE\nmodelsummary(models, gof_map = gm)\nThe goodness-of-fit statistics will be printed in the table in the same order as in the gof_map data.frame.\nf <- function(x) format(round(x, 3), big.mark=\",\")\ngm <- list(\n list(\"raw\" = \"nobs\", \"clean\" = \"N\", \"fmt\" = f),\n list(\"raw\" = \"AIC\", \"clean\" = \"aic\", \"fmt\" = f))\nmodelsummary(models, gof_map = gm)\nNotice the subtle difference between coef_map and gof_map. On the one hand, coef_map works as a “white list”: any coefficient not explicitly entered will be omitted from the table. On the other, gof_map works as a “black list”: statistics need to be explicitly marked for omission.\nAnother convenient way to build a gof_map argument is to use the tribble function from the tibble package. In this example, we insert special HTML code to display a superscript, so we use the escape=FALSE argument:\n\ngm <- tibble::tribble(\n ~raw, ~clean, ~fmt,\n \"nobs\", \"N\", 0,\n \"r.squared\", \"R<sup>2</sup>\", 2)\n\nmodelsummary(\n models,\n statistic = NULL,\n gof_map = gm,\n escape = FALSE)\n\n\n\n\n\n \n \n \n OLS 1\n Poisson 1\n OLS 2\n Poisson 2\n OLS 3\n \n \n \n (Intercept)\n7948.667\n8.241\n16259.384\n9.876\n11243.544\n Literacy\n-39.121\n0.003\n3.680\n0.000\n-68.507\n Clergy\n15.257\n\n77.148\n\n-16.376\n Commerce\n\n0.011\n\n0.001\n\n N\n86\n86\n86\n86\n86\n R<sup>2</sup>\n0.02\n\n0.07\n\n0.15\n \n \n \n\n\n\n\n\n\n\nThis section requires version 1.3.1 of modelsummary. If this version is not available on CRAN yet, you can install the development version by following the instructions on the website.\nThe shape argument accepts:\n\nA formula which determines the structure of the table, and can display “grouped” coefficients together (e.g., multivariate outcome or mixed-effects models).\nThe strings “rbind” or “rcollapse” to stack multiple tables on top of each other and present models in distinct “panels”.\n\n\n\nThe left side of the formula represents the rows and the right side represents the columns. The default formula is term + statistic ~ model:\n\nm <- list(\n lm(mpg ~ hp, data = mtcars),\n lm(mpg ~ hp + drat, data = mtcars))\n\nmodelsummary(m, shape = term + statistic ~ model, gof_map = NA)\n\n\n\n\n\n \n \n \n (1)\n (2)\n \n \n \n (Intercept)\n30.099\n10.790\n \n(1.634)\n(5.078)\n hp\n-0.068\n-0.052\n \n(0.010)\n(0.009)\n drat\n\n4.698\n \n\n(1.192)\n \n \n \n\n\n\n\nWe can display statistics horizontally with:\n\nmodelsummary(m,\n shape = term ~ model + statistic,\n statistic = \"conf.int\",\n gof_map = NA)\n\n\n\n\n\n \n \n \n \n (1) \n \n \n (2) \n \n \n \n Est.\n 2.5 %\n 97.5 %\n Est. \n 2.5 % \n 97.5 % \n \n \n \n (Intercept)\n30.099\n26.762\n33.436\n10.790\n0.405\n21.175\n hp\n-0.068\n-0.089\n-0.048\n-0.052\n-0.071\n-0.033\n drat\n\n\n\n4.698\n2.261\n7.135\n \n \n \n\n\n\n\nThe order of terms in the formula determines the order of headers in the table.\n\nmodelsummary(m,\n shape = term ~ statistic + model,\n statistic = \"conf.int\",\n gof_map = NA)\n\n\n\n\n\n \n \n \n \n Est. \n \n \n 2.5 % \n \n \n 97.5 % \n \n \n \n (1)\n (2)\n (1) \n (2) \n (1) \n (2) \n \n \n \n (Intercept)\n30.099\n10.790\n26.762\n0.405\n33.436\n21.175\n hp\n-0.068\n-0.052\n-0.089\n-0.071\n-0.048\n-0.033\n drat\n\n4.698\n\n2.261\n\n7.135\n \n \n \n\n\n\n\nshape does partial matching and will try to fill-in incomplete formulas:\n\nmodelsummary(m, shape = ~ statistic)\n\nSome models like multinomial logit or GAMLSS produce “grouped” parameter estimates. To display these groups, we can include a group identifier in the shape formula. This group identifier must be one of the column names produced by get_estimates(model). For example, in models produced by nnet::multinom, the group identifier is called “response”:\n\nlibrary(nnet)\n\ndat_multinom <- mtcars\ndat_multinom$cyl <- sprintf(\"Cyl: %s\", dat_multinom$cyl)\n\nmod <- list(\n nnet::multinom(cyl ~ mpg, data = dat_multinom, trace = FALSE),\n nnet::multinom(cyl ~ mpg + drat, data = dat_multinom, trace = FALSE))\n\nget_estimates(mod[[1]])\n\n term estimate std.error conf.level conf.low conf.high statistic\n1 (Intercept) 47.252432 34.975171 0.95 -21.2976435 115.8025065 1.351028\n2 mpg -2.205418 1.637963 0.95 -5.4157653 1.0049299 -1.346440\n3 (Intercept) 72.440246 37.175162 0.95 -0.4217332 145.3022247 1.948619\n4 mpg -3.579991 1.774693 0.95 -7.0583242 -0.1016573 -2.017246\n df.error p.value response s.value group\n1 Inf 0.17668650 Cyl: 6 2.5 \n2 Inf 0.17816078 Cyl: 6 2.5 \n3 Inf 0.05134088 Cyl: 8 4.3 \n4 Inf 0.04366989 Cyl: 8 4.5 \n\n\nTo summarize the results, we can type:\n\nmodelsummary(mod, shape = term + response ~ statistic)\n\n\n\n\n\n \n \n \n response\n \n (1) \n \n \n (2) \n \n \n \n Est.\n S.E.\n Est. \n S.E. \n \n \n \n (Intercept)\nCyl: 6\n47.252\n34.975\n89.573\n86.884\n \nCyl: 8\n72.440\n37.175\n117.971\n87.998\n mpg\nCyl: 6\n-2.205\n1.638\n-3.627\n3.869\n \nCyl: 8\n-3.580\n1.775\n-4.838\n3.915\n drat\nCyl: 6\n\n\n-3.210\n3.810\n \nCyl: 8\n\n\n-5.028\n4.199\n Num.Obs.\n\n32\n\n32\n\n R2\n\n0.763\n\n0.815\n\n R2 Adj.\n\n0.733\n\n0.786\n\n AIC\n\n24.1\n\n24.5\n\n BIC\n\n30.0\n\n33.3\n\n RMSE\n\n0.24\n\n0.20\n\n \n \n \n\n\n\n\nThe terms of the shape formula above can of course be rearranged to reshape the table. For example:\n\nmodelsummary(mod, shape = model + term ~ response)\n\n\n\n\n\n \n \n \n \n Cyl: 6\n Cyl: 8\n \n \n \n (1)\n(Intercept)\n47.252\n72.440\n \n\n(34.975)\n(37.175)\n \nmpg\n-2.205\n-3.580\n \n\n(1.638)\n(1.775)\n (2)\n(Intercept)\n89.573\n117.971\n \n\n(86.884)\n(87.998)\n \nmpg\n-3.627\n-4.838\n \n\n(3.869)\n(3.915)\n \ndrat\n-3.210\n-5.028\n \n\n(3.810)\n(4.199)\n \n \n \n\n\n\n\nWe can combine the term and group identifier columns by inserting an interaction colon : instead of the + in the formula:\n\nlibrary(marginaleffects)\nmod <- glm(am ~ mpg + factor(cyl), family = binomial, data = mtcars)\nmfx <- slopes(mod)\n\nmodelsummary(mfx, shape = term + contrast ~ model)\n\n\n\n\n\n \n \n \n \n (1)\n \n \n \n cyl\nmean(6) - mean(4)\n0.097\n \n\n(0.166)\n \nmean(8) - mean(4)\n0.093\n \n\n(0.234)\n mpg\nmean(dY/dX)\n0.056\n \n\n(0.027)\n Num.Obs.\n\n32\n AIC\n\n37.4\n BIC\n\n43.3\n Log.Lik.\n\n-14.702\n F\n\n2.236\n RMSE\n\n0.39\n \n \n \n\n\n\n\n\nmodelsummary(mfx, shape = term : contrast ~ model)\n\n\n\n\n\n \n \n \n (1)\n \n \n \n cyl mean(6) - mean(4)\n0.097\n \n(0.166)\n cyl mean(8) - mean(4)\n0.093\n \n(0.234)\n mpg mean(dY/dX)\n0.056\n \n(0.027)\n Num.Obs.\n32\n AIC\n37.4\n BIC\n43.3\n Log.Lik.\n-14.702\n F\n2.236\n RMSE\n0.39\n \n \n \n\n\n\n\n\n\n\nNote: The code in this section requires version 1.3.0 or the development version of modelsummary. See the website for installation instructions.\nThis section shows how to “stack/bind” multiple regression tables on top of one another, to display the results several models side-by-side and top-to-bottom. For example, imagine that we want to present 4 different models, half of which are estimated using a different outcome variable. When using modelsummary, we store models in a list. When using modelsummary with shape=\"rbind\" or shape=\"rbind\", we store models in a list of lists:\n\ngm <- c(\"r.squared\", \"nobs\", \"rmse\")\n\npanels <- list(\n list(\n lm(mpg ~ 1, data = mtcars),\n lm(mpg ~ qsec, data = mtcars)\n ),\n list(\n lm(hp ~ 1, data = mtcars),\n lm(hp ~ qsec, data = mtcars)\n )\n)\n\nmodelsummary(\n panels,\n shape = \"rbind\",\n gof_map = gm)\n\n\n\n\n\n \n \n \n (1)\n (2)\n \n \n \n \n Panel A\n \n (Intercept)\n20.091\n-5.114\n \n(1.065)\n(10.030)\n qsec\n\n1.412\n \n\n(0.559)\n R2\n0.000\n0.175\n Num.Obs.\n32\n32\n RMSE\n5.93\n5.39\n \n Panel B\n \n (Intercept)\n146.688\n631.704\n \n(12.120)\n(88.700)\n qsec\n\n-27.174\n \n\n(4.946)\n R2\n0.000\n0.502\n Num.Obs.\n32\n32\n RMSE\n67.48\n47.64\n \n \n \n\n\n\n\nLike with modelsummary(), we can can name models and panels by naming elements of our nested list:\n\npanels <- list(\n \"Outcome: mpg\" = list(\n \"(I)\" = lm(mpg ~ 1, data = mtcars),\n \"(II)\" = lm(mpg ~ qsec, data = mtcars)\n ),\n \"Outcome: hp\" = list(\n \"(I)\" = lm(hp ~ 1, data = mtcars),\n \"(II)\" = lm(hp ~ qsec, data = mtcars)\n )\n)\n\nmodelsummary(\n panels,\n shape = \"rbind\",\n gof_map = gm)\n\n\n\n\n\n \n \n \n (I)\n (II)\n \n \n \n \n Outcome: mpg\n \n (Intercept)\n20.091\n-5.114\n \n(1.065)\n(10.030)\n qsec\n\n1.412\n \n\n(0.559)\n R2\n0.000\n0.175\n Num.Obs.\n32\n32\n RMSE\n5.93\n5.39\n \n Outcome: hp\n \n (Intercept)\n146.688\n631.704\n \n(12.120)\n(88.700)\n qsec\n\n-27.174\n \n\n(4.946)\n R2\n0.000\n0.502\n Num.Obs.\n32\n32\n RMSE\n67.48\n47.64\n \n \n \n\n\n\n\n\n\nThe fixest package offers powerful tools to estimate multiple models using a concise syntax. fixest functions are also convenient because they return named lists of models which are easy to subset and manipulate using standard R functions like grepl.\nFor example, to introduce regressors in stepwise fashion, and to estimate models on different subsets of the data, we can do:\n\n##| message = FALSE\n\n## estimate 4 models\nlibrary(fixest)\n\n\nAttaching package: 'fixest'\n\n\nThe following object is masked _by_ '.GlobalEnv':\n\n f\n\nmod <- feols(\n c(hp, mpg) ~ csw(qsec, drat) | gear,\n data = mtcars)\n\n## select models with different outcome variables\npanels <- list(\n \"Miles per gallon\" = mod[grepl(\"mpg\", names(mod))],\n \"Horsepower\" = mod[grepl(\"hp\", names(mod))]\n)\n\nmodelsummary(\n panels,\n shape = \"rcollapse\",\n gof_omit = \"IC|R2\")\n\n\n\n\n\n \n \n \n (1)\n (2)\n \n \n \n \n Miles per gallon\n \n qsec\n1.436\n1.519\n \n(0.594)\n(0.529)\n drat\n\n5.765\n \n\n(2.381)\n RMSE\n4.03\n3.67\n \n Horsepower\n \n qsec\n-22.175\n-22.676\n \n(12.762)\n(13.004)\n drat\n\n-35.106\n \n\n(28.509)\n RMSE\n40.45\n39.14\n \n \n \n Num.Obs.\n32\n32\n Std.Errors\nby: gear\nby: gear\n FE: gear\nX\nX\n \n \n \n\n\n\n\nWe can use all the typical extension systems to add information, such as the mean of the dependent variable:\n\nglance_custom.fixest <- function(x, ...) {\n dv <- insight::get_response(x)\n dv <- sprintf(\"%.2f\", mean(dv, na.rm = TRUE))\n data.table::data.table(`Mean of DV` = dv)\n}\n\nmodelsummary(\n panels,\n shape = \"rcollapse\",\n gof_omit = \"IC|R2\")\n\n\n\n\n\n \n \n \n (1)\n (2)\n \n \n \n \n Miles per gallon\n \n qsec\n1.436\n1.519\n \n(0.594)\n(0.529)\n drat\n\n5.765\n \n\n(2.381)\n RMSE\n4.03\n3.67\n Mean of DV\n20.09\n20.09\n \n Horsepower\n \n qsec\n-22.175\n-22.676\n \n(12.762)\n(13.004)\n drat\n\n-35.106\n \n\n(28.509)\n RMSE\n40.45\n39.14\n Mean of DV\n146.69\n146.69\n \n \n \n Num.Obs.\n32\n32\n Std.Errors\nby: gear\nby: gear\n FE: gear\nX\nX\n \n \n \n\n\n\nrm(\"glance_custom.fixest\")\n\n\n\n\n\n\nBy default, modelsummary will align the first column (with coefficient names) to the left, and will center the results columns. To change this default, you can use the align argument, which accepts a string of the same length as the number of columns:\n\nmodelsummary(models, align=\"lrrrrr\")\n\nUsers who produce PDF documents using Rmarkdown or LaTeX can also align values on the decimal dot by using the character “d” in the align argument:\n\nmodelsummary(models, align=\"lddddd\")\n\nFor the table produced by this code to compile, users must include the following code in their LaTeX preamble:\n\n\\usepackage{booktabs}\n\\usepackage{siunitx}\n\\newcolumntype{d}{S[input-symbols = ()]}\n\n\n\n\nAdd notes to the bottom of your table:\nmodelsummary(models, \n notes = list('Text of the first note.', \n 'Text of the second note.'))\n\n\n\nYou can add a title to your table as follows:\nmodelsummary(models, title = 'This is a title for my table.')\n\n\n\nUse the add_rows argument to add rows manually to a table. For example, let’s say you estimate two models with a factor variables and you want to insert (a) an empty line to identify the category of reference, and (b) customized information at the bottom of the table:\n\nmodels <- list()\nmodels[['OLS']] <- lm(mpg ~ factor(cyl), mtcars)\nmodels[['Logit']] <- glm(am ~ factor(cyl), mtcars, family = binomial)\n\nWe create a data.frame with the same number of columns as the summary table. Then, we define a “position” attribute to specify where the new rows should be inserted in the table. Finally, we pass this data.frame to the add_rows argument:\n\nlibrary(tibble)\nrows <- tribble(~term, ~OLS, ~Logit,\n 'factor(cyl)4', '-', '-',\n 'Info', '???', 'XYZ')\nattr(rows, 'position') <- c(3, 9)\n\nmodelsummary(models, add_rows = rows)\n\n\n\n\n\n \n \n \n OLS\n Logit\n \n \n \n (Intercept)\n26.664\n0.981\n \n(0.972)\n(0.677)\n factor(cyl)4\n-\n-\n factor(cyl)6\n-6.921\n-1.269\n \n(1.558)\n(1.021)\n factor(cyl)8\n-11.564\n-2.773\n \n(1.299)\n(1.021)\n Num.Obs.\n32\n32\n Info\n???\nXYZ\n R2\n0.732\n\n R2 Adj.\n0.714\n\n AIC\n170.6\n39.9\n BIC\n176.4\n44.3\n Log.Lik.\n-81.282\n-16.967\n F\n39.698\n3.691\n RMSE\n3.07\n0.42\n \n \n \n\n\n\n\n\n\n\nWe can exponentiate their estimates using the exponentiate argument:\n\nmod_logit <- glm(am ~ mpg, data = mtcars, family = binomial)\nmodelsummary(mod_logit, exponentiate = TRUE)\n\n\n\n\n\n \n \n \n (1)\n \n \n \n (Intercept)\n0.001\n \n(0.003)\n mpg\n1.359\n \n(0.156)\n Num.Obs.\n32\n AIC\n33.7\n BIC\n36.6\n Log.Lik.\n-14.838\n F\n7.148\n RMSE\n0.39\n \n \n \n\n\n\n\nWe can also present exponentiated and standard models side by side by using a logical vector:\n\nmod_logit <- list(mod_logit, mod_logit)\nmodelsummary(mod_logit, exponentiate = c(TRUE, FALSE))\n\n\n\n\n\n \n \n \n (1)\n (2)\n \n \n \n (Intercept)\n0.001\n-6.604\n \n(0.003)\n(2.351)\n mpg\n1.359\n0.307\n \n(0.156)\n(0.115)\n Num.Obs.\n32\n32\n AIC\n33.7\n33.7\n BIC\n36.6\n36.6\n Log.Lik.\n-14.838\n-14.838\n F\n7.148\n7.148\n RMSE\n0.39\n0.39\n \n \n \n\n\n\n\n\n\n\nAll arguments passed by the user to a modelsummary function are pushed forward in two other functions:\n\nThe function which extracts model estimates.\n\nBy default, additional arguments are pushed forward to parameters::parameters and performance::performance. Users can also can also use a different “backend” to extract information from model objects: the broom package. By setting the modelsummary_get global option, we tell modelsummary to use the easystats/parameters packages instead of broom. With these packages, other arguments are available, such as the metrics argument. Please refer to these package’s documentation to details.\n\nThe table-making functions.\n\nBy default, additional arguments are pushed forward to kableExtra::kbl, but users can use a different table-making function by setting the output argument to a different value such as \"gt\", \"flextable\", or \"huxtable\".\nSee the Appearance vignette for examples.\n\n\nAll arguments passed supported by these functions are thus automatically available directly in modelsummary, modelplot, and the datasummary family of functions.\n\n\n\nTo customize the appearance of tables, modelsummary supports five popular and extremely powerful table-making packages:\n\ngt: https://gt.rstudio.com\nkableExtra: http://haozhu233.github.io/kableExtra\nhuxtable: https://hughjonesd.github.io/huxtable/\nflextable: https://davidgohel.github.io/flextable/\nDT: https://rstudio.github.io/DT\n\nThe “customizing the look of your tables” vignette shows examples for all 4 packages.\n\n\n\nmodelsummary automatically supports all the models supported by the tidy function of the broom package or the parameters function of the parameters package. The list of supported models is rapidly expanding. At the moment, it covers the following model classes:\n\nsupported_models()\n\n [1] \"aareg\" \"acf\" \n [3] \"afex_aov\" \"AKP\" \n [5] \"anova\" \"Anova.mlm\" \n [7] \"anova.rms\" \"aov\" \n [9] \"aovlist\" \"Arima\" \n [11] \"averaging\" \"bamlss\" \n [13] \"bayesQR\" \"bcplm\" \n [15] \"befa\" \"betamfx\" \n [17] \"betaor\" \"betareg\" \n [19] \"BFBayesFactor\" \"bfsl\" \n [21] \"BGGM\" \"bifeAPEs\" \n [23] \"biglm\" \"binDesign\" \n [25] \"binWidth\" \"blavaan\" \n [27] \"blrm\" \"boot\" \n [29] \"bootstrap_model\" \"bracl\" \n [31] \"brmsfit\" \"brmultinom\" \n [33] \"btergm\" \"cch\" \n [35] \"censReg\" \"cgam\" \n [37] \"character\" \"cld\" \n [39] \"clm\" \"clm2\" \n [41] \"clmm\" \"clmm2\" \n [43] \"coeftest\" \"comparisons\" \n [45] \"confint.glht\" \"confusionMatrix\" \n [47] \"coxph\" \"cpglmm\" \n [49] \"crr\" \"cv.glmnet\" \n [51] \"data.frame\" \"dbscan\" \n [53] \"default\" \"deltaMethod\" \n [55] \"density\" \"dep.effect\" \n [57] \"DirichletRegModel\" \"dist\" \n [59] \"draws\" \"drc\" \n [61] \"durbinWatsonTest\" \"emm_list\" \n [63] \"emmeans\" \"emmeans_summary\" \n [65] \"emmGrid\" \"epi.2by2\" \n [67] \"ergm\" \"fa\" \n [69] \"fa.ci\" \"factanal\" \n [71] \"FAMD\" \"feglm\" \n [73] \"felm\" \"fitdistr\" \n [75] \"fixest\" \"fixest_multi\" \n [77] \"flac\" \"flic\" \n [79] \"ftable\" \"gam\" \n [81] \"Gam\" \"gamlss\" \n [83] \"gamm\" \"garch\" \n [85] \"geeglm\" \"ggeffects\" \n [87] \"glht\" \"glimML\" \n [89] \"glm\" \"glmm\" \n [91] \"glmmTMB\" \"glmnet\" \n [93] \"glmrob\" \"glmRob\" \n [95] \"glmx\" \"gmm\" \n [97] \"hclust\" \"hdbscan\" \n [99] \"hglm\" \"hkmeans\" \n[101] \"HLfit\" \"htest\" \n[103] \"hurdle\" \"hypotheses\" \n[105] \"irlba\" \"ivFixed\" \n[107] \"ivprobit\" \"ivreg\" \n[109] \"kappa\" \"kde\" \n[111] \"Kendall\" \"kmeans\" \n[113] \"lavaan\" \"leveneTest\" \n[115] \"Line\" \"Lines\" \n[117] \"list\" \"lm\" \n[119] \"lm_robust\" \"lm.beta\" \n[121] \"lme\" \"lmodel2\" \n[123] \"lmrob\" \"lmRob\" \n[125] \"logical\" \"logistf\" \n[127] \"logitmfx\" \"logitor\" \n[129] \"lqm\" \"lqmm\" \n[131] \"lsmobj\" \"manova\" \n[133] \"maov\" \"map\" \n[135] \"marginaleffects\" \"marginalmeans\" \n[137] \"margins\" \"maxim\" \n[139] \"maxLik\" \"mblogit\" \n[141] \"Mclust\" \"mcmc\" \n[143] \"mcmc.list\" \"MCMCglmm\" \n[145] \"mcp1\" \"mcp2\" \n[147] \"med1way\" \"mediate\" \n[149] \"merMod\" \"merModList\" \n[151] \"meta_bma\" \"meta_fixed\" \n[153] \"meta_random\" \"metaplus\" \n[155] \"mfx\" \"mhurdle\" \n[157] \"mipo\" \"mira\" \n[159] \"mixed\" \"MixMod\" \n[161] \"mixor\" \"mjoint\" \n[163] \"mle\" \"mle2\" \n[165] \"mlm\" \"mlogit\" \n[167] \"mmrm\" \"mmrm_fit\" \n[169] \"mmrm_tmb\" \"model_fit\" \n[171] \"model_parameters\" \"muhaz\" \n[173] \"multinom\" \"mvord\" \n[175] \"negbin\" \"negbinirr\" \n[177] \"negbinmfx\" \"nestedLogit\" \n[179] \"nlrq\" \"nls\" \n[181] \"NULL\" \"numeric\" \n[183] \"omega\" \"onesampb\" \n[185] \"optim\" \"orcutt\" \n[187] \"osrt\" \"pairwise.htest\" \n[189] \"pam\" \"parameters_efa\" \n[191] \"parameters_pca\" \"PCA\" \n[193] \"pgmm\" \"plm\" \n[195] \"PMCMR\" \"poissonirr\" \n[197] \"poissonmfx\" \"poLCA\" \n[199] \"polr\" \"Polygon\" \n[201] \"Polygons\" \"power.htest\" \n[203] \"prcomp\" \"predictions\" \n[205] \"principal\" \"probitmfx\" \n[207] \"pvclust\" \"pyears\" \n[209] \"rcorr\" \"ref.grid\" \n[211] \"regsubsets\" \"ridgelm\" \n[213] \"rlm\" \"rlmerMod\" \n[215] \"rma\" \"robtab\" \n[217] \"roc\" \"rq\" \n[219] \"rqs\" \"rqss\" \n[221] \"sarlm\" \"Sarlm\" \n[223] \"scam\" \"selection\" \n[225] \"sem\" \"SemiParBIV\" \n[227] \"slopes\" \"SpatialLinesDataFrame\" \n[229] \"SpatialPolygons\" \"SpatialPolygonsDataFrame\"\n[231] \"spec\" \"speedglm\" \n[233] \"speedlm\" \"stanfit\" \n[235] \"stanmvreg\" \"stanreg\" \n[237] \"summary_emm\" \"summary.glht\" \n[239] \"summary.lm\" \"summary.plm\" \n[241] \"summaryDefault\" \"survdiff\" \n[243] \"survexp\" \"survfit\" \n[245] \"survreg\" \"svd\" \n[247] \"svyglm\" \"svyolr\" \n[249] \"svytable\" \"systemfit\" \n[251] \"t1way\" \"table\" \n[253] \"tobit\" \"trendPMCMR\" \n[255] \"trimcibt\" \"ts\" \n[257] \"TukeyHSD\" \"varest\" \n[259] \"vgam\" \"wbgee\" \n[261] \"wbm\" \"wmcpAKP\" \n[263] \"xyz\" \"yuen\" \n[265] \"zcpglm\" \"zerocount\" \n[267] \"zeroinfl\" \"zoo\" \n\n\nTo see if a given model is supported, you can fit it, and then call this function:\n\nget_estimates(model)\n\nIf this function does not return a valid output, you can easily (really!!) add your own support. See the next section for a tutorial. If you do this, you may consider opening an issue on the Github website of the broom package: https://github.com/tidymodels/broom/issues\n\n\n\n\n\nYou can use modelsummary to insert tables into dynamic documents with knitr or Rmarkdown. This minimal .Rmd file can produce tables in PDF, HTML, or RTF documents:\n\nminimal.Rmd\n\nThis .Rmd file shows illustrates how to use table numbering and cross-references to produce PDF documents using bookdown:\n\ncross_references.Rmd\n\nThis .Rmd file shows how to customize tables in PDF and HTML files using gt and kableExtra functions:\n\nappearance.Rmd\n\n\n\n\nQuarto is an open source publishing system built on top of Pandoc. It was designed as a “successor” to Rmarkdown, and includes useful features for technical writing, such as built-in support for cross-references. modelsummary works automatically with Quarto. This is a minimal document with cross-references which should render automatically to PDF, HTML, and more:\n\n---\nformat: pdf\ntitle: Example\n---\n\n@tbl-mtcars shows that cars with high horse power get low miles per gallon.\n\n```{r}\n##| label: tbl-mtcars\n##| tbl-cap: \"Horse Powers vs. Miles per Gallon\"\nlibrary(modelsummary)\nmod <- lm(mpg ~ hp, mtcars)\nmodelsummary(mod)\n:::\n\n### Emacs Org-Mode\n\nYou can use `modelsummary` to insert tables into Emacs Org-Mode documents, which can be exported to a variety of formats, including HTML and PDF (via LaTeX). As with anything Emacs-related, there are many ways to achieve the outcomes you want. Here is one example of an Org-Mode document which can automatically export tables to HTML and PDF without manual tweaks:\n\n##+PROPERTY: header-args:R :var orgbackend=(prin1-to-string org-export-current-backend) ##+MACRO: Rtable (eval (concat “#+header: :results output” (prin1-to-string org-export-current-backend)))\n{{{Rtable}}} ##+BEGIN_SRC R :exports both library(modelsummary) options(modelsummary_factory_default = orgbackend)\nmod = lm(hp ~ mpg, data = mtcars)\nmodelsummary(mod) ##+END_SRC\n\nThe first line tells Org-mode to assign a variable called `orgbackend`. This variable will be accessible by the `R` session, and will be equal to \"html\" or \"latex\", depending on the export format.\n\nThe second line creates an Org macro which we will use to automatically add useful information to the header of source blocks. For instance, when we export to HTML, the macro will expand to `:results output html`. This tells Org-Mode to insert the last printed output from the `R` session, and to treat it as raw HTML. \n\nThe `{{{Rtable}}}` call expands the macro to add information to the header of the block that follows.\n\n`#+BEGIN_SRC R :exports both` says that we want to print both the original code and the output (`:exports results` would omit the code, for example).\n\nFinally, `options(modelsummary_factory_default=orgbackend` uses the variable we defined to set the default output format. That way, we don't have to use the `output` argument every time.\n\nOne potentially issue to keep in mind is that the code above extracts the printout from the `R` console. However, when we customize tables with `kableExtra` or `gt` functions, those functions do not always return printed raw HTML or LaTeX code. Sometimes, it can be necessary to add a call to `cat` at the end of a table customization pipeline. For example:\n\n{{{Rtable}}} ##+BEGIN_SRC R :exports both library(modelsummary) library(kableExtra)\nmod = lm(hp ~ mpg, data = mtcars)\nmodelsummary(mod, output = orgbackend) %>% row_spec(1, background = “pink”) %>% cat() ##+END_SRC\n\n## Global options\n\nUsers can change the default behavior of `modelsummary` by setting global options.\n\nOmit the note at the bottom of the table with significance threshold:\n\n::: {.cell}\n\n```{.r .cell-code}\noptions(\"modelsummary_stars_note\" = FALSE)\n```\n:::\n\nChange the default output format:\n\n::: {.cell}\n\n```{.r .cell-code}\noptions(modelsummary_factory_default = \"latex\")\noptions(modelsummary_factory_default = \"gt\")\n```\n:::\n\nChange the backend packages that `modelsummary` uses to create tables in different output formats:\n\n::: {.cell}\n\n```{.r .cell-code}\noptions(modelsummary_factory_html = 'kableExtra')\noptions(modelsummary_factory_latex = 'flextable')\noptions(modelsummary_factory_word = 'huxtable')\noptions(modelsummary_factory_png = 'gt')\n```\n:::\n\nChange the packages that `modelsummary` uses to extract information from models:\n\n::: {.cell}\n\n```{.r .cell-code}\n## tidymodels: broom \noptions(modelsummary_get = \"broom\")\n\n## easystats: performance + parameters\noptions(modelsummary_get = \"easystats\")\n```\n:::\n\n[The `appearance` vignette](https://modelsummary.com/articles/appearance.html#themes) shows how to set \"themes\" for your tables using the `modelsummary_theme_gt`, `modelsummary_theme_kableExtra`, `modelsummary_theme_flextable` and `modelsummary_theme_huxtable` global options. For example:\n\n::: {.cell}\n\n```{.r .cell-code}\nlibrary(gt)\n\n## The ... ellipsis is required!\ncustom_theme <- function(x, ...) {\n x %>% gt::opt_row_striping(row_striping = TRUE)\n}\noptions(\"modelsummary_theme_gt\" = custom_theme)\n\nmod <- lm(mpg ~ hp + drat, mtcars)\nmodelsummary(mod, output = \"gt\")\n```\n:::\n\n## Case studies\n\n\n### Standardization\n\nIn some cases, it is useful to standardize coefficients before reporting them. `modelsummary` extracts coefficients from model objects using the `parameters` package, and that package offers several options for standardization: https://easystats.github.io/parameters/reference/model_parameters.default.html\n\nWe can pass the `standardize` argument directly to `modelsummary` or `modelplot`, and that argument will be forwarded to `parameters`. For example to refit the model on standardized data and plot the results, we can do:\n\n::: {.cell}\n\n```{.r .cell-code}\nlibrary(modelsummary)\nmod <- lm(mpg ~ hp + am, data = mtcars)\n\nmodelplot(mod, standardize = \"refit\")\n```\n\n::: {.cell-output .cell-output-stderr}\n```\nWarning in ggplot2::geom_pointrange(ggplot2::aes(y = term, x = estimate, :\nIgnoring unknown parameters: `standardize`\n```\n:::\n\n::: {.cell-output-display}\n![](modelsummary_files/figure-html/unnamed-chunk-60-1.png){width=672}\n:::\n:::\n\nCompare to the unstandardized plot:\n\n::: {.cell}\n\n```{.r .cell-code}\nmodelplot(mod)\n```\n\n::: {.cell-output-display}\n![](modelsummary_files/figure-html/unnamed-chunk-61-1.png){width=672}\n:::\n:::\n\n### Subgroup estimation with `nest_by`\n\nSometimes, it is useful to estimate multiple regression models on subsets of the data. To do this efficiently, we can use the `nest_by` function from the `dplyr` package. Then, estimate the models with `lm`, extract them and name them with `pull`, and finally summarize them with `modelsummary`:\n\n::: {.cell}\n\n```{.r .cell-code}\nlibrary(tidyverse)\n\nmtcars %>%\n nest_by(cyl) %>%\n mutate(models = list(lm(mpg ~ hp, data))) %>%\n pull(models, name = cyl) %>%\n modelsummary\n```\n\n::: {.cell-output-display}\n\n```{=html}\n<div id=\"wkbhfqnlym\" style=\"padding-left:0px;padding-right:0px;padding-top:10px;padding-bottom:10px;overflow-x:auto;overflow-y:auto;width:auto;height:auto;\">\n<style>#wkbhfqnlym table {\n font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji';\n -webkit-font-smoothing: antialiased;\n -moz-osx-font-smoothing: grayscale;\n}\n\n#wkbhfqnlym thead, #wkbhfqnlym tbody, #wkbhfqnlym tfoot, #wkbhfqnlym tr, #wkbhfqnlym td, #wkbhfqnlym th {\n border-style: none;\n}\n\n#wkbhfqnlym p {\n margin: 0;\n padding: 0;\n}\n\n#wkbhfqnlym .gt_table {\n display: table;\n border-collapse: collapse;\n line-height: normal;\n margin-left: auto;\n margin-right: auto;\n color: #333333;\n font-size: 16px;\n font-weight: normal;\n font-style: normal;\n background-color: #FFFFFF;\n width: auto;\n border-top-style: solid;\n border-top-width: 2px;\n border-top-color: #A8A8A8;\n border-right-style: none;\n border-right-width: 2px;\n border-right-color: #D3D3D3;\n border-bottom-style: solid;\n border-bottom-width: 2px;\n border-bottom-color: #A8A8A8;\n border-left-style: none;\n border-left-width: 2px;\n border-left-color: #D3D3D3;\n}\n\n#wkbhfqnlym .gt_caption {\n padding-top: 4px;\n padding-bottom: 4px;\n}\n\n#wkbhfqnlym .gt_title {\n color: #333333;\n font-size: 125%;\n font-weight: initial;\n padding-top: 4px;\n padding-bottom: 4px;\n padding-left: 5px;\n padding-right: 5px;\n border-bottom-color: #FFFFFF;\n border-bottom-width: 0;\n}\n\n#wkbhfqnlym .gt_subtitle {\n color: #333333;\n font-size: 85%;\n font-weight: initial;\n padding-top: 3px;\n padding-bottom: 5px;\n padding-left: 5px;\n padding-right: 5px;\n border-top-color: #FFFFFF;\n border-top-width: 0;\n}\n\n#wkbhfqnlym .gt_heading {\n background-color: #FFFFFF;\n text-align: center;\n border-bottom-color: #FFFFFF;\n border-left-style: none;\n border-left-width: 1px;\n border-left-color: #D3D3D3;\n border-right-style: none;\n border-right-width: 1px;\n border-right-color: #D3D3D3;\n}\n\n#wkbhfqnlym .gt_bottom_border {\n border-bottom-style: solid;\n border-bottom-width: 2px;\n border-bottom-color: #D3D3D3;\n}\n\n#wkbhfqnlym .gt_col_headings {\n border-top-style: solid;\n border-top-width: 2px;\n border-top-color: #D3D3D3;\n border-bottom-style: solid;\n border-bottom-width: 2px;\n border-bottom-color: #D3D3D3;\n border-left-style: none;\n border-left-width: 1px;\n border-left-color: #D3D3D3;\n border-right-style: none;\n border-right-width: 1px;\n border-right-color: #D3D3D3;\n}\n\n#wkbhfqnlym .gt_col_heading {\n color: #333333;\n background-color: #FFFFFF;\n font-size: 100%;\n font-weight: normal;\n text-transform: inherit;\n border-left-style: none;\n border-left-width: 1px;\n border-left-color: #D3D3D3;\n border-right-style: none;\n border-right-width: 1px;\n border-right-color: #D3D3D3;\n vertical-align: bottom;\n padding-top: 5px;\n padding-bottom: 6px;\n padding-left: 5px;\n padding-right: 5px;\n overflow-x: hidden;\n}\n\n#wkbhfqnlym .gt_column_spanner_outer {\n color: #333333;\n background-color: #FFFFFF;\n font-size: 100%;\n font-weight: normal;\n text-transform: inherit;\n padding-top: 0;\n padding-bottom: 0;\n padding-left: 4px;\n padding-right: 4px;\n}\n\n#wkbhfqnlym .gt_column_spanner_outer:first-child {\n padding-left: 0;\n}\n\n#wkbhfqnlym .gt_column_spanner_outer:last-child {\n padding-right: 0;\n}\n\n#wkbhfqnlym .gt_column_spanner {\n border-bottom-style: solid;\n border-bottom-width: 2px;\n border-bottom-color: #D3D3D3;\n vertical-align: bottom;\n padding-top: 5px;\n padding-bottom: 5px;\n overflow-x: hidden;\n display: inline-block;\n width: 100%;\n}\n\n#wkbhfqnlym .gt_spanner_row {\n border-bottom-style: hidden;\n}\n\n#wkbhfqnlym .gt_group_heading {\n padding-top: 8px;\n padding-bottom: 8px;\n padding-left: 5px;\n padding-right: 5px;\n color: #333333;\n background-color: #FFFFFF;\n font-size: 100%;\n font-weight: initial;\n text-transform: inherit;\n border-top-style: solid;\n border-top-width: 2px;\n border-top-color: #D3D3D3;\n border-bottom-style: solid;\n border-bottom-width: 2px;\n border-bottom-color: #D3D3D3;\n border-left-style: none;\n border-left-width: 1px;\n border-left-color: #D3D3D3;\n border-right-style: none;\n border-right-width: 1px;\n border-right-color: #D3D3D3;\n vertical-align: middle;\n text-align: left;\n}\n\n#wkbhfqnlym .gt_empty_group_heading {\n padding: 0.5px;\n color: #333333;\n background-color: #FFFFFF;\n font-size: 100%;\n font-weight: initial;\n border-top-style: solid;\n border-top-width: 2px;\n border-top-color: #D3D3D3;\n border-bottom-style: solid;\n border-bottom-width: 2px;\n border-bottom-color: #D3D3D3;\n vertical-align: middle;\n}\n\n#wkbhfqnlym .gt_from_md > :first-child {\n margin-top: 0;\n}\n\n#wkbhfqnlym .gt_from_md > :last-child {\n margin-bottom: 0;\n}\n\n#wkbhfqnlym .gt_row {\n padding-top: 8px;\n padding-bottom: 8px;\n padding-left: 5px;\n padding-right: 5px;\n margin: 10px;\n border-top-style: solid;\n border-top-width: 1px;\n border-top-color: #D3D3D3;\n border-left-style: none;\n border-left-width: 1px;\n border-left-color: #D3D3D3;\n border-right-style: none;\n border-right-width: 1px;\n border-right-color: #D3D3D3;\n vertical-align: middle;\n overflow-x: hidden;\n}\n\n#wkbhfqnlym .gt_stub {\n color: #333333;\n background-color: #FFFFFF;\n font-size: 100%;\n font-weight: initial;\n text-transform: inherit;\n border-right-style: solid;\n border-right-width: 2px;\n border-right-color: #D3D3D3;\n padding-left: 5px;\n padding-right: 5px;\n}\n\n#wkbhfqnlym .gt_stub_row_group {\n color: #333333;\n background-color: #FFFFFF;\n font-size: 100%;\n font-weight: initial;\n text-transform: inherit;\n border-right-style: solid;\n border-right-width: 2px;\n border-right-color: #D3D3D3;\n padding-left: 5px;\n padding-right: 5px;\n vertical-align: top;\n}\n\n#wkbhfqnlym .gt_row_group_first td {\n border-top-width: 2px;\n}\n\n#wkbhfqnlym .gt_row_group_first th {\n border-top-width: 2px;\n}\n\n#wkbhfqnlym .gt_summary_row {\n color: #333333;\n background-color: #FFFFFF;\n text-transform: inherit;\n padding-top: 8px;\n padding-bottom: 8px;\n padding-left: 5px;\n padding-right: 5px;\n}\n\n#wkbhfqnlym .gt_first_summary_row {\n border-top-style: solid;\n border-top-color: #D3D3D3;\n}\n\n#wkbhfqnlym .gt_first_summary_row.thick {\n border-top-width: 2px;\n}\n\n#wkbhfqnlym .gt_last_summary_row {\n padding-top: 8px;\n padding-bottom: 8px;\n padding-left: 5px;\n padding-right: 5px;\n border-bottom-style: solid;\n border-bottom-width: 2px;\n border-bottom-color: #D3D3D3;\n}\n\n#wkbhfqnlym .gt_grand_summary_row {\n color: #333333;\n background-color: #FFFFFF;\n text-transform: inherit;\n padding-top: 8px;\n padding-bottom: 8px;\n padding-left: 5px;\n padding-right: 5px;\n}\n\n#wkbhfqnlym .gt_first_grand_summary_row {\n padding-top: 8px;\n padding-bottom: 8px;\n padding-left: 5px;\n padding-right: 5px;\n border-top-style: double;\n border-top-width: 6px;\n border-top-color: #D3D3D3;\n}\n\n#wkbhfqnlym .gt_last_grand_summary_row_top {\n padding-top: 8px;\n padding-bottom: 8px;\n padding-left: 5px;\n padding-right: 5px;\n border-bottom-style: double;\n border-bottom-width: 6px;\n border-bottom-color: #D3D3D3;\n}\n\n#wkbhfqnlym .gt_striped {\n background-color: rgba(128, 128, 128, 0.05);\n}\n\n#wkbhfqnlym .gt_table_body {\n border-top-style: solid;\n border-top-width: 2px;\n border-top-color: #D3D3D3;\n border-bottom-style: solid;\n border-bottom-width: 2px;\n border-bottom-color: #D3D3D3;\n}\n\n#wkbhfqnlym .gt_footnotes {\n color: #333333;\n background-color: #FFFFFF;\n border-bottom-style: none;\n border-bottom-width: 2px;\n border-bottom-color: #D3D3D3;\n border-left-style: none;\n border-left-width: 2px;\n border-left-color: #D3D3D3;\n border-right-style: none;\n border-right-width: 2px;\n border-right-color: #D3D3D3;\n}\n\n#wkbhfqnlym .gt_footnote {\n margin: 0px;\n font-size: 90%;\n padding-top: 4px;\n padding-bottom: 4px;\n padding-left: 5px;\n padding-right: 5px;\n}\n\n#wkbhfqnlym .gt_sourcenotes {\n color: #333333;\n background-color: #FFFFFF;\n border-bottom-style: none;\n border-bottom-width: 2px;\n border-bottom-color: #D3D3D3;\n border-left-style: none;\n border-left-width: 2px;\n border-left-color: #D3D3D3;\n border-right-style: none;\n border-right-width: 2px;\n border-right-color: #D3D3D3;\n}\n\n#wkbhfqnlym .gt_sourcenote {\n font-size: 90%;\n padding-top: 4px;\n padding-bottom: 4px;\n padding-left: 5px;\n padding-right: 5px;\n}\n\n#wkbhfqnlym .gt_left {\n text-align: left;\n}\n\n#wkbhfqnlym .gt_center {\n text-align: center;\n}\n\n#wkbhfqnlym .gt_right {\n text-align: right;\n font-variant-numeric: tabular-nums;\n}\n\n#wkbhfqnlym .gt_font_normal {\n font-weight: normal;\n}\n\n#wkbhfqnlym .gt_font_bold {\n font-weight: bold;\n}\n\n#wkbhfqnlym .gt_font_italic {\n font-style: italic;\n}\n\n#wkbhfqnlym .gt_super {\n font-size: 65%;\n}\n\n#wkbhfqnlym .gt_footnote_marks {\n font-size: 75%;\n vertical-align: 0.4em;\n position: initial;\n}\n\n#wkbhfqnlym .gt_asterisk {\n font-size: 100%;\n vertical-align: 0;\n}\n\n#wkbhfqnlym .gt_indent_1 {\n text-indent: 5px;\n}\n\n#wkbhfqnlym .gt_indent_2 {\n text-indent: 10px;\n}\n\n#wkbhfqnlym .gt_indent_3 {\n text-indent: 15px;\n}\n\n#wkbhfqnlym .gt_indent_4 {\n text-indent: 20px;\n}\n\n#wkbhfqnlym .gt_indent_5 {\n text-indent: 25px;\n}\n</style>\n<table class=\"gt_table\" data-quarto-disable-processing=\"false\" data-quarto-bootstrap=\"false\">\n <thead>\n <tr class=\"gt_col_headings\">\n <th class=\"gt_col_heading gt_columns_bottom_border gt_left\" rowspan=\"1\" colspan=\"1\" scope=\"col\" id=\" \"> </th>\n <th class=\"gt_col_heading gt_columns_bottom_border gt_center\" rowspan=\"1\" colspan=\"1\" scope=\"col\" id=\"4\">4</th>\n <th class=\"gt_col_heading gt_columns_bottom_border gt_center\" rowspan=\"1\" colspan=\"1\" scope=\"col\" id=\"6\">6</th>\n <th class=\"gt_col_heading gt_columns_bottom_border gt_center\" rowspan=\"1\" colspan=\"1\" scope=\"col\" id=\"8\">8</th>\n </tr>\n </thead>\n <tbody class=\"gt_table_body\">\n <tr><td headers=\" \" class=\"gt_row gt_left\">(Intercept)</td>\n<td headers=\"4\" class=\"gt_row gt_center\">35.983</td>\n<td headers=\"6\" class=\"gt_row gt_center\">20.674</td>\n<td headers=\"8\" class=\"gt_row gt_center\">18.080</td></tr>\n <tr><td headers=\" \" class=\"gt_row gt_left\"></td>\n<td headers=\"4\" class=\"gt_row gt_center\">(5.201)</td>\n<td headers=\"6\" class=\"gt_row gt_center\">(3.304)</td>\n<td headers=\"8\" class=\"gt_row gt_center\">(2.988)</td></tr>\n <tr><td headers=\" \" class=\"gt_row gt_left\">hp</td>\n<td headers=\"4\" class=\"gt_row gt_center\">-0.113</td>\n<td headers=\"6\" class=\"gt_row gt_center\">-0.008</td>\n<td headers=\"8\" class=\"gt_row gt_center\">-0.014</td></tr>\n <tr><td headers=\" \" class=\"gt_row gt_left\" style=\"border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #000000;\"></td>\n<td headers=\"4\" class=\"gt_row gt_center\" style=\"border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #000000;\">(0.061)</td>\n<td headers=\"6\" class=\"gt_row gt_center\" style=\"border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #000000;\">(0.027)</td>\n<td headers=\"8\" class=\"gt_row gt_center\" style=\"border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #000000;\">(0.014)</td></tr>\n <tr><td headers=\" \" class=\"gt_row gt_left\">Num.Obs.</td>\n<td headers=\"4\" class=\"gt_row gt_center\">11</td>\n<td headers=\"6\" class=\"gt_row gt_center\">7</td>\n<td headers=\"8\" class=\"gt_row gt_center\">14</td></tr>\n <tr><td headers=\" \" class=\"gt_row gt_left\">R2</td>\n<td headers=\"4\" class=\"gt_row gt_center\">0.274</td>\n<td headers=\"6\" class=\"gt_row gt_center\">0.016</td>\n<td headers=\"8\" class=\"gt_row gt_center\">0.080</td></tr>\n <tr><td headers=\" \" class=\"gt_row gt_left\">R2 Adj.</td>\n<td headers=\"4\" class=\"gt_row gt_center\">0.193</td>\n<td headers=\"6\" class=\"gt_row gt_center\">-0.181</td>\n<td headers=\"8\" class=\"gt_row gt_center\">0.004</td></tr>\n <tr><td headers=\" \" class=\"gt_row gt_left\">AIC</td>\n<td headers=\"4\" class=\"gt_row gt_center\">65.8</td>\n<td headers=\"6\" class=\"gt_row gt_center\">29.9</td>\n<td headers=\"8\" class=\"gt_row gt_center\">69.8</td></tr>\n <tr><td headers=\" \" class=\"gt_row gt_left\">BIC</td>\n<td headers=\"4\" class=\"gt_row gt_center\">67.0</td>\n<td headers=\"6\" class=\"gt_row gt_center\">29.7</td>\n<td headers=\"8\" class=\"gt_row gt_center\">71.8</td></tr>\n <tr><td headers=\" \" class=\"gt_row gt_left\">Log.Lik.</td>\n<td headers=\"4\" class=\"gt_row gt_center\">-29.891</td>\n<td headers=\"6\" class=\"gt_row gt_center\">-11.954</td>\n<td headers=\"8\" class=\"gt_row gt_center\">-31.920</td></tr>\n <tr><td headers=\" \" class=\"gt_row gt_left\">RMSE</td>\n<td headers=\"4\" class=\"gt_row gt_center\">3.66</td>\n<td headers=\"6\" class=\"gt_row gt_center\">1.33</td>\n<td headers=\"8\" class=\"gt_row gt_center\">2.37</td></tr>\n </tbody>\n \n \n</table>\n</div>\n```\n\n:::\n:::\n\n### Statistics in separate columns instead of one over the other\n\nIn somes cases, you may want to display statistics in separate columns instead of one over the other. It is easy to achieve this outcome by using the `estimate` argument. This argument accepts a vector of values, one for each of the models we are trying to summarize. If we want to include estimates and standard errors in separate columns, all we need to do is repeat a model, but request different statistics. For example,\n\n::: {.cell}\n\n```{.r .cell-code}\nlibrary(modelsummary)\nlibrary(kableExtra)\n\nmod1 <- lm(mpg ~ hp, mtcars)\nmod2 <- lm(mpg ~ hp + drat, mtcars)\n\nmodels <- list(\n \"Coef.\" = mod1,\n \"Std.Error\" = mod1,\n \"Coef.\" = mod2,\n \"Std.Error\" = mod2)\n\nmodelsummary(models,\n estimate = c(\"estimate\", \"std.error\", \"estimate\", \"std.error\"),\n statistic = NULL,\n gof_omit = \".*\",\n output = \"kableExtra\") %>%\n add_header_above(c(\" \" = 1, \"Model A\" = 2, \"Model B\" = 2))\n```\n\n::: {.cell-output-display}\n\n`````{=html}\n<table class=\"table\" style=\"width: auto !important; margin-left: auto; margin-right: auto;\">\n <thead>\n<tr>\n<th style=\"empty-cells: hide;border-bottom:hidden;\" colspan=\"1\"></th>\n<th style=\"border-bottom:hidden;padding-bottom:0; padding-left:3px;padding-right:3px;text-align: center; \" colspan=\"2\"><div style=\"border-bottom: 1px solid #ddd; padding-bottom: 5px; \">Model A</div></th>\n<th style=\"border-bottom:hidden;padding-bottom:0; padding-left:3px;padding-right:3px;text-align: center; \" colspan=\"2\"><div style=\"border-bottom: 1px solid #ddd; padding-bottom: 5px; \">Model B</div></th>\n</tr>\n <tr>\n <th style=\"text-align:left;\"> </th>\n <th style=\"text-align:center;\"> Coef. </th>\n <th style=\"text-align:center;\"> Std.Error </th>\n <th style=\"text-align:center;\"> Coef. </th>\n <th style=\"text-align:center;\"> Std.Error </th>\n </tr>\n </thead>\n<tbody>\n <tr>\n <td style=\"text-align:left;\"> (Intercept) </td>\n <td style=\"text-align:center;\"> 30.099 </td>\n <td style=\"text-align:center;\"> 1.634 </td>\n <td style=\"text-align:center;\"> 10.790 </td>\n <td style=\"text-align:center;\"> 5.078 </td>\n </tr>\n <tr>\n <td style=\"text-align:left;\"> hp </td>\n <td style=\"text-align:center;\"> −0.068 </td>\n <td style=\"text-align:center;\"> 0.010 </td>\n <td style=\"text-align:center;\"> −0.052 </td>\n <td style=\"text-align:center;\"> 0.009 </td>\n </tr>\n <tr>\n <td style=\"text-align:left;\"> drat </td>\n <td style=\"text-align:center;\"> </td>\n <td style=\"text-align:center;\"> </td>\n <td style=\"text-align:center;\"> 4.698 </td>\n <td style=\"text-align:center;\"> 1.192 </td>\n </tr>\n</tbody>\n</table>\n\n\n:::\nThis can be automated using a simple function:\n\nside_by_side <- function(models, estimates, ...) {\n models <- rep(models, each = length(estimates))\n estimates <- rep(estimates, times = 2)\n names(models) <- names(estimates)\n modelsummary(models = models, estimate = estimates,\n statistic = NULL, gof_omit = \".*\", ...)\n}\n\nmodels = list(\n lm(mpg ~ hp, mtcars),\n lm(mpg ~ hp + drat, mtcars))\n\nestimates <- c(\"Coef.\" = \"estimate\", \"Std.Error\" = \"std.error\")\n\nside_by_side(models, estimates = estimates)\n\n\n\n\n\n \n \n \n Coef.\n Std.Error\n Coef. \n Std.Error \n \n \n \n (Intercept)\n30.099\n1.634\n10.790\n5.078\n hp\n-0.068\n0.010\n-0.052\n0.009\n drat\n\n\n4.698\n1.192\n \n \n \n\n\n\n\n\n\n\nUsers often want to use estimates or standard errors that have been obtained using a custom strategy. To achieve this in an automated and replicable way, it can be useful to use the tidy_custom strategy described above in the “Cutomizing Existing Models” section.\nFor example, we can use the modelr package to draw 500 resamples of a dataset, and compute bootstrap standard errors by taking the standard deviation of estimates computed in all of those resampled datasets. To do this, we defined tidy_custom.lm function that will automatically bootstrap any lm model supplied to modelsummary, and replace the values in the table automatically.\nNote that the tidy_custom_lm returns a data.frame with 3 columns: term, estimate, and std.error:\n\nlibrary(\"modelsummary\")\nlibrary(\"broom\")\nlibrary(\"tidyverse\")\nlibrary(\"modelr\")\n\ntidy_custom.lm <- function(x, ...) {\n # extract data from the model\n model.frame(x) %>%\n # draw 500 bootstrap resamples\n modelr::bootstrap(n = 500) %>%\n # estimate the model 500 times\n mutate(results = map(strap, ~ update(x, data = .))) %>%\n # extract results using `broom::tidy`\n mutate(results = map(results, tidy)) %>%\n # unnest and summarize\n unnest(results) %>%\n group_by(term) %>%\n summarize(std.error = sd(estimate),\n estimate = mean(estimate))\n}\n\nmod = list(\n lm(hp ~ mpg, mtcars) ,\n lm(hp ~ mpg + drat, mtcars))\n\nmodelsummary(mod)\n\n\n\n\n\n \n \n \n (1)\n (2)\n \n \n \n (Intercept)\n328.764\n284.150\n \n(31.484)\n(42.113)\n mpg\n-9.081\n-9.994\n \n(1.420)\n(2.411)\n drat\n\n17.396\n \n\n(20.667)\n Num.Obs.\n32\n32\n R2\n0.602\n0.614\n R2 Adj.\n0.589\n0.588\n AIC\n336.9\n337.9\n BIC\n341.3\n343.7\n Log.Lik.\n-165.428\n-164.940\n F\n45.460\n23.100\n RMSE\n42.55\n41.91\n \n \n \n\n\n\n\n\n\n\nOne common use-case for glance_custom is to include additional goodness-of-fit statistics. For example, in an instrumental variable estimation computed by the fixest package, we may want to include an IV-Wald statistic for the first-stage regression of each endogenous regressor:\n\nlibrary(fixest)\nlibrary(tidyverse)\n\n## create a toy dataset\nbase <- iris\nnames(base) <- c(\"y\", \"x1\", \"x_endo_1\", \"x_inst_1\", \"fe\")\nbase$x_inst_2 <- 0.2 * base$y + 0.2 * base$x_endo_1 + rnorm(150, sd = 0.5)\nbase$x_endo_2 <- 0.2 * base$y - 0.2 * base$x_inst_1 + rnorm(150, sd = 0.5)\n\n## estimate an instrumental variable model\nmod <- feols(y ~ x1 | fe | x_endo_1 + x_endo_2 ~ x_inst_1 + x_inst_2, base)\n\n## custom extractor function returns a one-row data.frame (or tibble)\nglance_custom.fixest <- function(x) {\n tibble(\n \"Wald (x_endo_1)\" = fitstat(x, \"ivwald\")[[1]]$stat,\n \"Wald (x_endo_2)\" = fitstat(x, \"ivwald\")[[2]]$stat\n )\n}\n\n## draw table\nmodelsummary(mod)\n\n\n\n\n\n \n \n \n (1)\n \n \n \n fit_x_endo_1\n0.772\n \n(1.947)\n fit_x_endo_2\n-5.721\n \n(37.320)\n x1\n0.646\n \n(0.459)\n Num.Obs.\n150\n R2\n-11.399\n R2 Adj.\n-11.830\n R2 Within\n-31.519\n R2 Within Adj.\n-32.197\n AIC\n757.7\n BIC\n775.8\n RMSE\n2.91\n Std.Errors\nby: fe\n FE: fe\nX\n Wald (x_endo_1)\n15.65305145663\n Wald (x_endo_2)\n0.0222671859109197\n \n \n \n\n\n\n\n\nrm(\"glance_custom.fixest\")\n\n\n\n\nmodelsummary can pool and display analyses on several datasets imputed using the mice or Amelia packages. This code illustrates how:\n\nlibrary(mice)\nlibrary(Amelia)\nlibrary(modelsummary)\n\n## Download data from `Rdatasets`\nurl <- 'https://vincentarelbundock.github.io/Rdatasets/csv/HistData/Guerry.csv'\ndat <- read.csv(url)[, c('Clergy', 'Commerce', 'Literacy')]\n\n## Insert missing values\ndat$Clergy[sample(1:nrow(dat), 10)] <- NA\ndat$Commerce[sample(1:nrow(dat), 10)] <- NA\ndat$Literacy[sample(1:nrow(dat), 10)] <- NA\n\n## Impute with `mice` and `Amelia`\ndat_mice <- mice(dat, m = 5, printFlag = FALSE)\ndat_amelia <- amelia(dat, m = 5, p2s = 0)$imputations\n\n## Estimate models\nmod <- list()\nmod[['Listwise deletion']] <- lm(Clergy ~ Literacy + Commerce, dat)\nmod[['Mice']] <- with(dat_mice, lm(Clergy ~ Literacy + Commerce)) \nmod[['Amelia']] <- lapply(dat_amelia, function(x) lm(Clergy ~ Literacy + Commerce, x))\n\n## Pool results\nmod[['Mice']] <- mice::pool(mod[['Mice']])\nmod[['Amelia']] <- mice::pool(mod[['Amelia']])\n\n## Summarize\nmodelsummary(mod)\n\n\n\n\n\n \n \n \n Listwise deletion\n Mice\n Amelia\n \n \n \n (Intercept)\n63.166\n56.037\n68.298\n \n(15.624)\n(13.336)\n(12.735)\n Literacy\n-0.303\n-0.215\n-0.406\n \n(0.250)\n(0.207)\n(0.206)\n Commerce\n-0.136\n-0.082\n-0.184\n \n(0.164)\n(0.158)\n(0.140)\n Num.Obs.\n59\n86\n86\n Num.Imp.\n\n5\n5\n R2\n0.026\n0.018\n0.054\n R2 Adj.\n-0.009\n\n0.030\n AIC\n549.2\n\n\n BIC\n557.5\n\n\n Log.Lik.\n-270.576\n\n\n RMSE\n23.74\n\n\n \n \n \n\n\n\n\n\n\n\n\nThe table-making backends supported by modelsummary have overlapping capabilities (e.g., several of them can produce HTML tables). These are the default packages used for different outputs:\nkableExtra:\n\nHTML\nLaTeX / PDF\n\nflextable:\n\nWord\nPowerpoint\n\ngt:\n\njpg\npng\n\nYou can modify these defaults by setting global options such as:\noptions(modelsummary_factory_html = \"kableExtra\")\noptions(modelsummary_factory_latex = \"gt\")\noptions(modelsummary_factory_word = \"huxtable\")\noptions(modelsummary_factory_png = \"gt\")\n\n\n\n\n\n\nStandardized coefficients\nRow group labels\nCustomizing Word tables\nHow to add p values to datasummary_correlation\n\n\n\n\nFirst, please read the documentation in ?modelsummary and on the modelsummary website. The website includes dozens of worked examples and a lot of detailed explanation.\nSecond, try to use the [modelsummary] tag on StackOverflow.\nThird, if you think you found a bug or have a feature request, please file it on the Github issue tracker:\n\n\n\nSee the detailed documentation in the “Adding and Customizing Models” section of the modelsummary website.\n\n\n\nA modelsummary table is divided in two parts: “Estimates” (top of the table) and “Goodness-of-fit” (bottom of the table). To populate those two parts, modelsummary tries using the broom, parameters and performance packages in sequence.\nEstimates:\n\nTry the broom::tidy function to see if that package supports this model type, or if the user defined a custom tidy function in their global environment. If this fails…\nTry the parameters::model_parameters function to see if the parameters package supports this model type.\n\nGoodness-of-fit:\n\nTry the performance::model_performance function to see if the performance package supports this model type.\nTry the broom::glance function to see if that package supports this model type, or if the user defined a custom glance function in their global environment. If this fails…\n\nYou can change the order in which those steps are executed by setting a global option:\n\n## tidymodels: broom \noptions(modelsummary_get = \"broom\")\n\n## easystats: performance + parameters\noptions(modelsummary_get = \"easystats\")\n\nIf all of this fails, modelsummary will return an error message.\nIf you have problems with a model object, you can often diagnose the problem by running the following commands from a clean R session:\n## see if parameters and performance support your model type\nlibrary(parameters)\nlibrary(performance)\nmodel_parameters(model)\nmodel_performance(model)\n\n## see if broom supports your model type\nlibrary(broom)\ntidy(model)\nglance(model)\n\n## see if broom.mixed supports your model type\nlibrary(broom.mixed)\ntidy(model)\nglance(model)\nIf none of these options work, you can create your own tidy and glance methods, as described in the Adding new models section.\nIf one of the extractor functions does not work well or takes too long to process, you can define a new “custom” model class and choose your own extractors, as described in the Adding new models section.\n\n\n\nThe modelsummary function, by itself, is not slow: it should only take a couple seconds to produce a table in any output format. However, sometimes it can be computationally expensive (and long) to extract estimates and to compute goodness-of-fit statistics for your model.\nThe main options to speed up modelsummary are:\n\nSet gof_map=NA to avoid computing expensive goodness-of-fit statistics.\nUse the easystats extractor functions and the metrics argument to avoid computing expensive statistics (see below for an example).\nUse parallel computation if you are summarizing multiple models. See the “Parallel computation” section in the ?modelsummary documentation.\n\nTo diagnose the slowdown and find the bottleneck, you can try to benchmark the various extractor functions:\n\nlibrary(tictoc)\n\ndata(trade)\nmod <- lm(mpg ~ hp + drat, mtcars)\n\ntic(\"tidy\")\nx <- broom::tidy(mod)\ntoc()\n\ntidy: 0.003 sec elapsed\n\ntic(\"glance\")\nx <- broom::glance(mod)\ntoc()\n\nglance: 0.003 sec elapsed\n\ntic(\"parameters\")\nx <- parameters::parameters(mod)\ntoc()\n\nparameters: 0.02 sec elapsed\n\ntic(\"performance\")\nx <- performance::performance(mod)\ntoc()\n\nperformance: 0.011 sec elapsed\n\n\nIn my experience, the main bottleneck tends to be computing goodness-of-fit statistics. The performance extractor allows users to specify a metrics argument to select a subset of GOF to include. Using this can speedup things considerably.\nWe call modelsummary with the metrics argument:\n\nmodelsummary(mod, metrics = \"rmse\")\n\n\n\n\n\n \n \n \n (1)\n \n \n \n (Intercept)\n10.790\n \n(5.078)\n hp\n-0.052\n \n(0.009)\n drat\n4.698\n \n(1.192)\n Num.Obs.\n32\n R2\n0.741\n R2 Adj.\n0.723\n AIC\n169.5\n BIC\n175.4\n Log.Lik.\n-80.752\n F\n41.522\n \n \n \n\n\n\n\n\n\n\nSometimes, users want to include raw LaTeX commands in their tables, such as coefficient names including math mode: Apple $\\times$ Orange. The result of these attempts is often a weird string such as: \\$\\textbackslash{}times\\$ instead of proper LaTeX-rendered characters.\nThe source of the problem is that kableExtra, default table-making package in modelsummary, automatically escapes weird characters to make sure that your tables compile properly in LaTeX. To avoid this, we need to pass the escape=FALSE to modelsummary:\n\nmodelsummary(mod, escape = FALSE)\n\n\n\n\nMany bayesian models are supported out-of-the-box, including those produced by the rstanarm and brms packages. The statistics available for bayesian models are slightly different than those available for most frequentist models. Users can call get_estimates to see what is available:\n\nlibrary(rstanarm)\n\nThis is rstanarm version 2.32.1\n\n\n- See https://mc-stan.org/rstanarm/articles/priors for changes to default priors!\n\n\n- Default priors may change, so it's safest to specify priors, even if equivalent to the defaults.\n\n\n- For execution on a local, multicore CPU with excess RAM we recommend calling\n\n\n options(mc.cores = parallel::detectCores())\n\n\n\nAttaching package: 'rstanarm'\n\n\nThe following object is masked from 'package:fixest':\n\n se\n\nmod <- stan_glm(am ~ hp + drat, data = mtcars)\n\n\nget_estimates(mod)\n\n term estimate mad conf.level conf.low conf.high\n1 (Intercept) -2.230796826 0.58680702 0.95 -3.40385688 -1.055676364\n2 hp 0.000704079 0.00103711 0.95 -0.00158824 0.002788848\n3 drat 0.705052544 0.13876995 0.95 0.43515855 0.984934926\n prior.distribution prior.location prior.scale group std.error statistic\n1 normal 0.40625 1.24747729 NA NA\n2 normal 0.00000 0.01819465 NA NA\n3 normal 0.00000 2.33313429 NA NA\n p.value\n1 NA\n2 NA\n3 NA\n\n\nThis shows that there is no std.error column, but that there is a mad statistic (mean absolute deviation). So we can do:\n\nmodelsummary(mod, statistic = \"mad\")\n\nWarning: \n`modelsummary` uses the `performance` package to extract goodness-of-fit\nstatistics from models of this class. You can specify the statistics you wish\nto compute by supplying a `metrics` argument to `modelsummary`, which will then\npush it forward to `performance`. Acceptable values are: \"all\", \"common\",\n\"none\", or a character vector of metrics names. For example: `modelsummary(mod,\nmetrics = c(\"RMSE\", \"R2\")` Note that some metrics are computationally\nexpensive. See `?performance::performance` for details.\n This warning appears once per session.\n\n\n\n\n\n\n \n \n \n (1)\n \n \n \n (Intercept)\n-2.231\n \n(0.587)\n hp\n0.001\n \n(0.001)\n drat\n0.705\n \n(0.139)\n Num.Obs.\n32\n R2\n0.498\n R2 Adj.\n0.421\n Log.Lik.\n-12.037\n ELPD\n-15.3\n ELPD s.e.\n3.2\n LOOIC\n30.5\n LOOIC s.e.\n6.4\n WAIC\n30.1\n RMSE\n0.34\n \n \n \n\n\n\n\nAs noted in the modelsummary() documentation, model results are extracted using the parameters package. Users can pass additional arguments to modelsummary(), which will then push forward those arguments to the parameters::parameters function to change the results. For example, the parameters documentation for bayesian models shows that there is a centrality argument, which allows users to report the mean and standard deviation of the posterior distribution, instead of the median and MAD:\n\nget_estimates(mod, centrality = \"mean\")\n\n term estimate std.dev conf.level conf.low conf.high\n1 (Intercept) -2.2308585627 0.592540388 0.95 -3.40385688 -1.055676364\n2 hp 0.0006978276 0.001105456 0.95 -0.00158824 0.002788848\n3 drat 0.7044091550 0.139055255 0.95 0.43515855 0.984934926\n prior.distribution prior.location prior.scale group std.error statistic\n1 normal 0.40625 1.24747729 NA NA\n2 normal 0.00000 0.01819465 NA NA\n3 normal 0.00000 2.33313429 NA NA\n p.value\n1 NA\n2 NA\n3 NA\n\nmodelsummary(mod, statistic = \"std.dev\", centrality = \"mean\")\n\n\n\n\n\n \n \n \n (1)\n \n \n \n (Intercept)\n-2.231\n \n(0.593)\n hp\n0.001\n \n(0.001)\n drat\n0.704\n \n(0.139)\n Num.Obs.\n32\n R2\n0.498\n R2 Adj.\n0.421\n Log.Lik.\n-12.037\n ELPD\n-15.3\n ELPD s.e.\n3.2\n LOOIC\n30.5\n LOOIC s.e.\n6.4\n WAIC\n30.1\n RMSE\n0.34\n \n \n \n\n\n\n\nWe can also get additional test statistics using the test argument:\n\nget_estimates(mod, test = c(\"pd\", \"rope\"))\n\n term estimate mad conf.level conf.low conf.high\n1 (Intercept) -2.230796826 0.58680702 0.95 -3.40385688 -1.055676364\n2 hp 0.000704079 0.00103711 0.95 -0.00158824 0.002788848\n3 drat 0.705052544 0.13876995 0.95 0.43515855 0.984934926\n pd rope.percentage prior.distribution prior.location prior.scale group\n1 1.00000 0 normal 0.40625 1.24747729 \n2 0.75175 1 normal 0.00000 0.01819465 \n3 1.00000 0 normal 0.00000 2.33313429 \n std.error statistic p.value\n1 NA NA NA\n2 NA NA NA\n3 NA NA NA" + "text": "modelsummary includes a powerful set of utilities to customize the information displayed in your model summary tables. You can easily rename, reorder, subset or omit parameter estimates; choose the set of goodness-of-fit statistics to display; display various “robust” standard errors or confidence intervals; add titles, footnotes, or source notes; insert stars or custom characters to indicate levels of statistical significance; or add rows with supplemental information about your models.\n\nlibrary(modelsummary)\nlibrary(kableExtra)\nlibrary(gt)\n\nurl <- 'https://vincentarelbundock.github.io/Rdatasets/csv/HistData/Guerry.csv'\ndat <- read.csv(url)\n\nmodels <- list(\n \"OLS 1\" = lm(Donations ~ Literacy + Clergy, data = dat),\n \"Poisson 1\" = glm(Donations ~ Literacy + Commerce, family = poisson, data = dat),\n \"OLS 2\" = lm(Crime_pers ~ Literacy + Clergy, data = dat),\n \"Poisson 2\" = glm(Crime_pers ~ Literacy + Commerce, family = poisson, data = dat),\n \"OLS 3\" = lm(Crime_prop ~ Literacy + Clergy, data = dat)\n)\n\nmodelsummary(models)\n\n\n\n\n\n \n \n \n OLS 1\n Poisson 1\n OLS 2\n Poisson 2\n OLS 3\n \n \n \n (Intercept)\n7948.667\n8.241\n16259.384\n9.876\n11243.544\n \n(2078.276)\n(0.006)\n(2611.140)\n(0.003)\n(1011.240)\n Literacy\n-39.121\n0.003\n3.680\n0.000\n-68.507\n \n(37.052)\n(0.000)\n(46.552)\n(0.000)\n(18.029)\n Clergy\n15.257\n\n77.148\n\n-16.376\n \n(25.735)\n\n(32.334)\n\n(12.522)\n Commerce\n\n0.011\n\n0.001\n\n \n\n(0.000)\n\n(0.000)\n\n Num.Obs.\n86\n86\n86\n86\n86\n R2\n0.020\n\n0.065\n\n0.152\n R2 Adj.\n-0.003\n\n0.043\n\n0.132\n AIC\n1740.8\n274160.8\n1780.0\n257564.4\n1616.9\n BIC\n1750.6\n274168.2\n1789.9\n257571.7\n1626.7\n Log.Lik.\n-866.392\n-137077.401\n-886.021\n-128779.186\n-804.441\n F\n0.866\n18294.559\n2.903\n279.956\n7.441\n RMSE\n5740.99\n5491.61\n7212.97\n7451.70\n2793.43\n \n \n \n\n\n\n\n\n\nThe output argument determines the type of object returned by modelsummary and/or the file where this table should be written.\nIf you want to save a table directly to file, you can type:\n\nmodelsummary(models, output = \"table.docx\")\nmodelsummary(models, output = \"table.html\")\nmodelsummary(models, output = \"table.tex\")\nmodelsummary(models, output = \"table.md\")\nmodelsummary(models, output = \"table.txt\")\nmodelsummary(models, output = \"table.png\")\n\nIf you want a raw HTML, LaTeX, or Markdown table, you can type:\n\nmodelsummary(models, output = \"html\")\nmodelsummary(models, output = \"latex\")\nmodelsummary(models, output = \"markdown\")\n\nIf you to customize the appearance of your table using external tools like gt, kableExtra, flextable, or huxtable, you can type:\n\nmodelsummary(models, output = \"gt\")\nmodelsummary(models, output = \"kableExtra\")\nmodelsummary(models, output = \"flextable\")\nmodelsummary(models, output = \"huxtable\")\n\nWarning: When a file name is supplied to the output argument, the table is written immediately to file. If you want to customize your table by post-processing it with an external package, you need to choose a different output format and saving mechanism. Unfortunately, the approach differs from package to package:\n\ngt: set output=\"gt\", post-process your table, and use the gt::gtsave function.\nkableExtra: set output to your destination format (e.g., “latex”, “html”, “markdown”), post-process your table, and use kableExtra::save_kable function.\n\n\n\n\nThe fmt argument defines how numeric values are rounded and presented in the table. This argument accepts three types of input:\n\nInteger: Number of decimal digits\nUser-supplied function: Accepts a numeric vector and returns a character vector of the same length.\nmodelsummary function: fmt_decimal(), fmt_significant(), fmt_sprintf(), fmt_term(), fmt_statistic, fmt_identity()\n\nExamples:\n\nmod <- lm(mpg ~ hp + drat + qsec, data = mtcars)\n\n## decimal digits\nmodelsummary(mod, fmt = 3)\n\n## user-supplied function\nmodelsummary(mod, fmt = function(x) round(x, 2))\n\n## p values with different number of digits\nmodelsummary(mod, fmt = fmt_decimal(1, 3), statistic = c(\"std.error\", \"p.value\"))\n\n## significant digits\nmodelsummary(mod, fmt = fmt_significant(3))\n\n## sprintf(): decimal digits\nmodelsummary(mod, fmt = fmt_sprintf(\"%.5f\"))\n\n## sprintf(): scientific notation \nmodelsummary(mod, fmt = fmt_sprintf(\"%.5e\"))\n\n## statistic-specific formatting\nmodelsummary(mod, fmt = fmt_statistic(estimate = 4, conf.int = 1), statistic = \"conf.int\")\n\n## term-specific formatting\nmodelsummary(mod, fmt = fmt_term(hp = 4, drat = 1, default = fmt_significant(2)))\n\nmodelsummary(mod, fmt = NULL)\n\nCustom formatting function with big mark commas:\n\nmodf <- lm(I(mpg * 100) ~ hp, mtcars)\nf <- function(x) formatC(x, digits = 2, big.mark = \",\", format = \"f\")\nmodelsummary(modf, fmt = f, gof_map = NA)\n\n\n\n\n\n \n \n \n (1)\n \n \n \n (Intercept)\n3,009.89\n \n(163.39)\n hp\n-6.82\n \n(1.01)\n \n \n \n\n\n\n\nIn many languages the comma is used as a decimal mark instead of the period. modelsummary respects the global R OutDec option, so you can simply execute this command and your tables will be adjusted automatically:\n\noptions(OutDec=\",\")\n\n\n\n\nBy default, modelsummary prints each coefficient estimate on its own row. You can customize this by changing the estimate argument. For example, this would produce a table of p values instead of coefficient estimates:\n\nmodelsummary(models, estimate = \"p.value\")\n\nYou can also use glue string, using curly braces to specify the statistics you want. For example, this displays the estimate next to a confidence interval:\n\nmodelsummary(\n models,\n fmt = 1,\n estimate = \"{estimate} [{conf.low}, {conf.high}]\",\n statistic = NULL,\n coef_omit = \"Intercept\")\n\n\n\n\n\n \n \n \n OLS 1\n Poisson 1\n OLS 2\n Poisson 2\n OLS 3\n \n \n \n Literacy\n-39.1 [-112.8, 34.6]\n0.0 [0.0, 0.0]\n3.7 [-88.9, 96.3]\n0.0 [0.0, 0.0]\n-68.5 [-104.4, -32.6]\n Clergy\n15.3 [-35.9, 66.4]\n\n77.1 [12.8, 141.5]\n\n-16.4 [-41.3, 8.5]\n Commerce\n\n0.0 [0.0, 0.0]\n\n0.0 [0.0, 0.0]\n\n Num.Obs.\n86\n86\n86\n86\n86\n R2\n0.020\n\n0.065\n\n0.152\n R2 Adj.\n-0.003\n\n0.043\n\n0.132\n AIC\n1740.8\n274160.8\n1780.0\n257564.4\n1616.9\n BIC\n1750.6\n274168.2\n1789.9\n257571.7\n1626.7\n Log.Lik.\n-866.392\n-137077.401\n-886.021\n-128779.186\n-804.441\n F\n0.866\n18294.559\n2.903\n279.956\n7.441\n RMSE\n5740.99\n5491.61\n7212.97\n7451.70\n2793.43\n \n \n \n\n\n\n\nGlue strings can also apply R functions to estimates. However, since modelsummary rounds numbers and transforms them to character by default, we must set fmt = NULL:\n\nm <- glm(am ~ mpg, data = mtcars, family = binomial)\nmodelsummary(\n m,\n fmt = NULL,\n estimate = \"{round(exp(estimate), 5)}\",\n statistic = \"{round(exp(estimate) * std.error, 3)}\")\n\n\n\n\n\n \n \n \n (1)\n \n \n \n (Intercept)\n0.00136\n \n0.003\n mpg\n1.35938\n \n0.156\n Num.Obs.\n32\n AIC\n33.7\n BIC\n36.6\n Log.Lik.\n-14.838\n F\n7.148\n RMSE\n0.39\n \n \n \n\n\n\n\nYou can also use different estimates for different models by using a vector of strings:\n\nmodelsummary(\n models,\n fmt = 1,\n estimate = c(\"estimate\",\n \"{estimate}{stars}\",\n \"{estimate} ({std.error})\",\n \"{estimate} ({std.error}){stars}\",\n \"{estimate} [{conf.low}, {conf.high}]\"),\n statistic = NULL,\n coef_omit = \"Intercept\")\n\n\n\n\n\n \n \n \n OLS 1\n Poisson 1\n OLS 2\n Poisson 2\n OLS 3\n \n \n \n Literacy\n-39.1\n0.0***\n3.7 (46.6)\n0.0 (0.0)***\n-68.5 [-104.4, -32.6]\n Clergy\n15.3\n\n77.1 (32.3)\n\n-16.4 [-41.3, 8.5]\n Commerce\n\n0.0***\n\n0.0 (0.0)***\n\n Num.Obs.\n86\n86\n86\n86\n86\n R2\n0.020\n\n0.065\n\n0.152\n R2 Adj.\n-0.003\n\n0.043\n\n0.132\n AIC\n1740.8\n274160.8\n1780.0\n257564.4\n1616.9\n BIC\n1750.6\n274168.2\n1789.9\n257571.7\n1626.7\n Log.Lik.\n-866.392\n-137077.401\n-886.021\n-128779.186\n-804.441\n F\n0.866\n18294.559\n2.903\n279.956\n7.441\n RMSE\n5740.99\n5491.61\n7212.97\n7451.70\n2793.43\n \n \n \n\n\n\n\n\n\n\nBy default, modelsummary prints the coefficient’s standard error in parentheses below the corresponding estimate. The value of this uncertainty statistic is determined by the statistic argument. The statistic argument accepts any of the column names produced by get_estimates(model). For example:\n\nmodelsummary(models, statistic = 'std.error')\nmodelsummary(models, statistic = 'p.value')\nmodelsummary(models, statistic = 'statistic')\n\nYou can also display confidence intervals in brackets by setting statistic=\"conf.int\":\n\nmodelsummary(models,\n fmt = 1,\n statistic = 'conf.int', \n conf_level = .99)\n\n\n\n\n\n \n \n \n OLS 1\n Poisson 1\n OLS 2\n Poisson 2\n OLS 3\n \n \n \n (Intercept)\n7948.7\n8.2\n16259.4\n9.9\n11243.5\n \n[2469.6, 13427.8]\n[8.2, 8.3]\n[9375.5, 23143.3]\n[9.9, 9.9]\n[8577.5, 13909.5]\n Literacy\n-39.1\n0.0\n3.7\n0.0\n-68.5\n \n[-136.8, 58.6]\n[0.0, 0.0]\n[-119.0, 126.4]\n[0.0, 0.0]\n[-116.0, -21.0]\n Clergy\n15.3\n\n77.1\n\n-16.4\n \n[-52.6, 83.1]\n\n[-8.1, 162.4]\n\n[-49.4, 16.6]\n Commerce\n\n0.0\n\n0.0\n\n \n\n[0.0, 0.0]\n\n[0.0, 0.0]\n\n Num.Obs.\n86\n86\n86\n86\n86\n R2\n0.020\n\n0.065\n\n0.152\n R2 Adj.\n-0.003\n\n0.043\n\n0.132\n AIC\n1740.8\n274160.8\n1780.0\n257564.4\n1616.9\n BIC\n1750.6\n274168.2\n1789.9\n257571.7\n1626.7\n Log.Lik.\n-866.392\n-137077.401\n-886.021\n-128779.186\n-804.441\n F\n0.866\n18294.559\n2.903\n279.956\n7.441\n RMSE\n5740.99\n5491.61\n7212.97\n7451.70\n2793.43\n \n \n \n\n\n\n\nAlternatively, you can supply a glue string to get more complicated results:\n\nmodelsummary(models,\n statistic = \"{std.error} ({p.value})\")\n\n\n\n\n\n \n \n \n OLS 1\n Poisson 1\n OLS 2\n Poisson 2\n OLS 3\n \n \n \n (Intercept)\n7948.667\n8.241\n16259.384\n9.876\n11243.544\n \n2078.276 (<0.001)\n0.006 (<0.001)\n2611.140 (<0.001)\n0.003 (<0.001)\n1011.240 (<0.001)\n Literacy\n-39.121\n0.003\n3.680\n0.000\n-68.507\n \n37.052 (0.294)\n0.000 (<0.001)\n46.552 (0.937)\n0.000 (<0.001)\n18.029 (<0.001)\n Clergy\n15.257\n\n77.148\n\n-16.376\n \n25.735 (0.555)\n\n32.334 (0.019)\n\n12.522 (0.195)\n Commerce\n\n0.011\n\n0.001\n\n \n\n0.000 (<0.001)\n\n0.000 (<0.001)\n\n Num.Obs.\n86\n86\n86\n86\n86\n R2\n0.020\n\n0.065\n\n0.152\n R2 Adj.\n-0.003\n\n0.043\n\n0.132\n AIC\n1740.8\n274160.8\n1780.0\n257564.4\n1616.9\n BIC\n1750.6\n274168.2\n1789.9\n257571.7\n1626.7\n Log.Lik.\n-866.392\n-137077.401\n-886.021\n-128779.186\n-804.441\n F\n0.866\n18294.559\n2.903\n279.956\n7.441\n RMSE\n5740.99\n5491.61\n7212.97\n7451.70\n2793.43\n \n \n \n\n\n\n\nYou can also display several different uncertainty estimates below the coefficient estimates by using a vector. For example,\n\nmodelsummary(models, gof_omit = \".*\",\n statistic = c(\"conf.int\",\n \"s.e. = {std.error}\", \n \"t = {statistic}\",\n \"p = {p.value}\"))\n\n\n\n\n\n \n \n \n OLS 1\n Poisson 1\n OLS 2\n Poisson 2\n OLS 3\n \n \n \n (Intercept)\n7948.667\n8.241\n16259.384\n9.876\n11243.544\n \n[3815.060, 12082.275]\n[8.230, 8.252]\n[11065.933, 21452.836]\n[9.869, 9.883]\n[9232.228, 13254.860]\n \ns.e. = 2078.276\ns.e. = 0.006\ns.e. = 2611.140\ns.e. = 0.003\ns.e. = 1011.240\n \nt = 3.825\nt = 1408.907\nt = 6.227\nt = 2864.987\nt = 11.119\n \np = <0.001\np = <0.001\np = <0.001\np = <0.001\np = <0.001\n Literacy\n-39.121\n0.003\n3.680\n0.000\n-68.507\n \n[-112.816, 34.574]\n[0.003, 0.003]\n[-88.910, 96.270]\n[0.000, 0.000]\n[-104.365, -32.648]\n \ns.e. = 37.052\ns.e. = 0.000\ns.e. = 46.552\ns.e. = 0.000\ns.e. = 18.029\n \nt = -1.056\nt = 33.996\nt = 0.079\nt = -4.989\nt = -3.800\n \np = 0.294\np = <0.001\np = 0.937\np = <0.001\np = <0.001\n Clergy\n15.257\n\n77.148\n\n-16.376\n \n[-35.930, 66.443]\n\n[12.837, 141.459]\n\n[-41.282, 8.530]\n \ns.e. = 25.735\n\ns.e. = 32.334\n\ns.e. = 12.522\n \nt = 0.593\n\nt = 2.386\n\nt = -1.308\n \np = 0.555\n\np = 0.019\n\np = 0.195\n Commerce\n\n0.011\n\n0.001\n\n \n\n[0.011, 0.011]\n\n[0.001, 0.001]\n\n \n\ns.e. = 0.000\n\ns.e. = 0.000\n\n \n\nt = 174.542\n\nt = 15.927\n\n \n\np = <0.001\n\np = <0.001\n\n \n \n \n\n\n\n\nSetting statistic=NULL omits all statistics. This can often be useful if, for example, you want to display confidence intervals next to coefficients:\n\nmodelsummary(models, gof_omit = \".*\",\n estimate = \"{estimate} [{conf.low}, {conf.high}]\",\n statistic = NULL)\n\n\n\n\n\n \n \n \n OLS 1\n Poisson 1\n OLS 2\n Poisson 2\n OLS 3\n \n \n \n (Intercept)\n7948.667 [3815.060, 12082.275]\n8.241 [8.230, 8.252]\n16259.384 [11065.933, 21452.836]\n9.876 [9.869, 9.883]\n11243.544 [9232.228, 13254.860]\n Literacy\n-39.121 [-112.816, 34.574]\n0.003 [0.003, 0.003]\n3.680 [-88.910, 96.270]\n0.000 [0.000, 0.000]\n-68.507 [-104.365, -32.648]\n Clergy\n15.257 [-35.930, 66.443]\n\n77.148 [12.837, 141.459]\n\n-16.376 [-41.282, 8.530]\n Commerce\n\n0.011 [0.011, 0.011]\n\n0.001 [0.001, 0.001]\n\n \n \n \n\n\n\n\n\n\n\nYou can use clustered or robust uncertainty estimates by modifying the vcov parameter. This function accepts 5 different types of input. You can use a string or a vector of strings:\nmodelsummary(models, vcov = \"robust\")\nmodelsummary(models, vcov = c(\"classical\", \"robust\", \"bootstrap\", \"stata\", \"HC4\"))\nThese variance-covariance matrices are calculated using the sandwich package. You can pass arguments to the sandwich functions directly from the modelsummary function. For instance, to change the number of bootstrap replicates and to specify a clustering variable we could call:\nmodelsummary(mod, vcov = \"bootstrap\", R = 1000, cluster = \"country\")\nYou can use a one-sided formula or list of one-sided formulas to use clustered standard errors:\nmodelsummary(models, vcov = ~Region)\nYou can specify a function that produces variance-covariance matrices:\nlibrary(sandwich)\nmodelsummary(models, vcov = vcovHC)\nYou can supply a list of functions of the same length as your model list:\nmodelsummary(models, \n vcov = list(vcov, vcovHC, vcovHAC, vcovHC, vcov))\nYou can supply a list of named variance-covariance matrices:\nvcov_matrices <- lapply(models, vcovHC)\nmodelsummary(models, vcov = vcov_matrices)\nYou can supply a list of named vectors:\nvc <- list(\n `OLS 1` = c(`(Intercept)` = 2, Literacy = 3, Clergy = 4), \n `Poisson 1` = c(`(Intercept)` = 3, Literacy = -5, Commerce = 3),\n `OLS 2` = c(`(Intercept)` = 7, Literacy = -6, Clergy = 9), \n `Poisson 2` = c(`(Intercept)` = 4, Literacy = -7, Commerce = -9),\n `OLS 3` = c(`(Intercept)` = 1, Literacy = -5, Clergy = -2))\nmodelsummary(models, vcov = vc)\n\n\n\nSome people like to add “stars” to their model summary tables to mark statistical significance. The stars argument can take three types of input:\n\nNULL omits any stars or special marks (default)\nTRUE uses these default values: + p < 0.1, * p < 0.05, ** p < 0.01, *** p < 0.001\nNamed numeric vector for custom stars.\n\nmodelsummary(models)\nmodelsummary(models, stars = TRUE) \nmodelsummary(models, stars = c('+' = .1, '&' = .01)) \nWhenever stars is not NULL, modelsummary adds a note at the bottom of the table automatically. If you would like to add stars but not include a note at the bottom of the table, you can define the display of your estimate manually using a glue string, as described in the estimate argument section of the documentation. Whenever the {stars} string appears in the estimate or statistic arguments, modelsummary will assume that you want fine-grained control over your table, and will not include a note about stars.\n\nmodelsummary(models,\n estimate = \"{estimate}{stars}\",\n gof_omit = \".*\")\n\n\n\n\n\n \n \n \n OLS 1\n Poisson 1\n OLS 2\n Poisson 2\n OLS 3\n \n \n \n (Intercept)\n7948.667***\n8.241***\n16259.384***\n9.876***\n11243.544***\n \n(2078.276)\n(0.006)\n(2611.140)\n(0.003)\n(1011.240)\n Literacy\n-39.121\n0.003***\n3.680\n0.000***\n-68.507***\n \n(37.052)\n(0.000)\n(46.552)\n(0.000)\n(18.029)\n Clergy\n15.257\n\n77.148*\n\n-16.376\n \n(25.735)\n\n(32.334)\n\n(12.522)\n Commerce\n\n0.011***\n\n0.001***\n\n \n\n(0.000)\n\n(0.000)\n\n \n \n \n\n\n\n\nIf you want to create your own stars description, you can add custom notes with the notes argument.\n\n\n\nAn alternative mechanism to subset coefficients is to use the coef_omit argument, which accepts a vector of integer or a regular expression. For example, we can omit the first and second coefficients as follows:\n\nmodelsummary(models, coef_omit = 1:2, gof_map = NA)\n\n\n\n\n\n \n \n \n OLS 1\n Poisson 1\n OLS 2\n Poisson 2\n OLS 3\n \n \n \n Clergy\n15.257\n\n77.148\n\n-16.376\n \n(25.735)\n\n(32.334)\n\n(12.522)\n Commerce\n\n0.011\n\n0.001\n\n \n\n(0.000)\n\n(0.000)\n\n \n \n \n\n\n\n\nNegative indices determine which coefficients to keep:\n\nmodelsummary(models, coef_omit = c(-1, -2), gof_map = NA)\n\n\n\n\n\n \n \n \n OLS 1\n Poisson 1\n OLS 2\n Poisson 2\n OLS 3\n \n \n \n (Intercept)\n7948.667\n8.241\n16259.384\n9.876\n11243.544\n \n(2078.276)\n(0.006)\n(2611.140)\n(0.003)\n(1011.240)\n Literacy\n-39.121\n0.003\n3.680\n0.000\n-68.507\n \n(37.052)\n(0.000)\n(46.552)\n(0.000)\n(18.029)\n \n \n \n\n\n\n\nWhen coef_omit is a string, it is fed to grepl(x,perl=TRUE) to detect the variable names which should be excluded from the table.\n\nmodelsummary(models, coef_omit = \"Intercept|.*merce\", gof_map = NA)\n\n\n\n\n\n \n \n \n OLS 1\n Poisson 1\n OLS 2\n Poisson 2\n OLS 3\n \n \n \n Literacy\n-39.121\n0.003\n3.680\n0.000\n-68.507\n \n(37.052)\n(0.000)\n(46.552)\n(0.000)\n(18.029)\n Clergy\n15.257\n\n77.148\n\n-16.376\n \n(25.735)\n\n(32.334)\n\n(12.522)\n \n \n \n\n\n\n\nSince coef_omit accepts regexes, you can do interesting things with it, such as specifying the list of variables that modelsummary should keep instead of omit. To do this, we use a negative lookahead. To keep only the coefficients starting with “Lit”, we call:\n\nmodelsummary(models, coef_omit = \"^(?!Lit)\", gof_map = NA)\n\n\n\n\n\n \n \n \n OLS 1\n Poisson 1\n OLS 2\n Poisson 2\n OLS 3\n \n \n \n Literacy\n-39.121\n0.003\n3.680\n0.000\n-68.507\n \n(37.052)\n(0.000)\n(46.552)\n(0.000)\n(18.029)\n \n \n \n\n\n\n\nTo keep all coefficients matching the “y” substring:\n\nmodelsummary(models, coef_omit = \"^(?!.*y)\", gof_map = NA)\n\n\n\n\n\n \n \n \n OLS 1\n Poisson 1\n OLS 2\n Poisson 2\n OLS 3\n \n \n \n Literacy\n-39.121\n0.003\n3.680\n0.000\n-68.507\n \n(37.052)\n(0.000)\n(46.552)\n(0.000)\n(18.029)\n Clergy\n15.257\n\n77.148\n\n-16.376\n \n(25.735)\n\n(32.334)\n\n(12.522)\n \n \n \n\n\n\n\nTo keep all coefficients matching one of two substrings:\n\nmodelsummary(models, coef_omit = \"^(?!.*tercept|.*y)\", gof_map = NA)\n\n\n\n\n\n \n \n \n OLS 1\n Poisson 1\n OLS 2\n Poisson 2\n OLS 3\n \n \n \n (Intercept)\n7948.667\n8.241\n16259.384\n9.876\n11243.544\n \n(2078.276)\n(0.006)\n(2611.140)\n(0.003)\n(1011.240)\n Literacy\n-39.121\n0.003\n3.680\n0.000\n-68.507\n \n(37.052)\n(0.000)\n(46.552)\n(0.000)\n(18.029)\n Clergy\n15.257\n\n77.148\n\n-16.376\n \n(25.735)\n\n(32.334)\n\n(12.522)\n \n \n \n\n\n\n\n\n\n\nmodelsummary offers powerful and innovative mechanisms to rename, reorder, and subset coefficients and goodness-of-fit statistics.\nYou can rename coefficients using the coef_rename argument. For example, if you have two models with different explanatory variables, but you want both variables to have the same name and appear on the same row, you can do:\n\nx <- list(lm(hp ~ drat, mtcars),\n lm(hp ~ vs, mtcars))\n\nmodelsummary(x, coef_rename = c(\"drat\" = \"Explanator\", \"vs\" = \"Explanator\"))\n\n\n\n\n\n \n \n \n (1)\n (2)\n \n \n \n (Intercept)\n353.653\n189.722\n \n(76.049)\n(11.347)\n Explanator\n-57.545\n-98.365\n \n(20.922)\n(17.155)\n Num.Obs.\n32\n32\n R2\n0.201\n0.523\n R2 Adj.\n0.175\n0.507\n AIC\n359.2\n342.7\n BIC\n363.6\n347.1\n Log.Lik.\n-176.588\n-168.347\n F\n7.565\n32.876\n RMSE\n60.31\n46.61\n \n \n \n\n\n\n\nIf you provide a named character vector to coef_rename, only exact matches of the complete original term name will be replaced.\nFor complex modifications, you can feed a function which returns a named vector to the coef_rename argument. For example, modelsummary ships with a function called coef_rename, which executes some common renaming tasks automatically. This example also uses the dvnames function to extract the name of the dependent variable in each model:\n\nx <- list(\n lm(mpg ~ factor(cyl) + drat + disp, data = mtcars),\n lm(hp ~ factor(cyl) + drat + disp, data = mtcars)\n)\n\nmodelsummary(dvnames(x), coef_rename = coef_rename)\n\n\n\n\n\n \n \n \n mpg\n hp\n \n \n \n (Intercept)\n26.158\n-86.788\n \n(6.537)\n(79.395)\n 6\n-4.547\n46.485\n \n(1.731)\n(21.027)\n 8\n-4.580\n121.892\n \n(2.952)\n(35.853)\n Drat\n0.783\n37.815\n \n(1.478)\n(17.952)\n Disp\n-0.026\n0.147\n \n(0.011)\n(0.137)\n Num.Obs.\n32\n32\n R2\n0.786\n0.756\n R2 Adj.\n0.754\n0.720\n AIC\n167.4\n327.2\n BIC\n176.2\n336.0\n Log.Lik.\n-77.719\n-157.623\n F\n24.774\n20.903\n RMSE\n2.74\n33.34\n \n \n \n\n\n\n\nOf course, you can also define your own custom functions. For instance, to rename a model with interacted variables (e.g., “drat:mpg”), you could define a custom rename_explanator function:\n\ny <- list(\n lm(hp ~ drat / mpg, mtcars),\n lm(hp ~ vs / mpg, mtcars)\n)\n\nrename_explanator <- function(old_names) {\n new_names <- gsub(\"drat|vs\", \"Explanator\", old_names)\n setNames(new_names, old_names)\n}\n\nmodelsummary(y, coef_rename = rename_explanator)\n\n\n\n\n\n \n \n \n (1)\n (2)\n \n \n \n (Intercept)\n91.206\n189.722\n \n(72.344)\n(11.205)\n Explanator\n68.331\n-18.316\n \n(27.390)\n(62.531)\n Explanator:mpg\n-2.558\n-3.260\n \n(0.467)\n(2.451)\n Num.Obs.\n32\n32\n R2\n0.608\n0.550\n R2 Adj.\n0.581\n0.519\n AIC\n338.4\n342.8\n BIC\n344.3\n348.7\n Log.Lik.\n-165.218\n-167.399\n F\n22.454\n17.743\n RMSE\n42.27\n45.25\n \n \n \n\n\n\n\nBeware of inadvertently replacing parts of other variable names! Making your regex pattern as specific as possible (e.g., by adding word boundaries) is likely a good idea. The custom rename function is also a good place to re-introduce the replacement of “:” with “×” if you are dealing with interaction terms – modelsummary makes this replacement for you only when the coef_rename argument is not specified.\nAnother possibility is to assign variable labels to attributes in the data used to fit the model. Then, we can automatically rename them:\n\ndatlab <- mtcars\ndatlab$cyl <- factor(datlab$cyl)\nattr(datlab$cyl, \"label\") <- \"Cylinders\"\nattr(datlab$am, \"label\") <- \"Transmission\"\nmodlab <- lm(mpg ~ cyl + am, data = datlab)\nmodelsummary(modlab, coef_rename = TRUE)\n\n\n\n\n\n \n \n \n (1)\n \n \n \n (Intercept)\n24.802\n \n(1.323)\n Cylinders [6]\n-6.156\n \n(1.536)\n Cylinders [8]\n-10.068\n \n(1.452)\n Transmission\n2.560\n \n(1.298)\n Num.Obs.\n32\n R2\n0.765\n R2 Adj.\n0.740\n AIC\n168.4\n BIC\n175.7\n Log.Lik.\n-79.199\n F\n30.402\n RMSE\n2.87\n \n \n \n\n\n\n\n\n\n\nThe coef_map argument is a named vector which allows users to rename, reorder, and subset coefficient estimates. Values of this vector correspond to the “clean” variable name. Names of this vector correspond to the “raw” variable name. The table will be sorted in the order in which terms are presented in coef_map. Coefficients which are not included in coef_map will be excluded from the table.\n\ncm <- c('Literacy' = 'Literacy (%)',\n 'Commerce' = 'Patents per capita',\n '(Intercept)' = 'Constant')\nmodelsummary(models, coef_map = cm)\n\n\n\n\n\n \n \n \n OLS 1\n Poisson 1\n OLS 2\n Poisson 2\n OLS 3\n \n \n \n Literacy (%)\n-39.121\n0.003\n3.680\n0.000\n-68.507\n \n(37.052)\n(0.000)\n(46.552)\n(0.000)\n(18.029)\n Patents per capita\n\n0.011\n\n0.001\n\n \n\n(0.000)\n\n(0.000)\n\n Constant\n7948.667\n8.241\n16259.384\n9.876\n11243.544\n \n(2078.276)\n(0.006)\n(2611.140)\n(0.003)\n(1011.240)\n Num.Obs.\n86\n86\n86\n86\n86\n R2\n0.020\n\n0.065\n\n0.152\n R2 Adj.\n-0.003\n\n0.043\n\n0.132\n AIC\n1740.8\n274160.8\n1780.0\n257564.4\n1616.9\n BIC\n1750.6\n274168.2\n1789.9\n257571.7\n1626.7\n Log.Lik.\n-866.392\n-137077.401\n-886.021\n-128779.186\n-804.441\n F\n0.866\n18294.559\n2.903\n279.956\n7.441\n RMSE\n5740.99\n5491.61\n7212.97\n7451.70\n2793.43\n \n \n \n\n\n\n\n\n\n\ngof_omit is a regular expression which will be fed to grepl(x,perl=TRUE) to detect the names of the statistics which should be excluded from the table.\nmodelsummary(models, gof_omit = 'DF|Deviance|R2|AIC|BIC')\n\n\n\nThe gof_map argument can be used to rename, re-order, subset, and format the statistics displayed in the bottom section of the table (“goodness-of-fit”).\nThe first type of values allowed is a character vector with elements equal to column names in the data.frame produced by get_gof(model):\n\nmodelsummary(models, gof_map = c(\"nobs\", \"r.squared\"))\n\n\n\n\n\n \n \n \n OLS 1\n Poisson 1\n OLS 2\n Poisson 2\n OLS 3\n \n \n \n (Intercept)\n7948.667\n8.241\n16259.384\n9.876\n11243.544\n \n(2078.276)\n(0.006)\n(2611.140)\n(0.003)\n(1011.240)\n Literacy\n-39.121\n0.003\n3.680\n0.000\n-68.507\n \n(37.052)\n(0.000)\n(46.552)\n(0.000)\n(18.029)\n Clergy\n15.257\n\n77.148\n\n-16.376\n \n(25.735)\n\n(32.334)\n\n(12.522)\n Commerce\n\n0.011\n\n0.001\n\n \n\n(0.000)\n\n(0.000)\n\n Num.Obs.\n86\n86\n86\n86\n86\n R2\n0.020\n\n0.065\n\n0.152\n \n \n \n\n\n\n\nA more powerful mechanism is to supply a data.frame (or tibble) through the gof_map argument. This data.frame must include 3 columns:\n\nraw: a string with the name of a column produced by get_gof(model).\nclean: a string with the “clean” name of the statistic you want to appear in your final table.\nfmt: a string which will be used to round/format the string in question (e.g., \"%.3f\"). This follows the same standards as the fmt argument in ?modelsummary.\n\nYou can see an example of a valid data frame by typing modelsummary::gof_map. This is the default data.frame that modelsummary uses to subset and reorder goodness-of-fit statistics. As you can see, omit == TRUE for quite a number of statistics. You can include setting omit == FALSE:\ngm <- modelsummary::gof_map\ngm$omit <- FALSE\nmodelsummary(models, gof_map = gm)\nThe goodness-of-fit statistics will be printed in the table in the same order as in the gof_map data.frame.\nf <- function(x) format(round(x, 3), big.mark=\",\")\ngm <- list(\n list(\"raw\" = \"nobs\", \"clean\" = \"N\", \"fmt\" = f),\n list(\"raw\" = \"AIC\", \"clean\" = \"aic\", \"fmt\" = f))\nmodelsummary(models, gof_map = gm)\nNotice the subtle difference between coef_map and gof_map. On the one hand, coef_map works as a “white list”: any coefficient not explicitly entered will be omitted from the table. On the other, gof_map works as a “black list”: statistics need to be explicitly marked for omission.\nAnother convenient way to build a gof_map argument is to use the tribble function from the tibble package. In this example, we insert special HTML code to display a superscript, so we use the escape=FALSE argument:\n\ngm <- tibble::tribble(\n ~raw, ~clean, ~fmt,\n \"nobs\", \"N\", 0,\n \"r.squared\", \"R<sup>2</sup>\", 2)\n\nmodelsummary(\n models,\n statistic = NULL,\n gof_map = gm,\n escape = FALSE)\n\n\n\n\n\n \n \n \n OLS 1\n Poisson 1\n OLS 2\n Poisson 2\n OLS 3\n \n \n \n (Intercept)\n7948.667\n8.241\n16259.384\n9.876\n11243.544\n Literacy\n-39.121\n0.003\n3.680\n0.000\n-68.507\n Clergy\n15.257\n\n77.148\n\n-16.376\n Commerce\n\n0.011\n\n0.001\n\n N\n86\n86\n86\n86\n86\n R<sup>2</sup>\n0.02\n\n0.07\n\n0.15\n \n \n \n\n\n\n\n\n\n\nThis section requires version 1.3.1 of modelsummary. If this version is not available on CRAN yet, you can install the development version by following the instructions on the website.\nThe shape argument accepts:\n\nA formula which determines the structure of the table, and can display “grouped” coefficients together (e.g., multivariate outcome or mixed-effects models).\nThe strings “rbind” or “rcollapse” to stack multiple tables on top of each other and present models in distinct “panels”.\n\n\n\nThe left side of the formula represents the rows and the right side represents the columns. The default formula is term + statistic ~ model:\n\nm <- list(\n lm(mpg ~ hp, data = mtcars),\n lm(mpg ~ hp + drat, data = mtcars))\n\nmodelsummary(m, shape = term + statistic ~ model, gof_map = NA)\n\n\n\n\n\n \n \n \n (1)\n (2)\n \n \n \n (Intercept)\n30.099\n10.790\n \n(1.634)\n(5.078)\n hp\n-0.068\n-0.052\n \n(0.010)\n(0.009)\n drat\n\n4.698\n \n\n(1.192)\n \n \n \n\n\n\n\nWe can display statistics horizontally with:\n\nmodelsummary(m,\n shape = term ~ model + statistic,\n statistic = \"conf.int\",\n gof_map = NA)\n\n\n\n\n\n \n \n \n \n (1) \n \n \n (2) \n \n \n \n Est.\n 2.5 %\n 97.5 %\n Est. \n 2.5 % \n 97.5 % \n \n \n \n (Intercept)\n30.099\n26.762\n33.436\n10.790\n0.405\n21.175\n hp\n-0.068\n-0.089\n-0.048\n-0.052\n-0.071\n-0.033\n drat\n\n\n\n4.698\n2.261\n7.135\n \n \n \n\n\n\n\nThe order of terms in the formula determines the order of headers in the table.\n\nmodelsummary(m,\n shape = term ~ statistic + model,\n statistic = \"conf.int\",\n gof_map = NA)\n\n\n\n\n\n \n \n \n \n Est. \n \n \n 2.5 % \n \n \n 97.5 % \n \n \n \n (1)\n (2)\n (1) \n (2) \n (1) \n (2) \n \n \n \n (Intercept)\n30.099\n10.790\n26.762\n0.405\n33.436\n21.175\n hp\n-0.068\n-0.052\n-0.089\n-0.071\n-0.048\n-0.033\n drat\n\n4.698\n\n2.261\n\n7.135\n \n \n \n\n\n\n\nshape does partial matching and will try to fill-in incomplete formulas:\n\nmodelsummary(m, shape = ~ statistic)\n\nSome models like multinomial logit or GAMLSS produce “grouped” parameter estimates. To display these groups, we can include a group identifier in the shape formula. This group identifier must be one of the column names produced by get_estimates(model). For example, in models produced by nnet::multinom, the group identifier is called “response”:\n\nlibrary(nnet)\n\ndat_multinom <- mtcars\ndat_multinom$cyl <- sprintf(\"Cyl: %s\", dat_multinom$cyl)\n\nmod <- list(\n nnet::multinom(cyl ~ mpg, data = dat_multinom, trace = FALSE),\n nnet::multinom(cyl ~ mpg + drat, data = dat_multinom, trace = FALSE))\n\nget_estimates(mod[[1]])\n\n term estimate std.error conf.level conf.low conf.high statistic\n1 (Intercept) 47.252432 34.975171 0.95 -21.2976435 115.8025065 1.351028\n2 mpg -2.205418 1.637963 0.95 -5.4157653 1.0049299 -1.346440\n3 (Intercept) 72.440246 37.175162 0.95 -0.4217332 145.3022247 1.948619\n4 mpg -3.579991 1.774693 0.95 -7.0583242 -0.1016573 -2.017246\n df.error p.value response s.value group\n1 Inf 0.17668650 Cyl: 6 2.5 \n2 Inf 0.17816078 Cyl: 6 2.5 \n3 Inf 0.05134088 Cyl: 8 4.3 \n4 Inf 0.04366989 Cyl: 8 4.5 \n\n\nTo summarize the results, we can type:\n\nmodelsummary(mod, shape = term + response ~ statistic)\n\n\n\n\n\n \n \n \n response\n \n (1) \n \n \n (2) \n \n \n \n Est.\n S.E.\n Est. \n S.E. \n \n \n \n (Intercept)\nCyl: 6\n47.252\n34.975\n89.573\n86.884\n \nCyl: 8\n72.440\n37.175\n117.971\n87.998\n mpg\nCyl: 6\n-2.205\n1.638\n-3.627\n3.869\n \nCyl: 8\n-3.580\n1.775\n-4.838\n3.915\n drat\nCyl: 6\n\n\n-3.210\n3.810\n \nCyl: 8\n\n\n-5.028\n4.199\n Num.Obs.\n\n32\n\n32\n\n R2\n\n0.763\n\n0.815\n\n R2 Adj.\n\n0.733\n\n0.786\n\n AIC\n\n24.1\n\n24.5\n\n BIC\n\n30.0\n\n33.3\n\n RMSE\n\n0.24\n\n0.20\n\n \n \n \n\n\n\n\nThe terms of the shape formula above can of course be rearranged to reshape the table. For example:\n\nmodelsummary(mod, shape = model + term ~ response)\n\n\n\n\n\n \n \n \n \n Cyl: 6\n Cyl: 8\n \n \n \n (1)\n(Intercept)\n47.252\n72.440\n \n\n(34.975)\n(37.175)\n \nmpg\n-2.205\n-3.580\n \n\n(1.638)\n(1.775)\n (2)\n(Intercept)\n89.573\n117.971\n \n\n(86.884)\n(87.998)\n \nmpg\n-3.627\n-4.838\n \n\n(3.869)\n(3.915)\n \ndrat\n-3.210\n-5.028\n \n\n(3.810)\n(4.199)\n \n \n \n\n\n\n\nWe can combine the term and group identifier columns by inserting an interaction colon : instead of the + in the formula:\n\nlibrary(marginaleffects)\nmod <- glm(am ~ mpg + factor(cyl), family = binomial, data = mtcars)\nmfx <- slopes(mod)\n\nmodelsummary(mfx, shape = term + contrast ~ model)\n\n\n\n\n\n \n \n \n \n (1)\n \n \n \n cyl\nmean(6) - mean(4)\n0.097\n \n\n(0.166)\n \nmean(8) - mean(4)\n0.093\n \n\n(0.234)\n mpg\nmean(dY/dX)\n0.056\n \n\n(0.027)\n Num.Obs.\n\n32\n AIC\n\n37.4\n BIC\n\n43.3\n Log.Lik.\n\n-14.702\n F\n\n2.236\n RMSE\n\n0.39\n \n \n \n\n\n\n\n\nmodelsummary(mfx, shape = term : contrast ~ model)\n\n\n\n\n\n \n \n \n (1)\n \n \n \n cyl mean(6) - mean(4)\n0.097\n \n(0.166)\n cyl mean(8) - mean(4)\n0.093\n \n(0.234)\n mpg mean(dY/dX)\n0.056\n \n(0.027)\n Num.Obs.\n32\n AIC\n37.4\n BIC\n43.3\n Log.Lik.\n-14.702\n F\n2.236\n RMSE\n0.39\n \n \n \n\n\n\n\n\n\n\nNote: The code in this section requires version 1.3.0 or the development version of modelsummary. See the website for installation instructions.\nThis section shows how to “stack/bind” multiple regression tables on top of one another, to display the results several models side-by-side and top-to-bottom. For example, imagine that we want to present 4 different models, half of which are estimated using a different outcome variable. When using modelsummary, we store models in a list. When using modelsummary with shape=\"rbind\" or shape=\"rbind\", we store models in a list of lists:\n\ngm <- c(\"r.squared\", \"nobs\", \"rmse\")\n\npanels <- list(\n list(\n lm(mpg ~ 1, data = mtcars),\n lm(mpg ~ qsec, data = mtcars)\n ),\n list(\n lm(hp ~ 1, data = mtcars),\n lm(hp ~ qsec, data = mtcars)\n )\n)\n\nmodelsummary(\n panels,\n shape = \"rbind\",\n gof_map = gm)\n\n\n\n\n\n \n \n \n (1)\n (2)\n \n \n \n \n Panel A\n \n (Intercept)\n20.091\n-5.114\n \n(1.065)\n(10.030)\n qsec\n\n1.412\n \n\n(0.559)\n R2\n0.000\n0.175\n Num.Obs.\n32\n32\n RMSE\n5.93\n5.39\n \n Panel B\n \n (Intercept)\n146.688\n631.704\n \n(12.120)\n(88.700)\n qsec\n\n-27.174\n \n\n(4.946)\n R2\n0.000\n0.502\n Num.Obs.\n32\n32\n RMSE\n67.48\n47.64\n \n \n \n\n\n\n\nLike with modelsummary(), we can can name models and panels by naming elements of our nested list:\n\npanels <- list(\n \"Outcome: mpg\" = list(\n \"(I)\" = lm(mpg ~ 1, data = mtcars),\n \"(II)\" = lm(mpg ~ qsec, data = mtcars)\n ),\n \"Outcome: hp\" = list(\n \"(I)\" = lm(hp ~ 1, data = mtcars),\n \"(II)\" = lm(hp ~ qsec, data = mtcars)\n )\n)\n\nmodelsummary(\n panels,\n shape = \"rbind\",\n gof_map = gm)\n\n\n\n\n\n \n \n \n (I)\n (II)\n \n \n \n \n Outcome: mpg\n \n (Intercept)\n20.091\n-5.114\n \n(1.065)\n(10.030)\n qsec\n\n1.412\n \n\n(0.559)\n R2\n0.000\n0.175\n Num.Obs.\n32\n32\n RMSE\n5.93\n5.39\n \n Outcome: hp\n \n (Intercept)\n146.688\n631.704\n \n(12.120)\n(88.700)\n qsec\n\n-27.174\n \n\n(4.946)\n R2\n0.000\n0.502\n Num.Obs.\n32\n32\n RMSE\n67.48\n47.64\n \n \n \n\n\n\n\n\n\nThe fixest package offers powerful tools to estimate multiple models using a concise syntax. fixest functions are also convenient because they return named lists of models which are easy to subset and manipulate using standard R functions like grepl.\nFor example, to introduce regressors in stepwise fashion, and to estimate models on different subsets of the data, we can do:\n\n##| message = FALSE\n\n## estimate 4 models\nlibrary(fixest)\n\n\nAttaching package: 'fixest'\n\n\nThe following object is masked _by_ '.GlobalEnv':\n\n f\n\nmod <- feols(\n c(hp, mpg) ~ csw(qsec, drat) | gear,\n data = mtcars)\n\n## select models with different outcome variables\npanels <- list(\n \"Miles per gallon\" = mod[grepl(\"mpg\", names(mod))],\n \"Horsepower\" = mod[grepl(\"hp\", names(mod))]\n)\n\nmodelsummary(\n panels,\n shape = \"rcollapse\",\n gof_omit = \"IC|R2\")\n\n\n\n\n\n \n \n \n (1)\n (2)\n \n \n \n \n Miles per gallon\n \n qsec\n1.436\n1.519\n \n(0.594)\n(0.529)\n drat\n\n5.765\n \n\n(2.381)\n RMSE\n4.03\n3.67\n \n Horsepower\n \n qsec\n-22.175\n-22.676\n \n(12.762)\n(13.004)\n drat\n\n-35.106\n \n\n(28.509)\n RMSE\n40.45\n39.14\n \n \n \n Num.Obs.\n32\n32\n Std.Errors\nby: gear\nby: gear\n FE: gear\nX\nX\n \n \n \n\n\n\n\nWe can use all the typical extension systems to add information, such as the mean of the dependent variable:\n\nglance_custom.fixest <- function(x, ...) {\n dv <- insight::get_response(x)\n dv <- sprintf(\"%.2f\", mean(dv, na.rm = TRUE))\n data.table::data.table(`Mean of DV` = dv)\n}\n\nmodelsummary(\n panels,\n shape = \"rcollapse\",\n gof_omit = \"IC|R2\")\n\n\n\n\n\n \n \n \n (1)\n (2)\n \n \n \n \n Miles per gallon\n \n qsec\n1.436\n1.519\n \n(0.594)\n(0.529)\n drat\n\n5.765\n \n\n(2.381)\n RMSE\n4.03\n3.67\n Mean of DV\n20.09\n20.09\n \n Horsepower\n \n qsec\n-22.175\n-22.676\n \n(12.762)\n(13.004)\n drat\n\n-35.106\n \n\n(28.509)\n RMSE\n40.45\n39.14\n Mean of DV\n146.69\n146.69\n \n \n \n Num.Obs.\n32\n32\n Std.Errors\nby: gear\nby: gear\n FE: gear\nX\nX\n \n \n \n\n\n\nrm(\"glance_custom.fixest\")\n\n\n\n\n\n\nBy default, modelsummary will align the first column (with coefficient names) to the left, and will center the results columns. To change this default, you can use the align argument, which accepts a string of the same length as the number of columns:\n\nmodelsummary(models, align=\"lrrrrr\")\n\nUsers who produce PDF documents using Rmarkdown or LaTeX can also align values on the decimal dot by using the character “d” in the align argument:\n\nmodelsummary(models, align=\"lddddd\")\n\nFor the table produced by this code to compile, users must include the following code in their LaTeX preamble:\n\n\\usepackage{booktabs}\n\\usepackage{siunitx}\n\\newcolumntype{d}{S[input-symbols = ()]}\n\n\n\n\nAdd notes to the bottom of your table:\nmodelsummary(models, \n notes = list('Text of the first note.', \n 'Text of the second note.'))\n\n\n\nYou can add a title to your table as follows:\nmodelsummary(models, title = 'This is a title for my table.')\n\n\n\nUse the add_rows argument to add rows manually to a table. For example, let’s say you estimate two models with a factor variables and you want to insert (a) an empty line to identify the category of reference, and (b) customized information at the bottom of the table:\n\nmodels <- list()\nmodels[['OLS']] <- lm(mpg ~ factor(cyl), mtcars)\nmodels[['Logit']] <- glm(am ~ factor(cyl), mtcars, family = binomial)\n\nWe create a data.frame with the same number of columns as the summary table. Then, we define a “position” attribute to specify where the new rows should be inserted in the table. Finally, we pass this data.frame to the add_rows argument:\n\nlibrary(tibble)\nrows <- tribble(~term, ~OLS, ~Logit,\n 'factor(cyl)4', '-', '-',\n 'Info', '???', 'XYZ')\nattr(rows, 'position') <- c(3, 9)\n\nmodelsummary(models, add_rows = rows)\n\n\n\n\n\n \n \n \n OLS\n Logit\n \n \n \n (Intercept)\n26.664\n0.981\n \n(0.972)\n(0.677)\n factor(cyl)4\n-\n-\n factor(cyl)6\n-6.921\n-1.269\n \n(1.558)\n(1.021)\n factor(cyl)8\n-11.564\n-2.773\n \n(1.299)\n(1.021)\n Num.Obs.\n32\n32\n Info\n???\nXYZ\n R2\n0.732\n\n R2 Adj.\n0.714\n\n AIC\n170.6\n39.9\n BIC\n176.4\n44.3\n Log.Lik.\n-81.282\n-16.967\n F\n39.698\n3.691\n RMSE\n3.07\n0.42\n \n \n \n\n\n\n\n\n\n\nWe can exponentiate their estimates using the exponentiate argument:\n\nmod_logit <- glm(am ~ mpg, data = mtcars, family = binomial)\nmodelsummary(mod_logit, exponentiate = TRUE)\n\n\n\n\n\n \n \n \n (1)\n \n \n \n (Intercept)\n0.001\n \n(0.003)\n mpg\n1.359\n \n(0.156)\n Num.Obs.\n32\n AIC\n33.7\n BIC\n36.6\n Log.Lik.\n-14.838\n F\n7.148\n RMSE\n0.39\n \n \n \n\n\n\n\nWe can also present exponentiated and standard models side by side by using a logical vector:\n\nmod_logit <- list(mod_logit, mod_logit)\nmodelsummary(mod_logit, exponentiate = c(TRUE, FALSE))\n\n\n\n\n\n \n \n \n (1)\n (2)\n \n \n \n (Intercept)\n0.001\n-6.604\n \n(0.003)\n(2.351)\n mpg\n1.359\n0.307\n \n(0.156)\n(0.115)\n Num.Obs.\n32\n32\n AIC\n33.7\n33.7\n BIC\n36.6\n36.6\n Log.Lik.\n-14.838\n-14.838\n F\n7.148\n7.148\n RMSE\n0.39\n0.39\n \n \n \n\n\n\n\n\n\n\nAll arguments passed by the user to a modelsummary function are pushed forward in two other functions:\n\nThe function which extracts model estimates.\n\nBy default, additional arguments are pushed forward to parameters::parameters and performance::performance. Users can also can also use a different “backend” to extract information from model objects: the broom package. By setting the modelsummary_get global option, we tell modelsummary to use the easystats/parameters packages instead of broom. With these packages, other arguments are available, such as the metrics argument. Please refer to these package’s documentation to details.\n\nThe table-making functions.\n\nBy default, additional arguments are pushed forward to kableExtra::kbl, but users can use a different table-making function by setting the output argument to a different value such as \"gt\", \"flextable\", or \"huxtable\".\nSee the Appearance vignette for examples.\n\n\nAll arguments passed supported by these functions are thus automatically available directly in modelsummary, modelplot, and the datasummary family of functions.\n\n\n\nTo customize the appearance of tables, modelsummary supports five popular and extremely powerful table-making packages:\n\ngt: https://gt.rstudio.com\nkableExtra: http://haozhu233.github.io/kableExtra\nhuxtable: https://hughjonesd.github.io/huxtable/\nflextable: https://davidgohel.github.io/flextable/\nDT: https://rstudio.github.io/DT\n\nThe “customizing the look of your tables” vignette shows examples for all 4 packages.\n\n\n\nmodelsummary automatically supports all the models supported by the tidy function of the broom package or the parameters function of the parameters package. The list of supported models is rapidly expanding. At the moment, it covers the following model classes:\n\nsupported_models()\n\n [1] \"aareg\" \"acf\" \n [3] \"afex_aov\" \"AKP\" \n [5] \"anova\" \"Anova.mlm\" \n [7] \"anova.rms\" \"aov\" \n [9] \"aovlist\" \"Arima\" \n [11] \"averaging\" \"bamlss\" \n [13] \"bayesQR\" \"bcplm\" \n [15] \"befa\" \"betamfx\" \n [17] \"betaor\" \"betareg\" \n [19] \"BFBayesFactor\" \"bfsl\" \n [21] \"BGGM\" \"bifeAPEs\" \n [23] \"biglm\" \"binDesign\" \n [25] \"binWidth\" \"blavaan\" \n [27] \"blrm\" \"boot\" \n [29] \"bootstrap_model\" \"bracl\" \n [31] \"brmsfit\" \"brmultinom\" \n [33] \"btergm\" \"cch\" \n [35] \"censReg\" \"cgam\" \n [37] \"character\" \"cld\" \n [39] \"clm\" \"clm2\" \n [41] \"clmm\" \"clmm2\" \n [43] \"coeftest\" \"comparisons\" \n [45] \"confint.glht\" \"confusionMatrix\" \n [47] \"coxph\" \"cpglmm\" \n [49] \"crr\" \"cv.glmnet\" \n [51] \"data.frame\" \"dbscan\" \n [53] \"default\" \"deltaMethod\" \n [55] \"density\" \"dep.effect\" \n [57] \"DirichletRegModel\" \"dist\" \n [59] \"draws\" \"drc\" \n [61] \"durbinWatsonTest\" \"emm_list\" \n [63] \"emmeans\" \"emmeans_summary\" \n [65] \"emmGrid\" \"epi.2by2\" \n [67] \"ergm\" \"fa\" \n [69] \"fa.ci\" \"factanal\" \n [71] \"FAMD\" \"feglm\" \n [73] \"felm\" \"fitdistr\" \n [75] \"fixest\" \"fixest_multi\" \n [77] \"flac\" \"flic\" \n [79] \"ftable\" \"gam\" \n [81] \"Gam\" \"gamlss\" \n [83] \"gamm\" \"garch\" \n [85] \"geeglm\" \"ggeffects\" \n [87] \"glht\" \"glimML\" \n [89] \"glm\" \"glmm\" \n [91] \"glmmTMB\" \"glmnet\" \n [93] \"glmrob\" \"glmRob\" \n [95] \"glmx\" \"gmm\" \n [97] \"hclust\" \"hdbscan\" \n [99] \"hglm\" \"hkmeans\" \n[101] \"HLfit\" \"htest\" \n[103] \"hurdle\" \"hypotheses\" \n[105] \"irlba\" \"ivFixed\" \n[107] \"ivprobit\" \"ivreg\" \n[109] \"kappa\" \"kde\" \n[111] \"Kendall\" \"kmeans\" \n[113] \"lavaan\" \"leveneTest\" \n[115] \"Line\" \"Lines\" \n[117] \"list\" \"lm\" \n[119] \"lm_robust\" \"lm.beta\" \n[121] \"lme\" \"lmodel2\" \n[123] \"lmrob\" \"lmRob\" \n[125] \"logical\" \"logistf\" \n[127] \"logitmfx\" \"logitor\" \n[129] \"lqm\" \"lqmm\" \n[131] \"lsmobj\" \"manova\" \n[133] \"maov\" \"map\" \n[135] \"marginaleffects\" \"marginalmeans\" \n[137] \"margins\" \"maxim\" \n[139] \"maxLik\" \"mblogit\" \n[141] \"Mclust\" \"mcmc\" \n[143] \"mcmc.list\" \"MCMCglmm\" \n[145] \"mcp1\" \"mcp2\" \n[147] \"med1way\" \"mediate\" \n[149] \"merMod\" \"merModList\" \n[151] \"meta_bma\" \"meta_fixed\" \n[153] \"meta_random\" \"metaplus\" \n[155] \"mfx\" \"mhurdle\" \n[157] \"mipo\" \"mira\" \n[159] \"mixed\" \"MixMod\" \n[161] \"mixor\" \"mjoint\" \n[163] \"mle\" \"mle2\" \n[165] \"mlm\" \"mlogit\" \n[167] \"mmrm\" \"mmrm_fit\" \n[169] \"mmrm_tmb\" \"model_fit\" \n[171] \"model_parameters\" \"muhaz\" \n[173] \"multinom\" \"mvord\" \n[175] \"negbin\" \"negbinirr\" \n[177] \"negbinmfx\" \"nestedLogit\" \n[179] \"nlrq\" \"nls\" \n[181] \"NULL\" \"numeric\" \n[183] \"omega\" \"onesampb\" \n[185] \"optim\" \"orcutt\" \n[187] \"osrt\" \"pairwise.htest\" \n[189] \"pam\" \"parameters_efa\" \n[191] \"parameters_pca\" \"PCA\" \n[193] \"pgmm\" \"plm\" \n[195] \"PMCMR\" \"poissonirr\" \n[197] \"poissonmfx\" \"poLCA\" \n[199] \"polr\" \"Polygon\" \n[201] \"Polygons\" \"power.htest\" \n[203] \"prcomp\" \"predictions\" \n[205] \"principal\" \"probitmfx\" \n[207] \"pvclust\" \"pyears\" \n[209] \"rcorr\" \"ref.grid\" \n[211] \"regsubsets\" \"ridgelm\" \n[213] \"rlm\" \"rlmerMod\" \n[215] \"rma\" \"robtab\" \n[217] \"roc\" \"rq\" \n[219] \"rqs\" \"rqss\" \n[221] \"sarlm\" \"Sarlm\" \n[223] \"scam\" \"selection\" \n[225] \"sem\" \"SemiParBIV\" \n[227] \"slopes\" \"SpatialLinesDataFrame\" \n[229] \"SpatialPolygons\" \"SpatialPolygonsDataFrame\"\n[231] \"spec\" \"speedglm\" \n[233] \"speedlm\" \"stanfit\" \n[235] \"stanmvreg\" \"stanreg\" \n[237] \"summary_emm\" \"summary.glht\" \n[239] \"summary.lm\" \"summary.plm\" \n[241] \"summaryDefault\" \"survdiff\" \n[243] \"survexp\" \"survfit\" \n[245] \"survreg\" \"svd\" \n[247] \"svyglm\" \"svyolr\" \n[249] \"svytable\" \"systemfit\" \n[251] \"t1way\" \"table\" \n[253] \"tobit\" \"trendPMCMR\" \n[255] \"trimcibt\" \"ts\" \n[257] \"TukeyHSD\" \"varest\" \n[259] \"vgam\" \"wbgee\" \n[261] \"wbm\" \"wmcpAKP\" \n[263] \"xyz\" \"yuen\" \n[265] \"zcpglm\" \"zerocount\" \n[267] \"zeroinfl\" \"zoo\" \n\n\nTo see if a given model is supported, you can fit it, and then call this function:\n\nget_estimates(model)\n\nIf this function does not return a valid output, you can easily (really!!) add your own support. See the next section for a tutorial. If you do this, you may consider opening an issue on the Github website of the broom package: https://github.com/tidymodels/broom/issues\n\n\n\n\n\nYou can use modelsummary to insert tables into dynamic documents with knitr or Rmarkdown. This minimal .Rmd file can produce tables in PDF, HTML, or RTF documents:\n\nminimal.Rmd\n\nThis .Rmd file shows illustrates how to use table numbering and cross-references to produce PDF documents using bookdown:\n\ncross_references.Rmd\n\nThis .Rmd file shows how to customize tables in PDF and HTML files using gt and kableExtra functions:\n\nappearance.Rmd\n\n\n\n\nQuarto is an open source publishing system built on top of Pandoc. It was designed as a “successor” to Rmarkdown, and includes useful features for technical writing, such as built-in support for cross-references. modelsummary works automatically with Quarto. This is a minimal document with cross-references which should render automatically to PDF, HTML, and more:\n\n---\nformat: pdf\ntitle: Example\n---\n\n@tbl-mtcars shows that cars with high horse power get low miles per gallon.\n\n```{r}\n##| label: tbl-mtcars\n##| tbl-cap: \"Horse Powers vs. Miles per Gallon\"\nlibrary(modelsummary)\nmod <- lm(mpg ~ hp, mtcars)\nmodelsummary(mod)\n:::\n\n### Emacs Org-Mode\n\nYou can use `modelsummary` to insert tables into Emacs Org-Mode documents, which can be exported to a variety of formats, including HTML and PDF (via LaTeX). As with anything Emacs-related, there are many ways to achieve the outcomes you want. Here is one example of an Org-Mode document which can automatically export tables to HTML and PDF without manual tweaks:\n\n##+PROPERTY: header-args:R :var orgbackend=(prin1-to-string org-export-current-backend) ##+MACRO: Rtable (eval (concat “#+header: :results output” (prin1-to-string org-export-current-backend)))\n{{{Rtable}}} ##+BEGIN_SRC R :exports both library(modelsummary) options(modelsummary_factory_default = orgbackend)\nmod = lm(hp ~ mpg, data = mtcars)\nmodelsummary(mod) ##+END_SRC\n\nThe first line tells Org-mode to assign a variable called `orgbackend`. This variable will be accessible by the `R` session, and will be equal to \"html\" or \"latex\", depending on the export format.\n\nThe second line creates an Org macro which we will use to automatically add useful information to the header of source blocks. For instance, when we export to HTML, the macro will expand to `:results output html`. This tells Org-Mode to insert the last printed output from the `R` session, and to treat it as raw HTML. \n\nThe `{{{Rtable}}}` call expands the macro to add information to the header of the block that follows.\n\n`#+BEGIN_SRC R :exports both` says that we want to print both the original code and the output (`:exports results` would omit the code, for example).\n\nFinally, `options(modelsummary_factory_default=orgbackend` uses the variable we defined to set the default output format. That way, we don't have to use the `output` argument every time.\n\nOne potentially issue to keep in mind is that the code above extracts the printout from the `R` console. However, when we customize tables with `kableExtra` or `gt` functions, those functions do not always return printed raw HTML or LaTeX code. Sometimes, it can be necessary to add a call to `cat` at the end of a table customization pipeline. For example:\n\n{{{Rtable}}} ##+BEGIN_SRC R :exports both library(modelsummary) library(kableExtra)\nmod = lm(hp ~ mpg, data = mtcars)\nmodelsummary(mod, output = orgbackend) %>% row_spec(1, background = “pink”) %>% cat() ##+END_SRC\n\n## Global options\n\nUsers can change the default behavior of `modelsummary` by setting global options.\n\nOmit the note at the bottom of the table with significance threshold:\n\n::: {.cell}\n\n```{.r .cell-code}\noptions(\"modelsummary_stars_note\" = FALSE)\n```\n:::\n\nChange the default output format:\n\n::: {.cell}\n\n```{.r .cell-code}\noptions(modelsummary_factory_default = \"latex\")\noptions(modelsummary_factory_default = \"gt\")\n```\n:::\n\nChange the backend packages that `modelsummary` uses to create tables in different output formats:\n\n::: {.cell}\n\n```{.r .cell-code}\noptions(modelsummary_factory_html = 'kableExtra')\noptions(modelsummary_factory_latex = 'flextable')\noptions(modelsummary_factory_word = 'huxtable')\noptions(modelsummary_factory_png = 'gt')\n```\n:::\n\nChange the packages that `modelsummary` uses to extract information from models:\n\n::: {.cell}\n\n```{.r .cell-code}\n## tidymodels: broom \noptions(modelsummary_get = \"broom\")\n\n## easystats: performance + parameters\noptions(modelsummary_get = \"easystats\")\n```\n:::\n\n[The `appearance` vignette](https://modelsummary.com/articles/appearance.html#themes) shows how to set \"themes\" for your tables using the `modelsummary_theme_gt`, `modelsummary_theme_kableExtra`, `modelsummary_theme_flextable` and `modelsummary_theme_huxtable` global options. For example:\n\n::: {.cell}\n\n```{.r .cell-code}\nlibrary(gt)\n\n## The ... ellipsis is required!\ncustom_theme <- function(x, ...) {\n x %>% gt::opt_row_striping(row_striping = TRUE)\n}\noptions(\"modelsummary_theme_gt\" = custom_theme)\n\nmod <- lm(mpg ~ hp + drat, mtcars)\nmodelsummary(mod, output = \"gt\")\n```\n:::\n\n## Case studies\n\n\n### Standardization\n\nIn some cases, it is useful to standardize coefficients before reporting them. `modelsummary` extracts coefficients from model objects using the `parameters` package, and that package offers several options for standardization: https://easystats.github.io/parameters/reference/model_parameters.default.html\n\nWe can pass the `standardize` argument directly to `modelsummary` or `modelplot`, and that argument will be forwarded to `parameters`. For example to refit the model on standardized data and plot the results, we can do:\n\n::: {.cell}\n\n```{.r .cell-code}\nlibrary(modelsummary)\nmod <- lm(mpg ~ hp + am, data = mtcars)\n\nmodelplot(mod, standardize = \"refit\")\n```\n\n::: {.cell-output .cell-output-stderr}\n```\nWarning in ggplot2::geom_pointrange(ggplot2::aes(y = term, x = estimate, :\nIgnoring unknown parameters: `standardize`\n```\n:::\n\n::: {.cell-output-display}\n![](modelsummary_files/figure-html/unnamed-chunk-60-1.png){width=672}\n:::\n:::\n\nCompare to the unstandardized plot:\n\n::: {.cell}\n\n```{.r .cell-code}\nmodelplot(mod)\n```\n\n::: {.cell-output-display}\n![](modelsummary_files/figure-html/unnamed-chunk-61-1.png){width=672}\n:::\n:::\n\n### Subgroup estimation with `nest_by`\n\nSometimes, it is useful to estimate multiple regression models on subsets of the data. To do this efficiently, we can use the `nest_by` function from the `dplyr` package. Then, estimate the models with `lm`, extract them and name them with `pull`, and finally summarize them with `modelsummary`:\n\n::: {.cell}\n\n```{.r .cell-code}\nlibrary(tidyverse)\n\nmtcars %>%\n nest_by(cyl) %>%\n mutate(models = list(lm(mpg ~ hp, data))) %>%\n pull(models, name = cyl) %>%\n modelsummary\n```\n\n::: {.cell-output-display}\n\n```{=html}\n<div id=\"msyjczffsg\" style=\"padding-left:0px;padding-right:0px;padding-top:10px;padding-bottom:10px;overflow-x:auto;overflow-y:auto;width:auto;height:auto;\">\n<style>#msyjczffsg table {\n font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji';\n -webkit-font-smoothing: antialiased;\n -moz-osx-font-smoothing: grayscale;\n}\n\n#msyjczffsg thead, #msyjczffsg tbody, #msyjczffsg tfoot, #msyjczffsg tr, #msyjczffsg td, #msyjczffsg th {\n border-style: none;\n}\n\n#msyjczffsg p {\n margin: 0;\n padding: 0;\n}\n\n#msyjczffsg .gt_table {\n display: table;\n border-collapse: collapse;\n line-height: normal;\n margin-left: auto;\n margin-right: auto;\n color: #333333;\n font-size: 16px;\n font-weight: normal;\n font-style: normal;\n background-color: #FFFFFF;\n width: auto;\n border-top-style: solid;\n border-top-width: 2px;\n border-top-color: #A8A8A8;\n border-right-style: none;\n border-right-width: 2px;\n border-right-color: #D3D3D3;\n border-bottom-style: solid;\n border-bottom-width: 2px;\n border-bottom-color: #A8A8A8;\n border-left-style: none;\n border-left-width: 2px;\n border-left-color: #D3D3D3;\n}\n\n#msyjczffsg .gt_caption {\n padding-top: 4px;\n padding-bottom: 4px;\n}\n\n#msyjczffsg .gt_title {\n color: #333333;\n font-size: 125%;\n font-weight: initial;\n padding-top: 4px;\n padding-bottom: 4px;\n padding-left: 5px;\n padding-right: 5px;\n border-bottom-color: #FFFFFF;\n border-bottom-width: 0;\n}\n\n#msyjczffsg .gt_subtitle {\n color: #333333;\n font-size: 85%;\n font-weight: initial;\n padding-top: 3px;\n padding-bottom: 5px;\n padding-left: 5px;\n padding-right: 5px;\n border-top-color: #FFFFFF;\n border-top-width: 0;\n}\n\n#msyjczffsg .gt_heading {\n background-color: #FFFFFF;\n text-align: center;\n border-bottom-color: #FFFFFF;\n border-left-style: none;\n border-left-width: 1px;\n border-left-color: #D3D3D3;\n border-right-style: none;\n border-right-width: 1px;\n border-right-color: #D3D3D3;\n}\n\n#msyjczffsg .gt_bottom_border {\n border-bottom-style: solid;\n border-bottom-width: 2px;\n border-bottom-color: #D3D3D3;\n}\n\n#msyjczffsg .gt_col_headings {\n border-top-style: solid;\n border-top-width: 2px;\n border-top-color: #D3D3D3;\n border-bottom-style: solid;\n border-bottom-width: 2px;\n border-bottom-color: #D3D3D3;\n border-left-style: none;\n border-left-width: 1px;\n border-left-color: #D3D3D3;\n border-right-style: none;\n border-right-width: 1px;\n border-right-color: #D3D3D3;\n}\n\n#msyjczffsg .gt_col_heading {\n color: #333333;\n background-color: #FFFFFF;\n font-size: 100%;\n font-weight: normal;\n text-transform: inherit;\n border-left-style: none;\n border-left-width: 1px;\n border-left-color: #D3D3D3;\n border-right-style: none;\n border-right-width: 1px;\n border-right-color: #D3D3D3;\n vertical-align: bottom;\n padding-top: 5px;\n padding-bottom: 6px;\n padding-left: 5px;\n padding-right: 5px;\n overflow-x: hidden;\n}\n\n#msyjczffsg .gt_column_spanner_outer {\n color: #333333;\n background-color: #FFFFFF;\n font-size: 100%;\n font-weight: normal;\n text-transform: inherit;\n padding-top: 0;\n padding-bottom: 0;\n padding-left: 4px;\n padding-right: 4px;\n}\n\n#msyjczffsg .gt_column_spanner_outer:first-child {\n padding-left: 0;\n}\n\n#msyjczffsg .gt_column_spanner_outer:last-child {\n padding-right: 0;\n}\n\n#msyjczffsg .gt_column_spanner {\n border-bottom-style: solid;\n border-bottom-width: 2px;\n border-bottom-color: #D3D3D3;\n vertical-align: bottom;\n padding-top: 5px;\n padding-bottom: 5px;\n overflow-x: hidden;\n display: inline-block;\n width: 100%;\n}\n\n#msyjczffsg .gt_spanner_row {\n border-bottom-style: hidden;\n}\n\n#msyjczffsg .gt_group_heading {\n padding-top: 8px;\n padding-bottom: 8px;\n padding-left: 5px;\n padding-right: 5px;\n color: #333333;\n background-color: #FFFFFF;\n font-size: 100%;\n font-weight: initial;\n text-transform: inherit;\n border-top-style: solid;\n border-top-width: 2px;\n border-top-color: #D3D3D3;\n border-bottom-style: solid;\n border-bottom-width: 2px;\n border-bottom-color: #D3D3D3;\n border-left-style: none;\n border-left-width: 1px;\n border-left-color: #D3D3D3;\n border-right-style: none;\n border-right-width: 1px;\n border-right-color: #D3D3D3;\n vertical-align: middle;\n text-align: left;\n}\n\n#msyjczffsg .gt_empty_group_heading {\n padding: 0.5px;\n color: #333333;\n background-color: #FFFFFF;\n font-size: 100%;\n font-weight: initial;\n border-top-style: solid;\n border-top-width: 2px;\n border-top-color: #D3D3D3;\n border-bottom-style: solid;\n border-bottom-width: 2px;\n border-bottom-color: #D3D3D3;\n vertical-align: middle;\n}\n\n#msyjczffsg .gt_from_md > :first-child {\n margin-top: 0;\n}\n\n#msyjczffsg .gt_from_md > :last-child {\n margin-bottom: 0;\n}\n\n#msyjczffsg .gt_row {\n padding-top: 8px;\n padding-bottom: 8px;\n padding-left: 5px;\n padding-right: 5px;\n margin: 10px;\n border-top-style: solid;\n border-top-width: 1px;\n border-top-color: #D3D3D3;\n border-left-style: none;\n border-left-width: 1px;\n border-left-color: #D3D3D3;\n border-right-style: none;\n border-right-width: 1px;\n border-right-color: #D3D3D3;\n vertical-align: middle;\n overflow-x: hidden;\n}\n\n#msyjczffsg .gt_stub {\n color: #333333;\n background-color: #FFFFFF;\n font-size: 100%;\n font-weight: initial;\n text-transform: inherit;\n border-right-style: solid;\n border-right-width: 2px;\n border-right-color: #D3D3D3;\n padding-left: 5px;\n padding-right: 5px;\n}\n\n#msyjczffsg .gt_stub_row_group {\n color: #333333;\n background-color: #FFFFFF;\n font-size: 100%;\n font-weight: initial;\n text-transform: inherit;\n border-right-style: solid;\n border-right-width: 2px;\n border-right-color: #D3D3D3;\n padding-left: 5px;\n padding-right: 5px;\n vertical-align: top;\n}\n\n#msyjczffsg .gt_row_group_first td {\n border-top-width: 2px;\n}\n\n#msyjczffsg .gt_row_group_first th {\n border-top-width: 2px;\n}\n\n#msyjczffsg .gt_summary_row {\n color: #333333;\n background-color: #FFFFFF;\n text-transform: inherit;\n padding-top: 8px;\n padding-bottom: 8px;\n padding-left: 5px;\n padding-right: 5px;\n}\n\n#msyjczffsg .gt_first_summary_row {\n border-top-style: solid;\n border-top-color: #D3D3D3;\n}\n\n#msyjczffsg .gt_first_summary_row.thick {\n border-top-width: 2px;\n}\n\n#msyjczffsg .gt_last_summary_row {\n padding-top: 8px;\n padding-bottom: 8px;\n padding-left: 5px;\n padding-right: 5px;\n border-bottom-style: solid;\n border-bottom-width: 2px;\n border-bottom-color: #D3D3D3;\n}\n\n#msyjczffsg .gt_grand_summary_row {\n color: #333333;\n background-color: #FFFFFF;\n text-transform: inherit;\n padding-top: 8px;\n padding-bottom: 8px;\n padding-left: 5px;\n padding-right: 5px;\n}\n\n#msyjczffsg .gt_first_grand_summary_row {\n padding-top: 8px;\n padding-bottom: 8px;\n padding-left: 5px;\n padding-right: 5px;\n border-top-style: double;\n border-top-width: 6px;\n border-top-color: #D3D3D3;\n}\n\n#msyjczffsg .gt_last_grand_summary_row_top {\n padding-top: 8px;\n padding-bottom: 8px;\n padding-left: 5px;\n padding-right: 5px;\n border-bottom-style: double;\n border-bottom-width: 6px;\n border-bottom-color: #D3D3D3;\n}\n\n#msyjczffsg .gt_striped {\n background-color: rgba(128, 128, 128, 0.05);\n}\n\n#msyjczffsg .gt_table_body {\n border-top-style: solid;\n border-top-width: 2px;\n border-top-color: #D3D3D3;\n border-bottom-style: solid;\n border-bottom-width: 2px;\n border-bottom-color: #D3D3D3;\n}\n\n#msyjczffsg .gt_footnotes {\n color: #333333;\n background-color: #FFFFFF;\n border-bottom-style: none;\n border-bottom-width: 2px;\n border-bottom-color: #D3D3D3;\n border-left-style: none;\n border-left-width: 2px;\n border-left-color: #D3D3D3;\n border-right-style: none;\n border-right-width: 2px;\n border-right-color: #D3D3D3;\n}\n\n#msyjczffsg .gt_footnote {\n margin: 0px;\n font-size: 90%;\n padding-top: 4px;\n padding-bottom: 4px;\n padding-left: 5px;\n padding-right: 5px;\n}\n\n#msyjczffsg .gt_sourcenotes {\n color: #333333;\n background-color: #FFFFFF;\n border-bottom-style: none;\n border-bottom-width: 2px;\n border-bottom-color: #D3D3D3;\n border-left-style: none;\n border-left-width: 2px;\n border-left-color: #D3D3D3;\n border-right-style: none;\n border-right-width: 2px;\n border-right-color: #D3D3D3;\n}\n\n#msyjczffsg .gt_sourcenote {\n font-size: 90%;\n padding-top: 4px;\n padding-bottom: 4px;\n padding-left: 5px;\n padding-right: 5px;\n}\n\n#msyjczffsg .gt_left {\n text-align: left;\n}\n\n#msyjczffsg .gt_center {\n text-align: center;\n}\n\n#msyjczffsg .gt_right {\n text-align: right;\n font-variant-numeric: tabular-nums;\n}\n\n#msyjczffsg .gt_font_normal {\n font-weight: normal;\n}\n\n#msyjczffsg .gt_font_bold {\n font-weight: bold;\n}\n\n#msyjczffsg .gt_font_italic {\n font-style: italic;\n}\n\n#msyjczffsg .gt_super {\n font-size: 65%;\n}\n\n#msyjczffsg .gt_footnote_marks {\n font-size: 75%;\n vertical-align: 0.4em;\n position: initial;\n}\n\n#msyjczffsg .gt_asterisk {\n font-size: 100%;\n vertical-align: 0;\n}\n\n#msyjczffsg .gt_indent_1 {\n text-indent: 5px;\n}\n\n#msyjczffsg .gt_indent_2 {\n text-indent: 10px;\n}\n\n#msyjczffsg .gt_indent_3 {\n text-indent: 15px;\n}\n\n#msyjczffsg .gt_indent_4 {\n text-indent: 20px;\n}\n\n#msyjczffsg .gt_indent_5 {\n text-indent: 25px;\n}\n</style>\n<table class=\"gt_table\" data-quarto-disable-processing=\"false\" data-quarto-bootstrap=\"false\">\n <thead>\n <tr class=\"gt_col_headings\">\n <th class=\"gt_col_heading gt_columns_bottom_border gt_left\" rowspan=\"1\" colspan=\"1\" scope=\"col\" id=\" \"> </th>\n <th class=\"gt_col_heading gt_columns_bottom_border gt_center\" rowspan=\"1\" colspan=\"1\" scope=\"col\" id=\"4\">4</th>\n <th class=\"gt_col_heading gt_columns_bottom_border gt_center\" rowspan=\"1\" colspan=\"1\" scope=\"col\" id=\"6\">6</th>\n <th class=\"gt_col_heading gt_columns_bottom_border gt_center\" rowspan=\"1\" colspan=\"1\" scope=\"col\" id=\"8\">8</th>\n </tr>\n </thead>\n <tbody class=\"gt_table_body\">\n <tr><td headers=\" \" class=\"gt_row gt_left\">(Intercept)</td>\n<td headers=\"4\" class=\"gt_row gt_center\">35.983</td>\n<td headers=\"6\" class=\"gt_row gt_center\">20.674</td>\n<td headers=\"8\" class=\"gt_row gt_center\">18.080</td></tr>\n <tr><td headers=\" \" class=\"gt_row gt_left\"></td>\n<td headers=\"4\" class=\"gt_row gt_center\">(5.201)</td>\n<td headers=\"6\" class=\"gt_row gt_center\">(3.304)</td>\n<td headers=\"8\" class=\"gt_row gt_center\">(2.988)</td></tr>\n <tr><td headers=\" \" class=\"gt_row gt_left\">hp</td>\n<td headers=\"4\" class=\"gt_row gt_center\">-0.113</td>\n<td headers=\"6\" class=\"gt_row gt_center\">-0.008</td>\n<td headers=\"8\" class=\"gt_row gt_center\">-0.014</td></tr>\n <tr><td headers=\" \" class=\"gt_row gt_left\" style=\"border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #000000;\"></td>\n<td headers=\"4\" class=\"gt_row gt_center\" style=\"border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #000000;\">(0.061)</td>\n<td headers=\"6\" class=\"gt_row gt_center\" style=\"border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #000000;\">(0.027)</td>\n<td headers=\"8\" class=\"gt_row gt_center\" style=\"border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #000000;\">(0.014)</td></tr>\n <tr><td headers=\" \" class=\"gt_row gt_left\">Num.Obs.</td>\n<td headers=\"4\" class=\"gt_row gt_center\">11</td>\n<td headers=\"6\" class=\"gt_row gt_center\">7</td>\n<td headers=\"8\" class=\"gt_row gt_center\">14</td></tr>\n <tr><td headers=\" \" class=\"gt_row gt_left\">R2</td>\n<td headers=\"4\" class=\"gt_row gt_center\">0.274</td>\n<td headers=\"6\" class=\"gt_row gt_center\">0.016</td>\n<td headers=\"8\" class=\"gt_row gt_center\">0.080</td></tr>\n <tr><td headers=\" \" class=\"gt_row gt_left\">R2 Adj.</td>\n<td headers=\"4\" class=\"gt_row gt_center\">0.193</td>\n<td headers=\"6\" class=\"gt_row gt_center\">-0.181</td>\n<td headers=\"8\" class=\"gt_row gt_center\">0.004</td></tr>\n <tr><td headers=\" \" class=\"gt_row gt_left\">AIC</td>\n<td headers=\"4\" class=\"gt_row gt_center\">65.8</td>\n<td headers=\"6\" class=\"gt_row gt_center\">29.9</td>\n<td headers=\"8\" class=\"gt_row gt_center\">69.8</td></tr>\n <tr><td headers=\" \" class=\"gt_row gt_left\">BIC</td>\n<td headers=\"4\" class=\"gt_row gt_center\">67.0</td>\n<td headers=\"6\" class=\"gt_row gt_center\">29.7</td>\n<td headers=\"8\" class=\"gt_row gt_center\">71.8</td></tr>\n <tr><td headers=\" \" class=\"gt_row gt_left\">Log.Lik.</td>\n<td headers=\"4\" class=\"gt_row gt_center\">-29.891</td>\n<td headers=\"6\" class=\"gt_row gt_center\">-11.954</td>\n<td headers=\"8\" class=\"gt_row gt_center\">-31.920</td></tr>\n <tr><td headers=\" \" class=\"gt_row gt_left\">RMSE</td>\n<td headers=\"4\" class=\"gt_row gt_center\">3.66</td>\n<td headers=\"6\" class=\"gt_row gt_center\">1.33</td>\n<td headers=\"8\" class=\"gt_row gt_center\">2.37</td></tr>\n </tbody>\n \n \n</table>\n</div>\n```\n\n:::\n:::\n\n### Statistics in separate columns instead of one over the other\n\nIn somes cases, you may want to display statistics in separate columns instead of one over the other. It is easy to achieve this outcome by using the `estimate` argument. This argument accepts a vector of values, one for each of the models we are trying to summarize. If we want to include estimates and standard errors in separate columns, all we need to do is repeat a model, but request different statistics. For example,\n\n::: {.cell}\n\n```{.r .cell-code}\nlibrary(modelsummary)\nlibrary(kableExtra)\n\nmod1 <- lm(mpg ~ hp, mtcars)\nmod2 <- lm(mpg ~ hp + drat, mtcars)\n\nmodels <- list(\n \"Coef.\" = mod1,\n \"Std.Error\" = mod1,\n \"Coef.\" = mod2,\n \"Std.Error\" = mod2)\n\nmodelsummary(models,\n estimate = c(\"estimate\", \"std.error\", \"estimate\", \"std.error\"),\n statistic = NULL,\n gof_omit = \".*\",\n output = \"kableExtra\") %>%\n add_header_above(c(\" \" = 1, \"Model A\" = 2, \"Model B\" = 2))\n```\n\n::: {.cell-output-display}\n\n`````{=html}\n<table class=\"table\" style=\"width: auto !important; margin-left: auto; margin-right: auto;\">\n <thead>\n<tr>\n<th style=\"empty-cells: hide;border-bottom:hidden;\" colspan=\"1\"></th>\n<th style=\"border-bottom:hidden;padding-bottom:0; padding-left:3px;padding-right:3px;text-align: center; \" colspan=\"2\"><div style=\"border-bottom: 1px solid #ddd; padding-bottom: 5px; \">Model A</div></th>\n<th style=\"border-bottom:hidden;padding-bottom:0; padding-left:3px;padding-right:3px;text-align: center; \" colspan=\"2\"><div style=\"border-bottom: 1px solid #ddd; padding-bottom: 5px; \">Model B</div></th>\n</tr>\n <tr>\n <th style=\"text-align:left;\"> </th>\n <th style=\"text-align:center;\"> Coef. </th>\n <th style=\"text-align:center;\"> Std.Error </th>\n <th style=\"text-align:center;\"> Coef. </th>\n <th style=\"text-align:center;\"> Std.Error </th>\n </tr>\n </thead>\n<tbody>\n <tr>\n <td style=\"text-align:left;\"> (Intercept) </td>\n <td style=\"text-align:center;\"> 30.099 </td>\n <td style=\"text-align:center;\"> 1.634 </td>\n <td style=\"text-align:center;\"> 10.790 </td>\n <td style=\"text-align:center;\"> 5.078 </td>\n </tr>\n <tr>\n <td style=\"text-align:left;\"> hp </td>\n <td style=\"text-align:center;\"> −0.068 </td>\n <td style=\"text-align:center;\"> 0.010 </td>\n <td style=\"text-align:center;\"> −0.052 </td>\n <td style=\"text-align:center;\"> 0.009 </td>\n </tr>\n <tr>\n <td style=\"text-align:left;\"> drat </td>\n <td style=\"text-align:center;\"> </td>\n <td style=\"text-align:center;\"> </td>\n <td style=\"text-align:center;\"> 4.698 </td>\n <td style=\"text-align:center;\"> 1.192 </td>\n </tr>\n</tbody>\n</table>\n\n\n:::\nThis can be automated using a simple function:\n\nside_by_side <- function(models, estimates, ...) {\n models <- rep(models, each = length(estimates))\n estimates <- rep(estimates, times = 2)\n names(models) <- names(estimates)\n modelsummary(models = models, estimate = estimates,\n statistic = NULL, gof_omit = \".*\", ...)\n}\n\nmodels = list(\n lm(mpg ~ hp, mtcars),\n lm(mpg ~ hp + drat, mtcars))\n\nestimates <- c(\"Coef.\" = \"estimate\", \"Std.Error\" = \"std.error\")\n\nside_by_side(models, estimates = estimates)\n\n\n\n\n\n \n \n \n Coef.\n Std.Error\n Coef. \n Std.Error \n \n \n \n (Intercept)\n30.099\n1.634\n10.790\n5.078\n hp\n-0.068\n0.010\n-0.052\n0.009\n drat\n\n\n4.698\n1.192\n \n \n \n\n\n\n\n\n\n\nUsers often want to use estimates or standard errors that have been obtained using a custom strategy. To achieve this in an automated and replicable way, it can be useful to use the tidy_custom strategy described above in the “Cutomizing Existing Models” section.\nFor example, we can use the modelr package to draw 500 resamples of a dataset, and compute bootstrap standard errors by taking the standard deviation of estimates computed in all of those resampled datasets. To do this, we defined tidy_custom.lm function that will automatically bootstrap any lm model supplied to modelsummary, and replace the values in the table automatically.\nNote that the tidy_custom_lm returns a data.frame with 3 columns: term, estimate, and std.error:\n\nlibrary(\"modelsummary\")\nlibrary(\"broom\")\nlibrary(\"tidyverse\")\nlibrary(\"modelr\")\n\ntidy_custom.lm <- function(x, ...) {\n # extract data from the model\n model.frame(x) %>%\n # draw 500 bootstrap resamples\n modelr::bootstrap(n = 500) %>%\n # estimate the model 500 times\n mutate(results = map(strap, ~ update(x, data = .))) %>%\n # extract results using `broom::tidy`\n mutate(results = map(results, tidy)) %>%\n # unnest and summarize\n unnest(results) %>%\n group_by(term) %>%\n summarize(std.error = sd(estimate),\n estimate = mean(estimate))\n}\n\nmod = list(\n lm(hp ~ mpg, mtcars) ,\n lm(hp ~ mpg + drat, mtcars))\n\nmodelsummary(mod)\n\n\n\n\n\n \n \n \n (1)\n (2)\n \n \n \n (Intercept)\n326.421\n284.858\n \n(29.737)\n(44.865)\n mpg\n-9.000\n-10.117\n \n(1.364)\n(2.485)\n drat\n\n17.795\n \n\n(21.775)\n Num.Obs.\n32\n32\n R2\n0.602\n0.614\n R2 Adj.\n0.589\n0.588\n AIC\n336.9\n337.9\n BIC\n341.3\n343.7\n Log.Lik.\n-165.428\n-164.940\n F\n45.460\n23.100\n RMSE\n42.55\n41.91\n \n \n \n\n\n\n\n\n\n\nOne common use-case for glance_custom is to include additional goodness-of-fit statistics. For example, in an instrumental variable estimation computed by the fixest package, we may want to include an IV-Wald statistic for the first-stage regression of each endogenous regressor:\n\nlibrary(fixest)\nlibrary(tidyverse)\n\n## create a toy dataset\nbase <- iris\nnames(base) <- c(\"y\", \"x1\", \"x_endo_1\", \"x_inst_1\", \"fe\")\nbase$x_inst_2 <- 0.2 * base$y + 0.2 * base$x_endo_1 + rnorm(150, sd = 0.5)\nbase$x_endo_2 <- 0.2 * base$y - 0.2 * base$x_inst_1 + rnorm(150, sd = 0.5)\n\n## estimate an instrumental variable model\nmod <- feols(y ~ x1 | fe | x_endo_1 + x_endo_2 ~ x_inst_1 + x_inst_2, base)\n\n## custom extractor function returns a one-row data.frame (or tibble)\nglance_custom.fixest <- function(x) {\n tibble(\n \"Wald (x_endo_1)\" = fitstat(x, \"ivwald\")[[1]]$stat,\n \"Wald (x_endo_2)\" = fitstat(x, \"ivwald\")[[2]]$stat\n )\n}\n\n## draw table\nmodelsummary(mod)\n\n\n\n\n\n \n \n \n (1)\n \n \n \n fit_x_endo_1\n0.424\n \n(0.103)\n fit_x_endo_2\n0.798\n \n(0.272)\n x1\n0.485\n \n(0.059)\n Num.Obs.\n150\n R2\n0.683\n R2 Adj.\n0.672\n R2 Within\n0.168\n R2 Within Adj.\n0.150\n AIC\n207.9\n BIC\n226.0\n RMSE\n0.46\n Std.Errors\nby: fe\n FE: fe\nX\n Wald (x_endo_1)\n77.5359206163405\n Wald (x_endo_2)\n49.3216288080678\n \n \n \n\n\n\n\n\nrm(\"glance_custom.fixest\")\n\n\n\n\nmodelsummary can pool and display analyses on several datasets imputed using the mice or Amelia packages. This code illustrates how:\n\nlibrary(mice)\nlibrary(Amelia)\nlibrary(modelsummary)\n\n## Download data from `Rdatasets`\nurl <- 'https://vincentarelbundock.github.io/Rdatasets/csv/HistData/Guerry.csv'\ndat <- read.csv(url)[, c('Clergy', 'Commerce', 'Literacy')]\n\n## Insert missing values\ndat$Clergy[sample(1:nrow(dat), 10)] <- NA\ndat$Commerce[sample(1:nrow(dat), 10)] <- NA\ndat$Literacy[sample(1:nrow(dat), 10)] <- NA\n\n## Impute with `mice` and `Amelia`\ndat_mice <- mice(dat, m = 5, printFlag = FALSE)\ndat_amelia <- amelia(dat, m = 5, p2s = 0)$imputations\n\n## Estimate models\nmod <- list()\nmod[['Listwise deletion']] <- lm(Clergy ~ Literacy + Commerce, dat)\nmod[['Mice']] <- with(dat_mice, lm(Clergy ~ Literacy + Commerce)) \nmod[['Amelia']] <- lapply(dat_amelia, function(x) lm(Clergy ~ Literacy + Commerce, x))\n\n## Pool results\nmod[['Mice']] <- mice::pool(mod[['Mice']])\nmod[['Amelia']] <- mice::pool(mod[['Amelia']])\n\n## Summarize\nmodelsummary(mod)\n\n\n\n\n\n \n \n \n Listwise deletion\n Mice\n Amelia\n \n \n \n (Intercept)\n63.885\n71.548\n67.067\n \n(16.416)\n(16.109)\n(13.554)\n Literacy\n-0.266\n-0.436\n-0.360\n \n(0.269)\n(0.249)\n(0.215)\n Commerce\n-0.235\n-0.270\n-0.233\n \n(0.170)\n(0.171)\n(0.145)\n Num.Obs.\n60\n86\n86\n Num.Imp.\n\n5\n5\n R2\n0.033\n0.062\n0.049\n R2 Adj.\n-0.001\n\n0.025\n AIC\n564.1\n\n\n BIC\n572.5\n\n\n Log.Lik.\n-278.064\n\n\n RMSE\n24.91\n\n\n \n \n \n\n\n\n\n\n\n\n\nThe table-making backends supported by modelsummary have overlapping capabilities (e.g., several of them can produce HTML tables). These are the default packages used for different outputs:\nkableExtra:\n\nHTML\nLaTeX / PDF\n\nflextable:\n\nWord\nPowerpoint\n\ngt:\n\njpg\npng\n\nYou can modify these defaults by setting global options such as:\noptions(modelsummary_factory_html = \"kableExtra\")\noptions(modelsummary_factory_latex = \"gt\")\noptions(modelsummary_factory_word = \"huxtable\")\noptions(modelsummary_factory_png = \"gt\")\n\n\n\n\n\n\nStandardized coefficients\nRow group labels\nCustomizing Word tables\nHow to add p values to datasummary_correlation\n\n\n\n\nFirst, please read the documentation in ?modelsummary and on the modelsummary website. The website includes dozens of worked examples and a lot of detailed explanation.\nSecond, try to use the [modelsummary] tag on StackOverflow.\nThird, if you think you found a bug or have a feature request, please file it on the Github issue tracker:\n\n\n\nSee the detailed documentation in the “Adding and Customizing Models” section of the modelsummary website.\n\n\n\nA modelsummary table is divided in two parts: “Estimates” (top of the table) and “Goodness-of-fit” (bottom of the table). To populate those two parts, modelsummary tries using the broom, parameters and performance packages in sequence.\nEstimates:\n\nTry the broom::tidy function to see if that package supports this model type, or if the user defined a custom tidy function in their global environment. If this fails…\nTry the parameters::model_parameters function to see if the parameters package supports this model type.\n\nGoodness-of-fit:\n\nTry the performance::model_performance function to see if the performance package supports this model type.\nTry the broom::glance function to see if that package supports this model type, or if the user defined a custom glance function in their global environment. If this fails…\n\nYou can change the order in which those steps are executed by setting a global option:\n\n## tidymodels: broom \noptions(modelsummary_get = \"broom\")\n\n## easystats: performance + parameters\noptions(modelsummary_get = \"easystats\")\n\nIf all of this fails, modelsummary will return an error message.\nIf you have problems with a model object, you can often diagnose the problem by running the following commands from a clean R session:\n## see if parameters and performance support your model type\nlibrary(parameters)\nlibrary(performance)\nmodel_parameters(model)\nmodel_performance(model)\n\n## see if broom supports your model type\nlibrary(broom)\ntidy(model)\nglance(model)\n\n## see if broom.mixed supports your model type\nlibrary(broom.mixed)\ntidy(model)\nglance(model)\nIf none of these options work, you can create your own tidy and glance methods, as described in the Adding new models section.\nIf one of the extractor functions does not work well or takes too long to process, you can define a new “custom” model class and choose your own extractors, as described in the Adding new models section.\n\n\n\nThe modelsummary function, by itself, is not slow: it should only take a couple seconds to produce a table in any output format. However, sometimes it can be computationally expensive (and long) to extract estimates and to compute goodness-of-fit statistics for your model.\nThe main options to speed up modelsummary are:\n\nSet gof_map=NA to avoid computing expensive goodness-of-fit statistics.\nUse the easystats extractor functions and the metrics argument to avoid computing expensive statistics (see below for an example).\nUse parallel computation if you are summarizing multiple models. See the “Parallel computation” section in the ?modelsummary documentation.\n\nTo diagnose the slowdown and find the bottleneck, you can try to benchmark the various extractor functions:\n\nlibrary(tictoc)\n\ndata(trade)\nmod <- lm(mpg ~ hp + drat, mtcars)\n\ntic(\"tidy\")\nx <- broom::tidy(mod)\ntoc()\n\ntidy: 0.002 sec elapsed\n\ntic(\"glance\")\nx <- broom::glance(mod)\ntoc()\n\nglance: 0.004 sec elapsed\n\ntic(\"parameters\")\nx <- parameters::parameters(mod)\ntoc()\n\nparameters: 0.02 sec elapsed\n\ntic(\"performance\")\nx <- performance::performance(mod)\ntoc()\n\nperformance: 0.011 sec elapsed\n\n\nIn my experience, the main bottleneck tends to be computing goodness-of-fit statistics. The performance extractor allows users to specify a metrics argument to select a subset of GOF to include. Using this can speedup things considerably.\nWe call modelsummary with the metrics argument:\n\nmodelsummary(mod, metrics = \"rmse\")\n\n\n\n\n\n \n \n \n (1)\n \n \n \n (Intercept)\n10.790\n \n(5.078)\n hp\n-0.052\n \n(0.009)\n drat\n4.698\n \n(1.192)\n Num.Obs.\n32\n R2\n0.741\n R2 Adj.\n0.723\n AIC\n169.5\n BIC\n175.4\n Log.Lik.\n-80.752\n F\n41.522\n \n \n \n\n\n\n\n\n\n\nSometimes, users want to include raw LaTeX commands in their tables, such as coefficient names including math mode: Apple $\\times$ Orange. The result of these attempts is often a weird string such as: \\$\\textbackslash{}times\\$ instead of proper LaTeX-rendered characters.\nThe source of the problem is that kableExtra, default table-making package in modelsummary, automatically escapes weird characters to make sure that your tables compile properly in LaTeX. To avoid this, we need to pass the escape=FALSE to modelsummary:\n\nmodelsummary(mod, escape = FALSE)\n\n\n\n\nMany bayesian models are supported out-of-the-box, including those produced by the rstanarm and brms packages. The statistics available for bayesian models are slightly different than those available for most frequentist models. Users can call get_estimates to see what is available:\n\nlibrary(rstanarm)\n\nThis is rstanarm version 2.32.1\n\n\n- See https://mc-stan.org/rstanarm/articles/priors for changes to default priors!\n\n\n- Default priors may change, so it's safest to specify priors, even if equivalent to the defaults.\n\n\n- For execution on a local, multicore CPU with excess RAM we recommend calling\n\n\n options(mc.cores = parallel::detectCores())\n\n\n\nAttaching package: 'rstanarm'\n\n\nThe following object is masked from 'package:fixest':\n\n se\n\nmod <- stan_glm(am ~ hp + drat, data = mtcars)\n\n\nget_estimates(mod)\n\n term estimate mad conf.level conf.low conf.high\n1 (Intercept) -2.2475345916 0.555314871 0.95 -3.408237583 -1.058208927\n2 hp 0.0007033686 0.001063287 0.95 -0.001363318 0.002914953\n3 drat 0.7069798264 0.133722428 0.95 0.429156919 0.985993244\n prior.distribution prior.location prior.scale group std.error statistic\n1 normal 0.40625 1.24747729 NA NA\n2 normal 0.00000 0.01819465 NA NA\n3 normal 0.00000 2.33313429 NA NA\n p.value\n1 NA\n2 NA\n3 NA\n\n\nThis shows that there is no std.error column, but that there is a mad statistic (mean absolute deviation). So we can do:\n\nmodelsummary(mod, statistic = \"mad\")\n\nWarning: \n`modelsummary` uses the `performance` package to extract goodness-of-fit\nstatistics from models of this class. You can specify the statistics you wish\nto compute by supplying a `metrics` argument to `modelsummary`, which will then\npush it forward to `performance`. Acceptable values are: \"all\", \"common\",\n\"none\", or a character vector of metrics names. For example: `modelsummary(mod,\nmetrics = c(\"RMSE\", \"R2\")` Note that some metrics are computationally\nexpensive. See `?performance::performance` for details.\n This warning appears once per session.\n\n\n\n\n\n\n \n \n \n (1)\n \n \n \n (Intercept)\n-2.248\n \n(0.555)\n hp\n0.001\n \n(0.001)\n drat\n0.707\n \n(0.134)\n Num.Obs.\n32\n R2\n0.501\n R2 Adj.\n0.433\n Log.Lik.\n-12.064\n ELPD\n-15.0\n ELPD s.e.\n3.1\n LOOIC\n30.0\n LOOIC s.e.\n6.2\n WAIC\n29.8\n RMSE\n0.34\n \n \n \n\n\n\n\nAs noted in the modelsummary() documentation, model results are extracted using the parameters package. Users can pass additional arguments to modelsummary(), which will then push forward those arguments to the parameters::parameters function to change the results. For example, the parameters documentation for bayesian models shows that there is a centrality argument, which allows users to report the mean and standard deviation of the posterior distribution, instead of the median and MAD:\n\nget_estimates(mod, centrality = \"mean\")\n\n term estimate std.dev conf.level conf.low conf.high\n1 (Intercept) -2.2390156302 0.586604330 0.95 -3.408237583 -1.058208927\n2 hp 0.0007145923 0.001079799 0.95 -0.001363318 0.002914953\n3 drat 0.7062254928 0.138544254 0.95 0.429156919 0.985993244\n prior.distribution prior.location prior.scale group std.error statistic\n1 normal 0.40625 1.24747729 NA NA\n2 normal 0.00000 0.01819465 NA NA\n3 normal 0.00000 2.33313429 NA NA\n p.value\n1 NA\n2 NA\n3 NA\n\nmodelsummary(mod, statistic = \"std.dev\", centrality = \"mean\")\n\n\n\n\n\n \n \n \n (1)\n \n \n \n (Intercept)\n-2.239\n \n(0.587)\n hp\n0.001\n \n(0.001)\n drat\n0.706\n \n(0.139)\n Num.Obs.\n32\n R2\n0.501\n R2 Adj.\n0.433\n Log.Lik.\n-12.064\n ELPD\n-15.0\n ELPD s.e.\n3.1\n LOOIC\n30.0\n LOOIC s.e.\n6.2\n WAIC\n29.8\n RMSE\n0.34\n \n \n \n\n\n\n\nWe can also get additional test statistics using the test argument:\n\nget_estimates(mod, test = c(\"pd\", \"rope\"))\n\n term estimate mad conf.level conf.low conf.high\n1 (Intercept) -2.2475345916 0.555314871 0.95 -3.408237583 -1.058208927\n2 hp 0.0007033686 0.001063287 0.95 -0.001363318 0.002914953\n3 drat 0.7069798264 0.133722428 0.95 0.429156919 0.985993244\n pd rope.percentage prior.distribution prior.location prior.scale group\n1 1.000 0 normal 0.40625 1.24747729 \n2 0.747 1 normal 0.00000 0.01819465 \n3 1.000 0 normal 0.00000 2.33313429 \n std.error statistic p.value\n1 NA NA NA\n2 NA NA NA\n3 NA NA NA" }, { "objectID": "vignettes/modelsummary.html#output-print-and-save", @@ -1145,7 +1145,7 @@ "href": "vignettes/modelsummary.html#rmarkdown-quarto-org-mode", "title": "Model Summaries", "section": "", - "text": "You can use modelsummary to insert tables into dynamic documents with knitr or Rmarkdown. This minimal .Rmd file can produce tables in PDF, HTML, or RTF documents:\n\nminimal.Rmd\n\nThis .Rmd file shows illustrates how to use table numbering and cross-references to produce PDF documents using bookdown:\n\ncross_references.Rmd\n\nThis .Rmd file shows how to customize tables in PDF and HTML files using gt and kableExtra functions:\n\nappearance.Rmd\n\n\n\n\nQuarto is an open source publishing system built on top of Pandoc. It was designed as a “successor” to Rmarkdown, and includes useful features for technical writing, such as built-in support for cross-references. modelsummary works automatically with Quarto. This is a minimal document with cross-references which should render automatically to PDF, HTML, and more:\n\n---\nformat: pdf\ntitle: Example\n---\n\n@tbl-mtcars shows that cars with high horse power get low miles per gallon.\n\n```{r}\n##| label: tbl-mtcars\n##| tbl-cap: \"Horse Powers vs. Miles per Gallon\"\nlibrary(modelsummary)\nmod <- lm(mpg ~ hp, mtcars)\nmodelsummary(mod)\n:::\n\n### Emacs Org-Mode\n\nYou can use `modelsummary` to insert tables into Emacs Org-Mode documents, which can be exported to a variety of formats, including HTML and PDF (via LaTeX). As with anything Emacs-related, there are many ways to achieve the outcomes you want. Here is one example of an Org-Mode document which can automatically export tables to HTML and PDF without manual tweaks:\n\n##+PROPERTY: header-args:R :var orgbackend=(prin1-to-string org-export-current-backend) ##+MACRO: Rtable (eval (concat “#+header: :results output” (prin1-to-string org-export-current-backend)))\n{{{Rtable}}} ##+BEGIN_SRC R :exports both library(modelsummary) options(modelsummary_factory_default = orgbackend)\nmod = lm(hp ~ mpg, data = mtcars)\nmodelsummary(mod) ##+END_SRC\n\nThe first line tells Org-mode to assign a variable called `orgbackend`. This variable will be accessible by the `R` session, and will be equal to \"html\" or \"latex\", depending on the export format.\n\nThe second line creates an Org macro which we will use to automatically add useful information to the header of source blocks. For instance, when we export to HTML, the macro will expand to `:results output html`. This tells Org-Mode to insert the last printed output from the `R` session, and to treat it as raw HTML. \n\nThe `{{{Rtable}}}` call expands the macro to add information to the header of the block that follows.\n\n`#+BEGIN_SRC R :exports both` says that we want to print both the original code and the output (`:exports results` would omit the code, for example).\n\nFinally, `options(modelsummary_factory_default=orgbackend` uses the variable we defined to set the default output format. That way, we don't have to use the `output` argument every time.\n\nOne potentially issue to keep in mind is that the code above extracts the printout from the `R` console. However, when we customize tables with `kableExtra` or `gt` functions, those functions do not always return printed raw HTML or LaTeX code. Sometimes, it can be necessary to add a call to `cat` at the end of a table customization pipeline. For example:\n\n{{{Rtable}}} ##+BEGIN_SRC R :exports both library(modelsummary) library(kableExtra)\nmod = lm(hp ~ mpg, data = mtcars)\nmodelsummary(mod, output = orgbackend) %>% row_spec(1, background = “pink”) %>% cat() ##+END_SRC\n\n## Global options\n\nUsers can change the default behavior of `modelsummary` by setting global options.\n\nOmit the note at the bottom of the table with significance threshold:\n\n::: {.cell}\n\n```{.r .cell-code}\noptions(\"modelsummary_stars_note\" = FALSE)\n```\n:::\n\nChange the default output format:\n\n::: {.cell}\n\n```{.r .cell-code}\noptions(modelsummary_factory_default = \"latex\")\noptions(modelsummary_factory_default = \"gt\")\n```\n:::\n\nChange the backend packages that `modelsummary` uses to create tables in different output formats:\n\n::: {.cell}\n\n```{.r .cell-code}\noptions(modelsummary_factory_html = 'kableExtra')\noptions(modelsummary_factory_latex = 'flextable')\noptions(modelsummary_factory_word = 'huxtable')\noptions(modelsummary_factory_png = 'gt')\n```\n:::\n\nChange the packages that `modelsummary` uses to extract information from models:\n\n::: {.cell}\n\n```{.r .cell-code}\n## tidymodels: broom \noptions(modelsummary_get = \"broom\")\n\n## easystats: performance + parameters\noptions(modelsummary_get = \"easystats\")\n```\n:::\n\n[The `appearance` vignette](https://modelsummary.com/articles/appearance.html#themes) shows how to set \"themes\" for your tables using the `modelsummary_theme_gt`, `modelsummary_theme_kableExtra`, `modelsummary_theme_flextable` and `modelsummary_theme_huxtable` global options. For example:\n\n::: {.cell}\n\n```{.r .cell-code}\nlibrary(gt)\n\n## The ... ellipsis is required!\ncustom_theme <- function(x, ...) {\n x %>% gt::opt_row_striping(row_striping = TRUE)\n}\noptions(\"modelsummary_theme_gt\" = custom_theme)\n\nmod <- lm(mpg ~ hp + drat, mtcars)\nmodelsummary(mod, output = \"gt\")\n```\n:::\n\n## Case studies\n\n\n### Standardization\n\nIn some cases, it is useful to standardize coefficients before reporting them. `modelsummary` extracts coefficients from model objects using the `parameters` package, and that package offers several options for standardization: https://easystats.github.io/parameters/reference/model_parameters.default.html\n\nWe can pass the `standardize` argument directly to `modelsummary` or `modelplot`, and that argument will be forwarded to `parameters`. For example to refit the model on standardized data and plot the results, we can do:\n\n::: {.cell}\n\n```{.r .cell-code}\nlibrary(modelsummary)\nmod <- lm(mpg ~ hp + am, data = mtcars)\n\nmodelplot(mod, standardize = \"refit\")\n```\n\n::: {.cell-output .cell-output-stderr}\n```\nWarning in ggplot2::geom_pointrange(ggplot2::aes(y = term, x = estimate, :\nIgnoring unknown parameters: `standardize`\n```\n:::\n\n::: {.cell-output-display}\n![](modelsummary_files/figure-html/unnamed-chunk-60-1.png){width=672}\n:::\n:::\n\nCompare to the unstandardized plot:\n\n::: {.cell}\n\n```{.r .cell-code}\nmodelplot(mod)\n```\n\n::: {.cell-output-display}\n![](modelsummary_files/figure-html/unnamed-chunk-61-1.png){width=672}\n:::\n:::\n\n### Subgroup estimation with `nest_by`\n\nSometimes, it is useful to estimate multiple regression models on subsets of the data. To do this efficiently, we can use the `nest_by` function from the `dplyr` package. Then, estimate the models with `lm`, extract them and name them with `pull`, and finally summarize them with `modelsummary`:\n\n::: {.cell}\n\n```{.r .cell-code}\nlibrary(tidyverse)\n\nmtcars %>%\n nest_by(cyl) %>%\n mutate(models = list(lm(mpg ~ hp, data))) %>%\n pull(models, name = cyl) %>%\n modelsummary\n```\n\n::: {.cell-output-display}\n\n```{=html}\n<div id=\"wkbhfqnlym\" style=\"padding-left:0px;padding-right:0px;padding-top:10px;padding-bottom:10px;overflow-x:auto;overflow-y:auto;width:auto;height:auto;\">\n<style>#wkbhfqnlym table {\n font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji';\n -webkit-font-smoothing: antialiased;\n -moz-osx-font-smoothing: grayscale;\n}\n\n#wkbhfqnlym thead, #wkbhfqnlym tbody, #wkbhfqnlym tfoot, #wkbhfqnlym tr, #wkbhfqnlym td, #wkbhfqnlym th {\n border-style: none;\n}\n\n#wkbhfqnlym p {\n margin: 0;\n padding: 0;\n}\n\n#wkbhfqnlym .gt_table {\n display: table;\n border-collapse: collapse;\n line-height: normal;\n margin-left: auto;\n margin-right: auto;\n color: #333333;\n font-size: 16px;\n font-weight: normal;\n font-style: normal;\n background-color: #FFFFFF;\n width: auto;\n border-top-style: solid;\n border-top-width: 2px;\n border-top-color: #A8A8A8;\n border-right-style: none;\n border-right-width: 2px;\n border-right-color: #D3D3D3;\n border-bottom-style: solid;\n border-bottom-width: 2px;\n border-bottom-color: #A8A8A8;\n border-left-style: none;\n border-left-width: 2px;\n border-left-color: #D3D3D3;\n}\n\n#wkbhfqnlym .gt_caption {\n padding-top: 4px;\n padding-bottom: 4px;\n}\n\n#wkbhfqnlym .gt_title {\n color: #333333;\n font-size: 125%;\n font-weight: initial;\n padding-top: 4px;\n padding-bottom: 4px;\n padding-left: 5px;\n padding-right: 5px;\n border-bottom-color: #FFFFFF;\n border-bottom-width: 0;\n}\n\n#wkbhfqnlym .gt_subtitle {\n color: #333333;\n font-size: 85%;\n font-weight: initial;\n padding-top: 3px;\n padding-bottom: 5px;\n padding-left: 5px;\n padding-right: 5px;\n border-top-color: #FFFFFF;\n border-top-width: 0;\n}\n\n#wkbhfqnlym .gt_heading {\n background-color: #FFFFFF;\n text-align: center;\n border-bottom-color: #FFFFFF;\n border-left-style: none;\n border-left-width: 1px;\n border-left-color: #D3D3D3;\n border-right-style: none;\n border-right-width: 1px;\n border-right-color: #D3D3D3;\n}\n\n#wkbhfqnlym .gt_bottom_border {\n border-bottom-style: solid;\n border-bottom-width: 2px;\n border-bottom-color: #D3D3D3;\n}\n\n#wkbhfqnlym .gt_col_headings {\n border-top-style: solid;\n border-top-width: 2px;\n border-top-color: #D3D3D3;\n border-bottom-style: solid;\n border-bottom-width: 2px;\n border-bottom-color: #D3D3D3;\n border-left-style: none;\n border-left-width: 1px;\n border-left-color: #D3D3D3;\n border-right-style: none;\n border-right-width: 1px;\n border-right-color: #D3D3D3;\n}\n\n#wkbhfqnlym .gt_col_heading {\n color: #333333;\n background-color: #FFFFFF;\n font-size: 100%;\n font-weight: normal;\n text-transform: inherit;\n border-left-style: none;\n border-left-width: 1px;\n border-left-color: #D3D3D3;\n border-right-style: none;\n border-right-width: 1px;\n border-right-color: #D3D3D3;\n vertical-align: bottom;\n padding-top: 5px;\n padding-bottom: 6px;\n padding-left: 5px;\n padding-right: 5px;\n overflow-x: hidden;\n}\n\n#wkbhfqnlym .gt_column_spanner_outer {\n color: #333333;\n background-color: #FFFFFF;\n font-size: 100%;\n font-weight: normal;\n text-transform: inherit;\n padding-top: 0;\n padding-bottom: 0;\n padding-left: 4px;\n padding-right: 4px;\n}\n\n#wkbhfqnlym .gt_column_spanner_outer:first-child {\n padding-left: 0;\n}\n\n#wkbhfqnlym .gt_column_spanner_outer:last-child {\n padding-right: 0;\n}\n\n#wkbhfqnlym .gt_column_spanner {\n border-bottom-style: solid;\n border-bottom-width: 2px;\n border-bottom-color: #D3D3D3;\n vertical-align: bottom;\n padding-top: 5px;\n padding-bottom: 5px;\n overflow-x: hidden;\n display: inline-block;\n width: 100%;\n}\n\n#wkbhfqnlym .gt_spanner_row {\n border-bottom-style: hidden;\n}\n\n#wkbhfqnlym .gt_group_heading {\n padding-top: 8px;\n padding-bottom: 8px;\n padding-left: 5px;\n padding-right: 5px;\n color: #333333;\n background-color: #FFFFFF;\n font-size: 100%;\n font-weight: initial;\n text-transform: inherit;\n border-top-style: solid;\n border-top-width: 2px;\n border-top-color: #D3D3D3;\n border-bottom-style: solid;\n border-bottom-width: 2px;\n border-bottom-color: #D3D3D3;\n border-left-style: none;\n border-left-width: 1px;\n border-left-color: #D3D3D3;\n border-right-style: none;\n border-right-width: 1px;\n border-right-color: #D3D3D3;\n vertical-align: middle;\n text-align: left;\n}\n\n#wkbhfqnlym .gt_empty_group_heading {\n padding: 0.5px;\n color: #333333;\n background-color: #FFFFFF;\n font-size: 100%;\n font-weight: initial;\n border-top-style: solid;\n border-top-width: 2px;\n border-top-color: #D3D3D3;\n border-bottom-style: solid;\n border-bottom-width: 2px;\n border-bottom-color: #D3D3D3;\n vertical-align: middle;\n}\n\n#wkbhfqnlym .gt_from_md > :first-child {\n margin-top: 0;\n}\n\n#wkbhfqnlym .gt_from_md > :last-child {\n margin-bottom: 0;\n}\n\n#wkbhfqnlym .gt_row {\n padding-top: 8px;\n padding-bottom: 8px;\n padding-left: 5px;\n padding-right: 5px;\n margin: 10px;\n border-top-style: solid;\n border-top-width: 1px;\n border-top-color: #D3D3D3;\n border-left-style: none;\n border-left-width: 1px;\n border-left-color: #D3D3D3;\n border-right-style: none;\n border-right-width: 1px;\n border-right-color: #D3D3D3;\n vertical-align: middle;\n overflow-x: hidden;\n}\n\n#wkbhfqnlym .gt_stub {\n color: #333333;\n background-color: #FFFFFF;\n font-size: 100%;\n font-weight: initial;\n text-transform: inherit;\n border-right-style: solid;\n border-right-width: 2px;\n border-right-color: #D3D3D3;\n padding-left: 5px;\n padding-right: 5px;\n}\n\n#wkbhfqnlym .gt_stub_row_group {\n color: #333333;\n background-color: #FFFFFF;\n font-size: 100%;\n font-weight: initial;\n text-transform: inherit;\n border-right-style: solid;\n border-right-width: 2px;\n border-right-color: #D3D3D3;\n padding-left: 5px;\n padding-right: 5px;\n vertical-align: top;\n}\n\n#wkbhfqnlym .gt_row_group_first td {\n border-top-width: 2px;\n}\n\n#wkbhfqnlym .gt_row_group_first th {\n border-top-width: 2px;\n}\n\n#wkbhfqnlym .gt_summary_row {\n color: #333333;\n background-color: #FFFFFF;\n text-transform: inherit;\n padding-top: 8px;\n padding-bottom: 8px;\n padding-left: 5px;\n padding-right: 5px;\n}\n\n#wkbhfqnlym .gt_first_summary_row {\n border-top-style: solid;\n border-top-color: #D3D3D3;\n}\n\n#wkbhfqnlym .gt_first_summary_row.thick {\n border-top-width: 2px;\n}\n\n#wkbhfqnlym .gt_last_summary_row {\n padding-top: 8px;\n padding-bottom: 8px;\n padding-left: 5px;\n padding-right: 5px;\n border-bottom-style: solid;\n border-bottom-width: 2px;\n border-bottom-color: #D3D3D3;\n}\n\n#wkbhfqnlym .gt_grand_summary_row {\n color: #333333;\n background-color: #FFFFFF;\n text-transform: inherit;\n padding-top: 8px;\n padding-bottom: 8px;\n padding-left: 5px;\n padding-right: 5px;\n}\n\n#wkbhfqnlym .gt_first_grand_summary_row {\n padding-top: 8px;\n padding-bottom: 8px;\n padding-left: 5px;\n padding-right: 5px;\n border-top-style: double;\n border-top-width: 6px;\n border-top-color: #D3D3D3;\n}\n\n#wkbhfqnlym .gt_last_grand_summary_row_top {\n padding-top: 8px;\n padding-bottom: 8px;\n padding-left: 5px;\n padding-right: 5px;\n border-bottom-style: double;\n border-bottom-width: 6px;\n border-bottom-color: #D3D3D3;\n}\n\n#wkbhfqnlym .gt_striped {\n background-color: rgba(128, 128, 128, 0.05);\n}\n\n#wkbhfqnlym .gt_table_body {\n border-top-style: solid;\n border-top-width: 2px;\n border-top-color: #D3D3D3;\n border-bottom-style: solid;\n border-bottom-width: 2px;\n border-bottom-color: #D3D3D3;\n}\n\n#wkbhfqnlym .gt_footnotes {\n color: #333333;\n background-color: #FFFFFF;\n border-bottom-style: none;\n border-bottom-width: 2px;\n border-bottom-color: #D3D3D3;\n border-left-style: none;\n border-left-width: 2px;\n border-left-color: #D3D3D3;\n border-right-style: none;\n border-right-width: 2px;\n border-right-color: #D3D3D3;\n}\n\n#wkbhfqnlym .gt_footnote {\n margin: 0px;\n font-size: 90%;\n padding-top: 4px;\n padding-bottom: 4px;\n padding-left: 5px;\n padding-right: 5px;\n}\n\n#wkbhfqnlym .gt_sourcenotes {\n color: #333333;\n background-color: #FFFFFF;\n border-bottom-style: none;\n border-bottom-width: 2px;\n border-bottom-color: #D3D3D3;\n border-left-style: none;\n border-left-width: 2px;\n border-left-color: #D3D3D3;\n border-right-style: none;\n border-right-width: 2px;\n border-right-color: #D3D3D3;\n}\n\n#wkbhfqnlym .gt_sourcenote {\n font-size: 90%;\n padding-top: 4px;\n padding-bottom: 4px;\n padding-left: 5px;\n padding-right: 5px;\n}\n\n#wkbhfqnlym .gt_left {\n text-align: left;\n}\n\n#wkbhfqnlym .gt_center {\n text-align: center;\n}\n\n#wkbhfqnlym .gt_right {\n text-align: right;\n font-variant-numeric: tabular-nums;\n}\n\n#wkbhfqnlym .gt_font_normal {\n font-weight: normal;\n}\n\n#wkbhfqnlym .gt_font_bold {\n font-weight: bold;\n}\n\n#wkbhfqnlym .gt_font_italic {\n font-style: italic;\n}\n\n#wkbhfqnlym .gt_super {\n font-size: 65%;\n}\n\n#wkbhfqnlym .gt_footnote_marks {\n font-size: 75%;\n vertical-align: 0.4em;\n position: initial;\n}\n\n#wkbhfqnlym .gt_asterisk {\n font-size: 100%;\n vertical-align: 0;\n}\n\n#wkbhfqnlym .gt_indent_1 {\n text-indent: 5px;\n}\n\n#wkbhfqnlym .gt_indent_2 {\n text-indent: 10px;\n}\n\n#wkbhfqnlym .gt_indent_3 {\n text-indent: 15px;\n}\n\n#wkbhfqnlym .gt_indent_4 {\n text-indent: 20px;\n}\n\n#wkbhfqnlym .gt_indent_5 {\n text-indent: 25px;\n}\n</style>\n<table class=\"gt_table\" data-quarto-disable-processing=\"false\" data-quarto-bootstrap=\"false\">\n <thead>\n <tr class=\"gt_col_headings\">\n <th class=\"gt_col_heading gt_columns_bottom_border gt_left\" rowspan=\"1\" colspan=\"1\" scope=\"col\" id=\" \"> </th>\n <th class=\"gt_col_heading gt_columns_bottom_border gt_center\" rowspan=\"1\" colspan=\"1\" scope=\"col\" id=\"4\">4</th>\n <th class=\"gt_col_heading gt_columns_bottom_border gt_center\" rowspan=\"1\" colspan=\"1\" scope=\"col\" id=\"6\">6</th>\n <th class=\"gt_col_heading gt_columns_bottom_border gt_center\" rowspan=\"1\" colspan=\"1\" scope=\"col\" id=\"8\">8</th>\n </tr>\n </thead>\n <tbody class=\"gt_table_body\">\n <tr><td headers=\" \" class=\"gt_row gt_left\">(Intercept)</td>\n<td headers=\"4\" class=\"gt_row gt_center\">35.983</td>\n<td headers=\"6\" class=\"gt_row gt_center\">20.674</td>\n<td headers=\"8\" class=\"gt_row gt_center\">18.080</td></tr>\n <tr><td headers=\" \" class=\"gt_row gt_left\"></td>\n<td headers=\"4\" class=\"gt_row gt_center\">(5.201)</td>\n<td headers=\"6\" class=\"gt_row gt_center\">(3.304)</td>\n<td headers=\"8\" class=\"gt_row gt_center\">(2.988)</td></tr>\n <tr><td headers=\" \" class=\"gt_row gt_left\">hp</td>\n<td headers=\"4\" class=\"gt_row gt_center\">-0.113</td>\n<td headers=\"6\" class=\"gt_row gt_center\">-0.008</td>\n<td headers=\"8\" class=\"gt_row gt_center\">-0.014</td></tr>\n <tr><td headers=\" \" class=\"gt_row gt_left\" style=\"border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #000000;\"></td>\n<td headers=\"4\" class=\"gt_row gt_center\" style=\"border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #000000;\">(0.061)</td>\n<td headers=\"6\" class=\"gt_row gt_center\" style=\"border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #000000;\">(0.027)</td>\n<td headers=\"8\" class=\"gt_row gt_center\" style=\"border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #000000;\">(0.014)</td></tr>\n <tr><td headers=\" \" class=\"gt_row gt_left\">Num.Obs.</td>\n<td headers=\"4\" class=\"gt_row gt_center\">11</td>\n<td headers=\"6\" class=\"gt_row gt_center\">7</td>\n<td headers=\"8\" class=\"gt_row gt_center\">14</td></tr>\n <tr><td headers=\" \" class=\"gt_row gt_left\">R2</td>\n<td headers=\"4\" class=\"gt_row gt_center\">0.274</td>\n<td headers=\"6\" class=\"gt_row gt_center\">0.016</td>\n<td headers=\"8\" class=\"gt_row gt_center\">0.080</td></tr>\n <tr><td headers=\" \" class=\"gt_row gt_left\">R2 Adj.</td>\n<td headers=\"4\" class=\"gt_row gt_center\">0.193</td>\n<td headers=\"6\" class=\"gt_row gt_center\">-0.181</td>\n<td headers=\"8\" class=\"gt_row gt_center\">0.004</td></tr>\n <tr><td headers=\" \" class=\"gt_row gt_left\">AIC</td>\n<td headers=\"4\" class=\"gt_row gt_center\">65.8</td>\n<td headers=\"6\" class=\"gt_row gt_center\">29.9</td>\n<td headers=\"8\" class=\"gt_row gt_center\">69.8</td></tr>\n <tr><td headers=\" \" class=\"gt_row gt_left\">BIC</td>\n<td headers=\"4\" class=\"gt_row gt_center\">67.0</td>\n<td headers=\"6\" class=\"gt_row gt_center\">29.7</td>\n<td headers=\"8\" class=\"gt_row gt_center\">71.8</td></tr>\n <tr><td headers=\" \" class=\"gt_row gt_left\">Log.Lik.</td>\n<td headers=\"4\" class=\"gt_row gt_center\">-29.891</td>\n<td headers=\"6\" class=\"gt_row gt_center\">-11.954</td>\n<td headers=\"8\" class=\"gt_row gt_center\">-31.920</td></tr>\n <tr><td headers=\" \" class=\"gt_row gt_left\">RMSE</td>\n<td headers=\"4\" class=\"gt_row gt_center\">3.66</td>\n<td headers=\"6\" class=\"gt_row gt_center\">1.33</td>\n<td headers=\"8\" class=\"gt_row gt_center\">2.37</td></tr>\n </tbody>\n \n \n</table>\n</div>\n```\n\n:::\n:::\n\n### Statistics in separate columns instead of one over the other\n\nIn somes cases, you may want to display statistics in separate columns instead of one over the other. It is easy to achieve this outcome by using the `estimate` argument. This argument accepts a vector of values, one for each of the models we are trying to summarize. If we want to include estimates and standard errors in separate columns, all we need to do is repeat a model, but request different statistics. For example,\n\n::: {.cell}\n\n```{.r .cell-code}\nlibrary(modelsummary)\nlibrary(kableExtra)\n\nmod1 <- lm(mpg ~ hp, mtcars)\nmod2 <- lm(mpg ~ hp + drat, mtcars)\n\nmodels <- list(\n \"Coef.\" = mod1,\n \"Std.Error\" = mod1,\n \"Coef.\" = mod2,\n \"Std.Error\" = mod2)\n\nmodelsummary(models,\n estimate = c(\"estimate\", \"std.error\", \"estimate\", \"std.error\"),\n statistic = NULL,\n gof_omit = \".*\",\n output = \"kableExtra\") %>%\n add_header_above(c(\" \" = 1, \"Model A\" = 2, \"Model B\" = 2))\n```\n\n::: {.cell-output-display}\n\n`````{=html}\n<table class=\"table\" style=\"width: auto !important; margin-left: auto; margin-right: auto;\">\n <thead>\n<tr>\n<th style=\"empty-cells: hide;border-bottom:hidden;\" colspan=\"1\"></th>\n<th style=\"border-bottom:hidden;padding-bottom:0; padding-left:3px;padding-right:3px;text-align: center; \" colspan=\"2\"><div style=\"border-bottom: 1px solid #ddd; padding-bottom: 5px; \">Model A</div></th>\n<th style=\"border-bottom:hidden;padding-bottom:0; padding-left:3px;padding-right:3px;text-align: center; \" colspan=\"2\"><div style=\"border-bottom: 1px solid #ddd; padding-bottom: 5px; \">Model B</div></th>\n</tr>\n <tr>\n <th style=\"text-align:left;\"> </th>\n <th style=\"text-align:center;\"> Coef. </th>\n <th style=\"text-align:center;\"> Std.Error </th>\n <th style=\"text-align:center;\"> Coef. </th>\n <th style=\"text-align:center;\"> Std.Error </th>\n </tr>\n </thead>\n<tbody>\n <tr>\n <td style=\"text-align:left;\"> (Intercept) </td>\n <td style=\"text-align:center;\"> 30.099 </td>\n <td style=\"text-align:center;\"> 1.634 </td>\n <td style=\"text-align:center;\"> 10.790 </td>\n <td style=\"text-align:center;\"> 5.078 </td>\n </tr>\n <tr>\n <td style=\"text-align:left;\"> hp </td>\n <td style=\"text-align:center;\"> −0.068 </td>\n <td style=\"text-align:center;\"> 0.010 </td>\n <td style=\"text-align:center;\"> −0.052 </td>\n <td style=\"text-align:center;\"> 0.009 </td>\n </tr>\n <tr>\n <td style=\"text-align:left;\"> drat </td>\n <td style=\"text-align:center;\"> </td>\n <td style=\"text-align:center;\"> </td>\n <td style=\"text-align:center;\"> 4.698 </td>\n <td style=\"text-align:center;\"> 1.192 </td>\n </tr>\n</tbody>\n</table>\n\n\n:::\nThis can be automated using a simple function:\n\nside_by_side <- function(models, estimates, ...) {\n models <- rep(models, each = length(estimates))\n estimates <- rep(estimates, times = 2)\n names(models) <- names(estimates)\n modelsummary(models = models, estimate = estimates,\n statistic = NULL, gof_omit = \".*\", ...)\n}\n\nmodels = list(\n lm(mpg ~ hp, mtcars),\n lm(mpg ~ hp + drat, mtcars))\n\nestimates <- c(\"Coef.\" = \"estimate\", \"Std.Error\" = \"std.error\")\n\nside_by_side(models, estimates = estimates)\n\n\n\n\n\n \n \n \n Coef.\n Std.Error\n Coef. \n Std.Error \n \n \n \n (Intercept)\n30.099\n1.634\n10.790\n5.078\n hp\n-0.068\n0.010\n-0.052\n0.009\n drat\n\n\n4.698\n1.192\n \n \n \n\n\n\n\n\n\n\nUsers often want to use estimates or standard errors that have been obtained using a custom strategy. To achieve this in an automated and replicable way, it can be useful to use the tidy_custom strategy described above in the “Cutomizing Existing Models” section.\nFor example, we can use the modelr package to draw 500 resamples of a dataset, and compute bootstrap standard errors by taking the standard deviation of estimates computed in all of those resampled datasets. To do this, we defined tidy_custom.lm function that will automatically bootstrap any lm model supplied to modelsummary, and replace the values in the table automatically.\nNote that the tidy_custom_lm returns a data.frame with 3 columns: term, estimate, and std.error:\n\nlibrary(\"modelsummary\")\nlibrary(\"broom\")\nlibrary(\"tidyverse\")\nlibrary(\"modelr\")\n\ntidy_custom.lm <- function(x, ...) {\n # extract data from the model\n model.frame(x) %>%\n # draw 500 bootstrap resamples\n modelr::bootstrap(n = 500) %>%\n # estimate the model 500 times\n mutate(results = map(strap, ~ update(x, data = .))) %>%\n # extract results using `broom::tidy`\n mutate(results = map(results, tidy)) %>%\n # unnest and summarize\n unnest(results) %>%\n group_by(term) %>%\n summarize(std.error = sd(estimate),\n estimate = mean(estimate))\n}\n\nmod = list(\n lm(hp ~ mpg, mtcars) ,\n lm(hp ~ mpg + drat, mtcars))\n\nmodelsummary(mod)\n\n\n\n\n\n \n \n \n (1)\n (2)\n \n \n \n (Intercept)\n328.764\n284.150\n \n(31.484)\n(42.113)\n mpg\n-9.081\n-9.994\n \n(1.420)\n(2.411)\n drat\n\n17.396\n \n\n(20.667)\n Num.Obs.\n32\n32\n R2\n0.602\n0.614\n R2 Adj.\n0.589\n0.588\n AIC\n336.9\n337.9\n BIC\n341.3\n343.7\n Log.Lik.\n-165.428\n-164.940\n F\n45.460\n23.100\n RMSE\n42.55\n41.91\n \n \n \n\n\n\n\n\n\n\nOne common use-case for glance_custom is to include additional goodness-of-fit statistics. For example, in an instrumental variable estimation computed by the fixest package, we may want to include an IV-Wald statistic for the first-stage regression of each endogenous regressor:\n\nlibrary(fixest)\nlibrary(tidyverse)\n\n## create a toy dataset\nbase <- iris\nnames(base) <- c(\"y\", \"x1\", \"x_endo_1\", \"x_inst_1\", \"fe\")\nbase$x_inst_2 <- 0.2 * base$y + 0.2 * base$x_endo_1 + rnorm(150, sd = 0.5)\nbase$x_endo_2 <- 0.2 * base$y - 0.2 * base$x_inst_1 + rnorm(150, sd = 0.5)\n\n## estimate an instrumental variable model\nmod <- feols(y ~ x1 | fe | x_endo_1 + x_endo_2 ~ x_inst_1 + x_inst_2, base)\n\n## custom extractor function returns a one-row data.frame (or tibble)\nglance_custom.fixest <- function(x) {\n tibble(\n \"Wald (x_endo_1)\" = fitstat(x, \"ivwald\")[[1]]$stat,\n \"Wald (x_endo_2)\" = fitstat(x, \"ivwald\")[[2]]$stat\n )\n}\n\n## draw table\nmodelsummary(mod)\n\n\n\n\n\n \n \n \n (1)\n \n \n \n fit_x_endo_1\n0.772\n \n(1.947)\n fit_x_endo_2\n-5.721\n \n(37.320)\n x1\n0.646\n \n(0.459)\n Num.Obs.\n150\n R2\n-11.399\n R2 Adj.\n-11.830\n R2 Within\n-31.519\n R2 Within Adj.\n-32.197\n AIC\n757.7\n BIC\n775.8\n RMSE\n2.91\n Std.Errors\nby: fe\n FE: fe\nX\n Wald (x_endo_1)\n15.65305145663\n Wald (x_endo_2)\n0.0222671859109197\n \n \n \n\n\n\n\n\nrm(\"glance_custom.fixest\")\n\n\n\n\nmodelsummary can pool and display analyses on several datasets imputed using the mice or Amelia packages. This code illustrates how:\n\nlibrary(mice)\nlibrary(Amelia)\nlibrary(modelsummary)\n\n## Download data from `Rdatasets`\nurl <- 'https://vincentarelbundock.github.io/Rdatasets/csv/HistData/Guerry.csv'\ndat <- read.csv(url)[, c('Clergy', 'Commerce', 'Literacy')]\n\n## Insert missing values\ndat$Clergy[sample(1:nrow(dat), 10)] <- NA\ndat$Commerce[sample(1:nrow(dat), 10)] <- NA\ndat$Literacy[sample(1:nrow(dat), 10)] <- NA\n\n## Impute with `mice` and `Amelia`\ndat_mice <- mice(dat, m = 5, printFlag = FALSE)\ndat_amelia <- amelia(dat, m = 5, p2s = 0)$imputations\n\n## Estimate models\nmod <- list()\nmod[['Listwise deletion']] <- lm(Clergy ~ Literacy + Commerce, dat)\nmod[['Mice']] <- with(dat_mice, lm(Clergy ~ Literacy + Commerce)) \nmod[['Amelia']] <- lapply(dat_amelia, function(x) lm(Clergy ~ Literacy + Commerce, x))\n\n## Pool results\nmod[['Mice']] <- mice::pool(mod[['Mice']])\nmod[['Amelia']] <- mice::pool(mod[['Amelia']])\n\n## Summarize\nmodelsummary(mod)\n\n\n\n\n\n \n \n \n Listwise deletion\n Mice\n Amelia\n \n \n \n (Intercept)\n63.166\n56.037\n68.298\n \n(15.624)\n(13.336)\n(12.735)\n Literacy\n-0.303\n-0.215\n-0.406\n \n(0.250)\n(0.207)\n(0.206)\n Commerce\n-0.136\n-0.082\n-0.184\n \n(0.164)\n(0.158)\n(0.140)\n Num.Obs.\n59\n86\n86\n Num.Imp.\n\n5\n5\n R2\n0.026\n0.018\n0.054\n R2 Adj.\n-0.009\n\n0.030\n AIC\n549.2\n\n\n BIC\n557.5\n\n\n Log.Lik.\n-270.576\n\n\n RMSE\n23.74" + "text": "You can use modelsummary to insert tables into dynamic documents with knitr or Rmarkdown. This minimal .Rmd file can produce tables in PDF, HTML, or RTF documents:\n\nminimal.Rmd\n\nThis .Rmd file shows illustrates how to use table numbering and cross-references to produce PDF documents using bookdown:\n\ncross_references.Rmd\n\nThis .Rmd file shows how to customize tables in PDF and HTML files using gt and kableExtra functions:\n\nappearance.Rmd\n\n\n\n\nQuarto is an open source publishing system built on top of Pandoc. It was designed as a “successor” to Rmarkdown, and includes useful features for technical writing, such as built-in support for cross-references. modelsummary works automatically with Quarto. This is a minimal document with cross-references which should render automatically to PDF, HTML, and more:\n\n---\nformat: pdf\ntitle: Example\n---\n\n@tbl-mtcars shows that cars with high horse power get low miles per gallon.\n\n```{r}\n##| label: tbl-mtcars\n##| tbl-cap: \"Horse Powers vs. Miles per Gallon\"\nlibrary(modelsummary)\nmod <- lm(mpg ~ hp, mtcars)\nmodelsummary(mod)\n:::\n\n### Emacs Org-Mode\n\nYou can use `modelsummary` to insert tables into Emacs Org-Mode documents, which can be exported to a variety of formats, including HTML and PDF (via LaTeX). As with anything Emacs-related, there are many ways to achieve the outcomes you want. Here is one example of an Org-Mode document which can automatically export tables to HTML and PDF without manual tweaks:\n\n##+PROPERTY: header-args:R :var orgbackend=(prin1-to-string org-export-current-backend) ##+MACRO: Rtable (eval (concat “#+header: :results output” (prin1-to-string org-export-current-backend)))\n{{{Rtable}}} ##+BEGIN_SRC R :exports both library(modelsummary) options(modelsummary_factory_default = orgbackend)\nmod = lm(hp ~ mpg, data = mtcars)\nmodelsummary(mod) ##+END_SRC\n\nThe first line tells Org-mode to assign a variable called `orgbackend`. This variable will be accessible by the `R` session, and will be equal to \"html\" or \"latex\", depending on the export format.\n\nThe second line creates an Org macro which we will use to automatically add useful information to the header of source blocks. For instance, when we export to HTML, the macro will expand to `:results output html`. This tells Org-Mode to insert the last printed output from the `R` session, and to treat it as raw HTML. \n\nThe `{{{Rtable}}}` call expands the macro to add information to the header of the block that follows.\n\n`#+BEGIN_SRC R :exports both` says that we want to print both the original code and the output (`:exports results` would omit the code, for example).\n\nFinally, `options(modelsummary_factory_default=orgbackend` uses the variable we defined to set the default output format. That way, we don't have to use the `output` argument every time.\n\nOne potentially issue to keep in mind is that the code above extracts the printout from the `R` console. However, when we customize tables with `kableExtra` or `gt` functions, those functions do not always return printed raw HTML or LaTeX code. Sometimes, it can be necessary to add a call to `cat` at the end of a table customization pipeline. For example:\n\n{{{Rtable}}} ##+BEGIN_SRC R :exports both library(modelsummary) library(kableExtra)\nmod = lm(hp ~ mpg, data = mtcars)\nmodelsummary(mod, output = orgbackend) %>% row_spec(1, background = “pink”) %>% cat() ##+END_SRC\n\n## Global options\n\nUsers can change the default behavior of `modelsummary` by setting global options.\n\nOmit the note at the bottom of the table with significance threshold:\n\n::: {.cell}\n\n```{.r .cell-code}\noptions(\"modelsummary_stars_note\" = FALSE)\n```\n:::\n\nChange the default output format:\n\n::: {.cell}\n\n```{.r .cell-code}\noptions(modelsummary_factory_default = \"latex\")\noptions(modelsummary_factory_default = \"gt\")\n```\n:::\n\nChange the backend packages that `modelsummary` uses to create tables in different output formats:\n\n::: {.cell}\n\n```{.r .cell-code}\noptions(modelsummary_factory_html = 'kableExtra')\noptions(modelsummary_factory_latex = 'flextable')\noptions(modelsummary_factory_word = 'huxtable')\noptions(modelsummary_factory_png = 'gt')\n```\n:::\n\nChange the packages that `modelsummary` uses to extract information from models:\n\n::: {.cell}\n\n```{.r .cell-code}\n## tidymodels: broom \noptions(modelsummary_get = \"broom\")\n\n## easystats: performance + parameters\noptions(modelsummary_get = \"easystats\")\n```\n:::\n\n[The `appearance` vignette](https://modelsummary.com/articles/appearance.html#themes) shows how to set \"themes\" for your tables using the `modelsummary_theme_gt`, `modelsummary_theme_kableExtra`, `modelsummary_theme_flextable` and `modelsummary_theme_huxtable` global options. For example:\n\n::: {.cell}\n\n```{.r .cell-code}\nlibrary(gt)\n\n## The ... ellipsis is required!\ncustom_theme <- function(x, ...) {\n x %>% gt::opt_row_striping(row_striping = TRUE)\n}\noptions(\"modelsummary_theme_gt\" = custom_theme)\n\nmod <- lm(mpg ~ hp + drat, mtcars)\nmodelsummary(mod, output = \"gt\")\n```\n:::\n\n## Case studies\n\n\n### Standardization\n\nIn some cases, it is useful to standardize coefficients before reporting them. `modelsummary` extracts coefficients from model objects using the `parameters` package, and that package offers several options for standardization: https://easystats.github.io/parameters/reference/model_parameters.default.html\n\nWe can pass the `standardize` argument directly to `modelsummary` or `modelplot`, and that argument will be forwarded to `parameters`. For example to refit the model on standardized data and plot the results, we can do:\n\n::: {.cell}\n\n```{.r .cell-code}\nlibrary(modelsummary)\nmod <- lm(mpg ~ hp + am, data = mtcars)\n\nmodelplot(mod, standardize = \"refit\")\n```\n\n::: {.cell-output .cell-output-stderr}\n```\nWarning in ggplot2::geom_pointrange(ggplot2::aes(y = term, x = estimate, :\nIgnoring unknown parameters: `standardize`\n```\n:::\n\n::: {.cell-output-display}\n![](modelsummary_files/figure-html/unnamed-chunk-60-1.png){width=672}\n:::\n:::\n\nCompare to the unstandardized plot:\n\n::: {.cell}\n\n```{.r .cell-code}\nmodelplot(mod)\n```\n\n::: {.cell-output-display}\n![](modelsummary_files/figure-html/unnamed-chunk-61-1.png){width=672}\n:::\n:::\n\n### Subgroup estimation with `nest_by`\n\nSometimes, it is useful to estimate multiple regression models on subsets of the data. To do this efficiently, we can use the `nest_by` function from the `dplyr` package. Then, estimate the models with `lm`, extract them and name them with `pull`, and finally summarize them with `modelsummary`:\n\n::: {.cell}\n\n```{.r .cell-code}\nlibrary(tidyverse)\n\nmtcars %>%\n nest_by(cyl) %>%\n mutate(models = list(lm(mpg ~ hp, data))) %>%\n pull(models, name = cyl) %>%\n modelsummary\n```\n\n::: {.cell-output-display}\n\n```{=html}\n<div id=\"msyjczffsg\" style=\"padding-left:0px;padding-right:0px;padding-top:10px;padding-bottom:10px;overflow-x:auto;overflow-y:auto;width:auto;height:auto;\">\n<style>#msyjczffsg table {\n font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji';\n -webkit-font-smoothing: antialiased;\n -moz-osx-font-smoothing: grayscale;\n}\n\n#msyjczffsg thead, #msyjczffsg tbody, #msyjczffsg tfoot, #msyjczffsg tr, #msyjczffsg td, #msyjczffsg th {\n border-style: none;\n}\n\n#msyjczffsg p {\n margin: 0;\n padding: 0;\n}\n\n#msyjczffsg .gt_table {\n display: table;\n border-collapse: collapse;\n line-height: normal;\n margin-left: auto;\n margin-right: auto;\n color: #333333;\n font-size: 16px;\n font-weight: normal;\n font-style: normal;\n background-color: #FFFFFF;\n width: auto;\n border-top-style: solid;\n border-top-width: 2px;\n border-top-color: #A8A8A8;\n border-right-style: none;\n border-right-width: 2px;\n border-right-color: #D3D3D3;\n border-bottom-style: solid;\n border-bottom-width: 2px;\n border-bottom-color: #A8A8A8;\n border-left-style: none;\n border-left-width: 2px;\n border-left-color: #D3D3D3;\n}\n\n#msyjczffsg .gt_caption {\n padding-top: 4px;\n padding-bottom: 4px;\n}\n\n#msyjczffsg .gt_title {\n color: #333333;\n font-size: 125%;\n font-weight: initial;\n padding-top: 4px;\n padding-bottom: 4px;\n padding-left: 5px;\n padding-right: 5px;\n border-bottom-color: #FFFFFF;\n border-bottom-width: 0;\n}\n\n#msyjczffsg .gt_subtitle {\n color: #333333;\n font-size: 85%;\n font-weight: initial;\n padding-top: 3px;\n padding-bottom: 5px;\n padding-left: 5px;\n padding-right: 5px;\n border-top-color: #FFFFFF;\n border-top-width: 0;\n}\n\n#msyjczffsg .gt_heading {\n background-color: #FFFFFF;\n text-align: center;\n border-bottom-color: #FFFFFF;\n border-left-style: none;\n border-left-width: 1px;\n border-left-color: #D3D3D3;\n border-right-style: none;\n border-right-width: 1px;\n border-right-color: #D3D3D3;\n}\n\n#msyjczffsg .gt_bottom_border {\n border-bottom-style: solid;\n border-bottom-width: 2px;\n border-bottom-color: #D3D3D3;\n}\n\n#msyjczffsg .gt_col_headings {\n border-top-style: solid;\n border-top-width: 2px;\n border-top-color: #D3D3D3;\n border-bottom-style: solid;\n border-bottom-width: 2px;\n border-bottom-color: #D3D3D3;\n border-left-style: none;\n border-left-width: 1px;\n border-left-color: #D3D3D3;\n border-right-style: none;\n border-right-width: 1px;\n border-right-color: #D3D3D3;\n}\n\n#msyjczffsg .gt_col_heading {\n color: #333333;\n background-color: #FFFFFF;\n font-size: 100%;\n font-weight: normal;\n text-transform: inherit;\n border-left-style: none;\n border-left-width: 1px;\n border-left-color: #D3D3D3;\n border-right-style: none;\n border-right-width: 1px;\n border-right-color: #D3D3D3;\n vertical-align: bottom;\n padding-top: 5px;\n padding-bottom: 6px;\n padding-left: 5px;\n padding-right: 5px;\n overflow-x: hidden;\n}\n\n#msyjczffsg .gt_column_spanner_outer {\n color: #333333;\n background-color: #FFFFFF;\n font-size: 100%;\n font-weight: normal;\n text-transform: inherit;\n padding-top: 0;\n padding-bottom: 0;\n padding-left: 4px;\n padding-right: 4px;\n}\n\n#msyjczffsg .gt_column_spanner_outer:first-child {\n padding-left: 0;\n}\n\n#msyjczffsg .gt_column_spanner_outer:last-child {\n padding-right: 0;\n}\n\n#msyjczffsg .gt_column_spanner {\n border-bottom-style: solid;\n border-bottom-width: 2px;\n border-bottom-color: #D3D3D3;\n vertical-align: bottom;\n padding-top: 5px;\n padding-bottom: 5px;\n overflow-x: hidden;\n display: inline-block;\n width: 100%;\n}\n\n#msyjczffsg .gt_spanner_row {\n border-bottom-style: hidden;\n}\n\n#msyjczffsg .gt_group_heading {\n padding-top: 8px;\n padding-bottom: 8px;\n padding-left: 5px;\n padding-right: 5px;\n color: #333333;\n background-color: #FFFFFF;\n font-size: 100%;\n font-weight: initial;\n text-transform: inherit;\n border-top-style: solid;\n border-top-width: 2px;\n border-top-color: #D3D3D3;\n border-bottom-style: solid;\n border-bottom-width: 2px;\n border-bottom-color: #D3D3D3;\n border-left-style: none;\n border-left-width: 1px;\n border-left-color: #D3D3D3;\n border-right-style: none;\n border-right-width: 1px;\n border-right-color: #D3D3D3;\n vertical-align: middle;\n text-align: left;\n}\n\n#msyjczffsg .gt_empty_group_heading {\n padding: 0.5px;\n color: #333333;\n background-color: #FFFFFF;\n font-size: 100%;\n font-weight: initial;\n border-top-style: solid;\n border-top-width: 2px;\n border-top-color: #D3D3D3;\n border-bottom-style: solid;\n border-bottom-width: 2px;\n border-bottom-color: #D3D3D3;\n vertical-align: middle;\n}\n\n#msyjczffsg .gt_from_md > :first-child {\n margin-top: 0;\n}\n\n#msyjczffsg .gt_from_md > :last-child {\n margin-bottom: 0;\n}\n\n#msyjczffsg .gt_row {\n padding-top: 8px;\n padding-bottom: 8px;\n padding-left: 5px;\n padding-right: 5px;\n margin: 10px;\n border-top-style: solid;\n border-top-width: 1px;\n border-top-color: #D3D3D3;\n border-left-style: none;\n border-left-width: 1px;\n border-left-color: #D3D3D3;\n border-right-style: none;\n border-right-width: 1px;\n border-right-color: #D3D3D3;\n vertical-align: middle;\n overflow-x: hidden;\n}\n\n#msyjczffsg .gt_stub {\n color: #333333;\n background-color: #FFFFFF;\n font-size: 100%;\n font-weight: initial;\n text-transform: inherit;\n border-right-style: solid;\n border-right-width: 2px;\n border-right-color: #D3D3D3;\n padding-left: 5px;\n padding-right: 5px;\n}\n\n#msyjczffsg .gt_stub_row_group {\n color: #333333;\n background-color: #FFFFFF;\n font-size: 100%;\n font-weight: initial;\n text-transform: inherit;\n border-right-style: solid;\n border-right-width: 2px;\n border-right-color: #D3D3D3;\n padding-left: 5px;\n padding-right: 5px;\n vertical-align: top;\n}\n\n#msyjczffsg .gt_row_group_first td {\n border-top-width: 2px;\n}\n\n#msyjczffsg .gt_row_group_first th {\n border-top-width: 2px;\n}\n\n#msyjczffsg .gt_summary_row {\n color: #333333;\n background-color: #FFFFFF;\n text-transform: inherit;\n padding-top: 8px;\n padding-bottom: 8px;\n padding-left: 5px;\n padding-right: 5px;\n}\n\n#msyjczffsg .gt_first_summary_row {\n border-top-style: solid;\n border-top-color: #D3D3D3;\n}\n\n#msyjczffsg .gt_first_summary_row.thick {\n border-top-width: 2px;\n}\n\n#msyjczffsg .gt_last_summary_row {\n padding-top: 8px;\n padding-bottom: 8px;\n padding-left: 5px;\n padding-right: 5px;\n border-bottom-style: solid;\n border-bottom-width: 2px;\n border-bottom-color: #D3D3D3;\n}\n\n#msyjczffsg .gt_grand_summary_row {\n color: #333333;\n background-color: #FFFFFF;\n text-transform: inherit;\n padding-top: 8px;\n padding-bottom: 8px;\n padding-left: 5px;\n padding-right: 5px;\n}\n\n#msyjczffsg .gt_first_grand_summary_row {\n padding-top: 8px;\n padding-bottom: 8px;\n padding-left: 5px;\n padding-right: 5px;\n border-top-style: double;\n border-top-width: 6px;\n border-top-color: #D3D3D3;\n}\n\n#msyjczffsg .gt_last_grand_summary_row_top {\n padding-top: 8px;\n padding-bottom: 8px;\n padding-left: 5px;\n padding-right: 5px;\n border-bottom-style: double;\n border-bottom-width: 6px;\n border-bottom-color: #D3D3D3;\n}\n\n#msyjczffsg .gt_striped {\n background-color: rgba(128, 128, 128, 0.05);\n}\n\n#msyjczffsg .gt_table_body {\n border-top-style: solid;\n border-top-width: 2px;\n border-top-color: #D3D3D3;\n border-bottom-style: solid;\n border-bottom-width: 2px;\n border-bottom-color: #D3D3D3;\n}\n\n#msyjczffsg .gt_footnotes {\n color: #333333;\n background-color: #FFFFFF;\n border-bottom-style: none;\n border-bottom-width: 2px;\n border-bottom-color: #D3D3D3;\n border-left-style: none;\n border-left-width: 2px;\n border-left-color: #D3D3D3;\n border-right-style: none;\n border-right-width: 2px;\n border-right-color: #D3D3D3;\n}\n\n#msyjczffsg .gt_footnote {\n margin: 0px;\n font-size: 90%;\n padding-top: 4px;\n padding-bottom: 4px;\n padding-left: 5px;\n padding-right: 5px;\n}\n\n#msyjczffsg .gt_sourcenotes {\n color: #333333;\n background-color: #FFFFFF;\n border-bottom-style: none;\n border-bottom-width: 2px;\n border-bottom-color: #D3D3D3;\n border-left-style: none;\n border-left-width: 2px;\n border-left-color: #D3D3D3;\n border-right-style: none;\n border-right-width: 2px;\n border-right-color: #D3D3D3;\n}\n\n#msyjczffsg .gt_sourcenote {\n font-size: 90%;\n padding-top: 4px;\n padding-bottom: 4px;\n padding-left: 5px;\n padding-right: 5px;\n}\n\n#msyjczffsg .gt_left {\n text-align: left;\n}\n\n#msyjczffsg .gt_center {\n text-align: center;\n}\n\n#msyjczffsg .gt_right {\n text-align: right;\n font-variant-numeric: tabular-nums;\n}\n\n#msyjczffsg .gt_font_normal {\n font-weight: normal;\n}\n\n#msyjczffsg .gt_font_bold {\n font-weight: bold;\n}\n\n#msyjczffsg .gt_font_italic {\n font-style: italic;\n}\n\n#msyjczffsg .gt_super {\n font-size: 65%;\n}\n\n#msyjczffsg .gt_footnote_marks {\n font-size: 75%;\n vertical-align: 0.4em;\n position: initial;\n}\n\n#msyjczffsg .gt_asterisk {\n font-size: 100%;\n vertical-align: 0;\n}\n\n#msyjczffsg .gt_indent_1 {\n text-indent: 5px;\n}\n\n#msyjczffsg .gt_indent_2 {\n text-indent: 10px;\n}\n\n#msyjczffsg .gt_indent_3 {\n text-indent: 15px;\n}\n\n#msyjczffsg .gt_indent_4 {\n text-indent: 20px;\n}\n\n#msyjczffsg .gt_indent_5 {\n text-indent: 25px;\n}\n</style>\n<table class=\"gt_table\" data-quarto-disable-processing=\"false\" data-quarto-bootstrap=\"false\">\n <thead>\n <tr class=\"gt_col_headings\">\n <th class=\"gt_col_heading gt_columns_bottom_border gt_left\" rowspan=\"1\" colspan=\"1\" scope=\"col\" id=\" \"> </th>\n <th class=\"gt_col_heading gt_columns_bottom_border gt_center\" rowspan=\"1\" colspan=\"1\" scope=\"col\" id=\"4\">4</th>\n <th class=\"gt_col_heading gt_columns_bottom_border gt_center\" rowspan=\"1\" colspan=\"1\" scope=\"col\" id=\"6\">6</th>\n <th class=\"gt_col_heading gt_columns_bottom_border gt_center\" rowspan=\"1\" colspan=\"1\" scope=\"col\" id=\"8\">8</th>\n </tr>\n </thead>\n <tbody class=\"gt_table_body\">\n <tr><td headers=\" \" class=\"gt_row gt_left\">(Intercept)</td>\n<td headers=\"4\" class=\"gt_row gt_center\">35.983</td>\n<td headers=\"6\" class=\"gt_row gt_center\">20.674</td>\n<td headers=\"8\" class=\"gt_row gt_center\">18.080</td></tr>\n <tr><td headers=\" \" class=\"gt_row gt_left\"></td>\n<td headers=\"4\" class=\"gt_row gt_center\">(5.201)</td>\n<td headers=\"6\" class=\"gt_row gt_center\">(3.304)</td>\n<td headers=\"8\" class=\"gt_row gt_center\">(2.988)</td></tr>\n <tr><td headers=\" \" class=\"gt_row gt_left\">hp</td>\n<td headers=\"4\" class=\"gt_row gt_center\">-0.113</td>\n<td headers=\"6\" class=\"gt_row gt_center\">-0.008</td>\n<td headers=\"8\" class=\"gt_row gt_center\">-0.014</td></tr>\n <tr><td headers=\" \" class=\"gt_row gt_left\" style=\"border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #000000;\"></td>\n<td headers=\"4\" class=\"gt_row gt_center\" style=\"border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #000000;\">(0.061)</td>\n<td headers=\"6\" class=\"gt_row gt_center\" style=\"border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #000000;\">(0.027)</td>\n<td headers=\"8\" class=\"gt_row gt_center\" style=\"border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: #000000;\">(0.014)</td></tr>\n <tr><td headers=\" \" class=\"gt_row gt_left\">Num.Obs.</td>\n<td headers=\"4\" class=\"gt_row gt_center\">11</td>\n<td headers=\"6\" class=\"gt_row gt_center\">7</td>\n<td headers=\"8\" class=\"gt_row gt_center\">14</td></tr>\n <tr><td headers=\" \" class=\"gt_row gt_left\">R2</td>\n<td headers=\"4\" class=\"gt_row gt_center\">0.274</td>\n<td headers=\"6\" class=\"gt_row gt_center\">0.016</td>\n<td headers=\"8\" class=\"gt_row gt_center\">0.080</td></tr>\n <tr><td headers=\" \" class=\"gt_row gt_left\">R2 Adj.</td>\n<td headers=\"4\" class=\"gt_row gt_center\">0.193</td>\n<td headers=\"6\" class=\"gt_row gt_center\">-0.181</td>\n<td headers=\"8\" class=\"gt_row gt_center\">0.004</td></tr>\n <tr><td headers=\" \" class=\"gt_row gt_left\">AIC</td>\n<td headers=\"4\" class=\"gt_row gt_center\">65.8</td>\n<td headers=\"6\" class=\"gt_row gt_center\">29.9</td>\n<td headers=\"8\" class=\"gt_row gt_center\">69.8</td></tr>\n <tr><td headers=\" \" class=\"gt_row gt_left\">BIC</td>\n<td headers=\"4\" class=\"gt_row gt_center\">67.0</td>\n<td headers=\"6\" class=\"gt_row gt_center\">29.7</td>\n<td headers=\"8\" class=\"gt_row gt_center\">71.8</td></tr>\n <tr><td headers=\" \" class=\"gt_row gt_left\">Log.Lik.</td>\n<td headers=\"4\" class=\"gt_row gt_center\">-29.891</td>\n<td headers=\"6\" class=\"gt_row gt_center\">-11.954</td>\n<td headers=\"8\" class=\"gt_row gt_center\">-31.920</td></tr>\n <tr><td headers=\" \" class=\"gt_row gt_left\">RMSE</td>\n<td headers=\"4\" class=\"gt_row gt_center\">3.66</td>\n<td headers=\"6\" class=\"gt_row gt_center\">1.33</td>\n<td headers=\"8\" class=\"gt_row gt_center\">2.37</td></tr>\n </tbody>\n \n \n</table>\n</div>\n```\n\n:::\n:::\n\n### Statistics in separate columns instead of one over the other\n\nIn somes cases, you may want to display statistics in separate columns instead of one over the other. It is easy to achieve this outcome by using the `estimate` argument. This argument accepts a vector of values, one for each of the models we are trying to summarize. If we want to include estimates and standard errors in separate columns, all we need to do is repeat a model, but request different statistics. For example,\n\n::: {.cell}\n\n```{.r .cell-code}\nlibrary(modelsummary)\nlibrary(kableExtra)\n\nmod1 <- lm(mpg ~ hp, mtcars)\nmod2 <- lm(mpg ~ hp + drat, mtcars)\n\nmodels <- list(\n \"Coef.\" = mod1,\n \"Std.Error\" = mod1,\n \"Coef.\" = mod2,\n \"Std.Error\" = mod2)\n\nmodelsummary(models,\n estimate = c(\"estimate\", \"std.error\", \"estimate\", \"std.error\"),\n statistic = NULL,\n gof_omit = \".*\",\n output = \"kableExtra\") %>%\n add_header_above(c(\" \" = 1, \"Model A\" = 2, \"Model B\" = 2))\n```\n\n::: {.cell-output-display}\n\n`````{=html}\n<table class=\"table\" style=\"width: auto !important; margin-left: auto; margin-right: auto;\">\n <thead>\n<tr>\n<th style=\"empty-cells: hide;border-bottom:hidden;\" colspan=\"1\"></th>\n<th style=\"border-bottom:hidden;padding-bottom:0; padding-left:3px;padding-right:3px;text-align: center; \" colspan=\"2\"><div style=\"border-bottom: 1px solid #ddd; padding-bottom: 5px; \">Model A</div></th>\n<th style=\"border-bottom:hidden;padding-bottom:0; padding-left:3px;padding-right:3px;text-align: center; \" colspan=\"2\"><div style=\"border-bottom: 1px solid #ddd; padding-bottom: 5px; \">Model B</div></th>\n</tr>\n <tr>\n <th style=\"text-align:left;\"> </th>\n <th style=\"text-align:center;\"> Coef. </th>\n <th style=\"text-align:center;\"> Std.Error </th>\n <th style=\"text-align:center;\"> Coef. </th>\n <th style=\"text-align:center;\"> Std.Error </th>\n </tr>\n </thead>\n<tbody>\n <tr>\n <td style=\"text-align:left;\"> (Intercept) </td>\n <td style=\"text-align:center;\"> 30.099 </td>\n <td style=\"text-align:center;\"> 1.634 </td>\n <td style=\"text-align:center;\"> 10.790 </td>\n <td style=\"text-align:center;\"> 5.078 </td>\n </tr>\n <tr>\n <td style=\"text-align:left;\"> hp </td>\n <td style=\"text-align:center;\"> −0.068 </td>\n <td style=\"text-align:center;\"> 0.010 </td>\n <td style=\"text-align:center;\"> −0.052 </td>\n <td style=\"text-align:center;\"> 0.009 </td>\n </tr>\n <tr>\n <td style=\"text-align:left;\"> drat </td>\n <td style=\"text-align:center;\"> </td>\n <td style=\"text-align:center;\"> </td>\n <td style=\"text-align:center;\"> 4.698 </td>\n <td style=\"text-align:center;\"> 1.192 </td>\n </tr>\n</tbody>\n</table>\n\n\n:::\nThis can be automated using a simple function:\n\nside_by_side <- function(models, estimates, ...) {\n models <- rep(models, each = length(estimates))\n estimates <- rep(estimates, times = 2)\n names(models) <- names(estimates)\n modelsummary(models = models, estimate = estimates,\n statistic = NULL, gof_omit = \".*\", ...)\n}\n\nmodels = list(\n lm(mpg ~ hp, mtcars),\n lm(mpg ~ hp + drat, mtcars))\n\nestimates <- c(\"Coef.\" = \"estimate\", \"Std.Error\" = \"std.error\")\n\nside_by_side(models, estimates = estimates)\n\n\n\n\n\n \n \n \n Coef.\n Std.Error\n Coef. \n Std.Error \n \n \n \n (Intercept)\n30.099\n1.634\n10.790\n5.078\n hp\n-0.068\n0.010\n-0.052\n0.009\n drat\n\n\n4.698\n1.192\n \n \n \n\n\n\n\n\n\n\nUsers often want to use estimates or standard errors that have been obtained using a custom strategy. To achieve this in an automated and replicable way, it can be useful to use the tidy_custom strategy described above in the “Cutomizing Existing Models” section.\nFor example, we can use the modelr package to draw 500 resamples of a dataset, and compute bootstrap standard errors by taking the standard deviation of estimates computed in all of those resampled datasets. To do this, we defined tidy_custom.lm function that will automatically bootstrap any lm model supplied to modelsummary, and replace the values in the table automatically.\nNote that the tidy_custom_lm returns a data.frame with 3 columns: term, estimate, and std.error:\n\nlibrary(\"modelsummary\")\nlibrary(\"broom\")\nlibrary(\"tidyverse\")\nlibrary(\"modelr\")\n\ntidy_custom.lm <- function(x, ...) {\n # extract data from the model\n model.frame(x) %>%\n # draw 500 bootstrap resamples\n modelr::bootstrap(n = 500) %>%\n # estimate the model 500 times\n mutate(results = map(strap, ~ update(x, data = .))) %>%\n # extract results using `broom::tidy`\n mutate(results = map(results, tidy)) %>%\n # unnest and summarize\n unnest(results) %>%\n group_by(term) %>%\n summarize(std.error = sd(estimate),\n estimate = mean(estimate))\n}\n\nmod = list(\n lm(hp ~ mpg, mtcars) ,\n lm(hp ~ mpg + drat, mtcars))\n\nmodelsummary(mod)\n\n\n\n\n\n \n \n \n (1)\n (2)\n \n \n \n (Intercept)\n326.421\n284.858\n \n(29.737)\n(44.865)\n mpg\n-9.000\n-10.117\n \n(1.364)\n(2.485)\n drat\n\n17.795\n \n\n(21.775)\n Num.Obs.\n32\n32\n R2\n0.602\n0.614\n R2 Adj.\n0.589\n0.588\n AIC\n336.9\n337.9\n BIC\n341.3\n343.7\n Log.Lik.\n-165.428\n-164.940\n F\n45.460\n23.100\n RMSE\n42.55\n41.91\n \n \n \n\n\n\n\n\n\n\nOne common use-case for glance_custom is to include additional goodness-of-fit statistics. For example, in an instrumental variable estimation computed by the fixest package, we may want to include an IV-Wald statistic for the first-stage regression of each endogenous regressor:\n\nlibrary(fixest)\nlibrary(tidyverse)\n\n## create a toy dataset\nbase <- iris\nnames(base) <- c(\"y\", \"x1\", \"x_endo_1\", \"x_inst_1\", \"fe\")\nbase$x_inst_2 <- 0.2 * base$y + 0.2 * base$x_endo_1 + rnorm(150, sd = 0.5)\nbase$x_endo_2 <- 0.2 * base$y - 0.2 * base$x_inst_1 + rnorm(150, sd = 0.5)\n\n## estimate an instrumental variable model\nmod <- feols(y ~ x1 | fe | x_endo_1 + x_endo_2 ~ x_inst_1 + x_inst_2, base)\n\n## custom extractor function returns a one-row data.frame (or tibble)\nglance_custom.fixest <- function(x) {\n tibble(\n \"Wald (x_endo_1)\" = fitstat(x, \"ivwald\")[[1]]$stat,\n \"Wald (x_endo_2)\" = fitstat(x, \"ivwald\")[[2]]$stat\n )\n}\n\n## draw table\nmodelsummary(mod)\n\n\n\n\n\n \n \n \n (1)\n \n \n \n fit_x_endo_1\n0.424\n \n(0.103)\n fit_x_endo_2\n0.798\n \n(0.272)\n x1\n0.485\n \n(0.059)\n Num.Obs.\n150\n R2\n0.683\n R2 Adj.\n0.672\n R2 Within\n0.168\n R2 Within Adj.\n0.150\n AIC\n207.9\n BIC\n226.0\n RMSE\n0.46\n Std.Errors\nby: fe\n FE: fe\nX\n Wald (x_endo_1)\n77.5359206163405\n Wald (x_endo_2)\n49.3216288080678\n \n \n \n\n\n\n\n\nrm(\"glance_custom.fixest\")\n\n\n\n\nmodelsummary can pool and display analyses on several datasets imputed using the mice or Amelia packages. This code illustrates how:\n\nlibrary(mice)\nlibrary(Amelia)\nlibrary(modelsummary)\n\n## Download data from `Rdatasets`\nurl <- 'https://vincentarelbundock.github.io/Rdatasets/csv/HistData/Guerry.csv'\ndat <- read.csv(url)[, c('Clergy', 'Commerce', 'Literacy')]\n\n## Insert missing values\ndat$Clergy[sample(1:nrow(dat), 10)] <- NA\ndat$Commerce[sample(1:nrow(dat), 10)] <- NA\ndat$Literacy[sample(1:nrow(dat), 10)] <- NA\n\n## Impute with `mice` and `Amelia`\ndat_mice <- mice(dat, m = 5, printFlag = FALSE)\ndat_amelia <- amelia(dat, m = 5, p2s = 0)$imputations\n\n## Estimate models\nmod <- list()\nmod[['Listwise deletion']] <- lm(Clergy ~ Literacy + Commerce, dat)\nmod[['Mice']] <- with(dat_mice, lm(Clergy ~ Literacy + Commerce)) \nmod[['Amelia']] <- lapply(dat_amelia, function(x) lm(Clergy ~ Literacy + Commerce, x))\n\n## Pool results\nmod[['Mice']] <- mice::pool(mod[['Mice']])\nmod[['Amelia']] <- mice::pool(mod[['Amelia']])\n\n## Summarize\nmodelsummary(mod)\n\n\n\n\n\n \n \n \n Listwise deletion\n Mice\n Amelia\n \n \n \n (Intercept)\n63.885\n71.548\n67.067\n \n(16.416)\n(16.109)\n(13.554)\n Literacy\n-0.266\n-0.436\n-0.360\n \n(0.269)\n(0.249)\n(0.215)\n Commerce\n-0.235\n-0.270\n-0.233\n \n(0.170)\n(0.171)\n(0.145)\n Num.Obs.\n60\n86\n86\n Num.Imp.\n\n5\n5\n R2\n0.033\n0.062\n0.049\n R2 Adj.\n-0.001\n\n0.025\n AIC\n564.1\n\n\n BIC\n572.5\n\n\n Log.Lik.\n-278.064\n\n\n RMSE\n24.91" }, { "objectID": "vignettes/modelsummary.html#table-making-packages", @@ -1159,6 +1159,6 @@ "href": "vignettes/modelsummary.html#faq", "title": "Model Summaries", "section": "", - "text": "Standardized coefficients\nRow group labels\nCustomizing Word tables\nHow to add p values to datasummary_correlation\n\n\n\n\nFirst, please read the documentation in ?modelsummary and on the modelsummary website. The website includes dozens of worked examples and a lot of detailed explanation.\nSecond, try to use the [modelsummary] tag on StackOverflow.\nThird, if you think you found a bug or have a feature request, please file it on the Github issue tracker:\n\n\n\nSee the detailed documentation in the “Adding and Customizing Models” section of the modelsummary website.\n\n\n\nA modelsummary table is divided in two parts: “Estimates” (top of the table) and “Goodness-of-fit” (bottom of the table). To populate those two parts, modelsummary tries using the broom, parameters and performance packages in sequence.\nEstimates:\n\nTry the broom::tidy function to see if that package supports this model type, or if the user defined a custom tidy function in their global environment. If this fails…\nTry the parameters::model_parameters function to see if the parameters package supports this model type.\n\nGoodness-of-fit:\n\nTry the performance::model_performance function to see if the performance package supports this model type.\nTry the broom::glance function to see if that package supports this model type, or if the user defined a custom glance function in their global environment. If this fails…\n\nYou can change the order in which those steps are executed by setting a global option:\n\n## tidymodels: broom \noptions(modelsummary_get = \"broom\")\n\n## easystats: performance + parameters\noptions(modelsummary_get = \"easystats\")\n\nIf all of this fails, modelsummary will return an error message.\nIf you have problems with a model object, you can often diagnose the problem by running the following commands from a clean R session:\n## see if parameters and performance support your model type\nlibrary(parameters)\nlibrary(performance)\nmodel_parameters(model)\nmodel_performance(model)\n\n## see if broom supports your model type\nlibrary(broom)\ntidy(model)\nglance(model)\n\n## see if broom.mixed supports your model type\nlibrary(broom.mixed)\ntidy(model)\nglance(model)\nIf none of these options work, you can create your own tidy and glance methods, as described in the Adding new models section.\nIf one of the extractor functions does not work well or takes too long to process, you can define a new “custom” model class and choose your own extractors, as described in the Adding new models section.\n\n\n\nThe modelsummary function, by itself, is not slow: it should only take a couple seconds to produce a table in any output format. However, sometimes it can be computationally expensive (and long) to extract estimates and to compute goodness-of-fit statistics for your model.\nThe main options to speed up modelsummary are:\n\nSet gof_map=NA to avoid computing expensive goodness-of-fit statistics.\nUse the easystats extractor functions and the metrics argument to avoid computing expensive statistics (see below for an example).\nUse parallel computation if you are summarizing multiple models. See the “Parallel computation” section in the ?modelsummary documentation.\n\nTo diagnose the slowdown and find the bottleneck, you can try to benchmark the various extractor functions:\n\nlibrary(tictoc)\n\ndata(trade)\nmod <- lm(mpg ~ hp + drat, mtcars)\n\ntic(\"tidy\")\nx <- broom::tidy(mod)\ntoc()\n\ntidy: 0.003 sec elapsed\n\ntic(\"glance\")\nx <- broom::glance(mod)\ntoc()\n\nglance: 0.003 sec elapsed\n\ntic(\"parameters\")\nx <- parameters::parameters(mod)\ntoc()\n\nparameters: 0.02 sec elapsed\n\ntic(\"performance\")\nx <- performance::performance(mod)\ntoc()\n\nperformance: 0.011 sec elapsed\n\n\nIn my experience, the main bottleneck tends to be computing goodness-of-fit statistics. The performance extractor allows users to specify a metrics argument to select a subset of GOF to include. Using this can speedup things considerably.\nWe call modelsummary with the metrics argument:\n\nmodelsummary(mod, metrics = \"rmse\")\n\n\n\n\n\n \n \n \n (1)\n \n \n \n (Intercept)\n10.790\n \n(5.078)\n hp\n-0.052\n \n(0.009)\n drat\n4.698\n \n(1.192)\n Num.Obs.\n32\n R2\n0.741\n R2 Adj.\n0.723\n AIC\n169.5\n BIC\n175.4\n Log.Lik.\n-80.752\n F\n41.522\n \n \n \n\n\n\n\n\n\n\nSometimes, users want to include raw LaTeX commands in their tables, such as coefficient names including math mode: Apple $\\times$ Orange. The result of these attempts is often a weird string such as: \\$\\textbackslash{}times\\$ instead of proper LaTeX-rendered characters.\nThe source of the problem is that kableExtra, default table-making package in modelsummary, automatically escapes weird characters to make sure that your tables compile properly in LaTeX. To avoid this, we need to pass the escape=FALSE to modelsummary:\n\nmodelsummary(mod, escape = FALSE)\n\n\n\n\nMany bayesian models are supported out-of-the-box, including those produced by the rstanarm and brms packages. The statistics available for bayesian models are slightly different than those available for most frequentist models. Users can call get_estimates to see what is available:\n\nlibrary(rstanarm)\n\nThis is rstanarm version 2.32.1\n\n\n- See https://mc-stan.org/rstanarm/articles/priors for changes to default priors!\n\n\n- Default priors may change, so it's safest to specify priors, even if equivalent to the defaults.\n\n\n- For execution on a local, multicore CPU with excess RAM we recommend calling\n\n\n options(mc.cores = parallel::detectCores())\n\n\n\nAttaching package: 'rstanarm'\n\n\nThe following object is masked from 'package:fixest':\n\n se\n\nmod <- stan_glm(am ~ hp + drat, data = mtcars)\n\n\nget_estimates(mod)\n\n term estimate mad conf.level conf.low conf.high\n1 (Intercept) -2.230796826 0.58680702 0.95 -3.40385688 -1.055676364\n2 hp 0.000704079 0.00103711 0.95 -0.00158824 0.002788848\n3 drat 0.705052544 0.13876995 0.95 0.43515855 0.984934926\n prior.distribution prior.location prior.scale group std.error statistic\n1 normal 0.40625 1.24747729 NA NA\n2 normal 0.00000 0.01819465 NA NA\n3 normal 0.00000 2.33313429 NA NA\n p.value\n1 NA\n2 NA\n3 NA\n\n\nThis shows that there is no std.error column, but that there is a mad statistic (mean absolute deviation). So we can do:\n\nmodelsummary(mod, statistic = \"mad\")\n\nWarning: \n`modelsummary` uses the `performance` package to extract goodness-of-fit\nstatistics from models of this class. You can specify the statistics you wish\nto compute by supplying a `metrics` argument to `modelsummary`, which will then\npush it forward to `performance`. Acceptable values are: \"all\", \"common\",\n\"none\", or a character vector of metrics names. For example: `modelsummary(mod,\nmetrics = c(\"RMSE\", \"R2\")` Note that some metrics are computationally\nexpensive. See `?performance::performance` for details.\n This warning appears once per session.\n\n\n\n\n\n\n \n \n \n (1)\n \n \n \n (Intercept)\n-2.231\n \n(0.587)\n hp\n0.001\n \n(0.001)\n drat\n0.705\n \n(0.139)\n Num.Obs.\n32\n R2\n0.498\n R2 Adj.\n0.421\n Log.Lik.\n-12.037\n ELPD\n-15.3\n ELPD s.e.\n3.2\n LOOIC\n30.5\n LOOIC s.e.\n6.4\n WAIC\n30.1\n RMSE\n0.34\n \n \n \n\n\n\n\nAs noted in the modelsummary() documentation, model results are extracted using the parameters package. Users can pass additional arguments to modelsummary(), which will then push forward those arguments to the parameters::parameters function to change the results. For example, the parameters documentation for bayesian models shows that there is a centrality argument, which allows users to report the mean and standard deviation of the posterior distribution, instead of the median and MAD:\n\nget_estimates(mod, centrality = \"mean\")\n\n term estimate std.dev conf.level conf.low conf.high\n1 (Intercept) -2.2308585627 0.592540388 0.95 -3.40385688 -1.055676364\n2 hp 0.0006978276 0.001105456 0.95 -0.00158824 0.002788848\n3 drat 0.7044091550 0.139055255 0.95 0.43515855 0.984934926\n prior.distribution prior.location prior.scale group std.error statistic\n1 normal 0.40625 1.24747729 NA NA\n2 normal 0.00000 0.01819465 NA NA\n3 normal 0.00000 2.33313429 NA NA\n p.value\n1 NA\n2 NA\n3 NA\n\nmodelsummary(mod, statistic = \"std.dev\", centrality = \"mean\")\n\n\n\n\n\n \n \n \n (1)\n \n \n \n (Intercept)\n-2.231\n \n(0.593)\n hp\n0.001\n \n(0.001)\n drat\n0.704\n \n(0.139)\n Num.Obs.\n32\n R2\n0.498\n R2 Adj.\n0.421\n Log.Lik.\n-12.037\n ELPD\n-15.3\n ELPD s.e.\n3.2\n LOOIC\n30.5\n LOOIC s.e.\n6.4\n WAIC\n30.1\n RMSE\n0.34\n \n \n \n\n\n\n\nWe can also get additional test statistics using the test argument:\n\nget_estimates(mod, test = c(\"pd\", \"rope\"))\n\n term estimate mad conf.level conf.low conf.high\n1 (Intercept) -2.230796826 0.58680702 0.95 -3.40385688 -1.055676364\n2 hp 0.000704079 0.00103711 0.95 -0.00158824 0.002788848\n3 drat 0.705052544 0.13876995 0.95 0.43515855 0.984934926\n pd rope.percentage prior.distribution prior.location prior.scale group\n1 1.00000 0 normal 0.40625 1.24747729 \n2 0.75175 1 normal 0.00000 0.01819465 \n3 1.00000 0 normal 0.00000 2.33313429 \n std.error statistic p.value\n1 NA NA NA\n2 NA NA NA\n3 NA NA NA" + "text": "Standardized coefficients\nRow group labels\nCustomizing Word tables\nHow to add p values to datasummary_correlation\n\n\n\n\nFirst, please read the documentation in ?modelsummary and on the modelsummary website. The website includes dozens of worked examples and a lot of detailed explanation.\nSecond, try to use the [modelsummary] tag on StackOverflow.\nThird, if you think you found a bug or have a feature request, please file it on the Github issue tracker:\n\n\n\nSee the detailed documentation in the “Adding and Customizing Models” section of the modelsummary website.\n\n\n\nA modelsummary table is divided in two parts: “Estimates” (top of the table) and “Goodness-of-fit” (bottom of the table). To populate those two parts, modelsummary tries using the broom, parameters and performance packages in sequence.\nEstimates:\n\nTry the broom::tidy function to see if that package supports this model type, or if the user defined a custom tidy function in their global environment. If this fails…\nTry the parameters::model_parameters function to see if the parameters package supports this model type.\n\nGoodness-of-fit:\n\nTry the performance::model_performance function to see if the performance package supports this model type.\nTry the broom::glance function to see if that package supports this model type, or if the user defined a custom glance function in their global environment. If this fails…\n\nYou can change the order in which those steps are executed by setting a global option:\n\n## tidymodels: broom \noptions(modelsummary_get = \"broom\")\n\n## easystats: performance + parameters\noptions(modelsummary_get = \"easystats\")\n\nIf all of this fails, modelsummary will return an error message.\nIf you have problems with a model object, you can often diagnose the problem by running the following commands from a clean R session:\n## see if parameters and performance support your model type\nlibrary(parameters)\nlibrary(performance)\nmodel_parameters(model)\nmodel_performance(model)\n\n## see if broom supports your model type\nlibrary(broom)\ntidy(model)\nglance(model)\n\n## see if broom.mixed supports your model type\nlibrary(broom.mixed)\ntidy(model)\nglance(model)\nIf none of these options work, you can create your own tidy and glance methods, as described in the Adding new models section.\nIf one of the extractor functions does not work well or takes too long to process, you can define a new “custom” model class and choose your own extractors, as described in the Adding new models section.\n\n\n\nThe modelsummary function, by itself, is not slow: it should only take a couple seconds to produce a table in any output format. However, sometimes it can be computationally expensive (and long) to extract estimates and to compute goodness-of-fit statistics for your model.\nThe main options to speed up modelsummary are:\n\nSet gof_map=NA to avoid computing expensive goodness-of-fit statistics.\nUse the easystats extractor functions and the metrics argument to avoid computing expensive statistics (see below for an example).\nUse parallel computation if you are summarizing multiple models. See the “Parallel computation” section in the ?modelsummary documentation.\n\nTo diagnose the slowdown and find the bottleneck, you can try to benchmark the various extractor functions:\n\nlibrary(tictoc)\n\ndata(trade)\nmod <- lm(mpg ~ hp + drat, mtcars)\n\ntic(\"tidy\")\nx <- broom::tidy(mod)\ntoc()\n\ntidy: 0.002 sec elapsed\n\ntic(\"glance\")\nx <- broom::glance(mod)\ntoc()\n\nglance: 0.004 sec elapsed\n\ntic(\"parameters\")\nx <- parameters::parameters(mod)\ntoc()\n\nparameters: 0.02 sec elapsed\n\ntic(\"performance\")\nx <- performance::performance(mod)\ntoc()\n\nperformance: 0.011 sec elapsed\n\n\nIn my experience, the main bottleneck tends to be computing goodness-of-fit statistics. The performance extractor allows users to specify a metrics argument to select a subset of GOF to include. Using this can speedup things considerably.\nWe call modelsummary with the metrics argument:\n\nmodelsummary(mod, metrics = \"rmse\")\n\n\n\n\n\n \n \n \n (1)\n \n \n \n (Intercept)\n10.790\n \n(5.078)\n hp\n-0.052\n \n(0.009)\n drat\n4.698\n \n(1.192)\n Num.Obs.\n32\n R2\n0.741\n R2 Adj.\n0.723\n AIC\n169.5\n BIC\n175.4\n Log.Lik.\n-80.752\n F\n41.522\n \n \n \n\n\n\n\n\n\n\nSometimes, users want to include raw LaTeX commands in their tables, such as coefficient names including math mode: Apple $\\times$ Orange. The result of these attempts is often a weird string such as: \\$\\textbackslash{}times\\$ instead of proper LaTeX-rendered characters.\nThe source of the problem is that kableExtra, default table-making package in modelsummary, automatically escapes weird characters to make sure that your tables compile properly in LaTeX. To avoid this, we need to pass the escape=FALSE to modelsummary:\n\nmodelsummary(mod, escape = FALSE)\n\n\n\n\nMany bayesian models are supported out-of-the-box, including those produced by the rstanarm and brms packages. The statistics available for bayesian models are slightly different than those available for most frequentist models. Users can call get_estimates to see what is available:\n\nlibrary(rstanarm)\n\nThis is rstanarm version 2.32.1\n\n\n- See https://mc-stan.org/rstanarm/articles/priors for changes to default priors!\n\n\n- Default priors may change, so it's safest to specify priors, even if equivalent to the defaults.\n\n\n- For execution on a local, multicore CPU with excess RAM we recommend calling\n\n\n options(mc.cores = parallel::detectCores())\n\n\n\nAttaching package: 'rstanarm'\n\n\nThe following object is masked from 'package:fixest':\n\n se\n\nmod <- stan_glm(am ~ hp + drat, data = mtcars)\n\n\nget_estimates(mod)\n\n term estimate mad conf.level conf.low conf.high\n1 (Intercept) -2.2475345916 0.555314871 0.95 -3.408237583 -1.058208927\n2 hp 0.0007033686 0.001063287 0.95 -0.001363318 0.002914953\n3 drat 0.7069798264 0.133722428 0.95 0.429156919 0.985993244\n prior.distribution prior.location prior.scale group std.error statistic\n1 normal 0.40625 1.24747729 NA NA\n2 normal 0.00000 0.01819465 NA NA\n3 normal 0.00000 2.33313429 NA NA\n p.value\n1 NA\n2 NA\n3 NA\n\n\nThis shows that there is no std.error column, but that there is a mad statistic (mean absolute deviation). So we can do:\n\nmodelsummary(mod, statistic = \"mad\")\n\nWarning: \n`modelsummary` uses the `performance` package to extract goodness-of-fit\nstatistics from models of this class. You can specify the statistics you wish\nto compute by supplying a `metrics` argument to `modelsummary`, which will then\npush it forward to `performance`. Acceptable values are: \"all\", \"common\",\n\"none\", or a character vector of metrics names. For example: `modelsummary(mod,\nmetrics = c(\"RMSE\", \"R2\")` Note that some metrics are computationally\nexpensive. See `?performance::performance` for details.\n This warning appears once per session.\n\n\n\n\n\n\n \n \n \n (1)\n \n \n \n (Intercept)\n-2.248\n \n(0.555)\n hp\n0.001\n \n(0.001)\n drat\n0.707\n \n(0.134)\n Num.Obs.\n32\n R2\n0.501\n R2 Adj.\n0.433\n Log.Lik.\n-12.064\n ELPD\n-15.0\n ELPD s.e.\n3.1\n LOOIC\n30.0\n LOOIC s.e.\n6.2\n WAIC\n29.8\n RMSE\n0.34\n \n \n \n\n\n\n\nAs noted in the modelsummary() documentation, model results are extracted using the parameters package. Users can pass additional arguments to modelsummary(), which will then push forward those arguments to the parameters::parameters function to change the results. For example, the parameters documentation for bayesian models shows that there is a centrality argument, which allows users to report the mean and standard deviation of the posterior distribution, instead of the median and MAD:\n\nget_estimates(mod, centrality = \"mean\")\n\n term estimate std.dev conf.level conf.low conf.high\n1 (Intercept) -2.2390156302 0.586604330 0.95 -3.408237583 -1.058208927\n2 hp 0.0007145923 0.001079799 0.95 -0.001363318 0.002914953\n3 drat 0.7062254928 0.138544254 0.95 0.429156919 0.985993244\n prior.distribution prior.location prior.scale group std.error statistic\n1 normal 0.40625 1.24747729 NA NA\n2 normal 0.00000 0.01819465 NA NA\n3 normal 0.00000 2.33313429 NA NA\n p.value\n1 NA\n2 NA\n3 NA\n\nmodelsummary(mod, statistic = \"std.dev\", centrality = \"mean\")\n\n\n\n\n\n \n \n \n (1)\n \n \n \n (Intercept)\n-2.239\n \n(0.587)\n hp\n0.001\n \n(0.001)\n drat\n0.706\n \n(0.139)\n Num.Obs.\n32\n R2\n0.501\n R2 Adj.\n0.433\n Log.Lik.\n-12.064\n ELPD\n-15.0\n ELPD s.e.\n3.1\n LOOIC\n30.0\n LOOIC s.e.\n6.2\n WAIC\n29.8\n RMSE\n0.34\n \n \n \n\n\n\n\nWe can also get additional test statistics using the test argument:\n\nget_estimates(mod, test = c(\"pd\", \"rope\"))\n\n term estimate mad conf.level conf.low conf.high\n1 (Intercept) -2.2475345916 0.555314871 0.95 -3.408237583 -1.058208927\n2 hp 0.0007033686 0.001063287 0.95 -0.001363318 0.002914953\n3 drat 0.7069798264 0.133722428 0.95 0.429156919 0.985993244\n pd rope.percentage prior.distribution prior.location prior.scale group\n1 1.000 0 normal 0.40625 1.24747729 \n2 0.747 1 normal 0.00000 0.01819465 \n3 1.000 0 normal 0.00000 2.33313429 \n std.error statistic p.value\n1 NA NA NA\n2 NA NA NA\n3 NA NA NA" } ] \ No newline at end of file diff --git a/vignettes/appearance.html b/vignettes/appearance.html index b9bcd1713..50222a8b4 100644 --- a/vignettes/appearance.html +++ b/vignettes/appearance.html @@ -516,23 +516,23 @@

gt

locations = cells_body(rows = 5))
-
- @@ -1063,23 +1063,23 @@

gt

text_transform(locations = cells_body(columns = 2:6, rows = 1), fn = f)
-
- @@ -1652,7 +1652,7 @@

flextable

autofit()
-

OLS 1

Poisson 1

OLS 2

Poisson 2

OLS 3

Constant

8759.068***

8.986***

20357.309***

9.708***

11243.544***

(1559.363)

(0.004)

(2020.980)

(0.003)

(1011.240)

Literacy (%)

-42.886

-0.006***

-15.358

0.000***

-68.507***

(36.362)

(0.000)

(47.127)

(0.000)

(18.029)

Priests/capita

0.002***

0.004***

-16.376

(0.000)

(0.000)

(12.522)

Num.Obs.

86

86

86

86

86

R2

0.016

0.001

0.152

F

1.391

4170.610

0.106

7905.811

7.441

RMSE

5753.14

5727.27

7456.23

7233.22

2793.43

+ p < 0.1, * p < 0.05, ** p < 0.01, *** p < 0.001

+

OLS 1

Poisson 1

OLS 2

Poisson 2

OLS 3

Constant

8759.068***

8.986***

20357.309***

9.708***

11243.544***

(1559.363)

(0.004)

(2020.980)

(0.003)

(1011.240)

Literacy (%)

-42.886

-0.006***

-15.358

0.000***

-68.507***

(36.362)

(0.000)

(47.127)

(0.000)

(18.029)

Priests/capita

0.002***

0.004***

-16.376

(0.000)

(0.000)

(12.522)

Num.Obs.

86

86

86

86

86

R2

0.016

0.001

0.152

F

1.391

4170.610

0.106

7905.811

7.441

RMSE

5753.14

5727.27

7456.23

7233.22

2793.43

+ p < 0.1, * p < 0.05, ** p < 0.01, *** p < 0.001

@@ -1699,8 +1699,8 @@

DT

fixedColumns = list(leftColumns = 1)))
-
- +
+
@@ -1721,23 +1721,23 @@

Themes

modelsummary(mod, output = "gt")
-
- @@ -2211,23 +2211,23 @@

Themes

datasummary_crosstab(island ~ sex * species, output = "gt", data = penguins)
-
- @@ -2799,8 +2799,8 @@

Themes: Data Frame

modelsummary(mod, output = "dataframe")
-
- +
+

Restore default theme:

@@ -2822,23 +2822,23 @@

Variable labels

modelsummary(mod, coef_rename = TRUE)
-
- @@ -3307,23 +3307,23 @@

Variable labels

datasummary_skim(dat[, c("mpg", "am", "drat")])
-
- diff --git a/vignettes/datasummary.html b/vignettes/datasummary.html index f32eaa7a7..52d64b51c 100644 --- a/vignettes/datasummary.html +++ b/vignettes/datasummary.html @@ -506,23 +506,23 @@

datasummary_skim
datasummary_skim(penguins, type = "categorical")

-
- @@ -1054,23 +1054,23 @@

datasummary_bala
datasummary_balance(~ 1, data = training)
-
- @@ -1567,23 +1567,23 @@

datasummary_
datasummary_correlation(mtcars)
-
- @@ -2183,23 +2183,23 @@

datasummary_ datasummary_correlation(mtcars, method = fun)

-
- @@ -2769,23 +2769,23 @@

datasummary_cro datasummary_crosstab(species ~ sex, data = penguins)

-
- @@ -3272,23 +3272,23 @@

datasummary_cro
datasummary_crosstab(species ~ sex * island, data = penguins)
-
- @@ -3821,23 +3821,23 @@

datasummary_cro data = penguins)

-
- @@ -4305,23 +4305,23 @@

datasummary

data = penguins)
-
- @@ -4768,23 +4768,23 @@

datasummary

data = penguins)
-
- @@ -5233,23 +5233,23 @@

Custom summary fu data = penguins)

-
- @@ -5699,23 +5699,23 @@

Custom summary fu data = penguins)

-
- @@ -6161,23 +6161,23 @@

Custom summary fu data = penguins)

-
- @@ -6627,23 +6627,23 @@

Concatenating with data = penguins)

-
- @@ -7095,23 +7095,23 @@

Concatenating with data = penguins)

-
- @@ -7579,23 +7579,23 @@

Nesting with * data = penguins)

-
- @@ -8047,23 +8047,23 @@

Nesting with * data = penguins)

-
- @@ -8524,23 +8524,23 @@

Nesting with * data = penguins)

-
- @@ -9003,23 +9003,23 @@

Nesting with * data = penguins)

-
- @@ -9494,23 +9494,23 @@

Nesting with * data = penguins)

-
- @@ -9983,23 +9983,23 @@

Nesting with * sparse_header = FALSE)

-
- @@ -10479,23 +10479,23 @@

Renaming with = data = tmp)

-
- @@ -10948,23 +10948,23 @@

Renaming with = data = penguins)

-
- @@ -11439,23 +11439,23 @@

Renaming with = data = penguins)

-
- @@ -11930,23 +11930,23 @@

Renaming with = data = penguins)

-
- @@ -12410,23 +12410,23 @@

Renaming with = data = penguins)

-
- @@ -12901,23 +12901,23 @@

Counts and Percenta data = penguins)

-
- @@ -13536,23 +13536,23 @@

Custom percentages

data = dat)
-
- @@ -14006,23 +14006,23 @@

Factor

data = mtcars)
-
- @@ -14491,23 +14491,23 @@

Arguments data = penguins)

-
- @@ -14957,23 +14957,23 @@

Arguments data = penguins)

-
- @@ -15422,23 +15422,23 @@

Arguments data = penguins)

-
- @@ -15908,11 +15908,11 @@

Arguments x -1.05 +-4.18 y --3.82 +3.38 @@ -15922,22 +15922,22 @@

Arguments
weighted.mean(newdata$x, newdata$w)
-
[1] 1.051561
+
[1] -4.178942
weighted.mean(newdata$y, newdata$w)
-
[1] -3.816827
+
[1] 3.378796

But different results from:

mean(newdata$x)
-
[1] 0.1759387
+
[1] -0.3440597
mean(newdata$y)
-
[1] 0.1215528
+
[1] 0.2460577
@@ -15949,23 +15949,23 @@

Empty cells

data = penguins)
-
- @@ -16483,23 +16483,23 @@

Empty cells

data = penguins)
-
- @@ -16995,23 +16995,23 @@

Logical subsets

data = penguins)
-
- @@ -17497,23 +17497,23 @@

fmt

data = penguins)
-
- @@ -17962,23 +17962,23 @@

fmt

data = penguins)
-
- @@ -18433,23 +18433,23 @@

fmt

locations = cells_body(rows = Mean > 10, columns = 2))
-
- @@ -18929,23 +18929,23 @@

fmt

datasummary(X ~ N, data = tmp)
-
- @@ -19378,11 +19378,11 @@

fmt

a -333,404 +333,736 b -332,896 +333,200 c -333,700 +333,064 @@ -19401,23 +19401,23 @@

title, notes

notes = c('A note at the bottom of the table.'))
-
- @@ -19890,23 +19890,23 @@

align

align = 'lrcl')
-
- @@ -20371,23 +20371,23 @@

add_rows

add_rows = new_rows)
-
- @@ -20873,23 +20873,23 @@

add_columns

add_columns = new_cols)
-
- @@ -21345,7 +21345,7 @@

add_columns

7.13 217.19 6.48 -0.83 +0.38 body_mass_g 3700.66 458.57 @@ -21353,7 +21353,7 @@

add_columns

384.34 5076.02 504.12 -0.39 +0.88 @@ -21370,23 +21370,23 @@

Histograms

datasummary_skim(tmp)
-
- @@ -21880,23 +21880,23 @@

Histograms

datasummary(mpg + hp ~ Mean + SD + Histogram, data = tmp)
-
- @@ -22375,23 +22375,23 @@

Factors

datasummary_crosstab(cyl_nona ~ vs, data = mycars)
-
- @@ -22875,23 +22875,23 @@

Factors

datasummary_crosstab(cyl_na ~ vs, data = mycars)
-
- @@ -23420,23 +23420,23 @@

Appearance

cols_align('center', columns = 3:6)
-
- diff --git a/vignettes/modelsummary.html b/vignettes/modelsummary.html index 2e9c0b14e..24f93a397 100644 --- a/vignettes/modelsummary.html +++ b/vignettes/modelsummary.html @@ -490,23 +490,23 @@

Model Summaries

modelsummary(models)
-
- @@ -1121,23 +1121,23 @@

fmt: rou modelsummary(modf, fmt = f, gof_map = NA)

-
- @@ -1605,23 +1605,23 @@

estimate

coef_omit = "Intercept")
-
- @@ -2140,23 +2140,23 @@

estimate

statistic = "{round(exp(estimate) * std.error, 3)}")
-
- @@ -2629,23 +2629,23 @@

estimate

coef_omit = "Intercept")
-
- @@ -3171,23 +3171,23 @@

statistic conf_level = .99)

-
- @@ -3732,23 +3732,23 @@

statistic statistic = "{std.error} ({p.value})")

-
- @@ -4296,23 +4296,23 @@

statistic "p = {p.value}"))

-
- @@ -4882,23 +4882,23 @@

statistic statistic = NULL)

-
- @@ -5411,23 +5411,23 @@

stars

gof_omit = ".*")
-
- @@ -5927,23 +5927,23 @@

coef_omit

modelsummary(models, coef_omit = 1:2, gof_map = NA)
-
- @@ -6415,23 +6415,23 @@

coef_omit

modelsummary(models, coef_omit = c(-1, -2), gof_map = NA)
-
- @@ -6903,23 +6903,23 @@

coef_omit

modelsummary(models, coef_omit = "Intercept|.*merce", gof_map = NA)
-
- @@ -7391,23 +7391,23 @@

coef_omit

modelsummary(models, coef_omit = "^(?!Lit)", gof_map = NA)
-
- @@ -7867,23 +7867,23 @@

coef_omit

modelsummary(models, coef_omit = "^(?!.*y)", gof_map = NA)
-
- @@ -8355,23 +8355,23 @@

coef_omit

modelsummary(models, coef_omit = "^(?!.*tercept|.*y)", gof_map = NA)
-
- @@ -8862,23 +8862,23 @@

coef_rename

modelsummary(x, coef_rename = c("drat" = "Explanator", "vs" = "Explanator"))
-
- @@ -9365,23 +9365,23 @@

coef_rename

modelsummary(dvnames(x), coef_rename = coef_rename)
-
- @@ -9890,23 +9890,23 @@

coef_rename

modelsummary(y, coef_rename = rename_explanator)
-
- @@ -10399,23 +10399,23 @@

coef_rename

modelsummary(modlab, coef_rename = TRUE)
-
- @@ -10897,23 +10897,23 @@

coef_map< modelsummary(models, coef_map = cm)

-
- @@ -11454,23 +11454,23 @@

gof_map

modelsummary(models, gof_map = c("nobs", "r.squared"))
-
- @@ -12004,23 +12004,23 @@

gof_map

escape = FALSE)
-
- @@ -12519,23 +12519,23 @@

Formula

modelsummary(m, shape = term + statistic ~ model, gof_map = NA)
-
- @@ -13001,23 +13001,23 @@

Formula

gof_map = NA)
-
- @@ -13498,23 +13498,23 @@

Formula

gof_map = NA)
-
- @@ -14024,23 +14024,23 @@

Formula

modelsummary(mod, shape = term + response ~ statistic)
-
- @@ -14568,23 +14568,23 @@

Formula

modelsummary(mod, shape = model + term ~ response)
-
- @@ -15074,23 +15074,23 @@

Formula

modelsummary(mfx, shape = term + contrast ~ model)
-
- @@ -15570,23 +15570,23 @@

Formula

modelsummary(mfx, shape = term : contrast ~ model)
-
- @@ -16074,23 +16074,23 @@

gof_map = gm)

-
- @@ -16597,23 +16597,23 @@

gof_map = gm)

-
- @@ -17135,23 +17135,23 @@

fixest

gof_omit = "IC|R2")
-
- @@ -17653,23 +17653,23 @@

fixest

gof_omit = "IC|R2")
-
- @@ -18215,23 +18215,23 @@

add_rows

modelsummary(models, add_rows = rows)
-
- @@ -18728,23 +18728,23 @@

exponentiate

modelsummary(mod_logit, exponentiate = TRUE)
-
- @@ -19209,23 +19209,23 @@

exponentiate

modelsummary(mod_logit, exponentiate = c(TRUE, FALSE))
-
- @@ -20061,23 +20061,23 @@

Quarto

::: {.cell-output-display} ```{=html} -<div id="wkbhfqnlym" style="padding-left:0px;padding-right:0px;padding-top:10px;padding-bottom:10px;overflow-x:auto;overflow-y:auto;width:auto;height:auto;"> -<style>#wkbhfqnlym table { +<div id="msyjczffsg" style="padding-left:0px;padding-right:0px;padding-top:10px;padding-bottom:10px;overflow-x:auto;overflow-y:auto;width:auto;height:auto;"> +<style>#msyjczffsg table { font-family: system-ui, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif, 'Apple Color Emoji', 'Segoe UI Emoji', 'Segoe UI Symbol', 'Noto Color Emoji'; -webkit-font-smoothing: antialiased; -moz-osx-font-smoothing: grayscale; } -#wkbhfqnlym thead, #wkbhfqnlym tbody, #wkbhfqnlym tfoot, #wkbhfqnlym tr, #wkbhfqnlym td, #wkbhfqnlym th { +#msyjczffsg thead, #msyjczffsg tbody, #msyjczffsg tfoot, #msyjczffsg tr, #msyjczffsg td, #msyjczffsg th { border-style: none; } -#wkbhfqnlym p { +#msyjczffsg p { margin: 0; padding: 0; } -#wkbhfqnlym .gt_table { +#msyjczffsg .gt_table { display: table; border-collapse: collapse; line-height: normal; @@ -20103,12 +20103,12 @@

Quarto

border-left-color: #D3D3D3; } -#wkbhfqnlym .gt_caption { +#msyjczffsg .gt_caption { padding-top: 4px; padding-bottom: 4px; } -#wkbhfqnlym .gt_title { +#msyjczffsg .gt_title { color: #333333; font-size: 125%; font-weight: initial; @@ -20120,7 +20120,7 @@

Quarto

border-bottom-width: 0; } -#wkbhfqnlym .gt_subtitle { +#msyjczffsg .gt_subtitle { color: #333333; font-size: 85%; font-weight: initial; @@ -20132,7 +20132,7 @@

Quarto

border-top-width: 0; } -#wkbhfqnlym .gt_heading { +#msyjczffsg .gt_heading { background-color: #FFFFFF; text-align: center; border-bottom-color: #FFFFFF; @@ -20144,13 +20144,13 @@

Quarto

border-right-color: #D3D3D3; } -#wkbhfqnlym .gt_bottom_border { +#msyjczffsg .gt_bottom_border { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; } -#wkbhfqnlym .gt_col_headings { +#msyjczffsg .gt_col_headings { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; @@ -20165,7 +20165,7 @@

Quarto

border-right-color: #D3D3D3; } -#wkbhfqnlym .gt_col_heading { +#msyjczffsg .gt_col_heading { color: #333333; background-color: #FFFFFF; font-size: 100%; @@ -20185,7 +20185,7 @@

Quarto

overflow-x: hidden; } -#wkbhfqnlym .gt_column_spanner_outer { +#msyjczffsg .gt_column_spanner_outer { color: #333333; background-color: #FFFFFF; font-size: 100%; @@ -20197,15 +20197,15 @@

Quarto

padding-right: 4px; } -#wkbhfqnlym .gt_column_spanner_outer:first-child { +#msyjczffsg .gt_column_spanner_outer:first-child { padding-left: 0; } -#wkbhfqnlym .gt_column_spanner_outer:last-child { +#msyjczffsg .gt_column_spanner_outer:last-child { padding-right: 0; } -#wkbhfqnlym .gt_column_spanner { +#msyjczffsg .gt_column_spanner { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; @@ -20217,11 +20217,11 @@

Quarto

width: 100%; } -#wkbhfqnlym .gt_spanner_row { +#msyjczffsg .gt_spanner_row { border-bottom-style: hidden; } -#wkbhfqnlym .gt_group_heading { +#msyjczffsg .gt_group_heading { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; @@ -20247,7 +20247,7 @@

Quarto

text-align: left; } -#wkbhfqnlym .gt_empty_group_heading { +#msyjczffsg .gt_empty_group_heading { padding: 0.5px; color: #333333; background-color: #FFFFFF; @@ -20262,15 +20262,15 @@

Quarto

vertical-align: middle; } -#wkbhfqnlym .gt_from_md > :first-child { +#msyjczffsg .gt_from_md > :first-child { margin-top: 0; } -#wkbhfqnlym .gt_from_md > :last-child { +#msyjczffsg .gt_from_md > :last-child { margin-bottom: 0; } -#wkbhfqnlym .gt_row { +#msyjczffsg .gt_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; @@ -20289,7 +20289,7 @@

Quarto

overflow-x: hidden; } -#wkbhfqnlym .gt_stub { +#msyjczffsg .gt_stub { color: #333333; background-color: #FFFFFF; font-size: 100%; @@ -20302,7 +20302,7 @@

Quarto

padding-right: 5px; } -#wkbhfqnlym .gt_stub_row_group { +#msyjczffsg .gt_stub_row_group { color: #333333; background-color: #FFFFFF; font-size: 100%; @@ -20316,15 +20316,15 @@

Quarto

vertical-align: top; } -#wkbhfqnlym .gt_row_group_first td { +#msyjczffsg .gt_row_group_first td { border-top-width: 2px; } -#wkbhfqnlym .gt_row_group_first th { +#msyjczffsg .gt_row_group_first th { border-top-width: 2px; } -#wkbhfqnlym .gt_summary_row { +#msyjczffsg .gt_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; @@ -20334,16 +20334,16 @@

Quarto

padding-right: 5px; } -#wkbhfqnlym .gt_first_summary_row { +#msyjczffsg .gt_first_summary_row { border-top-style: solid; border-top-color: #D3D3D3; } -#wkbhfqnlym .gt_first_summary_row.thick { +#msyjczffsg .gt_first_summary_row.thick { border-top-width: 2px; } -#wkbhfqnlym .gt_last_summary_row { +#msyjczffsg .gt_last_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; @@ -20353,7 +20353,7 @@

Quarto

border-bottom-color: #D3D3D3; } -#wkbhfqnlym .gt_grand_summary_row { +#msyjczffsg .gt_grand_summary_row { color: #333333; background-color: #FFFFFF; text-transform: inherit; @@ -20363,7 +20363,7 @@

Quarto

padding-right: 5px; } -#wkbhfqnlym .gt_first_grand_summary_row { +#msyjczffsg .gt_first_grand_summary_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; @@ -20373,7 +20373,7 @@

Quarto

border-top-color: #D3D3D3; } -#wkbhfqnlym .gt_last_grand_summary_row_top { +#msyjczffsg .gt_last_grand_summary_row_top { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; @@ -20383,11 +20383,11 @@

Quarto

border-bottom-color: #D3D3D3; } -#wkbhfqnlym .gt_striped { +#msyjczffsg .gt_striped { background-color: rgba(128, 128, 128, 0.05); } -#wkbhfqnlym .gt_table_body { +#msyjczffsg .gt_table_body { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; @@ -20396,7 +20396,7 @@

Quarto

border-bottom-color: #D3D3D3; } -#wkbhfqnlym .gt_footnotes { +#msyjczffsg .gt_footnotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; @@ -20410,7 +20410,7 @@

Quarto

border-right-color: #D3D3D3; } -#wkbhfqnlym .gt_footnote { +#msyjczffsg .gt_footnote { margin: 0px; font-size: 90%; padding-top: 4px; @@ -20419,7 +20419,7 @@

Quarto

padding-right: 5px; } -#wkbhfqnlym .gt_sourcenotes { +#msyjczffsg .gt_sourcenotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; @@ -20433,7 +20433,7 @@

Quarto

border-right-color: #D3D3D3; } -#wkbhfqnlym .gt_sourcenote { +#msyjczffsg .gt_sourcenote { font-size: 90%; padding-top: 4px; padding-bottom: 4px; @@ -20441,63 +20441,63 @@

Quarto

padding-right: 5px; } -#wkbhfqnlym .gt_left { +#msyjczffsg .gt_left { text-align: left; } -#wkbhfqnlym .gt_center { +#msyjczffsg .gt_center { text-align: center; } -#wkbhfqnlym .gt_right { +#msyjczffsg .gt_right { text-align: right; font-variant-numeric: tabular-nums; } -#wkbhfqnlym .gt_font_normal { +#msyjczffsg .gt_font_normal { font-weight: normal; } -#wkbhfqnlym .gt_font_bold { +#msyjczffsg .gt_font_bold { font-weight: bold; } -#wkbhfqnlym .gt_font_italic { +#msyjczffsg .gt_font_italic { font-style: italic; } -#wkbhfqnlym .gt_super { +#msyjczffsg .gt_super { font-size: 65%; } -#wkbhfqnlym .gt_footnote_marks { +#msyjczffsg .gt_footnote_marks { font-size: 75%; vertical-align: 0.4em; position: initial; } -#wkbhfqnlym .gt_asterisk { +#msyjczffsg .gt_asterisk { font-size: 100%; vertical-align: 0; } -#wkbhfqnlym .gt_indent_1 { +#msyjczffsg .gt_indent_1 { text-indent: 5px; } -#wkbhfqnlym .gt_indent_2 { +#msyjczffsg .gt_indent_2 { text-indent: 10px; } -#wkbhfqnlym .gt_indent_3 { +#msyjczffsg .gt_indent_3 { text-indent: 15px; } -#wkbhfqnlym .gt_indent_4 { +#msyjczffsg .gt_indent_4 { text-indent: 20px; } -#wkbhfqnlym .gt_indent_5 { +#msyjczffsg .gt_indent_5 { text-indent: 25px; } </style> @@ -20656,23 +20656,23 @@

Quarto

side_by_side(models, estimates = estimates)
-
- @@ -21164,23 +21164,23 @@

Bootstrap

modelsummary(mod)
-
- @@ -21614,23 +21614,23 @@

Bootstrap

(Intercept) -328.764 -284.150 +326.421 +284.858 -(31.484) -(42.113) +(29.737) +(44.865) mpg --9.081 --9.994 +-9.000 +-10.117 -(1.420) -(2.411) +(1.364) +(2.485) drat -17.396 +17.795 -(20.667) +(21.775) Num.Obs. 32 32 @@ -21691,23 +21691,23 @@

modelsummary(mod)

-
- @@ -22140,41 +22140,41 @@

fit_x_endo_1 -0.772 +0.424 -(1.947) +(0.103) fit_x_endo_2 --5.721 +0.798 -(37.320) +(0.272) x1 -0.646 +0.485 -(0.459) +(0.059) Num.Obs. 150 R2 --11.399 +0.683 R2 Adj. --11.830 +0.672 R2 Within --31.519 +0.168 R2 Within Adj. --32.197 +0.150 AIC -757.7 +207.9 BIC -775.8 +226.0 RMSE -2.91 +0.46 Std.Errors by: fe FE: fe X Wald (x_endo_1) -15.65305145663 +77.5359206163405 Wald (x_endo_2) -0.0222671859109197 +49.3216288080678 @@ -22221,23 +22221,23 @@

Multiple imputationmodelsummary(mod)

-
- @@ -22672,31 +22672,31 @@

Multiple imputation (Intercept) -63.166 -56.037 -68.298 +63.885 +71.548 +67.067 -(15.624) -(13.336) -(12.735) +(16.416) +(16.109) +(13.554) Literacy --0.303 --0.215 --0.406 +-0.266 +-0.436 +-0.360 -(0.250) -(0.207) -(0.206) +(0.269) +(0.249) +(0.215) Commerce --0.136 --0.082 --0.184 +-0.235 +-0.270 +-0.233 -(0.164) -(0.158) -(0.140) +(0.170) +(0.171) +(0.145) Num.Obs. -59 +60 86 86 Num.Imp. @@ -22704,27 +22704,27 @@

Multiple imputation5 5 R2 -0.026 -0.018 -0.054 +0.033 +0.062 +0.049 R2 Adj. --0.009 +-0.001 -0.030 +0.025 AIC -549.2 +564.1 BIC -557.5 +572.5 Log.Lik. --270.576 +-278.064 RMSE -23.74 +24.91 @@ -22842,13 +22842,13 @@

How can I x <- broom::tidy(mod) toc()

-
tidy: 0.003 sec elapsed
+
tidy: 0.002 sec elapsed
tic("glance")
 x <- broom::glance(mod)
 toc()
-
glance: 0.003 sec elapsed
+
glance: 0.004 sec elapsed
tic("parameters")
 x <- parameters::parameters(mod)
@@ -22869,23 +22869,23 @@ 

How can I
modelsummary(mod, metrics = "rmse")
-
- @@ -23393,10 +23393,10 @@

Bayesian models

get_estimates(mod)
-
         term     estimate        mad conf.level    conf.low    conf.high
-1 (Intercept) -2.230796826 0.58680702       0.95 -3.40385688 -1.055676364
-2          hp  0.000704079 0.00103711       0.95 -0.00158824  0.002788848
-3        drat  0.705052544 0.13876995       0.95  0.43515855  0.984934926
+
         term      estimate         mad conf.level     conf.low    conf.high
+1 (Intercept) -2.2475345916 0.555314871       0.95 -3.408237583 -1.058208927
+2          hp  0.0007033686 0.001063287       0.95 -0.001363318  0.002914953
+3        drat  0.7069798264 0.133722428       0.95  0.429156919  0.985993244
   prior.distribution prior.location prior.scale group std.error statistic
 1             normal        0.40625  1.24747729              NA        NA
 2             normal        0.00000  0.01819465              NA        NA
@@ -23423,23 +23423,23 @@ 

Bayesian models

-
- @@ -23872,35 +23872,35 @@

Bayesian models

(Intercept) --2.231 +-2.248 -(0.587) +(0.555) hp 0.001 (0.001) drat -0.705 +0.707 -(0.139) +(0.134) Num.Obs. 32 R2 -0.498 +0.501 R2 Adj. -0.421 +0.433 Log.Lik. --12.037 +-12.064 ELPD --15.3 +-15.0 ELPD s.e. -3.2 +3.1 LOOIC -30.5 +30.0 LOOIC s.e. -6.4 +6.2 WAIC -30.1 +29.8 RMSE 0.34 @@ -23914,10 +23914,10 @@

Bayesian models

get_estimates(mod, centrality = "mean")
-
         term      estimate     std.dev conf.level    conf.low    conf.high
-1 (Intercept) -2.2308585627 0.592540388       0.95 -3.40385688 -1.055676364
-2          hp  0.0006978276 0.001105456       0.95 -0.00158824  0.002788848
-3        drat  0.7044091550 0.139055255       0.95  0.43515855  0.984934926
+
         term      estimate     std.dev conf.level     conf.low    conf.high
+1 (Intercept) -2.2390156302 0.586604330       0.95 -3.408237583 -1.058208927
+2          hp  0.0007145923 0.001079799       0.95 -0.001363318  0.002914953
+3        drat  0.7062254928 0.138544254       0.95  0.429156919  0.985993244
   prior.distribution prior.location prior.scale group std.error statistic
 1             normal        0.40625  1.24747729              NA        NA
 2             normal        0.00000  0.01819465              NA        NA
@@ -23930,23 +23930,23 @@ 

Bayesian models

modelsummary(mod, statistic = "std.dev", centrality = "mean")
-
- @@ -24379,35 +24379,35 @@

Bayesian models

(Intercept) --2.231 +-2.239 -(0.593) +(0.587) hp 0.001 (0.001) drat -0.704 +0.706 (0.139) Num.Obs. 32 R2 -0.498 +0.501 R2 Adj. -0.421 +0.433 Log.Lik. --12.037 +-12.064 ELPD --15.3 +-15.0 ELPD s.e. -3.2 +3.1 LOOIC -30.5 +30.0 LOOIC s.e. -6.4 +6.2 WAIC -30.1 +29.8 RMSE 0.34 @@ -24421,14 +24421,14 @@

Bayesian models

get_estimates(mod, test = c("pd", "rope"))
-
         term     estimate        mad conf.level    conf.low    conf.high
-1 (Intercept) -2.230796826 0.58680702       0.95 -3.40385688 -1.055676364
-2          hp  0.000704079 0.00103711       0.95 -0.00158824  0.002788848
-3        drat  0.705052544 0.13876995       0.95  0.43515855  0.984934926
-       pd rope.percentage prior.distribution prior.location prior.scale group
-1 1.00000               0             normal        0.40625  1.24747729      
-2 0.75175               1             normal        0.00000  0.01819465      
-3 1.00000               0             normal        0.00000  2.33313429      
+
         term      estimate         mad conf.level     conf.low    conf.high
+1 (Intercept) -2.2475345916 0.555314871       0.95 -3.408237583 -1.058208927
+2          hp  0.0007033686 0.001063287       0.95 -0.001363318  0.002914953
+3        drat  0.7069798264 0.133722428       0.95  0.429156919  0.985993244
+     pd rope.percentage prior.distribution prior.location prior.scale group
+1 1.000               0             normal        0.40625  1.24747729      
+2 0.747               1             normal        0.00000  0.01819465      
+3 1.000               0             normal        0.00000  2.33313429      
   std.error statistic p.value
 1        NA        NA      NA
 2        NA        NA      NA