Summarize continuous variables
mean_tbl.Rd
mean_tbl()
calculates summary statistics (i.e., mean, standard
deviation, minimum, maximum, and count of non-missing values) for interval and
ratio-level variables that share a common prefix (i.e., variable stem). A variable
'stem' is a shared naming pattern across related variables, often representing
repeated measures of the same concept or a series of items measuring a single
construct. Missing data are excluded using listwise
deletion by default.
Usage
mean_tbl(
data,
var_stem,
escape_stem = FALSE,
ignore_stem_case = FALSE,
na_removal = "listwise",
only = NULL,
var_labels = NULL,
ignore = NULL
)
Arguments
- data
A data frame.
- var_stem
A character string of a variable stem or the full name of a variable in
data
.- escape_stem
A logical value indicating whether to escape
var_stem
. Default isFALSE
.- ignore_stem_case
A logical value indicating whether the search for columns matching the supplied
var_stem
is case-insensitive. Default isFALSE
.- na_removal
A character string that specifies the method for handling missing values:
pairwise
orlistwise
. Defaults tolistwise
.- only
A character string or vector of character strings specifying which summary statistics to return. Defaults to NULL, which includes mean (mean), standard deviation (sd), minimum (min), maximum (max), and count of non-missing values (nobs).
- var_labels
An optional named character vector or list used to assign custom labels to variable names. Each element should be named and correspond to a variable in the returned table. If any element is unnamed or references a variable not returned in the table, all labels will be ignored and the table will be printed without them.
- ignore
An optional vector of values to exclude from variables matching the specified variable stem. Defaults to
NULL
, which retains all values.
Examples
sdoh_child_ages <- dplyr::select(sdoh, c(ACS_PCT_AGE_0_4, ACS_PCT_AGE_5_9,
ACS_PCT_AGE_10_14, ACS_PCT_AGE_15_17))
mean_tbl(data = sdoh_child_ages,var_stem = "ACS_PCT_AGE")
#> # A tibble: 4 × 6
#> variable mean sd min max nobs
#> <chr> <dbl> <dbl> <dbl> <dbl> <int>
#> 1 ACS_PCT_AGE_0_4 5.72 1.29 0.23 18.4 3221
#> 2 ACS_PCT_AGE_5_9 6.01 1.31 0 14.9 3221
#> 3 ACS_PCT_AGE_10_14 6.42 1.25 0 13.6 3221
#> 4 ACS_PCT_AGE_15_17 3.86 0.730 0 11.9 3221
mean_tbl(data = sdoh_child_ages,
var_stem = "ACS_PCT_AGE",
na_removal = "pairwise",
var_labels = c(ACS_PCT_AGE_0_4 = "Percentage of population between ages 0-4",
ACS_PCT_AGE_5_9 = "Percentage of population between ages 5-9",
ACS_PCT_AGE_10_14 = "Percentage of population between ages 10-14",
ACS_PCT_AGE_15_17 = "Percentage of population between ages 15-17"))
#> # A tibble: 4 × 7
#> variable variable_label mean sd min max nobs
#> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <int>
#> 1 ACS_PCT_AGE_0_4 Percentage of population betw… 5.72 1.29 0.23 18.4 3221
#> 2 ACS_PCT_AGE_5_9 Percentage of population betw… 6.01 1.31 0 14.9 3221
#> 3 ACS_PCT_AGE_10_14 Percentage of population betw… 6.42 1.25 0 13.6 3221
#> 4 ACS_PCT_AGE_15_17 Percentage of population betw… 3.86 0.730 0 11.9 3221