Skip to contents

mean_tbl() calculates summary statistics (i.e., mean, standard deviation, minimum, maximum, and count of non-missing values) for interval and ratio-level variables that share a common prefix (i.e., variable stem). A variable 'stem' is a shared naming pattern across related variables, often representing repeated measures of the same concept or a series of items measuring a single construct. Missing data are excluded using listwise deletion by default.

Usage

mean_tbl(
  data,
  var_stem,
  escape_stem = FALSE,
  ignore_stem_case = FALSE,
  na_removal = "listwise",
  only = NULL,
  var_labels = NULL,
  ignore = NULL
)

Arguments

data

A data frame.

var_stem

A character string of a variable stem or the full name of a variable in data.

escape_stem

A logical value indicating whether to escape var_stem. Default is FALSE.

ignore_stem_case

A logical value indicating whether the search for columns matching the supplied var_stem is case-insensitive. Default is FALSE.

na_removal

A character string that specifies the method for handling missing values: pairwise or listwise. Defaults to listwise.

only

A character string or vector of character strings specifying which summary statistics to return. Defaults to NULL, which includes mean (mean), standard deviation (sd), minimum (min), maximum (max), and count of non-missing values (nobs).

var_labels

An optional named character vector or list used to assign custom labels to variable names. Each element should be named and correspond to a variable in the returned table. If any element is unnamed or references a variable not returned in the table, all labels will be ignored and the table will be printed without them.

ignore

An optional vector of values to exclude from variables matching the specified variable stem. Defaults to NULL, which retains all values.

Value

A tibble showing summary statistics for continuous variables sharing a common variable stem.

Author

Ama Nyame-Mensah

Examples


sdoh_child_ages <- dplyr::select(sdoh, c(ACS_PCT_AGE_0_4, ACS_PCT_AGE_5_9,
                                            ACS_PCT_AGE_10_14, ACS_PCT_AGE_15_17))
mean_tbl(data = sdoh_child_ages,var_stem = "ACS_PCT_AGE")
#> # A tibble: 4 × 6
#>   variable           mean    sd   min   max  nobs
#>   <chr>             <dbl> <dbl> <dbl> <dbl> <int>
#> 1 ACS_PCT_AGE_0_4    5.72 1.29   0.23  18.4  3221
#> 2 ACS_PCT_AGE_5_9    6.01 1.31   0     14.9  3221
#> 3 ACS_PCT_AGE_10_14  6.42 1.25   0     13.6  3221
#> 4 ACS_PCT_AGE_15_17  3.86 0.730  0     11.9  3221

mean_tbl(data = sdoh_child_ages,
         var_stem = "ACS_PCT_AGE",
         na_removal = "pairwise",
         var_labels = c(ACS_PCT_AGE_0_4 = "Percentage of population between ages 0-4",
                        ACS_PCT_AGE_5_9 = "Percentage of population between ages 5-9",
                        ACS_PCT_AGE_10_14 = "Percentage of population between ages 10-14",
                        ACS_PCT_AGE_15_17 = "Percentage of population between ages 15-17"))
#> # A tibble: 4 × 7
#>   variable          variable_label                  mean    sd   min   max  nobs
#>   <chr>             <chr>                          <dbl> <dbl> <dbl> <dbl> <int>
#> 1 ACS_PCT_AGE_0_4   Percentage of population betw…  5.72 1.29   0.23  18.4  3221
#> 2 ACS_PCT_AGE_5_9   Percentage of population betw…  6.01 1.31   0     14.9  3221
#> 3 ACS_PCT_AGE_10_14 Percentage of population betw…  6.42 1.25   0     13.6  3221
#> 4 ACS_PCT_AGE_15_17 Percentage of population betw…  3.86 0.730  0     11.9  3221