select_tbl() displays frequency counts and percentages for multiple response variables (e.g., a series of questions where participants answer "Yes" or "No" to each item) as well as ordinal variables (such as Likert or Likert-type items with responses ranging from "Strongly Disagree" to "Strongly Agree", where respondents select one response per statement, question, or item).

select_tbl(
  data,
  var_stem,
  var_input = "stem",
  regex_stem = FALSE,
  ignore_stem_case = FALSE,
  na_removal = "listwise",
  pivot = "longer",
  only = NULL,
  var_labels = NULL,
  ignore = NULL,
  force_pivot = FALSE
)

Arguments

data

A data frame.

var_stem

A character vector with one or more elements, where each represents either a variable stem or the complete name of a variable present in data. A variable 'stem' refers to a common naming pattern shared among related variables, typically reflecting repeated measures of the same idea or a group of items assessing a single concept.

var_input

A character string specifying whether the values supplied to var_stem should be treated as variable stems (stem) or as complete variable names (name). By default, this is set to stem, so the function searches for variables that begin with each stem provided. Setting this argument to name directs the function to look for variables that exactly match the provided names.

regex_stem

A logical value indicating whether to use Perl-compatible regular expressions when searching for variable stems. Default is FALSE.

ignore_stem_case

A logical value indicating whether the search for columns matching the supplied var_stem is case-insensitive. Default is FALSE.

na_removal

A character string specifying how missing values are handled. Must be one of listwise or pairwise. Defaults to listwise.

  • listwise: Removes any row that has at least one missing value across all variables returned or analyzed. (Effectively uses complete cases only.)

  • pairwise: Handles missing values per variable or per pair of variables, using all available data, even if other variables in the row have missing values.

pivot

A character string that determines the format of the table. By default, longer returns the data in the long format. To receive the data in the wide format, specify wider.

only

A character string or vector of character strings of the types of summary data to return. Default is NULL, which returns both counts and percentages. To return only counts or percentages, use count or percent, respectively.

var_labels

An optional named character vector or list used to assign custom labels to variable names. Each element must be named and correspond to a variable included in the returned table. If var_input is set to stem, and any element is either unnamed or refers to a variable not present in the table, all labels will be ignored and the table will be printed without them.

ignore

An optional named vector or list indicating values to exclude from variables matching specified stems (or names). Defaults to NULL, indicating that all values are retained. To specify exclusions for variables identified by var_stem, use the corresponding stems or variable names as names in the vector or list. To exclude multiple values from these variables, supply them as a named list.

force_pivot

A logical value that enables pivoting to the 'wider' format even when variables have inconsistent value sets. By default, this is set to FALSE to prevent reshaping errors when values differ across variables in the returned table. Set to TRUE to override this safeguard and pivot to the 'wider' format regardless of value inconsistencies.

Value

A tibble displaying the count and percentage for each category in a multiple response variable.

Author

Ama Nyame-Mensah

Examples

select_tbl(data = tas,
           var_stem = "involved_",
           na_removal = "pairwise")
#> # A tibble: 12 × 4
#>    variable                  values count percent
#>    <chr>                      <dbl> <int>   <dbl>
#>  1 involved_arts                  0  2127  0.842 
#>  2 involved_arts                  1   399  0.158 
#>  3 involved_sports                0  2114  0.837 
#>  4 involved_sports                1   412  0.163 
#>  5 involved_schoolClubs           0  2127  0.858 
#>  6 involved_schoolClubs           1   352  0.142 
#>  7 involved_election              0  1028  0.452 
#>  8 involved_election              1  1248  0.548 
#>  9 involved_socialActionGrps      0  2419  0.958 
#> 10 involved_socialActionGrps      1   107  0.0424
#> 11 involved_volunteer             0  1732  0.686 
#> 12 involved_volunteer             1   794  0.314 

select_tbl(data = depressive,
           var_stem = "dep",
           na_removal = "listwise",
           pivot = "wider",
           only = "percent")
#> # A tibble: 8 × 4
#>   variable percent_value_1 percent_value_2 percent_value_3
#>   <chr>              <dbl>           <dbl>           <dbl>
#> 1 dep_1             0.0678           0.429          0.503 
#> 2 dep_2             0.0896           0.464          0.446 
#> 3 dep_3             0.723            0.244          0.0330
#> 4 dep_4             0.374            0.520          0.106 
#> 5 dep_5             0.121            0.346          0.533 
#> 6 dep_6             0.241            0.535          0.224 
#> 7 dep_7             0.640            0.305          0.0554
#> 8 dep_8             0.197            0.488          0.315 

var_label_example <-
  c("dep_1" = "how often child feels sad and blue",
    "dep_2" = "how often child feels nervous, tense, or on edge",
    "dep_3" = "how often child feels happy",
    "dep_4" = "how often child feels bored",
    "dep_5" = "how often child feels lonely",
    "dep_6" = "how often child feels tired or worn out",
    "dep_7" = "how often child feels excited about something",
    "dep_8" = "how often child feels too busy to get everything")

select_tbl(data = depressive,
           var_stem = "dep",
           na_removal = "pairwise",
           pivot = "longer",
           var_labels = var_label_example)
#> # A tibble: 24 × 5
#>    variable variable_label                                  values count percent
#>    <chr>    <chr>                                            <int> <int>   <dbl>
#>  1 dep_1    how often child feels sad and blue                   1   120  0.0726
#>  2 dep_1    how often child feels sad and blue                   2   709  0.429 
#>  3 dep_1    how often child feels sad and blue                   3   825  0.499 
#>  4 dep_2    how often child feels nervous, tense, or on ed…      1   151  0.0920
#>  5 dep_2    how often child feels nervous, tense, or on ed…      2   762  0.464 
#>  6 dep_2    how often child feels nervous, tense, or on ed…      3   728  0.444 
#>  7 dep_3    how often child feels happy                          1  1192  0.721 
#>  8 dep_3    how often child feels happy                          2   406  0.246 
#>  9 dep_3    how often child feels happy                          3    55  0.0333
#> 10 dep_4    how often child feels bored                          1   611  0.371 
#> # ℹ 14 more rows

select_tbl(data = depressive,
           var_stem = "dep",
           na_removal = "pairwise",
           pivot = "wider",
           only = "count",
           var_labels = var_label_example)
#> # A tibble: 8 × 5
#>   variable variable_label              count_value_1 count_value_2 count_value_3
#>   <chr>    <chr>                               <int>         <int>         <int>
#> 1 dep_1    how often child feels sad …           120           709           825
#> 2 dep_2    how often child feels nerv…           151           762           728
#> 3 dep_3    how often child feels happy          1192           406            55
#> 4 dep_4    how often child feels bored           611           856           181
#> 5 dep_5    how often child feels lone…           206           574           871
#> 6 dep_6    how often child feels tire…           399           879           371
#> 7 dep_7    how often child feels exci…          1046           507            95
#> 8 dep_8    how often child feels too …           323           801           519