add_var.RdThis function processes a dataset for statistical analysis by categorizing variables into continuous and categorical types. It automatically handles normality checks, equality of variances checks, and expected frequency assumptions checks.
add_var(data, var = NULL, group = "group", norm = "auto", center = "median")A data frame containing the variables to analyze, with variables at columns and observations at rows.
A character vector of variable names to include. If NULL, by default, all columns except the group column will be used.
A character string specifying the grouping variable in data. If not specified, 'group', by default.
Control parameter for normality tests. Accepts:
'auto': Automatically decide based on p-values, but the same as 'ask' when n > 1000, default
'ask': Show p-values, plots QQ plots and prompts for decision
TRUE/'true': Always assuming data are normally distributed
FALSE/'false': Always assuming data are non-normally distributed
A character string specifying the center to use in Levene's test for equality of variances. Default is 'median', which is more robust than the mean.
A modified data frame with an attribute 'add_var' containing a list of categorized variables and their properties:
var: List of categorized variables:
valid: All valid variable names after checks
continuous: Sublist of continuous variables (further divided by normality/equal variance)
categorical: Sublist of categorical variables (further divided by ordered/expected frequency)
group: Grouping variable name
overall_n: Total number of observations
group_n: Observation counts per group
group_nlevels: Number of groups
group_levels: Group level names
norm: Normality check method used
data <- add_var(iris, var = c("Sepal.Length", "Sepal.Width"), group = "Species")