tidysummary • tidysummary

tidysummary: An Elegant Approach to Summarizing Clinical Data.

The goal of tidysummary is to streamlines the analysis of clinical data by automatically selecting appropriate statistical descriptions and inference methods based on variable types.

Prepare your data

A data frame containing the variables to analyze, with variables at columns and observations at rows:

Continuous variables: Numeric.
Categorical variables: Factor (Ordinal Categorical variables: ordered Factor).

add_var()

The add_var() function prepares your dataset for downstream analysis by classifying variables into:

Continuous variables: Further subdivided by normality and equal variance assumptions.
Categorical variables: Further subdivided by ordered status and expected frequency.

Usage

Specify the variables to summarize in var and the grouping variable in group.

data <- iris %>%
  add_var(var = c("Sepal.Length", "Sepal.Width"), group = "Species")

The function can automatically checks normality using statistical tests. You can choose:

norm

'auto': By default, automatically checks normality, but the same as ask when n > 1000.
'ask': Displays automatic result, QQ plots and prompts for manual confirmation.
true: Treats all variables as normal.
false: Treats all variables as non-normal.

data <- iris %>%
  add_var(var = c("Sepal.Length", "Sepal.Width"), group = "Species", norm = "ask")

add_summary()

The add_summary() function summarize your dataset from add_var() result with:

A summary dataframe with rows as the variables and columns as the group.

Usage

Just input the result from add_var()

summary <- data %>%
  add_summary()

If you want to custom the summary style, You can choose:

add_overall

TRUE: By default, include an “Overall” summary column.
FALSE: Show only groups summary column.

continuous_format

Format string to override both norm_continuous_format, and unnorm_continuous_format.

Accepted placeholders are '{mean}', '{SD}', '{median}', '{Q1}', '{Q3}'.

norm_continuous_format

Default is '{mean} ± {SD}'. Accepted placeholders same as continuous_format.

unnorm_continuous_format

Default is '{median} ({Q1}, {Q3})'. Accepted placeholders same as continuous_format.

categorical_format

Format string for categorical variables. Default is '{n} ({pct})'. Accepted placeholders are '{n}' and '{pct}'.

binary_show

'last': By default, show only last level.
'first': Show only first level.
'all': show all levels.

summary <- data %>%
  add_summary(add_overall = T,
              continuous_format = "{mean} ± {SD}",
              categorical_format = "{n} ({pct})",
              binary_show = "last")

add_p()

The add_summary() function summarize your dataset from add_summary() result with:

A summary_with_p dataframe with rows as the variables and columns as the group.

Usage

Just input the result from add_summary()

summary_with_p <- summary %>%
  add_p()

If you want to custom the summary_with_p column, You can choose:

asterisk

TRUE: Show asterisk significance markers.
FALSE: By default, show p-values.

add_method

TRUE: Show method text.
FALSE: By default, not show method text.
'code': Show method as codes according to order of appearance.

add_statistic_name

TRUE: Show statistic name.
FALSE: By default, not show statistic name.

add_statistic_value

TRUE: Show statistic value.
FALSE: By default, not show statistic value.

summary_with_p <- summary %>%
  add_p(asterisk = T, add_method = "code")