create_data_description.Rd
Generates a concise textual description of a dataset, summarizing each column's type,
number of missing values, and a short overview of its content (e.g., range or levels).
Optionally, this description can be written to a README.md
file.
this helpfile was generated with AI
create_data_description(dat, readme = FALSE, tab = " ", max_char = 60)
A data.frame or a character string pointing to an .rds
file to load.
If a character string is provided and ends with .rds
, the file will be loaded using readRDS()
.
Logical. If TRUE
, the output is appended to a file named "README.md"
. Default is FALSE
.
Integer. Maximum number of characters to show per variable summary.
Longer content will be truncated. Default is 60
.
Integer. Number of tab characters (\\t
) to use for formatting. Default is 1
.
Invisibly returns a character vector with one entry per column describing its contents. Side effect: prints the description to the console, and optionally to a README file.
For each column in the dataset, the function provides:
The column name
The type (typeof
)
The number of missing values
A brief summary:
For factors: list of levels (possibly truncated)
For numeric variables: value range
For character variables: coerced to factor and listed as above
df <- data.frame(
id = 1:10,
group = factor(c("A", "B")),
score = c(NA, 2:10)
)
create_data_description(df)
#> # Discription of datafile `df`
#>
#> Columns: 3 | Rows: 10
#>
#> id (integer, 0 NA): 1 to 10
#> group (factor, 0 NA): A, B
#> score (integer, 1 NA): 2 to 10