Extract data from stata file for data dictionary
Usage
ds2dd_detailed(
data,
add.auto.id = FALSE,
date.format = "dmy",
form.name = NULL,
form.sep = NULL,
form.prefix = TRUE,
field.type = NULL,
field.label = NULL,
field.label.attr = "label",
field.validation = NULL,
metadata = names(REDCapCAST::redcapcast_meta),
convert.logicals = TRUE
)
Arguments
- data
data frame
- add.auto.id
flag to add id column
- date.format
date format, character string. ymd/dmy/mdy. dafault is dmy.
- form.name
manually specify form name(s). Vector of length 1 or ncol(data). Default is NULL and "data" is used.
- form.sep
If supplied dataset has form names as suffix or prefix to the column/variable names, the seperator can be specified. If supplied, the form.name is ignored. Default is NULL.
- form.prefix
Flag to set if form is prefix (TRUE) or suffix (FALSE) to the column names. Assumes all columns have pre- or suffix if specified.
- field.type
manually specify field type(s). Vector of length 1 or ncol(data). Default is NULL and "text" is used for everything but factors, which wil get "radio".
- field.label
manually specify field label(s). Vector of length 1 or ncol(data). Default is NULL and colnames(data) is used or attribute `field.label.attr` for haven_labelled data set (imported .dta file with `haven::read_dta()`).
- field.label.attr
attribute name for named labels for haven_labelled data set (imported .dta file with `haven::read_dta()`. Default is "label"
- field.validation
manually specify field validation(s). Vector of length 1 or ncol(data). Default is NULL and `levels()` are used for factors or attribute `factor.labels.attr` for haven_labelled data set (imported .dta file with `haven::read_dta()`).
- metadata
redcap metadata headings. Default is REDCapCAST:::metadata_names.
- convert.logicals
convert logicals to factor. Default is TRUE.
Details
This function is a natural development of the ds2dd() function. It assumes that the first column is the ID-column. No checks. Please, do always inspect the data dictionary before upload.
Ensure, that the data set is formatted with as much information as possible.
`field.type` can be supplied
Examples
if (FALSE) { # \dontrun{
data <- REDCapCAST::redcapcast_data
data |> ds2dd_detailed()
iris |> ds2dd_detailed(add.auto.id = TRUE)
iris |>
ds2dd_detailed(
add.auto.id = TRUE,
form.name = sample(c("b", "c"), size = 6, replace = TRUE, prob = rep(.5, 2))
) |>
purrr::pluck("meta")
mtcars |> ds2dd_detailed(add.auto.id = TRUE)
data <- iris |>
ds2dd_detailed(add.auto.id = TRUE) |>
purrr::pluck("data")
names(data) <- glue::glue("{sample(x = c('a','b'),size = length(names(data)),
replace=TRUE,prob = rep(x=.5,2))}__{names(data)}")
data |> ds2dd_detailed(form.sep = "__")
} # }