mirror of
https://github.com/agdamsbo/REDCapCAST.git
synced 2024-11-22 13:30:23 +01:00
Adding illustration to README.md
This commit is contained in:
parent
f826571c2c
commit
bace4bb48b
117
README.md
117
README.md
@ -19,19 +19,126 @@ might expect that the non-repeating instruments may constitute one table
|
|||||||
that would be related to Repeating Instruments tables via a one-to-many
|
that would be related to Repeating Instruments tables via a one-to-many
|
||||||
relationship. In reality, the data is outputted as one table with all
|
relationship. In reality, the data is outputted as one table with all
|
||||||
possible fields; this has the effect of nesting the output table in a
|
possible fields; this has the effect of nesting the output table in a
|
||||||
way that is not useful in most analysis software. Therefore, I have made
|
way that is not useful in most analysis software.
|
||||||
a solution to handle the problem in both SAS and R.
|
|
||||||
|
|
||||||
## Supported Platforms
|
The normalized data can be retrieved by downloading repeating instruments individually then doing a little
|
||||||
|
data munging or by writing a few custom parameters in a series of API calls (then doing more data munging),
|
||||||
|
but this is a lot of extra steps that can make reproducible research more difficult.
|
||||||
|
Therefore, I have made a programmatic solution to handle the problem in both SAS and R.
|
||||||
|
|
||||||
|
### Illustration
|
||||||
|
|
||||||
|
For example, consider this mocked-up data involving some information about cars in
|
||||||
|
R's built-in `mtcars` dataset as well as some sales data for some of the cars.
|
||||||
|
|
||||||
|
| car_id|redcap_repeat_instrument |redcap_repeat_instance |make |model |mpg |cyl |motor_trend_cars_complete |price |color |customer |sale_complete |
|
||||||
|
|------:|:------------------------|:----------------------|:--------|:-----------|:----|:---|:-------------------------|:--------|:-----|:--------|:-------------|
|
||||||
|
| 1| | |AMC |Javelin |15.2 |8 |1 | | | | |
|
||||||
|
| 1|sale |1 | | | | | |12000.50 |1 |Bob |0 |
|
||||||
|
| 1|sale |2 | | | | | |13750.77 |3 |Sue |2 |
|
||||||
|
| 1|sale |3 | | | | | |15004.57 |2 |Kim |0 |
|
||||||
|
| 2| | |Cadillac |Fleetwood |10.4 |8 |0 | | | | |
|
||||||
|
| 3| | |Camaro |Z28 |13.3 |8 |0 | | | | |
|
||||||
|
| 3|sale |1 | | | | | |7800.00 |2 |Janice |2 |
|
||||||
|
| 3|sale |2 | | | | | |8000.00 |3 |Tim |0 |
|
||||||
|
| 4| | |Chrysler |Imperial |14.7 |8 |0 | | | | |
|
||||||
|
| 4|sale |1 | | | | | |7500.00 |1 |Jim |2 |
|
||||||
|
| 5| | |Datsun |710 |22.8 |4 |0 | | | | |
|
||||||
|
| 6| | |Dodge |Challenger |15.5 |8 |0 | | | | |
|
||||||
|
| 7| | |Duster |360 |14.3 |8 |0 | | | | |
|
||||||
|
| 7|sale |1 | | | | | |8756.40 |4 |Sarah |1 |
|
||||||
|
| 7|sale |2 | | | | | |6800.88 |2 |Pablo |0 |
|
||||||
|
| 7|sale |3 | | | | | |8888.88 |1 |Erica |0 |
|
||||||
|
| 7|sale |4 | | | | | |970.00 |4 |Juan |0 |
|
||||||
|
| 8| | |Ferrari |Dino |19.7 |6 |0 | | | | |
|
||||||
|
| 9| | |Mazda |RX4 Wag |21 |6 |0 | | | | |
|
||||||
|
| 10| | |Merc |230 |22.8 |4 |0 | | | | |
|
||||||
|
| 10|sale |1 | | | | | |7800.98 |2 |Ted |0 |
|
||||||
|
| 10|sale |2 | | | | | |7954.00 |1 |Quentin |0 |
|
||||||
|
| 10|sale |3 | | | | | |6800.55 |3 |Sharon |2 |
|
||||||
|
|
||||||
|
*Data credit*: Henderson and Velleman (1981), Building multiple regression models interactively. *Biometrics*, **37**, 391--411.
|
||||||
|
**Modified with fake data for the purpose of illustration**
|
||||||
|
|
||||||
|
You can see that the data from the non-repeating forms (primary table) is interlaced with the data in the repeating forms,
|
||||||
|
creating a checkerboard pattern. In order to do analysis, the data must be normalized and then the tables rejoined.
|
||||||
|
The normalized tables would look like this:
|
||||||
|
|
||||||
|
**Primary table**
|
||||||
|
|
||||||
|
| car_id|make |model |mpg |cyl |motor_trend_cars_complete |
|
||||||
|
|------:|:--------|:----------|:----|:---|:-------------------------|
|
||||||
|
| 1|AMC |Javelin |15.2 |8 |1 |
|
||||||
|
| 2|Cadillac |Fleetwood |10.4 |8 |0 |
|
||||||
|
| 3|Camaro |Z28 |13.3 |8 |0 |
|
||||||
|
| 4|Chrysler |Imperial |14.7 |8 |0 |
|
||||||
|
| 5|Datsun |710 |22.8 |4 |0 |
|
||||||
|
| 6|Dodge |Challenger |15.5 |8 |0 |
|
||||||
|
| 7|Duster |360 |14.3 |8 |0 |
|
||||||
|
| 8|Ferrari |Dino |19.7 |6 |0 |
|
||||||
|
| 9|Mazda |RX4 Wag |21 |6 |0 |
|
||||||
|
| 10|Merc |230 |22.8 |4 |0 |
|
||||||
|
|
||||||
|
**Child table**
|
||||||
|
|
||||||
|
|car_id |redcap_repeat_instrument |redcap_repeat_instance |price |color |customer |sale_complete |
|
||||||
|
|:------|:------------------------|:----------------------|:--------|:-----|:--------|:-------------|
|
||||||
|
|1 |sale |1 |12000.50 |1 |Bob |0 |
|
||||||
|
|1 |sale |2 |13750.77 |3 |Sue |2 |
|
||||||
|
|1 |sale |3 |15004.57 |2 |Kim |0 |
|
||||||
|
|3 |sale |1 |7800.00 |2 |Janice |2 |
|
||||||
|
|3 |sale |2 |8000.00 |3 |Tim |0 |
|
||||||
|
|4 |sale |1 |7500.00 |1 |Jim |2 |
|
||||||
|
|7 |sale |1 |8756.40 |4 |Sarah |1 |
|
||||||
|
|7 |sale |2 |6800.88 |2 |Pablo |0 |
|
||||||
|
|7 |sale |3 |8888.88 |1 |Erica |0 |
|
||||||
|
|7 |sale |4 |970.00 |4 |Juan |0 |
|
||||||
|
|10 |sale |1 |7800.98 |2 |Ted |0 |
|
||||||
|
|10 |sale |2 |7954.00 |1 |Quentin |0 |
|
||||||
|
|10 |sale |3 |6800.55 |3 |Sharon |2 |
|
||||||
|
|
||||||
|
After inner joining the primary table to the child table on `car_id` and selecting only the fields you are interested in,
|
||||||
|
your resulting analytic dataset might look something like this:
|
||||||
|
|
||||||
|
| car_id|make |model |price |color |customer |
|
||||||
|
|------:|:--------|:--------|:--------|:-----|:--------|
|
||||||
|
| 1|AMC |Javelin |12000.50 |1 |Bob |
|
||||||
|
| 1|AMC |Javelin |13750.77 |3 |Sue |
|
||||||
|
| 1|AMC |Javelin |15004.57 |2 |Kim |
|
||||||
|
| 3|Camaro |Z28 |7800.00 |2 |Janice |
|
||||||
|
| 3|Camaro |Z28 |8000.00 |3 |Tim |
|
||||||
|
| 4|Chrysler |Imperial |7500.00 |1 |Jim |
|
||||||
|
| 7|Duster |360 |8756.40 |4 |Sarah |
|
||||||
|
| 7|Duster |360 |6800.88 |2 |Pablo |
|
||||||
|
| 7|Duster |360 |8888.88 |1 |Erica |
|
||||||
|
| 7|Duster |360 |970.00 |4 |Juan |
|
||||||
|
| 10|Merc |230 |7800.98 |2 |Ted |
|
||||||
|
| 10|Merc |230 |7954.00 |1 |Quentin |
|
||||||
|
| 10|Merc |230 |6800.55 |3 |Sharon |
|
||||||
|
|
||||||
|
### Supported Platforms
|
||||||
|
|
||||||
|
Currently, the R and SAS code is well-tested with mocked-up data.
|
||||||
|
|
||||||
- R
|
- R
|
||||||
- SAS
|
- SAS
|
||||||
|
|
||||||
### Coming Soon
|
I have made some effort to replicate the
|
||||||
|
messiness of real-world data and have tried to include as many special cases and data types as possible.
|
||||||
|
However, this code may not account for all contingencies or changes in the native REDCap export format.
|
||||||
|
If you find a bug, please feel free to open an issue or pull request.
|
||||||
|
|
||||||
|
#### Coming Soon
|
||||||
|
|
||||||
|
Currently, we have given some consideration to expand the capabilities into the following languages.
|
||||||
|
|
||||||
- Python
|
- Python
|
||||||
- VBA
|
- VBA
|
||||||
|
|
||||||
|
If you have some talents in these or other languages, please feel free to open a pull request! We
|
||||||
|
welcome your contributions!
|
||||||
|
|
||||||
|
|
||||||
## Instructions
|
## Instructions
|
||||||
### R
|
### R
|
||||||
|
|
||||||
@ -48,7 +155,7 @@ devtools::install_github("SpectrumHealthResearch/REDCapRITS/R")
|
|||||||
|
|
||||||
After the package is installed, follow these instructions:
|
After the package is installed, follow these instructions:
|
||||||
|
|
||||||
1. Download the record dataset and metadata. This can
|
1. Download the record dataset and metadata (data dictionary). This can
|
||||||
be accomplished by several methods:
|
be accomplished by several methods:
|
||||||
- Using the API. Check with your REDCap administrator for details.
|
- Using the API. Check with your REDCap administrator for details.
|
||||||
- Exporting the data from the web interface by selecting *CSV / Microsoft Excel (raw data)*.
|
- Exporting the data from the web interface by selecting *CSV / Microsoft Excel (raw data)*.
|
||||||
|
Loading…
Reference in New Issue
Block a user