Friday, April 28, 2017

Model formulas in R: Retrieving names and values of original and transformed variables

This is a reminder on how to retrieve variables and names from formulas in R. First some example formulas with increasing complexity:
# simple formula:
myform_1 <- y ~ x1 + x2 + x3
# sum of two variables as predictor:
myform_2 <- y ~ x1 + identity(x2 + x3)
# function of variable as predictor:
myform_3 <- y ~ x1 + log(x2) + x3
myform_4 <- y ~ x1 + factor(x2) + x3
# two outcome / dependent variables:
myform_5 <- y + z ~ x1 + x2 + x3
# function of variable as outcome:
myform_6 <- exp(y) + z ~ x1 + x2 + x3
# go crazy:
myform_7 <- identity(exp(y) + z) ~ factor(x1) + identity(x2*x3)

mydata <- data.frame(z = 5:1, y = 1:5, x1 = 5:1, x2 = 1:5, x3 = 5:1)

Here's some ways to retrieve the original and transformed variable names and values:
## Get names of all original variables:
all.vars(myform_1[[2]])
all.vars(myform_6[[2]])
all.vars(myform_7[[2]])

## Get names and values of all original variables:
get_all_vars(myform_1, data = mydata)
get_all_vars(myform_2, data = mydata)
get_all_vars(myform_3, data = mydata)
get_all_vars(myform_4, data = mydata)
get_all_vars(myform_5, data = mydata)
get_all_vars(myform_6, data = mydata)
get_all_vars(myform_7, data = mydata)

## Get names of all transformed predictor variables
labels(terms(myform_1))
labels(terms(myform_2))
labels(terms(myform_3))
labels(terms(myform_4))
labels(terms(myform_7))

## Get names of all transformed response variables:
## (probably can go shorter, but it works ...)
data <- model.frame(myform_1, data = mydata)
names(data)[attr(attr(data, "terms"), "response")]
data <- model.frame(myform_5, data = mydata)
names(data)[attr(attr(data, "terms"), "response")]
data <- model.frame(myform_6, data = mydata)
names(data)[attr(attr(data, "terms"), "response")]
data <- model.frame(myform_7, data = mydata)
names(data)[attr(attr(data, "terms"), "response")]

No comments:

Post a Comment