Hi,
This is a vignette to illustrate an application of the cross-lag ACE model from the TwinAnalysis package.
Here is a full cross-lag ACE model with four variables. Two traits X and Y are measured twice (X1, Y1, X2, Y2). Only one twin is shown on the path diagram.
The model includes three sets of latent variables that correspond to the structure of genetic, shared environmental and non-shared environmental variability. Each set mirrors the cross-laged structure of the data. Here are the parameters that represent genetic structure of the traits:
The model allows correlations within each assessment (\(r_1\), \(r_2\)) and regression paths from variables in one assessment to the variables in the following one (\(k_{11}\), … , \(k_{22}\)). The total variability of all latent variables is constrained to 1. \(h_{X1}^2\), … , \(h_{Y2}^2\) represent the amount of genetic variability of corresponding traits.
I used the following model to simulate the data:
The lengthy code that generates the data is below. It assumes that you have OpenMx installed. We will need it later anyway. In the resulting data table, X11 is trait X at the first assessment in the first twin, X12 is trait X at the first assessment in the second twin, etc. zyg stands for zygosity (MZ or DZ).
vars <- c('X1', 'Y1', 'X2', 'Y2')
I <- OpenMx::vec2diag(rep(1, 4))
A_A <- matrix(
c(0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0,
0.6, 0.3, 0.0, 0.0,
0.4, 0.5, 0.0, 0.0
), ncol = 4, byrow = TRUE,
dimnames = list(vars, vars))
A_S <- matrix(
c(1.0, 0.8, 0.0, 0.0,
0.8, 1.0, 0.0, 0.0,
0.0, 0.0, 0.262, 0.1,
0.0, 0.0, 0.1, 0.270
), ncol = 4, byrow = TRUE,
dimnames = list(vars, vars))
A_V <- solve(I - A_A) %*% A_S %*% t(solve(I - A_A))
C_A <- matrix(
c(0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0,
0.7, 0.1, 0.0, 0.0,
0.2, 0.5, 0.0, 0.0
), ncol = 4, byrow = TRUE,
dimnames = list(vars, vars))
C_S <- matrix(
c(1.0, 0.3, 0.0, 0.0,
0.3, 1.0, 0.0, 0.0,
0.0, 0.0, 0.458, 0.1,
0.0, 0.0, 0.1, 0.650
), ncol = 4, byrow = TRUE,
dimnames = list(vars, vars))
C_V <- solve(I - C_A) %*% C_S %*% t(solve(I - C_A))
E_A <- matrix(
c(0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0,
0.2, 0.0, 0.0, 0.0,
0.0, 0.2, 0.0, 0.0
), ncol = 4, byrow = TRUE,
dimnames = list(vars, vars))
E_S <- matrix(
c(1.0, 0.1, 0.0, 0.0,
0.1, 1.0, 0.0, 0.0,
0.0, 0.0, 0.960, 0.0,
0.0, 0.0, 0.0, 0.960
), ncol = 4, byrow = TRUE,
dimnames = list(vars, vars))
E_V <- solve(I - E_A) %*% E_S %*% t(solve(I - E_A))
vcomp <- matrix(
c(0.5, 0.4, 0.6, 0.4,
0.3, 0.2, 0.2, 0.2,
0.2, 0.4, 0.2, 0.4),
ncol = 3,
dimnames = list(vars, c('A', 'C', 'E'))
)
A <- (sqrt(vcomp[, 'A']) %*% t(sqrt(vcomp[, 'A']))) * A_V
C <- (sqrt(vcomp[, 'C']) %*% t(sqrt(vcomp[, 'C']))) * C_V
E <- (sqrt(vcomp[, 'E']) %*% t(sqrt(vcomp[, 'E']))) * E_V
cov_mz <- rbind(cbind(A + C + E, A + C),
cbind(A + C, A + C + E))
cov_dz <- rbind(cbind(A + C + E, 0.5 %x% A + C),
cbind(0.5 %x% A + C, A + C + E))
dimnames(cov_mz) <- dimnames(cov_dz) <-
list(paste0(vars, rep(1:2, each = length(vars))),
paste0(vars, rep(1:2, each = length(vars))))
mz_data <- MASS::mvrnorm(1000,
mu = rep(0, 8),
Sigma = cov_mz)
dz_data <- MASS::mvrnorm(1000,
mu = rep(0, 8),
Sigma = cov_dz)
data <- rbind(cbind(zyg = 'MZ', as.data.frame(mz_data)),
cbind(zyg = 'DZ', as.data.frame(dz_data)))Now we can apply the cross-lag ACE model to the data. The argument defintion is a list with the vectors with variable names from each one measurement occasion. Otherwise, the model works just like any other model from TwinAnalysis package, refer to another vignette. I present here a minimal example of the application of cross-lag ACE.
library(OpenMx)
#> To take full advantage of multiple cores, use:
#> mxOption(key='Number of Threads', value=parallel::detectCores()) #now
#> Sys.setenv(OMP_NUM_THREADS=parallel::detectCores()) #before library(OpenMx)
library(TwinAnalysis)
model <- cross_lag_ace(data, zyg = 'zyg',
definition = list(c('X1', 'Y1'),
c('X2', 'Y2')))
#> Running crosslag with 14 parameters
model <- mxRun(model)
#> Running ACE with 34 parameters
model <- ref_models(model)
#> Running Saturated ACE with 88 parameters
#> Running Independence ACE with 32 parameters
# To fit assumption constrained reference models:
# model <- twin_ref_models(model, run = TRUE)
# Fit statistics
fit_stats(model)
#> model base ep minus2LL df AIC BIC CFI TLI
#> 1 ACE Saturated 34 37170.29 15966 5238.292 -84185.72 1.001177 1.001221
#> RMSEA diffLL diffdf p
#> 1 0 44.13209 54 0.8287449
# Parameter tables
get_output_tables(model)
#> $Variance_components
#> Atot Acom Ares Ctot Ccom Cres
#> X1 0.4648753 0.0000000 0.46487533 0.3298528 0.00000000 0.32985279
#> Y1 0.3177104 0.0000000 0.31771037 0.3038531 0.00000000 0.30385310
#> X2 0.6144000 0.5029067 0.11149331 0.1881319 0.09429262 0.09383925
#> Y2 0.3762794 0.2810327 0.09524676 0.2377194 0.10019022 0.13752920
#> Etot Ecom Eres
#> X1 0.2052719 0.000000000 0.2052719
#> Y1 0.3784365 0.000000000 0.3784365
#> X2 0.1974681 0.006607846 0.1908603
#> Y2 0.3860011 0.007941010 0.3780601
#>
#> $Path_estimates
#> A C E
#> X1 --> X2 0.56489206 0.71724060 0.181666296
#> X1 --> Y2 0.18682441 0.21765763 0.024121778
#> Y1 --> X2 0.39063876 -0.02041916 0.008038069
#> Y1 --> Y2 0.70964189 0.51857464 0.138158582
#> X1 <-> X1 1.00000000 1.00000000 1.000000000
#> X1 <-> Y1 0.78586426 0.46588136 0.135456249
#> Y1 <-> Y1 1.00000000 1.00000000 1.000000000
#> X2 <-> X2 0.18146697 0.49879507 0.966537148
#> X2 <-> Y2 -0.02827336 0.30873028 0.025003808
#> Y2 <-> Y2 0.25312774 0.57853584 0.979427495Output tables:
Variance_components:Atot,CtotandEtotare the standardized estimates of genetic and environmental contributions.Acom,CcomabdEcomare genetic and environmental contributions accounted for by the factors from the previous measurement occasion. For the first measurement occasion, there is no previous one, soAcom,CcomandEcomforX1andX2are zeros.Ares,CresandEresare the factors that remained unaccounted for by the previous measurement occasion. Eachcomandressum up tototandAtot,CtotandEtotsum up to 1, the total variability of the corresponding measure.Path_estimates: Each columnA,CandEstands for the corresponding part of the twin model. Each row represents a parameter from the cross-lag model. For example, the genetic stability ofX, denoted asX1 --> X2, is 0.565. The genetic correlation betweenXandYin the first measurement occasion, denoted asX1 <-> Y1, is 0.786. Mind, that the total variance of the latent variables is constrained to 1 and the variance parameters in the table (X1 <-> X1,Y1 <-> Y1etc) are residual (novel) variability, not total. In the first measurement occasion, all variability is novel, this is why allX1 <-> X1andY1 <-> Y1are 1. It is not so forX2 <-> X2andY2 <-> Y2because some part of variance ofX2andY2is explained byX1andY1.
Note:
at the moment, mxCheckIdentification() doesn’t work on the cross-lag ACE and mxTryHard() doesn’t run the fitting repeatedly if the model status is red.