Hi,
This is a vignette to illustrate an application of the cross-lag ACE model from the TwinAnalysis
package.
Here is a full cross-lag ACE model with four variables. Two traits X
and Y
are measured twice (X1
, Y1
, X2
, Y2
). Only one twin is shown on the path diagram.
The model includes three sets of latent variables that correspond to the structure of genetic, shared environmental and non-shared environmental variability. Each set mirrors the cross-laged structure of the data. Here are the parameters that represent genetic structure of the traits:
The model allows correlations within each assessment (\(r_1\), \(r_2\)) and regression paths from variables in one assessment to the variables in the following one (\(k_{11}\), … , \(k_{22}\)). The total variability of all latent variables is constrained to 1. \(h_{X1}^2\), … , \(h_{Y2}^2\) represent the amount of genetic variability of corresponding traits.
I used the following model to simulate the data:
The lengthy code that generates the data is below. It assumes that you have OpenMx
installed. We will need it later anyway. In the resulting data table, X11
is trait X
at the first assessment in the first twin, X12
is trait X
at the first assessment in the second twin, etc. zyg
stands for zygosity (MZ
or DZ
).
vars <- c('X1', 'Y1', 'X2', 'Y2')
I <- OpenMx::vec2diag(rep(1, 4))
A_A <- matrix(
c(0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0,
0.6, 0.3, 0.0, 0.0,
0.4, 0.5, 0.0, 0.0
), ncol = 4, byrow = TRUE,
dimnames = list(vars, vars))
A_S <- matrix(
c(1.0, 0.8, 0.0, 0.0,
0.8, 1.0, 0.0, 0.0,
0.0, 0.0, 0.262, 0.1,
0.0, 0.0, 0.1, 0.270
), ncol = 4, byrow = TRUE,
dimnames = list(vars, vars))
A_V <- solve(I - A_A) %*% A_S %*% t(solve(I - A_A))
C_A <- matrix(
c(0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0,
0.7, 0.1, 0.0, 0.0,
0.2, 0.5, 0.0, 0.0
), ncol = 4, byrow = TRUE,
dimnames = list(vars, vars))
C_S <- matrix(
c(1.0, 0.3, 0.0, 0.0,
0.3, 1.0, 0.0, 0.0,
0.0, 0.0, 0.458, 0.1,
0.0, 0.0, 0.1, 0.650
), ncol = 4, byrow = TRUE,
dimnames = list(vars, vars))
C_V <- solve(I - C_A) %*% C_S %*% t(solve(I - C_A))
E_A <- matrix(
c(0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0,
0.2, 0.0, 0.0, 0.0,
0.0, 0.2, 0.0, 0.0
), ncol = 4, byrow = TRUE,
dimnames = list(vars, vars))
E_S <- matrix(
c(1.0, 0.1, 0.0, 0.0,
0.1, 1.0, 0.0, 0.0,
0.0, 0.0, 0.960, 0.0,
0.0, 0.0, 0.0, 0.960
), ncol = 4, byrow = TRUE,
dimnames = list(vars, vars))
E_V <- solve(I - E_A) %*% E_S %*% t(solve(I - E_A))
vcomp <- matrix(
c(0.5, 0.4, 0.6, 0.4,
0.3, 0.2, 0.2, 0.2,
0.2, 0.4, 0.2, 0.4),
ncol = 3,
dimnames = list(vars, c('A', 'C', 'E'))
)
A <- (sqrt(vcomp[, 'A']) %*% t(sqrt(vcomp[, 'A']))) * A_V
C <- (sqrt(vcomp[, 'C']) %*% t(sqrt(vcomp[, 'C']))) * C_V
E <- (sqrt(vcomp[, 'E']) %*% t(sqrt(vcomp[, 'E']))) * E_V
cov_mz <- rbind(cbind(A + C + E, A + C),
cbind(A + C, A + C + E))
cov_dz <- rbind(cbind(A + C + E, 0.5 %x% A + C),
cbind(0.5 %x% A + C, A + C + E))
dimnames(cov_mz) <- dimnames(cov_dz) <-
list(paste0(vars, rep(1:2, each = length(vars))),
paste0(vars, rep(1:2, each = length(vars))))
mz_data <- MASS::mvrnorm(1000,
mu = rep(0, 8),
Sigma = cov_mz)
dz_data <- MASS::mvrnorm(1000,
mu = rep(0, 8),
Sigma = cov_dz)
data <- rbind(cbind(zyg = 'MZ', as.data.frame(mz_data)),
cbind(zyg = 'DZ', as.data.frame(dz_data)))
Now we can apply the cross-lag ACE model to the data. The argument defintion
is a list with the vectors with variable names from each one measurement occasion. Otherwise, the model works just like any other model from TwinAnalysis
package, refer to another vignette. I present here a minimal example of the application of cross-lag ACE.
library(OpenMx)
#> To take full advantage of multiple cores, use:
#> mxOption(key='Number of Threads', value=parallel::detectCores()) #now
#> Sys.setenv(OMP_NUM_THREADS=parallel::detectCores()) #before library(OpenMx)
library(TwinAnalysis)
model <- cross_lag_ace(data, zyg = 'zyg',
definition = list(c('X1', 'Y1'),
c('X2', 'Y2')))
#> Running crosslag with 14 parameters
model <- mxRun(model)
#> Running ACE with 34 parameters
model <- ref_models(model)
#> Running Saturated ACE with 88 parameters
#> Running Independence ACE with 32 parameters
# To fit assumption constrained reference models:
# model <- twin_ref_models(model, run = TRUE)
# Fit statistics
fit_stats(model)
#> model base ep minus2LL df AIC BIC CFI TLI
#> 1 ACE Saturated 34 37170.29 15966 5238.292 -84185.72 1.001177 1.001221
#> RMSEA diffLL diffdf p
#> 1 0 44.13209 54 0.8287449
# Parameter tables
get_output_tables(model)
#> $Variance_components
#> Atot Acom Ares Ctot Ccom Cres
#> X1 0.4648753 0.0000000 0.46487533 0.3298528 0.00000000 0.32985279
#> Y1 0.3177104 0.0000000 0.31771037 0.3038531 0.00000000 0.30385310
#> X2 0.6144000 0.5029067 0.11149331 0.1881319 0.09429262 0.09383925
#> Y2 0.3762794 0.2810327 0.09524676 0.2377194 0.10019022 0.13752920
#> Etot Ecom Eres
#> X1 0.2052719 0.000000000 0.2052719
#> Y1 0.3784365 0.000000000 0.3784365
#> X2 0.1974681 0.006607846 0.1908603
#> Y2 0.3860011 0.007941010 0.3780601
#>
#> $Path_estimates
#> A C E
#> X1 --> X2 0.56489206 0.71724060 0.181666296
#> X1 --> Y2 0.18682441 0.21765763 0.024121778
#> Y1 --> X2 0.39063876 -0.02041916 0.008038069
#> Y1 --> Y2 0.70964189 0.51857464 0.138158582
#> X1 <-> X1 1.00000000 1.00000000 1.000000000
#> X1 <-> Y1 0.78586426 0.46588136 0.135456249
#> Y1 <-> Y1 1.00000000 1.00000000 1.000000000
#> X2 <-> X2 0.18146697 0.49879507 0.966537148
#> X2 <-> Y2 -0.02827336 0.30873028 0.025003808
#> Y2 <-> Y2 0.25312774 0.57853584 0.979427495
Output tables:
Variance_components
:Atot
,Ctot
andEtot
are the standardized estimates of genetic and environmental contributions.Acom
,Ccom
abdEcom
are genetic and environmental contributions accounted for by the factors from the previous measurement occasion. For the first measurement occasion, there is no previous one, soAcom
,Ccom
andEcom
forX1
andX2
are zeros.Ares
,Cres
andEres
are the factors that remained unaccounted for by the previous measurement occasion. Eachcom
andres
sum up totot
andAtot
,Ctot
andEtot
sum up to 1, the total variability of the corresponding measure.Path_estimates
: Each columnA
,C
andE
stands for the corresponding part of the twin model. Each row represents a parameter from the cross-lag model. For example, the genetic stability ofX
, denoted asX1 --> X2
, is 0.565. The genetic correlation betweenX
andY
in the first measurement occasion, denoted asX1 <-> Y1
, is 0.786. Mind, that the total variance of the latent variables is constrained to 1 and the variance parameters in the table (X1 <-> X1
,Y1 <-> Y1
etc) are residual (novel) variability, not total. In the first measurement occasion, all variability is novel, this is why allX1 <-> X1
andY1 <-> Y1
are 1. It is not so forX2 <-> X2
andY2 <-> Y2
because some part of variance ofX2
andY2
is explained byX1
andY1
.
Note:
at the moment, mxCheckIdentification()
doesn’t work on the cross-lag ACE and mxTryHard()
doesn’t run the fitting repeatedly if the model status is red.