Hi,

This is a vignette to illustrate an application of the cross-lag ACE model from the TwinAnalysis package.

Here is a full cross-lag ACE model with four variables. Two traits X and Y are measured twice (X1, Y1, X2, Y2). Only one twin is shown on the path diagram.

The model includes three sets of latent variables that correspond to the structure of genetic, shared environmental and non-shared environmental variability. Each set mirrors the cross-laged structure of the data. Here are the parameters that represent genetic structure of the traits:

The model allows correlations within each assessment (\(r_1\), \(r_2\)) and regression paths from variables in one assessment to the variables in the following one (\(k_{11}\), … , \(k_{22}\)). The total variability of all latent variables is constrained to 1. \(h_{X1}^2\), … , \(h_{Y2}^2\) represent the amount of genetic variability of corresponding traits.

I used the following model to simulate the data:

The lengthy code that generates the data is below. It assumes that you have OpenMx installed. We will need it later anyway. In the resulting data table, X11 is trait X at the first assessment in the first twin, X12 is trait X at the first assessment in the second twin, etc. zyg stands for zygosity (MZ or DZ).

vars <- c('X1', 'Y1', 'X2', 'Y2')

I <- OpenMx::vec2diag(rep(1, 4))

A_A <- matrix(
  c(0.0, 0.0, 0.0, 0.0,
    0.0, 0.0, 0.0, 0.0,
    0.6, 0.3, 0.0, 0.0,
    0.4, 0.5, 0.0, 0.0
  ), ncol = 4, byrow = TRUE,
  dimnames = list(vars, vars))

A_S <- matrix(
  c(1.0, 0.8, 0.0,   0.0,
    0.8, 1.0, 0.0,   0.0,
    0.0, 0.0, 0.262, 0.1,
    0.0, 0.0, 0.1,   0.270
  ), ncol = 4, byrow = TRUE,
  dimnames = list(vars, vars))

A_V <- solve(I - A_A) %*% A_S %*% t(solve(I - A_A))


C_A <- matrix(
  c(0.0, 0.0, 0.0, 0.0,
    0.0, 0.0, 0.0, 0.0,
    0.7, 0.1, 0.0, 0.0,
    0.2, 0.5, 0.0, 0.0
  ), ncol = 4, byrow = TRUE,
  dimnames = list(vars, vars))

C_S <- matrix(
  c(1.0, 0.3, 0.0,   0.0,
    0.3, 1.0, 0.0,   0.0,
    0.0, 0.0, 0.458, 0.1,
    0.0, 0.0, 0.1,   0.650
  ), ncol = 4, byrow = TRUE,
  dimnames = list(vars, vars))

C_V <- solve(I - C_A) %*% C_S %*% t(solve(I - C_A))


E_A <- matrix(
  c(0.0, 0.0, 0.0, 0.0,
    0.0, 0.0, 0.0, 0.0,
    0.2, 0.0, 0.0, 0.0,
    0.0, 0.2, 0.0, 0.0
  ), ncol = 4, byrow = TRUE,
  dimnames = list(vars, vars))

E_S <- matrix(
  c(1.0, 0.1, 0.0,   0.0,
    0.1, 1.0, 0.0,   0.0,
    0.0, 0.0, 0.960, 0.0,
    0.0, 0.0, 0.0,   0.960
  ), ncol = 4, byrow = TRUE,
  dimnames = list(vars, vars))

E_V <- solve(I - E_A) %*% E_S %*% t(solve(I - E_A))

vcomp <- matrix(
  c(0.5, 0.4, 0.6, 0.4,
    0.3, 0.2, 0.2, 0.2,
    0.2, 0.4, 0.2, 0.4),
  ncol = 3,
  dimnames = list(vars, c('A', 'C', 'E'))
)

A <- (sqrt(vcomp[, 'A']) %*% t(sqrt(vcomp[, 'A']))) * A_V 
C <- (sqrt(vcomp[, 'C']) %*% t(sqrt(vcomp[, 'C']))) * C_V
E <- (sqrt(vcomp[, 'E']) %*% t(sqrt(vcomp[, 'E']))) * E_V

cov_mz <- rbind(cbind(A + C + E, A + C),
                cbind(A + C, A + C + E))
cov_dz <- rbind(cbind(A + C + E, 0.5 %x% A + C),
                cbind(0.5 %x% A + C, A + C + E))

dimnames(cov_mz) <- dimnames(cov_dz) <- 
  list(paste0(vars, rep(1:2, each = length(vars))), 
       paste0(vars, rep(1:2, each = length(vars))))

mz_data <- MASS::mvrnorm(1000, 
                         mu = rep(0, 8),
                         Sigma = cov_mz)

dz_data <- MASS::mvrnorm(1000, 
                         mu = rep(0, 8),
                         Sigma = cov_dz)

data <- rbind(cbind(zyg = 'MZ', as.data.frame(mz_data)),
              cbind(zyg = 'DZ', as.data.frame(dz_data)))

Now we can apply the cross-lag ACE model to the data. The argument defintion is a list with the vectors with variable names from each one measurement occasion. Otherwise, the model works just like any other model from TwinAnalysis package, refer to another vignette. I present here a minimal example of the application of cross-lag ACE.

library(OpenMx)
#> To take full advantage of multiple cores, use:
#>   mxOption(key='Number of Threads', value=parallel::detectCores()) #now
#>   Sys.setenv(OMP_NUM_THREADS=parallel::detectCores()) #before library(OpenMx)
library(TwinAnalysis)

model <- cross_lag_ace(data, zyg = 'zyg',
                       definition = list(c('X1', 'Y1'),
                                         c('X2', 'Y2')))
#> Running crosslag with 14 parameters

model <- mxRun(model)
#> Running ACE with 34 parameters

model <- ref_models(model)
#> Running Saturated ACE with 88 parameters
#> Running Independence ACE with 32 parameters

# To fit assumption constrained reference models:
# model <- twin_ref_models(model, run = TRUE)

# Fit statistics
fit_stats(model)
#>   model      base ep minus2LL    df      AIC       BIC      CFI      TLI
#> 1   ACE Saturated 34 37170.29 15966 5238.292 -84185.72 1.001177 1.001221
#>   RMSEA   diffLL diffdf         p
#> 1     0 44.13209     54 0.8287449

# Parameter tables
get_output_tables(model)
#> $Variance_components
#>         Atot      Acom       Ares      Ctot       Ccom       Cres
#> X1 0.4648753 0.0000000 0.46487533 0.3298528 0.00000000 0.32985279
#> Y1 0.3177104 0.0000000 0.31771037 0.3038531 0.00000000 0.30385310
#> X2 0.6144000 0.5029067 0.11149331 0.1881319 0.09429262 0.09383925
#> Y2 0.3762794 0.2810327 0.09524676 0.2377194 0.10019022 0.13752920
#>         Etot        Ecom      Eres
#> X1 0.2052719 0.000000000 0.2052719
#> Y1 0.3784365 0.000000000 0.3784365
#> X2 0.1974681 0.006607846 0.1908603
#> Y2 0.3860011 0.007941010 0.3780601
#> 
#> $Path_estimates
#>                     A           C           E
#> X1 --> X2  0.56489206  0.71724060 0.181666296
#> X1 --> Y2  0.18682441  0.21765763 0.024121778
#> Y1 --> X2  0.39063876 -0.02041916 0.008038069
#> Y1 --> Y2  0.70964189  0.51857464 0.138158582
#> X1 <-> X1  1.00000000  1.00000000 1.000000000
#> X1 <-> Y1  0.78586426  0.46588136 0.135456249
#> Y1 <-> Y1  1.00000000  1.00000000 1.000000000
#> X2 <-> X2  0.18146697  0.49879507 0.966537148
#> X2 <-> Y2 -0.02827336  0.30873028 0.025003808
#> Y2 <-> Y2  0.25312774  0.57853584 0.979427495

Output tables:

Variance_components: Atot, Ctot and Etot are the standardized estimates of genetic and environmental contributions. Acom, Ccom abd Ecom are genetic and environmental contributions accounted for by the factors from the previous measurement occasion. For the first measurement occasion, there is no previous one, so Acom, Ccom and Ecom for X1 and X2 are zeros. Ares, Cres and Eres are the factors that remained unaccounted for by the previous measurement occasion. Each com and res sum up to tot and Atot, Ctot and Etot sum up to 1, the total variability of the corresponding measure.
Path_estimates: Each column A, C and E stands for the corresponding part of the twin model. Each row represents a parameter from the cross-lag model. For example, the genetic stability of X, denoted as X1 --> X2, is 0.565. The genetic correlation between X and Y in the first measurement occasion, denoted as X1 <-> Y1, is 0.786. Mind, that the total variance of the latent variables is constrained to 1 and the variance parameters in the table (X1 <-> X1, Y1 <-> Y1 etc) are residual (novel) variability, not total. In the first measurement occasion, all variability is novel, this is why all X1 <-> X1 and Y1 <-> Y1 are 1. It is not so for X2 <-> X2 and Y2 <-> Y2 because some part of variance of X2 and Y2 is explained by X1 and Y1.

Note:

at the moment, mxCheckIdentification() doesn’t work on the cross-lag ACE and mxTryHard() doesn’t run the fitting repeatedly if the model status is red.

Cross-lag ACE model

Ivan Voronin

2020-04-04

Note: