Workshop 2:

An Introduction to Structural Equation Model Trees and Forests


Andreas Brandmaier

Max Planck Institute for Human Development and Max Planck UCL Centre for Computational Psychiatry and Ageing Research


Abstract

Structural Equation Models (SEM) have become widely accepted as a hypothesis-driven modeling tool for the relation between latent and observed variables. SEM is particularly attractive because it can be seen as a generalization of several multivariate analysis techniques. Decision trees, on the other hand, are data-driven, non-parametric models that learn hierarchical structures from observed data to optimally predict outcomes. Structural Equation Model Trees (SEM Trees) combine the strengths of SEM and the decision tree paradigm by building model tree structures that separate a dataset recursively into subsets with significantly different parameter estimates in a hypothesized SEM. If a global SEM for all observations does not fit well and additional variables are available, SEM Trees may be able to partition the observations with respect to these variables and find a best-fitting model in each resulting group of observations.

SEM forests are ensembles of SEM trees each built on a random sample of the original data. By aggregating the predictive information across trees in a forest, researchers obtain a non-parametric estimate of variable importance that is more robust than corresponding measures from single trees. By combining the flexibility of SEM as a generic modeling technique with the potential of trees and forests to account for diverse and possibly interactive predictors, SEM trees and forests serve as a powerful tool for the generation and refinement of hypotheses in large datasets.

In this seminar, we will cover the essentials of decision trees, SEM, and how those ideas can be combined into SEM trees and forests. Using the “semtree” package for the R programming language and Onyx, a graphical tool for SEM, we will learn how to run SEM tree and forest analyses using both simulated and real data.

For the practical part, participants are kindly asked to bring their own laptops running R and a recent version of the Java runtime environment. A basic understanding of SEM may be helpful.

 

The workshops will take place on Sunday, 15.09.2019 from 9:00 to 17:00. Registration for the workshops can be completed via ConfTool.