Workshop 2:

An Introduction to Structural Equation Model Trees and Forests


Andreas Brandmaier

Max-Planck-Institut für Bildungsforschung und Max Planck UCL Centre for Computational Psychiatry and Ageing Research
 

Abstract

Structural Equation Models (SEM) have become widely accepted as a hypothesis-driven modeling tool for the relation between latent and observed variables. SEM is particularly attractive because it can be seen as a generalization of several multivariate analysis techniques. Decision trees, on the other hand, are data-driven, non-parametric models that learn hierarchical structures from observed data to optimally predict outcomes. Structural Equation Model Trees (SEM Trees) combine the strengths of SEM and the decision tree paradigm by building model tree structures that separate a dataset recursively into subsets with significantly different parameter estimates in a hypothesized SEM. If a global SEM for all observations does not fit well and additional variables are available, SEM Trees may be able to partition the observations with respect to these variables and find a best-fitting model in each resulting group of observations.

SEM forests are ensembles of SEM trees each built on a random sample of the original data. By aggregating the predictive information across trees in a forest, researchers obtain a non-parametric estimate of variable importance that is more robust than corresponding measures from single trees. By combining the flexibility of SEM as a generic modeling technique with the potential of trees and forests to account for diverse and possibly interactive predictors, SEM trees and forests serve as a powerful tool for the generation and refinement of hypotheses in large datasets.

In this seminar, we will cover the essentials of decision trees, SEM, and how those ideas can be combined into SEM trees and forests. Using the “semtree” package for the R programming language and Onyx, a graphical tool for SEM, we will learn how to run SEM tree and forest analyses using both simulated and real data.

For the practical part, participants are kindly asked to bring their own laptops running R and a recent version of the Java runtime environment. A basic understanding of SEM may be helpful.

 

Die Workshops finden am Sonntag, den 15.09.2019 jeweils von 9:00 bis 17:00 Uhr statt. Die Anmeldung zu den Workshops erfolgt bei der Anmeldung über ConfTool.