Lab: Mediation Assessment

Here is my solution to the Lab assessment for today, based on our Robust Mediation seminar.

Deliverable

Create a dataset based on the observed coefficient estimates from the results on page 13 for the Passion → Locomotion → Grit portion of the model. Set your number of observations equal to 204, which is the same as that reported in the paper, and use set.seed(080203) for your simulation. Then conduct a sensitivity analysis using the mediation package using this data, determining the estimated value for rho where the indirect effect would be zero. Based on this analysis, what do you conclude about the threat of omitted variable bias in the paper?

Generate dataset

library(tidyverse)
library(lavaan)
population.model <- 'Locomotion ~ .49*Passion
                     Grit ~ .43*Locomotion
                     Grit ~ .0*Passion'

# Generate data
set.seed(080203)
my.df <- simulateData(population.model, sample.nobs = 204)

Notice that I specified the parameters for the two parameter estimates (\(a\) and \(b\) that we are interested in), but then constrained the Grit —> Passion path to be equal to zero. This is a judgement call, but as reported in the paper, this path wasn’t hypothesized (though it was freed later).

Model test

Any time you generate a dataset based on published results, you want to test your model to see how close you come to what was published. So, we set up a test model.

test.model <- 'Locomotion ~ a * Passion  # a path
               Grit ~ b * Locomotion  # b path

               # Specified parameters
               ab := a*b'
test.fit <- sem(test.model, data = my.df)
summary(test.fit)

## lavaan (0.5-23.1097) converged normally after  11 iterations
## 
##   Number of observations                           204
## 
##   Estimator                                         ML
##   Minimum Function Test Statistic                3.285
##   Degrees of freedom                                 1
##   P-value (Chi-square)                           0.070
## 
## Parameter Estimates:
## 
##   Information                                 Expected
##   Standard Errors                             Standard
## 
## Regressions:
##                    Estimate  Std.Err  z-value  P(>|z|)
##   Locomotion ~                                        
##     Passion    (a)    0.469    0.077    6.116    0.000
##   Grit ~                                              
##     Locomotion (b)    0.468    0.060    7.857    0.000
## 
## Variances:
##                    Estimate  Std.Err  z-value  P(>|z|)
##    .Locomotion        1.079    0.107   10.100    0.000
##    .Grit              0.926    0.092   10.100    0.000
## 
## Defined Parameters:
##                    Estimate  Std.Err  z-value  P(>|z|)
##     ab                0.220    0.046    4.826    0.000

So a couple of things to note.

While we specified the parameter estimates, remember this is a simulation, so we’re not going to get the exact values out of the draw. This is why, to do this correctly, you’d actually not want to use the set.seed function and let the model change the seed after each draw. Then you do this simulation, say, 1,000 times to get an average, which should be statistically indistinguishable from our specified parameters.

That said, our simulation’s indirect effect is .220, which is only a hundredth of a percent off of the indirect effect specified in the model (.49 x .43 = .21), so, I’m not too concerned about a substantive difference in the simulation results.

Analysis setup

Now lets set up our model for the simulation using the mediation package. For this one, I’m going to use 1,000 simulations.

library(mediation)
# We first set up our two regression equations
mediator.model <- lm(Locomotion ~ Passion, data = my.df)
outcome.model <- lm(Grit ~ Locomotion + Passion, data = my.df)
# Now we specify the full model using the 'mediate' function
mediation.si.fit <- mediate(mediator.model, outcome.model, treat = "Passion", 
                            mediator = "Locomotion", robustSE = TRUE, sims = 1000)

Now lets take a look…

summary(mediation.si.fit)

## 
## Causal Mediation Analysis 
## 
## Quasi-Bayesian Confidence Intervals
## 
##                Estimate 95% CI Lower 95% CI Upper p-value
## ACME             0.2396       0.1478       0.3503    0.00
## ADE             -0.1376      -0.2660      -0.0163    0.02
## Total Effect     0.1020      -0.0341       0.2440    0.15
## Prop. Mediated   2.1304     -13.0782      18.9075    0.15
## 
## Sample Size Used: 204 
## 
## 
## Simulations: 1000

So again, we’ve got a higher indirect effect (two one-hundredths) from the one reported in the paper. This is fine for our example, but another reason why you would actually run thousands of these simulations to have more confidence.

Simulation

Now we need to run some simulations using the medsens function.

sensitivity.fit <- medsens(mediation.si.fit, rho.by = 0.1, 
                           effect.type = "indirect", sims = 1000)
summary(sensitivity.fit)

## 
## Mediation Sensitivity Analysis for Average Causal Mediation Effect
## 
## Sensitivity Region
## 
##      Rho    ACME 95% CI Lower 95% CI Upper R^2_M*R^2_Y* R^2_M~R^2_Y~
## [1,] 0.4  0.0532      -0.0114       0.1178         0.16       0.1021
## [2,] 0.5 -0.0075      -0.0700       0.0550         0.25       0.1596
## 
## Rho at which ACME = 0: 0.5
## R^2_M*R^2_Y* at which ACME = 0: 0.25
## R^2_M~R^2_Y~ at which ACME = 0: 0.1596

So we end up with a \(\rho\) of 0.5. We would need to see a correlation between the disturbance terms (\(\zeta\)) of about .5 in order to have an simulated omitted variable lead to a different (not necessarily correct!) nomological conclusion from what is reported in the paper.

Is that a lot? Well, it could be, but then again, it could not be. That’s the rub, and why Imai et al. does not provide a table of “critical values of \(\rho\)” that could be abused by researchers.

Word of caution

Again, this is one simulation. If you were going to do this in real life, you’d want to actually run thousands of simulations, collecting \(\rho\) each time, and then building a confidence interval around that average. This is for another lab though!

Brian S. Anderson, Ph.D. | Assistant Professor of Entrepreneurship | andersonbri@umkc.edu
Henry W. Bloch School of Management | University of Missouri - Kansas City
entrepreneurship.bloch.umkc.edu
(c) Brian S. Anderson, 2017