Title: | Causal Inference using Multivariate Generalized Propensity Score |
---|---|
Description: | Methods for estimating and utilizing the multivariate generalized propensity score (mvGPS) for multiple continuous exposures described in Williams, J.R, and Crespi, C.M. (2020) <arxiv:2008.13767>. The methods allow estimation of a dose-response surface relating the joint distribution of multiple continuous exposure variables to an outcome. Weights are constructed assuming a multivariate normal density for the marginal and conditional distribution of exposures given a set of confounders. Confounders can be different for different exposure variables. The weights are designed to achieve balance across all exposure dimensions and can be used to estimate dose-response surfaces. |
Authors: | Justin Williams [aut, cre] |
Maintainer: | Justin Williams <[email protected]> |
License: | MIT + file LICENSE |
Version: | 1.2.2 |
Built: | 2024-11-03 04:21:17 UTC |
Source: | https://github.com/williazo/mvgps |
Assessing balance between exposure(s) and confounders is key when performing causal analysis using propensity scores. We provide a list of several models to generate weights to use in causal inference for multivariate exposures, and test the balancing property of these weights using weighted Pearson correlations. In addition, returns the effective sample size.
bal( model_list, D, C, common = FALSE, trim_w = FALSE, trim_quantile = 0.99, all_uni = TRUE, ... )
bal( model_list, D, C, common = FALSE, trim_w = FALSE, trim_quantile = 0.99, all_uni = TRUE, ... )
model_list |
character string identifying which methods to use when constructing weights. See details for a list of available models |
D |
numeric matrix of dimension |
C |
either a list of numeric matrices of length |
common |
logical indicator for whether C is a single matrix of common
confounders for all exposures. default is FALSE meaning C must be specified
as list of confounders of length |
trim_w |
logical indicator for whether to trim weights. default is FALSE |
trim_quantile |
numeric scalar used to specify the upper quantile to trim weights if applicable. default is 0.99 |
all_uni |
logical indicator. If TRUE then all univariate models specified in model_list will be estimated for each exposure. If FALSE will only estimate weights for the first exposure |
... |
additional arguments to pass to |
When using propensity score methods for causal inference it is crucial to check the balancing property of the covariates and exposure(s). To do this in the multivariate case we first use a weight generating method from the available list shown below.
"mvGPS": Multivariate generalized propensity score using Gaussian densities
"entropy": Estimating weights using entropy loss function without specifying propensity score (Tübbicke 2020)
"CBPS": Covariate balancing propensity score for continuous treatments which adds balance penalty while solving for propensity score parameters (Fong et al. 2018)
"PS": Generalized propensity score estimated using univariate Gaussian densities
"GBM": Gradient boosting to estimate the mean function of the propensity score, but still maintains Gaussian distributional assumptions (Zhu et al. 2015)
Note that only the mvGPS
method is multivariate and all others are strictly univariate.
For univariate methods weights are estimated for each exposure separately
using the weightit
function given the
confounders for that exposure in C
when all_uni=TRUE
. To estimate
weights for only the first exposure set all_uni=FALSE
.
It is also important to note that the weights for each method can be trimmed at
the desired quantile by setting trim_w=TRUE
and setting trim_quantile
in \[0.5, 1\]. Trimming is done at both the upper and lower bounds. For further details
see mvGPS
on how trimming is performed.
In this package we include three key balancing metrics to summarize balance across all of the exposures.
Euclidean distance
Maximum absolute correlation
Average absolute correlation
Euclidean distance is calculated using the origin point as reference, e.g. for m=2
exposures the reference point is \[0, 0\]. In this way we are calculating how far
the observed set of correlation points are from perfect balance.
Maximum absolute correlation reports the largest single imbalance between the exposures and the set of confounders. It is often a key diagnostic as even a single confounder that is sufficiently out of balance can reduce performance.
Average absolute correlation is the sum of the exposure-confounder correlations. This metric summarizes how well, on average, the entire set of exposures is balanced.
Effective sample size, ESS, is defined as
where are the estimated weights for a particular method (Kish 1965).
Note that when
for all units that the
is equal to the sample size
.
decreases when there are extreme weights or high variability in the weights.
W
: list of weights generated for each model
cor_list
: list of weighted Pearson correlation coefficients for all confounders specified
bal_metrics
: data.frame with the Euclidean distance, maximum absolute correlation, and average absolute correlation by method
ess
: effective sample size for each of the methods used to generate weights
models
: vector of models used
Fong C, Hazlett C, Imai K (2018).
“Covariate balancing propensity score for a continuous treatment: application to the efficacy of political advertisements.”
Annals of Applied Statistics, In-Press.
Kish L (1965).
Survey Sampling.
John Wiley \& Sons, New York.
Tübbicke S (2020).
“Entropy Balancing for Continuous Treatments.”
arXiv e-prints.
2001.06281.
Zhu Y, Coffman DL, Ghosh D (2015).
“A boosting algorithm for estimating generalized propensity scores with continuous treatments.”
Journal of Causal Inference, 3(1), 25-40.
#simulating data sim_dt <- gen_D(method="u", n=150, rho_cond=0.2, s_d1_cond=2, s_d2_cond=2, k=3, C_mu=rep(0, 3), C_cov=0.1, C_var=1, d1_beta=c(0.5, 1, 0), d2_beta=c(0, 0.3, 0.75), seed=06112020) D <- sim_dt$D C <- sim_dt$C #generating weights using mvGPS and potential univariate alternatives require(WeightIt) bal_sim <- bal(model_list=c("mvGPS", "entropy", "CBPS", "PS", "GBM"), D, C=list(C[, 1:2], C[, 2:3])) #overall summary statistics bal_sim$bal_metrics #effective sample sizes bal_sim$ess #we can also trim weights for all methods bal_sim_trim <- bal(model_list=c("mvGPS", "entropy", "CBPS", "PS", "GBM"), D, C=list(C[, 1:2], C[, 2:3]), trim_w=TRUE, trim_quantile=0.9, p.mean=0.5) #note that in this case we can also pass additional arguments using in #WeighIt package for entropy, CBPS, PS, and GBM such as specifying the p.mean #can check to ensure all the weights have been properly trimmed at upper and #lower bound all.equal(unname(unlist(lapply(bal_sim$W, quantile, 0.99))), unname(unlist(lapply(bal_sim_trim$W, max)))) all.equal(unname(unlist(lapply(bal_sim$W, quantile, 1-0.99))), unname(unlist(lapply(bal_sim_trim$W, min))))
#simulating data sim_dt <- gen_D(method="u", n=150, rho_cond=0.2, s_d1_cond=2, s_d2_cond=2, k=3, C_mu=rep(0, 3), C_cov=0.1, C_var=1, d1_beta=c(0.5, 1, 0), d2_beta=c(0, 0.3, 0.75), seed=06112020) D <- sim_dt$D C <- sim_dt$C #generating weights using mvGPS and potential univariate alternatives require(WeightIt) bal_sim <- bal(model_list=c("mvGPS", "entropy", "CBPS", "PS", "GBM"), D, C=list(C[, 1:2], C[, 2:3])) #overall summary statistics bal_sim$bal_metrics #effective sample sizes bal_sim$ess #we can also trim weights for all methods bal_sim_trim <- bal(model_list=c("mvGPS", "entropy", "CBPS", "PS", "GBM"), D, C=list(C[, 1:2], C[, 2:3]), trim_w=TRUE, trim_quantile=0.9, p.mean=0.5) #note that in this case we can also pass additional arguments using in #WeighIt package for entropy, CBPS, PS, and GBM such as specifying the p.mean #can check to ensure all the weights have been properly trimmed at upper and #lower bound all.equal(unname(unlist(lapply(bal_sim$W, quantile, 0.99))), unname(unlist(lapply(bal_sim_trim$W, max)))) all.equal(unname(unlist(lapply(bal_sim$W, quantile, 1-0.99))), unname(unlist(lapply(bal_sim_trim$W, min))))
Generate exposure from a bivariate normal distribution confounded by a set of
variables C
=\(C1, C2).
gen_D( method, n, rho_cond, s_d1_cond, s_d2_cond, k, C_mu, C_cov, C_var, C_sigma = NULL, d1_beta, d2_beta, seed = NULL )
gen_D( method, n, rho_cond, s_d1_cond, s_d2_cond, k, C_mu, C_cov, C_var, C_sigma = NULL, d1_beta, d2_beta, seed = NULL )
method |
character value identifying which method to use when generating
bivariate exposure. Options include "matrix_normal", "uni_cond", and "vector_normal".
See details for a brief explanation of each method. |
n |
integer value total number of units |
rho_cond |
scalar value identifying conditional correlation of exposures given covariates between \[0, 1\] |
s_d1_cond |
scalar value for conditional standard deviation of |
s_d2_cond |
scalar value for conditional standard deviation of |
k |
integer value determining number of covariates to generate in |
C_mu |
numeric vector of mean values for covariates. Must be same length as |
C_cov |
scalar value representing constant correlation between covariates |
C_var |
scalar value representing constant variance of covariates |
C_sigma |
numeric matrix representing the covariance matrix of covariates.
Default is NULL and will use |
d1_beta |
numeric vector of length |
d2_beta |
numeric vector of length |
seed |
integer value setting the seed of random generator to produce repeatable results. set to NULL by default |
We assume that there are a total of k
confounders that are generated
from a multivariate normal distribution with equicorrelation covariance, i.e.,
where is the column vector with all entries equal to 1,
is the identity matrix,
is a constant
standard deviation for all confounders, and
is the covariance of
any two confounders. Therefore, our random confounders
C
follow the distribution
We draw a total of n
samples from this multivariate normal distribution
using mvrnorm
.
The first step when generating the bivariate exposure is to specify the
effects of the confounders C
. We control this for each exposure value
using the arguments d1_beta
and d2_beta
such that
and
.
Note that by specifying d1_beta
and d2_beta
separately that the
user can control the amount of overlap in the confounders for each exposure,
and how many of the variables in C
are truly related to the exposures.
For instance to have the exposure have identical confounding effects
d1_beta
=d2_beta
, and they have separate confounding if there are
zero non-zero elements in common between d1_beta
and d2_beta
.
To generate the bivariate conditional distribution of exposures given the set
of confounders C
we have the following three methods:
"matrix_normal"
"uni_cond"
"vector_normal"
"matrix_normal" uses the function rmatnorm
to
generate all n
samples as
where is a column vector containing
and
, and
is the conditional covariance matrix.
"vector_normal" simply vectorizes the matrix_normal method above to generate
a vector of length .
"uni_cond" specifies the bivariate exposure using univariate conditional factorization, which in the case of bivariate normal results in two univariate normal expressions.
In general, we suggest using the univariate conditional, "uni_cond", method when generating exposures as it is substantially faster than both the matrix normal and vector normal approaches.
Note that the options use regular expression matching and can be specified uniquely using either "m", "u", or "v".
As described above the exposures are drawn conditional on the set C
,
so the marginal covariance of exposures is defined as
In our function we return the true marginal covariance as well
as the true marginal correlation
.
D
: nx2 numeric matrix of the sample values for the exposures given the set C
C
: nxk numeric matrix of the sampled values for the confounding set C
D_Sigma
: 2x2 numeric matrix of the true marginal covariance of exposures
rho
: numeric scalar representing the true marginal correlation of exposures
#generate bivariate exposures. D1 confounded by C1 and C2. D2 by C2 and C3 #uses univariate conditional normal to draw samples sim_dt <- gen_D(method="u", n=200, rho_cond=0.2, s_d1_cond=2, s_d2_cond=2, k=3, C_mu=rep(0, 3), C_cov=0.1, C_var=1, d1_beta=c(0.5, 1, 0), d2_beta=c(0, 0.3, 0.75), seed=06112020) D <- sim_dt$D C <- sim_dt$C #observed correlation should be close to true marginal value cor(D); sim_dt$rho #Use vector normal method instead of univariate method to draw samples sim_dt <- gen_D(method="v", n=200, rho_cond=0.2, s_d1_cond=2, s_d2_cond=2, k=3, C_mu=rep(0, 3), C_cov=0.1, C_var=1, d1_beta=c(0.5, 1, 0), d2_beta=c(0, 0.3, 0.75), seed=06112020)
#generate bivariate exposures. D1 confounded by C1 and C2. D2 by C2 and C3 #uses univariate conditional normal to draw samples sim_dt <- gen_D(method="u", n=200, rho_cond=0.2, s_d1_cond=2, s_d2_cond=2, k=3, C_mu=rep(0, 3), C_cov=0.1, C_var=1, d1_beta=c(0.5, 1, 0), d2_beta=c(0, 0.3, 0.75), seed=06112020) D <- sim_dt$D C <- sim_dt$C #observed correlation should be close to true marginal value cor(D); sim_dt$rho #Use vector normal method instead of univariate method to draw samples sim_dt <- gen_D(method="v", n=200, rho_cond=0.2, s_d1_cond=2, s_d2_cond=2, k=3, C_mu=rep(0, 3), C_cov=0.1, C_var=1, d1_beta=c(0.5, 1, 0), d2_beta=c(0, 0.3, 0.75), seed=06112020)
To define a proper estimable region with multivariate exposure we construct a convex hull of the data in order to maintain the positivity identifying assumption. We also provide options to create trimmed versions of the convex hull to further restrict to high density regions in multidimensional space.
hull_sample( X, num_grid_pts = 500, grid_type = "regular", trim_hull = FALSE, trim_quantile = NULL )
hull_sample( X, num_grid_pts = 500, grid_type = "regular", trim_hull = FALSE, trim_quantile = NULL )
X |
numeric matrix of n by m dimensions. Each row corresponds to a point in m-dimensional space. |
num_grid_pts |
integer scalar denoting the number of parameters to search for over the convex hull. Default is 500. |
grid_type |
character value indicating the type of grid to sample from
the convex hull from |
trim_hull |
logical indicator of whether to restrict convex hull. Default is FALSE |
trim_quantile |
numeric scalar between \[0.5, 1\] representing the quantile value to trim the convex hull. Only used if trim_hull is set to TRUE. |
Assume that is an
matrix representing the multivariate
exposure of interest. We can define the convex hull of these observations as
H. There are two distinct processes for defining H depending
on whether
or
.
If , we use the
chull
function to define the
vertices of the convex hull. The algorithm implemented is described in Eddy (1977).
If , we use the
convhulln
function. This algorithm
for obtaining the convex hull in m-dimensional space uses Qhull described in
Barber et al. (1996). Currently this function returns
only the vertex set hpts_vs
without the grid sample points. There are
options to visualize the convex hull when using triangular facets,
but there are no implementable solutions to sample along convex hulls in higher
dimensions.
To restrict the convex hull to higher density regions of the exposure we can
apply trimming. To apply trimming set trim_hull=TRUE
and specify
trim_quantile=q
where q
must be in \[0.5, 1\]. Along each
exposure dimension we then calculate the upper and lower bounds using the
quantile
function, i.e., quantile(q)
and
quantile(1-q)
. Any observations that have a value above or below these
sample quantiles is excluded. The remaining observations that fall completely
within the sample quantiles across all dimensions are used to estimate the
convex hull. We return X
that represents the observations used.
If trim_hull=FALSE
, then X
is unchanged. However, if trimming
is applied then X
contains only the remaining observations after trimming.
hpts_vs
: vertices of the convex hull in m-dimensional space
grid_pts
: values of grid points sampled over the corresponding convex hull
X
: data used to generate convex hull which may be trimmed
Barber CB, Dobkin DP, Huhdanpaa H (1996).
“The quickhull algorithm for convex hulls.”
ACM Transactions on Mathematical Software (TOMS), 22(4), 469-483.
Eddy WF (1977).
“A new convex hull algorithm for planar sets.”
ACM Transactions on Mathematical Software (TOMS), 3(4), 398-403.
#generating exposure with m=3 D <- matrix(unlist(lapply(seq_len(3), function(m) rnorm(100))), nrow=100) #first using only the first two variables we can return hpts_vs and grid_pts D_hull <- hull_sample(D[, 1:2]) #when m>2 we only return hpts_vs and grid_pts is NULL D_hull_large <- hull_sample(D) is.null(D_hull_large$grid_pts) #we can also apply trimming to the convex hull and return this reduced matrix D_hull_trim <- hull_sample(D[, 1:2], trim_hull=TRUE, trim_quantile=0.95) dim(D_hull$X) dim(D_hull_trim$X) #alternatively, we can also define the number of points to sample from for grid_pts small_grid <- hull_sample(D[, 1:2], num_grid_pts=100) length(D_hull$grid_pts) length(small_grid$grid_pts)
#generating exposure with m=3 D <- matrix(unlist(lapply(seq_len(3), function(m) rnorm(100))), nrow=100) #first using only the first two variables we can return hpts_vs and grid_pts D_hull <- hull_sample(D[, 1:2]) #when m>2 we only return hpts_vs and grid_pts is NULL D_hull_large <- hull_sample(D) is.null(D_hull_large$grid_pts) #we can also apply trimming to the convex hull and return this reduced matrix D_hull_trim <- hull_sample(D[, 1:2], trim_hull=TRUE, trim_quantile=0.95) dim(D_hull$X) dim(D_hull_trim$X) #alternatively, we can also define the number of points to sample from for grid_pts small_grid <- hull_sample(D[, 1:2], num_grid_pts=100) length(D_hull$grid_pts) length(small_grid$grid_pts)
Estimate propensity scores for multivariate continuous exposure by assuming joint normal conditional densities. Simultaneously estimate stabilized inverse probability of treatment weights (IPTW) using joint normal density for marginal distribution of exposure.
mvGPS(D, C, common = FALSE, trim_w = FALSE, trim_quantile = 0.99)
mvGPS(D, C, common = FALSE, trim_w = FALSE, trim_quantile = 0.99)
D |
numeric matrix of dimension |
C |
either a list of numeric matrices of length |
common |
logical indicator for whether C is a single matrix of common
confounders for all exposures. default is FALSE meaning C must be specified
as list of confounders of length |
trim_w |
logical indicator for whether to trim weights. default is FALSE |
trim_quantile |
numeric scalar used to specify the upper quantile to trim weights if applicable. default is 0.99 |
Generalized propensity scores (GPS) were proposed by Hirano and Imbens (2004) and Imai and Van Dyk (2004) to extend propensity scores to handle continuous exposures. The GPS is constructed using the conditional density of the exposure given a set of confounders. These original methods and the subsequent literature have focused on the case of a single continuous exposure where the GPS could be estimated using normal densities, kernel smoothing (Kennedy et al. 2017), gradient boosting (Zhu et al. 2015), and empirical likelihoods (Fong et al. 2018). In this package we provide an extension to this literature to allow for multivariate continuous exposures to be estimated.
Assume that we have a set of continuous exposures, , of length
m
, i.e., collected on
units. Further, we assume that there exists a set of confounders
for each
exposure of length
for
. The confounders are
related to both the exposures and the outcome of interest such. Note that
the confounders need not be identical for all exposures.
In our function we therefore expect that the argument D
is a numeric
matrix of dimension , and that
C
is a list of length
where each element is a matrix of dimension
. For
the case where we assume that all exposures have common confounders we
set
common=TRUE
and C
must be a matrix of dimension
.
We define the multivariate generalized propensity score, mvGPS, as
where represents a joint multivariate conditional density function.
For our current development we specify
as multivariate normal, i.e.,
Factorizing this joint density we can rewrite the mvGPS as the product of univariate conditional densities, i.e.,
We use this factorized version in our implementation, with parameters for each distribution estimated through least squares.
Following Robins et al. (2000), we use the mvGPS to construct stabilized inverse probability of treatment (IPTW) weights. These have been shown to balance confounders and return unbiased estimated of the dose-response. Weights are constructed as
where the marginal density of the exposures is assumed
to be multivariate normal as well.
Writing the weights using completely factorized densities we have
Often when using weights based on the propensity score, practitioners are
concerned about the effect of extreme weights. It has been shown that an
effective way to protect extreme weights is to trim them at a particular
percentile (Crump et al. 2009; Lee et al. 2011). We allow
users to specify whether trimmed weights should be returned and at which
threshold. To trim weights set trim_w=TRUE
and specify the desired
percentile as trim_quantile=q
. Note that trimming is applied at
both the upper and lower percentile thresholds, i.e.,
list of score and wts, where score is the mvGPS score values and wts are the corresponding stabilized inverse probability of treatment weights
Crump RK, Hotz VJ, Imbens GW, Mitnik OA (2009).
“Dealing with limited overlap in estimation of average treatment effects.”
Biometrika, 96(1), 187-199.
Fong C, Hazlett C, Imai K (2018).
“Covariate balancing propensity score for a continuous treatment: application to the efficacy of political advertisements.”
Annals of Applied Statistics, In-Press.
Hirano K, Imbens GW (2004).
“The propensity score with continuous treatments.”
In Gelman A, Meng X (eds.), Applied Bayesian Modeling and Causal Inference from Incomplete-Data Perspectives, 73-84.
Imai K, Van Dyk DA (2004).
“Causal inference with general treatment regimes: generalizing the propensity score.”
Journal of the American Statistical Association, 99(467), 854-866.
Kennedy EH, Ma Z, McHugh MD, Small DS (2017).
“Non-parametric methods for doubly robust estimation of continuous treatment effects.”
Journal of the Royal Statistical Society: Series B, 79(4), 1229-1245.
Lee BK, Lessler J, Stuart EA (2011).
“Weight trimming and propensity score weighting.”
PloS One, 6(3).
Robins JM, Hernan MA, Brumback B (2000).
“Marginal structural models and causal inference in epidemiology.”
Epidemiology, 11(5), 550-560.
Zhu Y, Coffman DL, Ghosh D (2015).
“A boosting algorithm for estimating generalized propensity scores with continuous treatments.”
Journal of Causal Inference, 3(1), 25-40.
#generating confounded bivariate exposure sim_dt <- gen_D(method="u", n=200, rho_cond=0.2, s_d1_cond=2, s_d2_cond=2, k=3, C_mu=rep(0, 3), C_cov=0.1, C_var=1, d1_beta=c(0.5, 1, 0), d2_beta=c(0, 0.3, 0.75), seed=06112020) D <- sim_dt$D C <- sim_dt$C #generating weights and mvGPS out_mvGPS <- mvGPS(D=D, C=list(C[, 1:2], C[, 2:3])) # can apply trimming with default 99th percentile out_mvGPS_trim <- mvGPS(D=D, C=list(C[, 1:2], C[, 2:3]), trim_w=TRUE) #or assume all exposures have equivalent confounders out_mvGPS_common <- mvGPS(D=D, C=C, common=TRUE)
#generating confounded bivariate exposure sim_dt <- gen_D(method="u", n=200, rho_cond=0.2, s_d1_cond=2, s_d2_cond=2, k=3, C_mu=rep(0, 3), C_cov=0.1, C_var=1, d1_beta=c(0.5, 1, 0), d2_beta=c(0, 0.3, 0.75), seed=06112020) D <- sim_dt$D C <- sim_dt$C #generating weights and mvGPS out_mvGPS <- mvGPS(D=D, C=list(C[, 1:2], C[, 2:3])) # can apply trimming with default 99th percentile out_mvGPS_trim <- mvGPS(D=D, C=list(C[, 1:2], C[, 2:3]), trim_w=TRUE) #or assume all exposures have equivalent confounders out_mvGPS_common <- mvGPS(D=D, C=C, common=TRUE)