This exercise continues the analysis of the lung cancer data in Exercise 2.3
Following Exercise 2.3, assume that the survival time \(X\) of the lung cancer patients are independent Weibull distributed \[
X_1,\ldots,X_n \vert \lambda, k \overset{\mathrm{iid}}{\sim} \mathrm{Weibull}(\lambda,k).
\] In Exercise 2.3 the value of \(k\) was fixed; here we will treat both \(\lambda\) and \(k\) as unknown. Since both these parameters are positive we reparameterize first by taking logs: \[\tilde\lambda := \log\lambda \text{ and } \tilde k := \log k,\] which usually improves the normal approximation. We assume prior independence between the two parameters and priors: \[\lambda \sim \mathrm{LogNormal}(5,1^2) \text{ and } k \sim \mathrm{LogNormal}(0,2^2),
\] which by the definition of the log-normal distribution implies that \[\tilde\lambda \sim N(5,1^2) \text{ and } \tilde k \sim N(0,2^2).\]
Compute a bivariate normal approximation of the joint posterior distribution in the log-parameterization \(p(\tilde\lambda, \tilde k \vert x_1,\ldots,x_n)\) using numerical optimization.
Use the results from (a) to obtain a Log-normal approximation for the two marginal posterior distributions \(p(\lambda \vert x_1,\ldots,x_n)\) and \(p(k \vert x_1,\ldots,x_n)\) in the original parameterization. Plot the prior and posterior density for both parameters. hint: remember that the marginal distributions from a bivariat normal distribution are both normal. Also, recall the relationship between normal and log-normal distributions.
If \(X \sim \mathrm{Weibull}(\lambda, k)\) then \[
\mu := \mathbb{E}(X)=\lambda\Gamma(1+1/k),
\] where \(\Gamma()\) is the gamma function. Obtain an approximate posterior distribution for \(\mu\) using the results from (a) and simulation.
Solution Exercise 7.2a
library(tidyverse) # loads data manipulation and visualization packageslibrary(survival) # loads the lung cancer data as `lung`colors =c("#6C8EBF", "#c0a34d", "#780000","#007878","#B5C6DF","#EADAAA","#AE6666")
Set data and set up prior hyperparameters
x = lung$timecensored = (lung$status ==1)log_lambda_mean <-5log_lambda_sd <-1log_k_mean <-0log_k_sd <-2
Set up function that computes the log posterior for any \(\boldsymbol{\theta}=(\tilde\lambda,\tilde k)\) vector. The first argument of this function must be a vector containing all parameters.
# Function for computing the log posterior for any given parameter vector logpost_weibull <-function(theta, x, censored){# Compute the parameters in the original scale lambda =exp(theta[1]) k =exp(theta[2])# Compute the (log) joint prior density logPrior =dnorm(theta[1], log_lambda_mean, log_lambda_sd, log =TRUE) +dnorm(theta[2], log_k_mean, log_k_sd, log =TRUE)# Compute the log-likelihood loglik_uncensored =sum(dweibull(x[-censored], shape = k, scale = lambda, log =TRUE)) loglik_censored =sum(pweibull(x[censored], shape = k, scale = lambda, lower.tail =FALSE, log.p =TRUE)) logLik = loglik_uncensored + loglik_censored# Return the log posteriorreturn(logLik + logPrior) }
Use optim to find the posterior mode and the observed information matrix J
from which we can compute approximate posterior standard deviations for \(\tilde\lambda\) and \(\tilde k\)
postStd
[1] 0.04652876 0.05190422
Solution Exercise 7.2b
Since the marginal distributions from a bivariate normal distribution are both normal, we have that the following normal posterior approximations \[
\begin{align}
\tilde\lambda \vert x_1,\ldots,x_n & \sim N(6.017,0.047) \\
\tilde k \vert x_1,\ldots,x_n & \sim N(0.359,0.052) \\
\end{align}
\] and therefore Log-Normal approximations on the original scale \[
\begin{align}
\lambda \vert x_1,\ldots,x_n & \sim \mathrm{LogNormal}(6.017,0.047) \\
k \vert x_1,\ldots,x_n & \sim \mathrm{LogNormal}(0.359,0.052) \\
\end{align}
\]
# Plot the prior and posterior densitieslambdaGrid <-seq(1, 600, length =1000)kGrid <-seq(0.01, 3, length =1000)# Plot the prior and posterior densitiesprior_dens_lambda <-dlnorm(lambdaGrid, log_lambda_mean, log_lambda_sd)post_dens_lambda <-dlnorm(lambdaGrid, postMode[1], postStd[1])prior_dens_k <-dlnorm(kGrid, log_k_mean, log_k_sd)post_dens_k <-dlnorm(kGrid, postMode[2], postStd[2])df_lambda <-data.frame(paramGrid = lambdaGrid, prior = prior_dens_lambda, posterior = post_dens_lambda)df_k <-data.frame(paramGrid = kGrid, prior = prior_dens_k, posterior = post_dens_k)df_long_lambda <- df_lambda %>%pivot_longer(-paramGrid, names_to ="density_type", values_to ="density")p_lambda =ggplot(df_long_lambda) +aes(x = paramGrid, y = density, color = density_type) +geom_line() +scale_colour_manual(breaks =c("prior", "posterior"), values =c(colors[2], colors[3])) +labs(title ="", x =expression(lambda), y ="Density", color ="") +theme_minimal()df_long_k <- df_k %>%pivot_longer(-paramGrid, names_to ="density_type", values_to ="density")p_k =ggplot(df_long_k) +aes(x = paramGrid, y = density, color = density_type) +geom_line() +scale_colour_manual(breaks =c("prior", "posterior"), values =c(colors[2], colors[3])) +labs(title ="" , x =expression(k), y ="Density", color ="") +theme_minimal()gridExtra::grid.arrange(p_lambda, p_k, nrow =2)
Solution Exercise 7.2c
We can simulate from the multivariate normal posterior approximation of \(\tilde\lambda,\tilde k\) from (a), transform the draws to the orginal parameterization \(\lambda,k\) and the finally compute the mean \(\mu = \lambda\Gamma(1 + 1/k)\) for each draw. Like this:
library(mvtnorm)nDraws =5000thetaDraws =rmvnorm(nDraws, postMode, postCov)lambdaDraws =exp(thetaDraws[,1])kDraws =exp(thetaDraws[,2])muDraws = lambdaDraws*gamma(1+ (1/kDraws))hist(muDraws, 50, col = colors[1], freq =FALSE, xlab =expression(mu), ylab ="density", main ="Posterior for the Weibull mean")
Exercise 7.3
This exercise uses the lung cancer data presented in Exercise 2.3
Here we model the survival times of the lung cancer patients in Exercise 2.3 as independent Weibull distributed with a scale parameter \(\lambda\) that is a function of covariates, i.e. using a Weibull regression model. The response variable time is denoted by \(y\) and is modelled as a function of the three covariates age, sex and ph.ecog (ECOG performance score). The model for patient \(i\) is: \[
y_i \vert \mathbf{x}_i, \boldsymbol{\beta}, k \overset{\mathrm{ind}}{\sim} \mathrm{Weibull}\big(\lambda_i = \exp(\mathbf{x}_i^\top \boldsymbol{\beta}),k\big).
\] where \(\boldsymbol{\beta}\) is the vector with regression coefficients. Note that by the properties of the Weibull distribution, the conditional mean in this model is \(\mathbb{E}(y \vert \mathbf{x}_i) = \lambda_i\Gamma(1+1/k)\), so the regression coefficients do not quite have the usual interpretation of the effect on the conditional mean. The three covariates are placed in a \(n\times p\) matrix \(\mathbf{X}\) with the first column being one for all observations to model the intercept. Use a multivariate normal prior for \(\boldsymbol{\beta} \sim N(\mathbf{0},\tau^2\mathbf{I}_p)\) with the non-informative choice \(\tau = 100\). Reparameterize \(\tilde k := \log k\) and use the prior \(\tilde k \sim N(0,2^2)\). Remove the patients with missing values in the selected covariates.
Compute a normal approximation of the joint posterior distribution \(p(\boldsymbol{\beta}, \tilde k \vert \mathbf{y}, \mathbf{X})\) using numerical optimization. Plot the marginal posteriors of each regression coefficient and the marginal posterior for \(k\) on the original scale.
Use the result from (a) and Monte Carlo simulation to compute the predictive densities for the following two new patients:
80-year female with an ECOG performance score of 0 (0=asymptomatic), i.e. \(\mathbf{x} = (1, 80, 1, 0)^\top\)
80-year female with an ECOG performance score of 4 (4=bedbound) \(\mathbf{x} = (1, 80, 1, 4)^\top\).
Plot the two predictive densities and compare. Compute the predictive probability of living for at least another 1000 days for both patients.
Solution Exercise 7.3a
library(tidyverse) # loads data manipulation and visualization packageslibrary(mvtnorm)colors =c("#6C8EBF", "#c0a34d", "#780000","#007878","#B5C6DF","#EADAAA","#AE6666")
Set data and set up prior hyperparameters
library(survival) # loads the lung cancer data as `lung`lung <- lung %>%select(c("time", "status", "age", "sex", "ph.ecog")) %>%drop_na()y = lung$timeX =cbind(1, lung$age, lung$sex ==2, lung$ph.ecog) # sex = 1 is femalep =dim(X)[2]censored = (lung$status ==1)mu <-rep(0,p) # beta ~ N(mu, tau^2*I)tau <-100log_k_mean <-0log_k_sd <-2
Set up a function that computes the log posterior for any \(\boldsymbol{\theta}=(\boldsymbol{\beta}^\top,\tilde k)^\top\) vector. The first argument of this function must be a vector containing all parameters.
# Function for computing the log posterior for any given parameter vector logpost_weibullreg <-function(theta, y, X, censored){ p =dim(X)[2]# Compute the parameters in the original scale beta_ = theta[1:p] k =exp(theta[p+1])# Compute the (log) joint prior density logPrior =dmvnorm(beta_, mu, tau^2*diag(p), log =TRUE) +dnorm(theta[p+1], log_k_mean, log_k_sd, log =TRUE)# Compute the log-likelihood lambda_uncensored =exp(X[-censored,]%*%beta_) loglik_uncensored =sum(dweibull(y[-censored], shape = k, scale = lambda_uncensored, log =TRUE)) lambda_censored =exp(X[censored,]%*%beta_) loglik_censored =sum(pweibull(y[censored], shape = k, scale = lambda_censored, lower.tail =FALSE, log.p =TRUE)) logLik = loglik_uncensored + loglik_censored# Return the log posteriorreturn(logLik + logPrior) }
Use optim to find the posterior mode and the observed information matrix J:
initVal <-c(5, 0, 0, 0, log_k_mean) # Start optimizer at prior meansOptimResults<-optim(initVal, logpost_weibullreg, gr=NULL, y, X, censored,method =c("BFGS"), control=list(fnscale=-1), hessian=TRUE)
Warning in dweibull(y[-censored], shape = k, scale = lambda_uncensored, : NaNs
produced
Warning in dweibull(y[-censored], shape = k, scale = lambda_uncensored, : NaNs
produced
Warning in dweibull(y[-censored], shape = k, scale = lambda_uncensored, : NaNs
produced
Warning in dweibull(y[-censored], shape = k, scale = lambda_uncensored, : NaNs
produced
Warning in dweibull(y[-censored], shape = k, scale = lambda_uncensored, : NaNs
produced
Warning in dweibull(y[-censored], shape = k, scale = lambda_uncensored, : NaNs
produced
Since the marginal distributions from a multivariate normal distribution are all normal, the marginal posterior for \(\tilde k\) is normal and the posterior for \(k\) is therefore Log-Normal \[
\begin{equation}
k \vert \mathbf{y},\mathbf{X} \sim \mathrm{LogNormal}(0.38,0.052) \\
\end{equation}
\] This marginal posterior is plotted below
kGrid <-seq(0.01, 2, length =1000)prior_dens_k <-dlnorm(kGrid, log_k_mean, log_k_sd)post_dens_k <-dlnorm(kGrid, postMode[p+1], postStd[p+1])plot(kGrid, prior_dens_k, col = colors[1], type ="l", ylim =c(0,5.5), lwd =2)lines(kGrid, post_dens_k, col = colors[3], lwd =2)
The marginal posteriors for the four beta coefficients are plotted below.
The predictive distribution for the lifetime of a new person \(\tilde y\) with covariate vector \(\tilde{\mathbf{x}}\) given the training data \((\mathbf{y}, \mathbf{X})\) is given by the integral: \[
p(\tilde y \vert \tilde{\mathbf{x}}, \mathbf{y}, \mathbf{X}) = \int p(\tilde y \vert \tilde{\mathbf{x}}, \boldsymbol{\beta}, k)
p(\boldsymbol{\beta}, k \vert \mathbf{y}, \mathbf{X})\mathrm{d}\boldsymbol{\beta} \mathrm{d} k
\] where \(p(\boldsymbol{\beta}, k \vert \mathbf{y}, \mathbf{X})\) is the posterior distribution and \(p(\tilde y \vert \tilde{\mathbf{x}}, \boldsymbol{\beta}, k)\) is the Weibull density. A Monte Carlo evaluation of this integral is obtained by:
simulating parameter draws \(\boldsymbol{\beta}^{(i)}, \tilde{k}^{(i)}\) for \(i=1,\ldots,m\) from the multivariate normal approximation in (a)
compute \(k^{(i)} = \exp\big(\tilde{k}^{(i)}\big)\) for each draw
simulate a predictive draw \(\tilde y^{(i)}\) from the Weibull model given those parameters \[
\tilde y^{(i)} \vert \tilde{\mathbf{x}}, \boldsymbol{\beta}^{(i)}, k^{(i)} \overset{\mathrm{ind}}{\sim} \mathrm{Weibull}\big(\lambda_i = \exp(\tilde{\mathbf{x}}^\top \boldsymbol{\beta}^{(i)}), k^{(i)} \big).
\] This code below does exactly this for the two patients. Note that this can be done much efficiently by making all draws in one shot, but this code is connects better to the algorithm.
nDraws =5000x1_tilde =c(1,80,1,0) # first patientx2_tilde =c(1,80,1,5) # second patienty1_tilde =rep(0,nDraws)y2_tilde =rep(0,nDraws)for (i in1:nDraws){# Simulate from multivariate normal posterior approximation theta <-rmvnorm(1, postMode, postCov) beta_ = theta[1:p] k =exp(theta[p+1])# Simulate predictive draws from the model y1_tilde[i] =rweibull(1, shape = k, scale =exp(x1_tilde %*% beta_)) y2_tilde[i] =rweibull(1, shape = k, scale =exp(x2_tilde %*% beta_))}# Plot the predictive densities as kernel density estimateskde1 =density(y1_tilde)kde2 =density(y2_tilde)plot(kde1$x, kde1$y, col = colors[1], type ="l", , lwd =2, ylim =c(0, 0.005),xlab ="days", ylab ="predictive density",main ="predictive distributions for two patients")lines(kde2$x, kde2$y, col = colors[3], lwd =2)legend(x ="topright", inset=.05, legend =c("asymptomatic", "bedbound"), lty =c(1,1), col =c(colors[1], colors[3]), box.lty=1)
The posterior predictive probability of living for at least another 1000 days is
message("The probability that a 80 year female asymptomatic patient lives for at least 1000 days is ", mean(y1_tilde >=1000))
The probability that a 80 year female asymptomatic patient lives for at least 1000 days is 0.106
message("The probability that a 80 year female bedbound patient lives for at least 1000 days is ", mean(y2_tilde >=1000))
The probability that a 80 year female bedbound patient lives for at least 1000 days is 6e-04