Fully Bayesian imputation model for non-random missing data in qPCR


We propose a new statistical approach to obtain differential gene expression of non-detects in quantitative real-time PCR (qPCR) experiments through Bayesian hierarchical modeling. We propose to treat non-detects as non-random missing data, model the missing data mechanism, and use this model to impute Ct values or obtain direct estimates of relevant model parameters. A typical laboratory does not have the resources to perform experiments with a large number of replicates; therefore, we propose an approach that does not rely on large sample theory. We aim to demonstrate the possibilities that exist for analyzing qPCR data in the presence of non-random missingness through the use of Bayesian estimation. Bayesian analysis typically allows for smaller data sets to be analyzed without losing power while retaining precision. The heart of Bayesian estimation is that everything that is known about a parameter before observing the data (the prior) is combined with the information from the data itself (the likelihood), resulting in updated knowledge about the parameter (the posterior). In this work we introduce and describe our hierarchical model and chosen prior distributions, assess the model sensitivity to the choice of prior, perform convergence diagnostics for the Markov Chain Monte Carlo, and present the results of a real data application.

Valeriia Sherina
Principal Statistician at GSK
Matthew N. McCall
Associate Professor of Biostatistics and Biomedical Genetics