List of Speakers and Abstracts

Rod Ball
NZ Forest Research Institute
Rotorua

Bayesian inference in gene discovery: experimental design and measures of evidence

Abstract:
We describe two methods for Bayesian inference in gene discovery: for QTL mapping in large experimental crosses, and experimental design for detecting linkage disequilibrium (LD, i.e. associations between loci) through population level marker-trait associations, with particular interest in detecting small effect genes affecting for economic traits or complex diseases.
QTL mapping attempts to find the genomic loci associated with variation in quantitative traits from marker- trait associations in experimental crosses. In the early days of QTL mapping, biological researchers experienced spurious associations (traits `detected’ were not verified, if they were re-tested at all), and selection bias (if verified their effects were smaller than originally estimated). We view QTL mapping as a model selection problem where each model is a linear regression on marker genotypes. We use the BIC criterion (Broman 1997, Broman and Speed 2002), further modified to take into account prior probabilities, to obtain approximate posterior probabilities for models. Selection bias can be large when selecting a single model. By considering all models according to their posterior probabilities (Cf Raftery et al 1997), we show that it is possible to obtain unbiased estimates of effects, and to make inferences about the genetic architecture (Ball 2001, discussed in Sillanpää and Corander 2002; see also Yandell et al 2002; Bogdan et al 2004).
QTL mapping uses LD generated in an experimental cross or pedigree. Association mapping, in contrast, uses LD existing in a population. This is usually shorter range and potentially higher resolution requiring even larger sample sizes. Many reported associations may be spurious (Altshuler et al 2000) due to inadequate statistical evidence, usually in the form of p-values. We modify the deterministic experimental design procedure (Luo 1998) to give power of designs to detect associations with a given Bayes factor (Ball 2005), hence a well-defined strength of evidence independent of sample size. Relationships between Bayesian inference and frequentist inference in the form of p-values, Type I error rates, power, and the false discovery rate are discussed.

Mik Black
Dept. of Statistics
University of Auckland

Estimating disease prevalence in the absence of a gold standard

Abstract:
When estimating disease prevalence, it is not uncommon to have data from conditionally dependent diagnostic tests. In such a situation, the estimation of prevalence is difficult if none of the tests is considered to be a gold standard. This talk will discuss a Bayesian approach to estimating disease prevalence based on the results of two diagnostic tests, allowing for the possibility that the tests are conditionally dependent, but not conditioning on any particular dependence structure.

Bill Bolstad
Department of Statistics
University of Waikato

A Monte Carlo Analysis of a Mixture-based Shrinkage Estimator

Abstract:
In this paper we examine the problem of estimating the means of several populations, when one or more of the means has been shifted a long way from the others. The shrinkage estimator from the hierarchical normal model would shrink the shifted mean too far from its sample mean, and this would be inefficient. We develop a hierarchical mean model based on a mixture distribution. We perform a Monte Carlo study comparing the efficiencies of the shrinkage estimator based on this model with the shrinkage estimator from the hierarchical normal mean model and a limited translation estimator. We see the mixture shrinkage estimator has very good properties. It shrinks all the mean estimates towards the overall mean when none of the means is shifted. When one of the means is shifted a large amount relative to the other means, that estimate is not shifted towards the overall mean very much, and the unshifted means continue to be shrunk towards their overall mean value. The estimator behaves smoothly for shifts between those two extreme cases.

Colin Fox
Dept. of Mathematics
University of Auckland

Solving Inverse Problems using MCMC with an Approximation

Abstract:
Inverse problems are inferential problems in which the forward map (from unknowns to observations) is a complex physical relationship and where inversion of the forward map presents special difficulties. Examples of inverse problems include the various modalities of imaging from wave scattering used in non-invasive medical diagnostics, geophysical prospecting, and industrial process monitoring. Since the posterior distribution can be evaluated, in principle it can be sampled via MCMC allowing summary statistics to be evaluated, effectively solving the inverse problem. However, the need to calculate the posterior, and hence forward map, at each step in a standard Metropolis--Hastings algorithm, with typically many thousands or millions of steps required to give sufficiently small variance in estimates, appears to be computationally prohibitive for realistic inverse problems. Hence considerable improvement in efficiency of MCMC algorithms for inverse problems is required if the method is to be widely applied. In this talk I give a modified MH algorithm that uses a cheap, perhaps local, approximation to the forward map to generate a viable MCMC having the correct ergodic properties, that can be several orders of magnitude faster than standard MH dynamics. An example inverse problem will be shown demonstrating the algorithm. This is joint work with Andres Christen.

Georgy Gimel'farb
Dept. of Computer Science
University of Auckland

MCMC-based texture synthesis by analytic and stochastic approximation of Gibbs potentials

Abstract:
Markov-Gibbs randon field models of image textures (or other scalar data arrays supported by an arithmetic lattice) that account only for point-wise and pairwise signal interdependencies allow for analytic first approximation of both Gibbs potential functions and a geometric structure of the dependencies. These estimates involve first- and second-order sufficient signal statistics for a given training image. The potentials are then refined by the MCMC based stochastic approximation that acts as a stochastic generator of images such that their sufficient statistics closely approach the training ones.

Ville Kolehmainen
Dept. of Applied Physics
University of Kuopio, Finland

Parallelized Bayesian Inversion for three-dimensional dental X-ray imaging

Abstract:
In this talk we will present a summary of our recent work on Bayesian modelling of dental radiology. We implement MAP-estimation with parallel computers and present results from real dental data.

Renate Meyer
Dept. of Statistics
University of Auckland

Bayesian Semiparametric Modelling of Stratified Survival Data using Mixtures

Abstract:
A stratified proportional hazards model is commonly used to analyse survival data collected over many strata, for example in multicentre clinical trials. Frailty models can be regarded as a compromise between a stratified and an unstratified analysis. Instead of including frailties, i.e. iid random variables for each stratum, in this paper we consider treating the whole stratum-specific baseline hazard function as random. We are using a Bayesian nonparametric approach to estimate the baseline hazards using mixtures of triangular distributions. The number of mixands is an unknown parameter, and estimated simultaneously with other parameters using a reversible jump Markov chain Monte Carlo algorithm. We illustrate the technique using clinical trial data and compare results to parametric alternatives. (Joint work with Bo Cai.)

Russell Millar
Dept. of Statistics
University of Auckland

A simple case-deletion diagnostic for Bayesian models

Abstract:
For models with observations that are conditionally independent (given the unknown parameters) we propose using the posterior variance of the loglikelihood of an observation as a measure of the local sensitivity of posterior inference to that observation. We motivate this using two arguments,
1) by analogy with Cook's D via calculation of the derivative of the Bayes estimators (of all model parameters) with respect to case weight,
2) by considering the derivative of the Kullback Leibler divergence between the posterior and case-deleted posterior.
This is joint work with Wayne Stewart.

Geoff Nicholls
Dept. of Mathematics
University of Auckland

Deposition model-comparison from radiocarbon data

Abstract:
Bayesian methods are now widely used for analysing radiocarbon dates. We find that certain non-informative priors in use in the literature generate a bias towards wider date ranges which does not in general reflect substantial prior knowledge. We recommend using a prior in which the distribution of the difference between the earliest and latest dates has a uniform distribution. We show how such priors are derived from a simple physical model of the deposition and observation process. We illustrate this in a case study, examining the effect that various priors have on the reconstructed dates. Bayes factors are used to help decide model choice problems.

Allen Rodrigo
Bioinformatics Institute
University of Auckland

Bayesian Inference in Evolutionary Genetics

Abstract:
Evolutionary biologists use genetic information to make inferences about historical patterns and processes. The analyses are mathematically complicated, and except in the most trivial cases, there are no closed-form estimators of evolutionary parameters. I will describe Bayesian MCMC inference of these parameters, with examples and I will discuss both good and bad aspects of these analyses.

Angelika van der Linde
Dept. of Mathematics
University of Bremen, Germany

Coefficients of Determination and Predictive Model Choice

Abstract:
In this talk information based criteria for predictive model comparison are introduced and discussed. From a Bayesian point of view, the focus is on posterior predictive criteria. In particular, it is shown that universal coefficients of determination lead to the criterion of posterior predictive entropy which I contrasted with the Deviance Information Criterion (DIC). Thus, on an abstract level, an old discussion (for regression models) of how to trade off ‘model fit’ and ‘model complexity’ in model assessments is resumed. The ideas are illustrated for regression and classification problems.

Tim Watson
NIWA Research Ltd
Auckland

A Hierarchical Bayesian Model and Simulation Software for the Maintenance of Water Pipe Networks

Abstract:
A hierarchical Bayesian model was developed to model the occurrence of pipe failures within a water pipe network. The Bayesian methodology has several advantages over previous methods including: it does not rely heavily on the availability of failure data, it encapsulates engineering knowledge in the form of priors, and it provides formal measurements of uncertainty. Further, the hierarchical model enables failure rates of individual pipes to be estimated by exploiting similarities between pipes across the network. The resultant MCMC estimates are then used to calculate optimal pipe replacement ages.

David Welch
Dept. of Mathematics
University of Auckland

Implementing an MCMC algorithm for a space of random graphs

Abstract:
A stochastic model of a virus moving through a host population results in a graph containing both host and viral genealogies. In this talk, I'll discuss current work on implentation of an MCMC algorithm to reconstruct these graphs and to estimate some population parameters. I'll look in particular at symmetries in the graph structure that need to be accounted for in the Hasting's ratios and I'll present a sampler for the prior distribution over the graph space.