Department of Statistics
2015 Seminars
Seminars by year: Current | 2004 | 2005 | 2006 | 2007 | 2008 | 2009 | 2010 | 2011 | 2012 | 2013 | 2014 | 2015
Speaker: Javier Cano
Affiliation: Department of Computer Science and Statistics, Universidad Rey Juan Carlos
When: Thursday, 17 December 2015, 3:00 pm to 4:00 pm
Where: Room 303-310
Runway excursions at landing constitute one of the major threats to aviation safety. Although their occurrence rate is very low, the entailed consequences may be severe in terms of lives and financial costs. We analyze the main contributing factors to this event with the aid of a Bayesian network. The issues uncovered suggest several operational recommendations that should be given to crews in order to reduce the probability of facing a runway excursion when landing.
Stein's Method and Hospital Inpatient Flow ManagementSpeaker: Jim Dai
Affiliation: School of Operations Research and Information Engineering, Cornell University
When: Monday, 30 November 2015, 2:00 pm to 3:00 pm
Where: Room 303-310
Diffusion models have been used for steady-state analysis of many stochastic systems. A recent paper (Braverman and Dai 2015) demonstrates that Stein's method is a natural tool to establish error bounds for steady-state diffusion approximations. It turns out that the method also serves as a practical engineering tool for building robust diffusion models that are accurate in a number of parameter regimes. I will illustrate these advances using a class of new stochastic models capturing patient flows from a hospital emergency department to inpatient wards. This talk is based on a joint work with Pengyi Shi at Purdue University.
Pure, predictable, pipeable: creating fluent interfaces with RSpeaker: Hadley Wickham
Affiliation:
When: Wednesday, 18 November 2015, 4:00 pm to 5:00 pm
Where: MLT3 (Room 303-101)
A fluent interface lets you easily express yourself in code. Over time a fluent interface retreats to your subconcious. You don't need to bring it to mind; the code just flows out of your fingers. I strive for this fluency in all the packages I write, and while I don't always succeed, I think I've learned some valuable lessons along the way.
In this talk, I'll discuss three guidelines that make it easier to develop fluent interfaces:
- Pure functions. A pure function only interacts with the world through its inputs and outputs; it has no side-effects. Pure functions make great building blocks because they're are easy to reason about and can be easily composed.
- Predictable interfaces. It's easier to learn a function if it's consistent, because you can learn the behaviour of a whole group of functions at once. I'll highlight the benefits of predictability with some of my favourite R "WAT"s (including 'c()', 'sapply()' and 'sample()').
- Pipes. Pure predictable functions are nice in isolation but are most powerful in combination. The pipe, '%>%', is particularly important when combining many functions because it turns function composition on its head so you can read it from left-to-right. I'll show you how this has helped me build dplyr, rvest, ggvis, lowliner, stringr and more.
This talk will help you make best use of my recent packages, and teach you how to apply the same principles to make your own code easier to use.
Please join us for refreshments afterwards in the break out space on Level 3.
Coherent frameworks for statistical inference serving integrated decision support systemsSpeaker: Martine Barons
Affiliation: Department of Statistics, University of Warwick
When: Wednesday, 11 November 2015, 1:00 pm to 2:00 pm
Where: Room 303-412
Please note the unusual time and place
The Survey Octopus: an approach to teaching Total Survey ErrorSpeaker: Caroline Jarrett
Affiliation: Effortmark Ltd, UK
When: Friday, 6 November 2015, 12:00 pm to 1:00 pm
Where: Room 303-310
Although the concepts of Total Survey Error (TSE) have been widely accepted amongst survey methodologists and statisticians for many years, TSE is still not familiar to many people who commission ad-hoc surveys for business or government. A recent survey (Jablonski, 2015) suggests that even some academics teaching survey methodology at university level do not use TSE in their classes; some were not even aware of the concept.
After having many conversations with colleagues and clients where I tried to explain that a high number of responses was in itself not a guarantee of data quality, I turned to the classic Survey Lifecycle (Groves et al, 2009), with its presentation of Total Survey Error arising from errors across steps in the survey process. From this, I evolved the Survey Octopus, a way of helping non-specialists to get to grips with the issues involved in TSE and to help them to make better, and more informed, choices when deciding about how to approach a survey.
In this seminar, I will describe the Survey Octopus, show how it relates to Groves et als' model of the survey lifecycle, and explain the benefits and problems of each depiction of TSE.
Biography: Caroline Jarrett is a forms specialist who got interested in surveys through her researching best practices in question design. She's based in the UK, and currently works mostly with the UK Government Digital Services, on the forms advice that they provide across UK government. She is the co-author of the textbook "User Interface Design and Evaluation" (2005, Elsevier/The Open University) and of "Forms that work: Designing web forms for usability" (2009, Elsevier). Her book on surveys will be published by Rosenfeld Media in 2016.
www.cs.auckland.ac.nz/en/about/newsandevents/events/events-2015/2015/11/seminar-cs.html
Semi-automatic categorization of open-ended questionsSpeaker: Matthias Schonlau
Affiliation: Dept. Statistics&Act.Sci. and Survey Research Centre, U. Waterloo
When: Wednesday, 28 October 2015, 11:00 am to 12:00 pm
Where: Room 303-310
Text data from open-ended questions in surveys are difficult to analyze and are frequently ignored. Yet open-ended questions are important because they do not constrain respondents' answer choices. Where open-ended questions are necessary, sometimes multiple human coders hand-code answers into one of several categories. At the same time, computer scientists have made impressive advances in text mining that may allow automation of such coding. Automated algorithms do not achieve an overall accuracy high enough to entirely replace humans. We categorize open-ended questions soliciting narrative responses using text mining for easy-to-categorize answers and humans for the remainder using expected accuracies to guide the choice of the threshold delineating between "easy" and "hard". This approach is illustrated with examples from open-ended questions related to respondents' advice to a patient in a hypothetical dilemma.
The statistical challenges posed by studying longitudinal clinical data in the presence of measurement errorSpeaker: Millie Parsons
Affiliation: MRC Lifecourse Epidemiology Unit, Southampton General Hospital and U. Southampton
When: Thursday, 22 October 2015, 3:00 pm to 4:00 pm
Where: Room 303-310
Longitudinal data is being increasingly used within a clinical setting as a method to help track disease progression and/or monitor the natural aging process, and similar data is used within epidemiological research to explore risk factors of change over time. My PhD focuses on joint space measurements from within the knee, which due to the very small nature of the measurements can be prone to errors and such errors may obscure clinicians from understanding the real underlying rate of deterioration within a patient. Ultimately the aim of my PhD research is to describe individual trajectories of change using longitudinal knee joint space width data and to explore the magnitude of measurement error that may impact upon interpretation of patterns of change over time, using methods such as the reliable change index, random-coefficient and Bayesian modelling.
Millie Parsons currently works as a statistician at the MRC Lifecourse Epidemiology Unit, University of Southampton and alongside her role as a statistician she is studying for a PhD which is entitled "Definition and predictors of trajectories of joint space narrowing in knee Osteoarthritis in the presence of measurement error".
Which workers are more vulnerable to work intensification? An analysis of two national surveysSpeaker: Peter Boxall
Affiliation: Management & International Business
When: Monday, 19 October 2015, 4:00 pm to 5:00 pm
Where: Fale Pasifika Complex, Bldg 273, Level 1, Rm 104
This presentation will report work conducted with Dr Mark Le Fevre and Associate Professor Keith Macky of AUT University. It will discuss findings from two national-level surveys in New Zealand that help us to identify which groups of workers experience higher levels of work intensity and to analyse the links to their well-being. The primary goal is to identify differences among occupational groups but the surveys also enable us to compare experiences of work intensity across a range of variables. Overall, the analysis addresses the question: which workers are more vulnerable to work intensification?
Is there really a link between low parental income and childhood obesity?Speaker: Nichola Shackleton
Affiliation: COMPASS
When: Monday, 12 October 2015, 4:00 pm to 5:00 pm
Where: Fale Pasifika Complex, Bldg 273, Level 1, Rm 104
The established association between familial socioeconomic status and child obesity has created the expectation that low familial income is a cause of child obesity. Yet there is very little evidence in the UK to suggest that this is the case. This paper uses data from the Millennium Cohort Study (age 7) to assess whether or not low familial income and family poverty are associated with an increased risk of child obesity.
Applied Time Series Models using Vector Generalized Linear Models and Reduced-Rank ExtensionsSpeaker: Victor Miranda
Affiliation: Department of Statistics
When: Wednesday, 7 October 2015, 11:00 am to 12:00 pm
Where: Room 303-310
The vector generalized linear and additive models (VGLM/VGAM) statistical framework is shown to confer advantages to some well-known time series (TS) models. In this talk I will present some preliminary results of my work, which has concentrated on fitting TS data structures. In particular, I look at the autoregressive model of order-p [AR(p)] and the moving-average process of order-q [MA(q)]. This work has been performed using the flexibility of the VGAM package via object oriented methods. Although numerous R packages to fit such processes are currently available, the majority of them are limited to fitting univariate and intercept-only models. Using the VGLM/VGAM framework, TS models can be conformably fitted through highly flexible arguments, including parameters constraints and contrasts. Time permitting, I will briefly discuss future work involving non-linear TS, cointegrated TS, and Reduced-Rank TS.
Research Data - Preserve, Share, Reuse, Publish, or PerishSpeaker: Mark Gahegan
Affiliation: Centre for eResearch
When: Monday, 5 October 2015, 4:00 pm to 5:00 pm
Where: Fale Pasifika Complex, Bldg 273, Level 1, Rm 104
Researchers need a variety of data services to support their work, from archives, to backups, through to data sharing and eventually data publishing. And it is very clear that our funding agencies will soon follow suit with those in the USA, EU, Australia and elsewhere and require researchers to make publicly funded research data available to others in most cases, though of course ensuring confidentiality of individuals where necessary. The talk will begin by describing some of these services, what we understand that researchers and funders need and how we go about ensuring such services are provided by our institutions. But this is just the beginning. In the burgeoning era of open (and sometimes data-led) research, new possibilities and challenges for how we describe, find, share and reuse data are waiting around every corner. Some of these may radically change how we conduct research, some could dramatically improve the effectiveness of the research sector at large. What we think of as data, and even as research, will change as a result.
Mark Gahegan is Professor in the Department of Computer Science at the University of Auckland. He directs the university's Centre for e-Research, which hosts a team of more than twenty high-performance computing specialists, and five researchers. He was lead author of the recently successful National eScience Infrastructure (NeSI) proposal to coordinate support for eResearch and high performance computing across New Zealand. He led the development of the Science Case for the science ministry and the Business Case for government that was funded by the Ministry of Science and Innovation in 2011, $27M over three years, with an additional co-investment of $21M from the 5 partner institutions.
Navigating the Starpath: Student Achievement and Equity in Secondary Schools in Tamaki and Tai TokerauSpeaker: Cindy Kiro
Affiliation: Starpath Project and U. Auckland Fac. Education
When: Monday, 28 September 2015, 4:00 pm to 5:00 pm
Where: Fale Pasifika Complex, Bldg 273, Level 1, Rm 104
Starpath has developed evidence-based strategies to transform patterns of educational underachievement for senior secondary students in Years 11, 12 and 13 in low socioeconomic high schools. Starpath has partnered with 39 low decile schools in Auckland and Northland. Achievement in NCEA Levels 1, 2, 3 and University Entrance showed remarkable improvements. There were also significant gains in school practices including expectations of achievement; using data to support achievement; informed student goal setting; tracking of student progress; literacy across the curriculum and whanau/family-school partnerships. Starpath has promoted a sense of responsibility amongst senior students for their own learning and achievement. Starpath has made a positive difference to the relationship between teachers, school leaders and parents. In this presentation we will review the evidence of this approach and share our findings from the matched schools analysis and development of a multi-level analysis to understand what impact, if any, Starpath has had on our partner schools.
Professor Cynthia (Cindy) Kiro is Director of the Starpath Project and also 'Te Tumu' - responsible for Maori/indigenous education in the Faculty of Education at the University of Auckland. Both roles promote educational excellence to increase student engagement and success in tertiary education for indigenous and minority populations. She has worked extensively in roles that improve life outcomes for children and young people who experience social marginalisation or exclusion: focusing upon equity and diversity as a constructive contributor to society. She was New Zealand's 4th Children's Commissioner - establishing the Taskforce for Action on Family Violence.
A knowledge laboratory of the early life courseSpeaker: Roy Lay-Yee
Affiliation: COMPASS
When: Monday, 21 September 2015, 4:00 pm to 5:00 pm
Where: Fale Pasifika Complex, Bldg 273, Level 1, Rm 104
The 'Knowledge Lab' micro-simulation project aims to integrate 'best evidence' from systematic reviews and meta-analyses into a working model of the early life course (from birth to age 21). We will describe progress on the Knowledge Lab project, and how it will be used to: (i) test the validity of the underlying behavioural equations and specific knowledge sources (meta-analyses, systematic reviews); and (ii) test policy scenarios by carrying out experiments on the 'virtual cohort' created by the working model.
Roy Lay-Yee is a Senior Research Fellow at the COMPASS Research Centre. He has a background in Sociology and one of his current interests is in using computer simulation to address questions of relevance to public policy.
Media blame and political violence in Northern Ireland 1994-1998Speaker: Maria Armoudian and Barry Milne
Affiliation: Politics & International Relations; COMPASS
When: Monday, 14 September 2015, 4:00 pm to 5:00 pm
Where: Fale Pasifika Complex, Bldg 273, Level 1, Rm 104
The relationship between media messages and political violence has been postulated but not explored quantitatively. We assess the association between media "blame" of political actors in articles from a random selection of issues from three Northern Ireland daily newspapers and two partisan periodicals, and subsequent violence during the period Jan 1994 to May 1998. We find evidence of media blames escalating violence, controlling for prior violence, and this varied by publication, the political actor blamed, and the partisan group perpetrating the violence.
Dr. Maria Armoudian is a lecturer at the University of Auckland, the author of Kill the Messenger: The Media's Role in the Fate of the World, and the host and producer of the syndicated radio program, The Scholars' Circle. She has served as an environmental commissioner for the City of Los Angeles and worked for the California State Legislature, prior to which she worked a journalist. Armoudian is currently working on a book about war correspondents.
Dr. Barry Milne is a Senior Research Fellow and Associate Director of the COMPASS Research Centre. He has a masters degrees in Psychology from the University of Otago, and a PhD in psychiatric epidemiology from Kings College London. His main interests are in longitudinal and life-course research, and in the use of large administrative datasets to answer policy and research questions.
Stationary distribution of the linkage disequilibrium coefficient r^2Speaker: Joey Zhang
Affiliation: Department of Statistics
When: Thursday, 10 September 2015, 11:00 am to 12:00 pm
Where: Room 303-310
The linkage disequilibrium coefficient r^2 is a measure of statistical dependence of the alleles possessed by an individual at two genetic loci. It is used to find the positions of disease-causing genes on the chromosomes. For this reason, seeking the statistical properties of r^2 is an important and meaningful issue. The maximum entropy principle is a useful tool to approximate the density function
of an unknown distribution, given a sequence of the distribution's moments. Here I use this method to approximate the density function of r^2 using some stationary moments computed under models for genetic drift. In order to obtain the sequence of moments, I generalize an analytic method that was originally used to compute the expectation of r^2.
Queues and cooperative gamesSpeaker: Moshe Haviv
Affiliation: Dept. Statistics and Federmann Center for the Study of Rationality, Hebrew University of Jerusalem
When: Wednesday, 2 September 2015, 11:00 am to 12:00 pm
Where: Room 303-310
The area of cooperative game theory deals with models in which a number of individuals, called players, can form coalitions so as to improve the utility of its members. In many cases, the formation of the grand coalition is a natural result of some negotiation or a bargaining procedure. The main question then is how the players should split the gains due to their cooperation among themselves. Various solutions have been suggested among them the Shapley value, the nucleolus and the core.
Servers in a queueing system can also join forces. For example, they can exchange service capacity among themselves or serve customers who originally seek service at their peers. The overall performance improves and the question is how they should split the gains, or, equivalently, how much each one of them needs to pay or be paid in order to cooperate with the others. Our major focus is in the core of the resulting cooperative game and in showing that in many queueing games the core is not empty.
Finally, customers who are served by the same server can also be looked at as players who form a grand coalition, now inflicting damage on each other in the form of additional waiting time. We show how cooperative game theory, specifically the Aumann-Shapley prices, leads to a way in which this damage can be attributed to individual customers or groups of customers.
Sequentially weighted nearest neighbour classifiersSpeaker: Mehdi Soleymani
Affiliation: Department of Statistics and Actuarial Science, The University of Hong Kong
When: Wednesday, 26 August 2015, 11:00 am to 12:00 pm
Where: Room 303-310
We describe an algorithm for sequential combination of the nonparametric and weighted bagging estimates. The proposed algorithm improves the stability of bagged classifiers without affecting the bias. Bagging is a well-known device for decreasing the variance of a given statistic. The new algorithm for bagged classification reduces the variance of the prediction even further. Theoretical properties are given for the nearest neighbour classifier, showing that sequential bagging accelerates convergence of the bagged predictor to the Bayes rule. There is a close connection between the sequential bagged nearest neighbour classifier and the optimal weighted nearest neighbour classifiers. We explore this connection and we show that randomisation of the weights in the sequential bagged nearest neighbour improves the convergence rate of the estimator.
Contribution to the accessibility of quantitative skillsSpeaker: Martin von Randow
Affiliation: COMPASS
When: Monday, 24 August 2015, 4:00 pm to 5:00 pm
Where: Fale Pasifika Complex, Bldg 273, Level 1, Rm 104
Around the world many groups have been lamenting the shortage of quantitative skills among those entering the workforce, especially in the social sciences. Research methods content has been cut from programmes in order to retain student numbers, leading inevitably to those skills not even being there to pass on. COMPASS has long been working to alleviate these concerns locally, with short courses (NZSSN), data archiving (NZSSDS), and hands-on teaching in both quantitative and qualitative skills at the University of Auckland.
Martin von Randow has worked for COMPASS Research Centre for more than 10 years. Along with his extensive analytical work, he has been involved with a number of "outreach" activities for which COMPASS has taken responsibility. As Operations Manager for courses through the New Zealand Social Statistics Network (NZSSN) and data archiving through the New Zealand Social Science Data Service (NZSSDS), as well as Lab Instructor within quantitative methods courses in the social sciences at the University of Auckland in various applications since 2005. In these ways, he has been central to what COMPASS has given back to the social science community.
Explaining the low income return for education among Asian New ZealandersSpeaker: Liza Bolton
Affiliation: COMPASS and Department of Statistics
When: Monday, 17 August 2015, 4:00 pm to 5:00 pm
Where: Fale Pasifika Complex, Bldg 273, Level 1, Rm 104
Aotearoa New Zealand has a long and rich history of migration from Asia, with Asians comprising around 10% of the usually resident population. For people of all ethnicities, educational attainment is positively associated with income, but in all qualification categories at the 2013 Census, Asian New Zealanders were earning markedly less than their European, Maori or Pacific Islander counterparts. This investigation uses 2013 New Zealand Census data to create explanatory models that investigate the factors related to this anomalous difference.
Liza Bolton is a PhD Candidate in Statistics at the University of Auckland, working with the Centre of Methods and Policy Application in the Social Sciences (COMPASS). Liza began her PhD in March 2015, under the supervision of Professor Alan Lee (Department of Statistics) and Dr Barry Milne (COMPASS). The work in this seminar is a continuation of her Honours research.
http://www.arts.auckland.ac.nz/en/about/our-research/research-centres-and-archives/compass.html
Life-course predictors of mortality inequalitiesSpeaker: Dr Barry Milne
Affiliation: COMPASS
When: Monday, 10 August 2015, 4:00 pm to 5:00 pm
Where: Fale Pasifika Complex, Bldg 273, Level 1, Rm 104
New Zealand has made great contributions to the understanding of the effects of socioeconomic factors on mortality, using data from the New Zealand Census Mortality Study. We extend this work by linking mortality data to the New Zealand Longitudinal Census – a link of individual Census records from 1981 to 2006 – to assess life-course socioeconomic predictors of mortality. The great advantage of this linkage is that it allows socio-economic, social and cultural influences across 25 years of life to be assessed for their importance in association with mortality. I will describe this project, its aims, and some early findings. In particular, I will describe how analysing Census individuals within their family context allows for siblings who experience different socio-economic conditions to be compared in terms of their mortality risk.
Barry Milne is a Senior Research Fellow and Associate Director of the COMPASS Research Centre. He has a masters degrees in Psychology from the University of Otago, and a PhD in psychiatric epidemiology from Kings College London. His main interests are in longitudinal and life-course research, and in the use of large administrative datasets to answer policy and research questions.
http://www.arts.auckland.ac.nz/en/about/our-research/research-centres-and-archives/compass.html
The COMPASS 'social laboratory'. A knowledge-based inquiry systemSpeaker: Peter Davis
Affiliation: COMPASS
When: Monday, 3 August 2015, 4:00 pm to 5:00 pm
Where: Fale Pasifika Complex, Bldg 273, Level 1, Rm 104
What do the books "The Healthy Country?" (Woodward & Blakely) and "The Spirit Level" (Wilkinson & Pickett) have in common? They are big picture, they present novel and stimulating interpretations, they are societal in scope - and they largely rely on aggregated (ecological) data of an observational kind. Can we add methodological precision to these speculations without losing the sense of the "bigger picture"? At COMPASS we have developed a rudimentary inquiry system using simulation methods based on existing research data. We propose over the next two years to extend this system to a societal level using the New Zealand Longitudinal Census.
Peter Davis is Professor of the Sociology of Health and Well-being at the University of Auckland, with cross-appointments in Population Health and Statistics, and founding director of the COMPASS Research Centre, a decade-long grant-funded research group. He has masters degrees in Sociology and Statistics from the London School of Economics, and a PhD in community health from Auckland. His main interests are in applying advanced methodological techniques to social data in addressing policy and substantive questions.
http://www.arts.auckland.ac.nz/en/about/our-research/research-centres-and-archives/compass.html
Less Volume, More Creativity — Introducing R to BeginnersSpeaker: Randall Pruim
Affiliation: Calvin College, Grand Rapids, MI
When: Wednesday, 24 June 2015, 11:00 am to 12:00 pm
Where: Room 303-310
Introducing beginners to R can be a daunting task for both the beginners and those doing the introducing. Over the past several years, with support from the US National Science Foundation, the Project MOSAIC team (D Kaplan, N Horton, and R Pruim) have been assembling the mosaic R package. Based on experience in our own classes, numerous workshops we have given, and feedback from others who have used the package, we have been refining it to provide a powerful R toolkit with as little cognitive complexity as possible. In addition to discussing how to use the mosaic package to compute numerical and graphical summaries and do simulation-based inference, we'll take a look at some the guiding principles behind the design of the package and how they might be applied in other contexts.
Reproducibility in Science - A Rethink, or a Crisis?Speaker: John Maindonald
Affiliation: Mathematical Sciences Institute, Australian National University
When: Wednesday, 17 June 2015, 11:00 am to 12:00 pm
Where: Room 303-310
Reproducibility issues have in the past several years received wide attention in 'Nature' and elsewhere. Especial attention has been directed at animal and drug laboratory studies, and at experimentation in psychology and neuropsychology. The crucial test of reproducibility is whether someone other than the original experimenter can reproduce the results. One recent attempt at reproducing 67 'seminal' drug studies was successful in 14 instances only. This seems not untypical. Selection effects doubtless explain part of the problem - most published results are false positives, from experiments where there is mostly no effect. There are in addition serious issues with study design and analysis, and with reporting. Experiments may not be reported with sufficient accuracy and detail to allow replication. I will comment on where the debate seems headed, and on initiatives that are designed to address the problem. There are implications for the statistical input to study design and analysis, and for statistical training. P-values are widely misunderstood and misused, to an extent that raises doubts about their usefulness. A really effective answer to the issues raised will require radical changes to funding arrangements, to the reporting of results, to the publication process, and to rewards systems. Moves towards larger cooperative studies would help greatly. Areas where the nature of the work requires cooperation between scientists in order to bring together the necessary skills and tools provide useful models.
John Maindonald has had wide experience as a quantitative problem solver, working with researchers in diverse areas of science and industry. In 1996 he moved from NZ to Australia, then taking a position at The Australian National University (ANU) in 1998. He is the author of a book on Statistical Computation, and the senior author of "Data Analysis and Graphics Using R" (3rd edition, CUP 2010). Now in semi-retirement, he does occasional consulting, fronts workshops on the R system, and continues to write. He has recently moved back to NZ, to live in Wellington.
A King's Bones under a Carpark? Evaluating the Genetic and Other EvidenceSpeaker: David Balding
Affiliation: Schools of BioSciences and of Maths & Stats., U. Melbourne, and UCL Genetics Institute, London
When: Friday, 5 June 2015, 1:00 pm to 2:00 pm
Where: MLT2 (Building 303 Level 1)
Although the evidence for bones found under a Leicester UK carpark to have been those of King Richard III seemed extremely strong even before the full genetic data became available, the great public interest in the story attracted sceptics trying to discredit the evidence, and the publication of the full evidence was expected to attract critical scrutiny. For reasons that I will explain, the genetic evidence was not decisive on its own (although we were unable to stop media reports continuing to say "genetic tests proved ..."). When publishing the full genetic data, the team at the University of Leicester asked myself and UCL colleague Mark Thomas to help quantify as much as we could of the disparate lines of evidence, both genetic and non-genetic, in order to come up with an overall summary of evidential weight for the claim that the bones are those of the king. There are close parallels with disparate lines of evidence in a criminal trial. It may be interesting to review the assumptions, judgment calls and presentational that were made in order to be well prepared for the next dead-king-in-a-carpark case.
Water Quality Monitoring and Modelling in NZ LakesSpeaker: Moritz Lehmann
Affiliation: University of Waikato
When: Wednesday, 27 May 2015, 11:00 am to 12:00 pm
Where: Room 303-310
Almost half of New Zealand’s lakes have degraded water quality (MFE 2009) which can manifest through increasing frequency of nuisance algal blooms, contamination with fecal microbes and low oxygen concentrations. Poor water quality is most often caused by complicated interactions of contaminants derived from the catchment, weather and climatic conditions and in-lake ecological processes. The University of Waikato is the home of LERNZ (Lake Ecosystem Restoration New Zealand), a programme under which a series of projects were conducted over the last 10 years towards identifying and remediating threats to lake ecosystems through the development of models, observation technologies and management strategies. Two particular outcomes of the programme are a network of real-time monitoring buoys currently installed in fifteen NZ lakes, and a number of state-of-the-art dynamic coupled bio-physical models of water quality. Both models and buoys produce large amounts of data which, by themselves, have their place in science, management and decision making. The logical next step is the mathematical integration of observations and models. In the future, we aim to implement data assimilation schemes for improved model validation and calibration, short-term forecasting of water quality and quantification of model uncertainty. Discussions about potential collaboration will be welcome.
Are we there yet? The effects of the changing school statistics curriculumSpeaker: Nicola Petty
Affiliation: Statistics Learning Centre
When: Wednesday, 20 May 2015, 11:00 am to 12:00 pm
Where: Room 303-310
Statistics teaching throughout New Zealand is in transition. Those schools and teachers that have embraced the changes are enjoying the new approach to learning and teaching statistics as introduced in the new curriculum. Others are less enthused, and some blame University of Auckland Statistics Department for their discomfort. Dr Nic provides resources and professional development to help teachers and students to make sense of statistics, and visits a large number of schools and meets many teachers. This seminar will include a report back on responses, joys and challenges from the teachers, some of the implementation issues and an overview of the Statistics Learning Centre resources.
Biographical info:
Nicola Petty (aka Dr Nic) specialises in the teaching and learning of statistics. After teaching in high school and twenty years lecturing in Operations Research at the University of Canterbury, Nicola set up the Statistics Learning Centre. She blogs, makes videos and resources and leads professional development sessions on statistics and probability.
Improving senior school statistics education in New ZealandSpeaker: Jake Wills
Affiliation: Kapiti College
When: Friday, 8 May 2015, 12:00 pm to 1:00 pm
Where: Room 303-310
We will look at the problems teachers and students face in secondary school, including issues around the content, assessment criteria and technology, and what I have been doing to try and address it. We will look at what led to the creation of www.MathsNZ.com and NZGrapher, and how these are being used in schools to help teachers and students.
We will also look at ways that the information we have on students can be made more accessible to teachers in order to improve student performance, with a look at some software that Jake has been developing to enable this to happen.
See also http://www.mathsnz.com and http://www.jake4maths.com/grapher/
Automatic adjudication of symptom-based exacerbations in bronchiectasis patients treated with azithromycinSpeaker: Mark Wheldon
Affiliation: AUT
When: Wednesday, 29 April 2015, 11:00 am to 12:00 pm
Where: Room 303-310
The EMBRACE multi-center RCT evaluated the effect of azithromycin on frequency of event-based exacerbations (EBEs), lung function and health-related quality of life in adult patients with non-cystic fibrosis bronchiectasis. The treatment was effective in lowering the rate of EBEs relative to placebo (Wong et al., Lancet 2012 380(9842):660--7). An EBE, a binary outcome, is a sustained worsening of condition requiring treatment with antibiotics. Respiratory condition is defined, partly, in terms of daily sputum volume, sputum purulence, and dyspnoea, recorded daily in diaries kept by participants. Quality of life, as measured by the St. George's respiratory questionnaire, was also recorded in the diaries. Here, we use the daily diary entries to build a statistical model to estimate the probability that an EBE is imminent. A definition of symptom-based exacerbation (SBE) is developed by comparing the ROC curves of logistic regression models with EBE as the response and respiratory condition over different time windows as the predictors. Predictive accuracy is estimated using cross-validation. The new SBE definition is validated against self-reported quality of life.
Combining aggregated and individual data in contingency tablesSpeaker: Markus Stein
Affiliation: U. Auckland
When: Wednesday, 15 April 2015, 11:00 am to 12:00 pm
Where: Room 303-310
Likelihood-based approaches for constrained contingency tables are considered in this PhD thesis proposal. Very often only the marginal distributions of variables of interest are available and inferences for unit-record effects are subject to a range of biases. Examples of such problem appear in so-called "ecological inference" and disclosure limitation. In this case, the only reliable way of reducing bias and the impact of uncheckable assumptions is supplementing the marginal distributions by other, e.g. individual-level, information. Our interest is in using unit-record data on subsamples as auxiliary information to the group-level distributions. Maximum likelihood estimation can be a very challenging task, because its calculations require the enumeration of all tables consistent with the observed data. However, good approximations to the full likelihood can be carried out by sampling possible tables. With the present research, our goal is to develop efficient methods for combinations of non-monotonic (non-nested) sources of information with minimal losses in efficiency, especially in high-dimensional constrained contingency tables. This talk will compare inferences via true likelihood in 2xJ tables and estimated likelihoods, through uniform sampling methods and an informative sampling scheme based on independent binomial distributions.
Data Science: Will Computer Science and Informatics Eat Our Lunch?Speaker: Thomas Lumley
Affiliation: U. Auckland
When: Wednesday, 8 April 2015, 11:00 am to 12:00 pm
Where: Room 303-B11
Mainstream statistics ignored computing for many years, so that students were taught to handle infinite N, but not N of a million. Practical estimation of conditional probabilities and conditional distributions in large data sets was often left to computer science and informatics. Although statistics started behind, we are catching up: many individual statisticians and some statistics departments are taking computing seriously. More importantly, applied statistics has a long tradition of understanding how to formulate questions: large-scale empirical data can tell you a lot of things, but not what your question is. Big Data are not only Big but Complex, Messy, Badly Sampled, and Creepy. These are problems that statistics has thought about for some time, so we have the opportunity to take all the shiny computing technology that other people have developed and use it to re-establish statistics at the centre of data science.
Patterns of cardiovascular disease risk factor laboratory monitoring during follow-upSpeaker: Mugdha Manda
Affiliation: U. Auckland - Dept. of Statistics and Section for Epidemiology and Biostatistics (School of Population Health)
When: Wednesday, 1 April 2015, 11:00 am to 12:00 pm
Where: Room 303.310
The overall goal of this research project is to contribute to the Vascular Informatics Using Epidemiology and the Wed (VIEW) HRC 11/800 research programme.
PREDICT is a web-based (electronic) clinical decision support system used by primary care practitioners and contains encrypted personal data on over two hundred thousand New Zealanders (national and regional databases).
We will use this vast cohort to understand the patterns of risk factor monitoring and the quality of monitoring to improve clinical methods for assessing and managing cardiovascular disease (CVD) burden with the aim of addressing disparities in CVD risk groups.
The trajectories of biochemical measures during the development of the adverse outcomes of diabetes and renal failure will be carried out using mixture models for longitudinal data, by implementing the recently developed SAS procedure, PROC TRAJ. This talk will cover fundamentals of modelling developmental trajectories in longitudinal data, the proposed models for developing these trajectories, including recent work in the development of PROC TRAJ and the supported distributions for carrying out analysis in PROC TRAJ.
Introduction to vines and some recent applicationsSpeaker: Claudia Czado
Affiliation: Technische Universitaet Muenchen
When: Wednesday, 4 March 2015, 11:00 am to 12:00 pm
Where: Room 303-310
Having large multivariate data sets available allows to carefully model the dependence structure. This might include tail dependence and asymmetry. For this modeling I will you a copula based approach and utilize the class of multivariate vine copulas. This class is built using only bivariate copula terms, called pair copulas. I will introduce this class and discuss estimation and model selection. Finally I will present their application in the area of stress testing to detect system risk of financial and insurance institutions. General information about vine copulas, software and their applications can be found at vine-copula.org
http://www.statistics.ma.tum.de/en/research/vine-copula-models/
Film: Wolfgang Doeblin – a mathematician rediscoveredSpeaker: Harrie Willems
Affiliation:
When: Tuesday, 24 February 2015, 11:00 am to 12:00 pm
Where: 303.310
The film tells the tragic and amazing story of the mathematician Wolfgang Doeblin and his famous manuscript ``Sur l'equation de Kolmogoroff'' (``On Kolmogoroff's equation'').
Wolfgang Doeblin, one of the great probabilists of the 20th century, was already widely known in the 1950s for his fundamental contributions to the theory of Markov chains. His coupling method became a key tool in later developments at the interface of probability and statistical mechanics. But the full measure of his mathematical stature became apparent only in 2000 when the sealed envelope containing his construction of diffusion processes in terms of a time change of Brownian motion was finally opened, 60 years after it was sent to the Academy of Sciences in Paris. The film of Agnes Handwerk and Harrie Willems documents scientific and human aspects of this amazing discovery and throws new light on the startling circumstances of his death at the age of 25.
Data Fusion Techniques for Questions of Population ConnectivitySpeaker: Louise McMillan
Affiliation: U. Auckland
When: Wednesday, 18 February 2015, 11:00 am to 12:00 pm
Where: Room 303-310
Conservation programmes use many different tools and techniques and recently, since the advent of genetic analysis and sequencing methods, many more such tools are available. One particularly useful tool is assignment testing, which uses genetic data from baseline populations to assess the likely source population of new individuals. This can be used to determine the source of invasive predators, or understand the population structure of an endangered species.
GenePlot was developed by Rachel Fewster as an improvement to existing assignment software, and the saddlepoint approximation method was added (during my masters project) as an alternative method for characterising the distribution of genotype probabilities for each population.
This talk will cover the basic principles of assignment testing, the improved methods used by GenePlot, and the details of the saddlepoint approximation, including recent work to improve the detection and mitigation of the numerical instability within the approximation to the cumulative distribution function.
Potts Models for the Segmentation of Time SeriesSpeaker: Volkmar Liebscher
Affiliation: University of Greifswald, Germany
When: Thursday, 29 January 2015, 11:00 am to 12:00 pm
Where: Room 303-310
Quite a few time series like data in life sciences force us to partition the data into periods of homogeneous behaviour separated by sudden changes.
We want to approach such data-analytic problems by minimisation of functionals composed of two terms: one measuring fidelity to the data and one penalising the complexity of the representation, essentially the cardinality of the partition. Especially, we present an overview on algorithmic solutions to the minimisation problem and the statistical performance of resulting estimators.
http://www.math-inf.uni-greifswald.de/index.php/mitarbeiter/97-volkmar-liebscher
A comparative study on the error rates and power of selection criteria for factor screeningSpeaker: Abu Zar Md Shafiullah
Affiliation: University of Auckland
When: Thursday, 15 January 2015, 3:00 pm to 4:00 pm
Where: Room 303-310
In this study we consider the problem of selecting active effects when an orthogonal main effect design is used for factor screening. We compare the performance of classical model selection criteria and a proposed APC criterion. Our simulation studies use a 12-run Placket-Burman design and select maximum 8 active effects under the assumption of effect sparsity. The proposed APC criterion is an AIC-like criterion that controls error rates at a specified level under three control strategies, viz., individual error rates (IER), experiment-wise error rates (EER) and false detection rates (FDR).This criterion is more flexible than existing model selection criteria in that the experimenter is able to select the type (IER, EER or FDR) and level of error control which is not possible for existing methods. Effect sizes and number of active effects affect the performance of all the selection criteria. Our simulation studies show in general that there is a tradeoff between error rates and power and this tradeoff is maintained quite nicely by APC criterion over a wide range of possible true models.
Quantifying the law of small numbersSpeaker: Andrew Barbour
Affiliation: U. Zurich
When: Monday, 12 January 2015, 10:00 am to 11:00 am
Where: 303.310
Until Ladislaus von Bortkewitch published his `Law of small numbers' in 1898, the Poisson distribution had been largely neglected as a statistical model.
Since then, it has become increasingly popular, in particular as an underlying framework for analyses in insurance mathematics, epidemiology and ecology. In this talk, we discuss reasons for, and limitations on, its application. Stein's method for distributional approximation gives good bounds on approximation by (compound) Poisson distributions, expressed in terms of explicit quantities that can be shown to be small under a variety of practically verifiable assumptions. For simplicity, we concentrate here on Poisson approximation and on elementary examples, such as the hat-check problem, birthday problems, coupon collecting and the probl`eme des m'enages.
Note that this is a general statistical audience talk. Andrew will subsequently (i.e. in the following days) be delivering a mini-course on Stein's method as part of the Probability and Mathematical Statistics meeting/seminar series.