Abstract: Bayesian statistics is becoming more popular in data science. Data scientists are often not trained in Bayesian statistics and if they are, it is usually part of their graduate training. During this talk, we will introduce an introductory course in Bayesian statistics for learners at the undergraduate level and comparably trained practitioners. We will share tools for teaching (and learning) the first course in Bayesian statistics, specifically the {bayesrules} package that accompanies the open-access Bayes Rules! An Introduction to Bayesian Modeling with R book. We will provide an outline of the curriculum and examples for novice learners and their instructors.

Speaker: Mine Dogucu, Associate Professor of Teaching and Vice Chair for Undergraduate Studies, Department of Statistics, University of California Irvine

Bio: Mine Dogucu is Associate Professor of Teaching and Vice Chair of Undergraduate Studies in the Department of Statistics at University of California Irvine. Her goal is to create educational resources for statistics and data science that are accessible physically and cognitively. Her work focuses on modern pedagogical approaches in the statistics curriculum, making data science education accessible, and undergraduate Bayesian education. She is the co-author of the book Bayes Rules! An Introduction to Applied Bayesian Modeling. She works on a few projects funded by the United States National Science Foundation and the National Institutes of Health. She writes blog posts about data, pedagogy, and data pedagogy at DataPedagogy.com.

Nonparametric Density Estimation for Compositional Data

Speaker: Jiajin (George) Xie

Affiliation: Department of Statistics, University of Auckland

When: Thursday, 28 November 2024, 12:00 pm to 1:00 pm

Where: 303-310

This study addresses the challenges of density estimation for compositional data, a type of data constrained to reflect relative proportions within a whole. Such data are prevalent across diverse fields, including microbiome analysis, geology, and machine learning. The research develops and evaluates nonparametric methods for high-dimensional compositional data density estimation, focusing on the mixture-based density estimation (MDE) approach. Two types of mixture components are explored: Gaussian distributions applied to log-ratio-transformed compositional data, which offer excellent flexibility, and Dirichlet distributions applied directly to compositions, effectively handling cases with zero values. The performance of these methods is assessed through simulation studies and compared with finite mixture and kernel density estimation techniques. Results demonstrate the superior accuracy and adaptability of the proposed methods in capturing intricate data structures across various scenarios.

(This is a PYR talk.)

Visualization and Analysis of Suicide Methods in Tokyo Using Interactive Graphs

Speaker: Takafumi KUBOTA

Affiliation: Tama University, Japan

When: Wednesday, 30 October 2024, 11:00 am to 12:00 pm

Where: 303-310

This study aims to visualize the trends in suicide methods in Tokyo, using Japan's regional suicide statistics to provide insights that can inform effective prevention strategies. Suicide is a significant social issue, and analyzing regional data can offer valuable perspectives for targeted interventions. The research focuses on visualizing the trends for different suicide methods by creating bar graphs, line charts, and choropleth maps. These visualizations are generated after data cleaning to clearly depict the occurrence and trends associated with each method.

The application is developed using the R packages shiny and plotly, enabling users to interactively explore the data. With shiny, users can select the items of interest, such as region,time period, or suicide method, from a menu, while plotly allows for the implementation of interactive graphs that dynamically update based on the selected parameters. This approach facilitates the identification of specific regional trends, such as railway suicides or jumps from high-rise buildings that are more prevalent in Tokyo.

Through the development and analysis of this application, the study aims to enhance the understanding of regional and method-specific suicide trends, providing recommendations for suicide prevention measures. The visualized data is expected to serve as a valuable tool for policymakers and researchers,contributing to the strengthening of suicide prevention efforts.

Test of clustering for Neyman-Scott processes

Speaker: Bethany Macdonald

Affiliation: Otago University

When: Wednesday, 23 October 2024, 11:00 am to 12:00 pm

Where: 303-310

Spatial point patterns can arise from a vast array of application areas including epidemiology, ecology and geoscience. A fundamental research question is whether the points within these patterns are independent or clustered. Somewhat surprisingly, there exists no formal statistical test for such a hypothesis. This is largely due to the long recognised fact that the likelihood of the Neyman-Scott process is intractable. Recent developments by Baddeley et al. (2022) have remedied this issue by reparametrising the Neyman-Scott model by cluster strength and cluster scale, where the Poisson process occurs when the cluster strength is zero. Using these developments, we establish a formal test of clustering for the Neyman-Scott process.

Bayesian and deep learning strategies for calibration and denoising in gravitational wave data analysis

Speaker: Ruiting Mao

Affiliation: Department of Statistics, University of Auckland

When: Thursday, 5 September 2024, 10:00 am to 11:00 am

Where: 303-B05

Bayesian statistical methods have played a pivotal role in signal detection and the physical parameter estimation of gravitational waveform models. The future space-based gravitational wave (GW) detector, the Laser Interferometer Space Antenna (LISA), which is sensitive to the millihertz frequency band, makes it possible to detect some promising sources of GWs. However, Bayesian inference for features of interest and noise characterization is often computationally expensive and subject to model misspecification with complex waveforms and nonstationary noise artifacts in the LISA data stream. Through this work, I will present the application of deep learning models to address these challenges inherent in LISA data analysis. Specifically, I will discuss two key issues: 1) Exploring calibration techniques to quantify and correct the approximation errors introduced by using computationally faster but less accurate waveform models in Bayesian parameter estimation, and 2) Investigating deep learning methods to fill in data gaps from the LISA data stream effectively.

(This is a PhD PYR talk.)

Engaging in, and teaching, ethical practice of statistics and data science

Speaker: Rochelle Tractenberg

Affiliation: Georgetown University, Washington DC

When: Tuesday, 23 July 2024, 2:00 pm to 3:00 pm

Where: 303-310

The American Statistical Association's Ethical Guidelines for Statistical Practice define "Statistical Practice" to include designing the collection of, summarizing, processing, analyzing, interpreting, or presenting, data; as well as model or algorithm development and deployment. The Guidelines are intended to support every individual who uses "statistical practice", irrespective of their level, training, degree or job title, to do so in an ethical way. When it comes to encouraging (and teaching) "ethical statistical practice", there are two dimensions that must be recognized:

(i) To practice ethically, i.e., execute each task in accordance with ethical practice standards (like the Guidelines); and

(ii) To identify, and respond to, unethical actions/requests.

In this talk we will explore how a Stakeholder Analysis can be used with the ASA Ethical Guidelines (or any guidance) to practice ethically, and teach ethical statistical practice. We will also consider an Ethical Reasoning paradigm that facilitates identifying and making an informed decision about responding to ethical dilemmas. This paradigm is also useful for both engaging in, and teaching, ethical statistical practice. Both of these tools will be examined in the context of a 7-task “statistics and data science pipeline", which itself can help instructors to reinforce student learning about the scientific method, the Problem, Plan, Data, Analysis, Conclusion cycle, and even the eight step UN-based Generic Statistical Business Process model which was developed to support "official statistics", a special case of statistical practice.

Bio: Rochelle Tractenberg is a tenured professor in the Department of Neurology, with appointments in Biostatistics, Bioinformatics & Biomathematics and Rehabilitation Medicine, at Georgetown University in Washington, DC. She is a multi-disciplinary research methodologist and ASA-accredited Professional Statistician (PStat®), as well as a cognitive scientist focused on higher education curriculum design and evaluation. Her clinical and translational work integrates theories and principles of statistics, psychometrics, and domain-specific measurement to problems of assessment and the determination of changes in cognition, brain aging, and other difficult-to-measure constructs, using qualitative and quantitative methods. She is also an internationally recognized expert on ethical statistics and data science practice, having published two books, Ethical Practice of Statistics and Data Science and Ethical Reasoning for a Data-Centered World, in 2022. In addition to ethical statistics and data science practice, she has also contributed to guidelines for ethical mathematical practice (US based) and particularly, on how to integrate ethical content into quantitative courses. She is developing a new edition of Ethical Practice of Statistics and Data Science, specifically for government settings (expected 2025) and is collaborating on a forthcoming UN Handbook on Ethical Practice in Official Statistics. Professor Tractenberg is an elected Fellow of the American Statistical Association, the International Statistics Institute, and the American Association for the Advancement of Science, and was nominated for the 2022 Einstein Foundation Award for Promoting Quality in Research. Each of these nominations highlighted her commitment to, and support for, ethical statistical practice and scientific stewardship.

Designing to Support Doing Data Science and Statistics in Schools

Speaker: Hollylynne Lee

Affiliation: NC State University

When: Tuesday, 16 July 2024, 4:00 pm to 5:00 pm

Where: 303-310

Abstract: The U.S. often looks to New Zealand for resources and research related to teaching and learning statistics. In this talk, Hollylynne will discuss two recent projects situated in the U.S. that are advancing the teaching and learning of statistics and data science for secondary schools. These projects have designed curricula and online professional learning experiences for teachers at all stages of their career, from undergraduate education through life-long learning as a practicing teacher. We collaborate with a team at CODAP to integrate advanced data experiences into classrooms. The presentation will have something for everyone related to research, design of educational materials, and ideas for secondary classrooms.

Bio: Dr. Hollylynne Lee is a Distinguished University Professor of Mathematics and Statistics Education in the STEM Education department at NC State University, Raleigh NC, USA. She is also a Senior Faculty Fellow at the Friday Institute for Educational Innovation where she directs the Hub for Innovation and Research in Statistics and Data Science Education (https://fi.ncsu.edu/teams/hirise/). With experience teaching in elementary, middle, and high school classrooms, she brings a depth of practical perspectives to her research, and ensures her research and designs of educational resources are directly applicable to teachers and students. Her current work includes a focus on teachers’ professional learning for teaching with data using tools like CODAP and transforming undergraduate teacher preparation related to teaching statistics and data science. She loves reading, kayaking, watching volleyball, spending time with family, and her dog and cat. https://ced.ncsu.edu/people/hstohl/

Investigating Statistical Literacy of Health Professionals in Papua New Guinea

Speaker: Deborah Kakis

Affiliation: UoA

When: Tuesday, 11 June 2024, 10:00 am to 11:00 am

Where: 303-310

Abstract:

In today’s healthcare landscape, where evidence-based practice is considered the gold standard, data and statistical literacy are important skills for healthcare professionals. These competencies enable practitioners to collect, store and manage medical data,

analyse data, interpret research findings, and make informed decisions. In Papua New Guinea (PNG), a developing nation with unique healthcare challenges, fostering these literacies becomes critical.

Healthcare professionals in PNG deal with data daily, whether patient records, public health data, or administrative information.

Ensuring the reliability and utility of this data for evidence-based practice requires strong data literacy skills to guarantee accurate

collection and storage, while statistical literacy enables practitioners to extract meaningful insights to inform their practice.

However, many challenges hinder healthcare professionals in PNG from developing strong foundations in these areas, leading to a lack of confidence in their data and statistical literacy skills, which are necessary for evidence-based practice.

To address this gap, this proposed study aims to assess the current data and statistical literacy levels among healthcare professionals in PNG. By evaluating their proficiency, we can identify areas for improvement and tailor target training programs accordingly. Enhancing statistical and data literacy equips healthcare professionals to evaluate treatment efficacy confidently, identify emerging trends, and actively contribute to evidence-based care.

This is the PYR seminar

Modern Variable Selection for Vector Generalized Linear Models

Speaker: Wenqi Zhao

Affiliation: UoA

When: Monday, 27 May 2024, 1:00 pm to 2:00 pm

Where: 303-257

Abstract:

The generalized linear model (GLM) is the framework in

statistics for modeling the relationship between a response variable

and one or more predictor variables, it is typically used to

fit random variables to linear regression to predict observations.

While GLMs offer relatively straightforward interpretation of

coefficients, they may not capture complex interactions or nonlinear

relationships in the data. Vector generalized linear models(VGLMs)

and vector generalized additive models (VGAMs) can greatly extend

GLMs, currently VGAM implements over 150 family functions, it has

a large flexible framework to vary model elements. Variable

selection is a crucial step in statistical modeling identifying the

most relevant observations for predicting the response variable.

In VGLM/VGAM framework, usually using the minimum value

of some information criterion (IC). Among such, the Akaike

IC (AIC) and Bayesian IC (BIC) are the most common.

VGAMs also can penalize regression splines using P-spline

smoothers, which we term ‘P-spline VGAMs’, however, fitting VGAMs

with penalized regression splines can be computationally intensive,

particularly when dealing with large datasets or high-dimensional

predictor spaces. When the variables are greater than the

observations,

In this project, we propose to combine elastic net and VGLM/VGAM

framework to create a new model selection method. Elastic net

regularization techniques can help prevent over#tting and

multidisciplinary. Elastic net can result in sparser models with fewer

predictors. This regularization path helps in identifying and handling

multicollinearity by favoring models with fewer predictors in

VGLM/VGAM framework.

This is the PYR seminar.

Childhood Risk and Resilience Factors for Pasifika Youth Respiratory Health: Accounting for Attrition and Missingness

Speaker: Dawson Zhai

Affiliation: UoA

When: Friday, 24 May 2024, 1:00 pm to 2:00 pm

Where: 303-310

Abstract:

In New Zealand, 7% of deaths are related to respiratory diseases, with Pacific people at higher risk. Based on knowledge of lung development, lung function can be damaged in two ways: 1) Lung function reduction: early insults may lower the maximum lung function and/or accelerate its decline after the peak; 2) Predisposition to later respiratory disease: early disease raises the risk of later disease occurring. Conversely, some resilience factors can create beneficial effects on respiratory function and/or provide protection to stop subsequent respiratory diseases; among these factors are childhood levels of physical activity, smoke exposure, immunisation, housing conditions, and breastfeeding.

Using Pacific Island Family Study (PIFS) cohort data, this work will investigate the causal effects of identified early-life factors on early-adulthood lung function, quality of life and comorbidities. The PIFS cohort is a longitudinal cohort, the participants of which were enrolled at birth in Middlemore Hospital (n=1398) between March and December 2000. A respiratory assessment (n=466) was conducted within the cohort when participants were 18 years old. In this PIFS birth cohort respiratory study, the primary respiratory outcome was the z-score of the Forced Ejection Volume in 1 second (FEV1). Secondary outcomes consisted of FEV1 adjusted for height and sex; the healthy lung function (HLF) indicator, defined as the z-score exceeding -1.64; health-related and respiratory-health-related quality of life scores; and respiratory condition indicators. The attrition and missingness present in the group undergoing respiratory assessment will inform much of the analysis plan, as will the longitudinal character of the risk and protective factors and their confounders.

This is the PYR seminar.

Statistical Methods and Designs for Multi-Wave Validation Studies

Speaker: Gustavo, Guimaraes DeCastro Amorim

Affiliation: Vanderbilt University Medical Center

When: Thursday, 23 May 2024, 11:00 am to 12:00 pm

Where: 303-310

Abstract:

Measurement errors are present in many data collection procedures and can harm analyses by biasing estimates. To correct for measurement errors, researchers often validate a subsample of records and then incorporate the information learned from this validation sample into the estimation. In practice, the validation sample is often selected using simple random sampling (SRS). However, SRS leads to inefficient estimates because it ignores information on the error-prone variables, which can be highly correlated to the unknown truth. Applying and extending ideas from the two-phase sampling literature, we propose optimal and nearly-optimal designs for selecting the validation sample in the classical measurement-error framework. We also present novel extensions of estimators that make use of all available data collected in two or more waves. We show through simulations that incorporating information from intermediate steps can lead to substantial gains in efficiency. These works are motivated by and illustrated in Multi-National HIV Research Cohorts.

About the speaker :

Dr Amorim is an Assistant Professor of Biostatistics. His research interest include developing novel statistical methods for problems arising in public health studies, semiparametric models for model misspecification, two-phase designs, measurement-error problems and ordinal data analysis.

https://www.vumc.org/biostatistics/person/gustavo-amorim

COVID-19 vaccine fatigue in Scotland: How do the trends in attrition rates for the second and third doses differ by age, sex, and council area?

Speaker: Robin Muegge

Affiliation: University of Glasgow

When: Thursday, 16 May 2024, 11:00 am to 12:00 pm

Where: 303-310

Abstract: Vaccine fatigue is the propensity for individuals to start but not finish a vaccination program with several doses, which thus means they are less protected. This is an especially important topic for COVID-19, where the vaccination program commonly consists of two doses followed by a booster vaccine dose to get full protection. COVID-19 vaccine hesitancy (the delay in acceptance or refusal of the first dose of the vaccine) has been studied extensively, and a few studies investigated the willingness to receive the booster vaccine dose. In contrast, attrition rates across subsequent doses caused by evolving vaccine fatigue have yet to be examined, which is the novel contribution of this paper. Our study focuses on Scotland, where the vaccine rollout began on 8th December 2020. We model vaccine attrition rates in the first transition (from doses one to two) and the second transition (from doses two to three) for the 32 council areas in Scotland. We estimate the effects of sex and transition, examine trends and patterns in the attrition rates by age group and council area and evaluate if these differ by sex or transition. We model the attrition rates with a hierarchical binomial logistic regression model that allows for flexible autocorrelation estimation for the corresponding neighbourhood and age group structures via correlated random effects models. Inference is based on a Bayesian paradigm, using integrated nested Laplace approximation (INLA). Our main findings are that attrition rates smoothly decrease with increasing age, that they are much higher in the second transition than in the first, that they are generally higher for males than females, and that the variation in attrition rates between age groups is greater for males than females.

At the end of the seminar, I will introduce my current work on outlier detection in areal data, titled “Disease mapping: What if Tobler's First Law of Geography doesn't hold?”

Bio: Robin Muegge is a PhD student in statistics from the University of Glasgow, UK. His research is in spatial and spatio-temporal areal data modelling under the supervision of Duncan Lee, Nema Dean, and Eilidh Jack. Robin completed his B.Sc. Mathematics at the Leibniz University of Hanover in Germany, and his M.Sc. Statistics at Portland State University, USA. He spent 11 weeks at the University of Wollongong, collaborating with Andrew Zammit Mangion, and is visiting the University of Auckland from the 13th to the 17th of May before returning to Glasgow.

Advanced methods for time series data applied to prediction of operating modes and detection of anomalies for wind turbines

Speaker: Hannah Yun

Affiliation: UoA

When: Monday, 13 May 2024, 10:00 am to 11:00 am

Where: 303-257

Abstract:

Wind turbine can be characterised by distinct operating modes that reflect production efficiency. In this talk, we focus on the forecasting problem for univariate discrete-valued time series of operating modes of a wind turbine. We define three prediction strategies to overcome the difficulties associated with missing data. These strategies are evaluated through experiments using five forecasting methods across two real-life datasets. Two of the forecasting methods have been introduced in the statistical literature as extensions of the well-known context algorithm: variable length Markov chains and Bayesian context tree. Additionally, we consider a Bayesian method based on conditional tensor factorisation and two different smoothers from the classical tools for time series forecasting. Each pair prediction strategy/forecasting method is evaluated in terms of prediction accuracy versus computational complexity. We provide guidance on the methods that are suitable for forecasting the time series of operating methods. The prediction results demonstrate that high accuracy can be achieved with reduced computational resources.

We will also briefly discuss how recent advances in the field of dictionary learning can be tailored to detect equipment health deterioration in the case of wind turbines.

This is the PYR seminar.

New Methods for Fitting Hawkes Models with Large Data

Speaker: Conor Kresin

Affiliation: University of Otago

When: Thursday, 2 May 2024, 11:00 am to 12:00 pm

Where: 303-310

Abstract :

Hawkes processes are concise mathematical representations of diverse point process data, ranging from disease spread and wildfire occurrences to non-physical phenomena such as financial asset price movements. Models for point process data are often fit using maximum likelihood (MLE) or Markov Chain Monte Carlo (MCMC), but such methods are slow or computationally intractable for data with large n. In this talk, I will present a novel estimator based on the Stoyan-Grabarnik (sum of inverse intensity) statistic. Unlike MLE or MCMC approaches, the proposed estimator does not require approximation of a computationally expensive integral. I will show that under quite general conditions, this estimator is consistent for estimating parameters governing spatial-temporal point processes such as the Hawkes process and present simulations demonstrating the performance of the estimator. In the second portion of the talk, I will discuss increasingly flexible parametric Hawkes models, culminating in Continuous Long Short Term Memory (cLSTM) recurrent neural networks.

About the speaker :

Conor Kresin is a lecturer of the Department of Mathematics and Statistics, University of Otago. His research interest include Point process theory and applications, stochastic geometry, disease modelling, information theory, causal inference.

Reproducible inference and model selection using bagged posteriors

Speaker: Jeffrey Miller

Affiliation: Harvard University

When: Thursday, 18 April 2024, 11:00 am to 12:00 pm

Where: 303-310

Abstract:

Under model misspecification, it is known that Bayesian posteriors often do not properly quantify uncertainty about true or pseudo-true parameters. Even more fundamentally, misspecification leads to a lack of reproducibility in the sense that the same model will yield contradictory posteriors on independent data sets from the true distribution. To improve reproducibility, an easy-to-use and widely applicable approach is to apply bagging to the Bayesian posterior ("BayesBag"); that is, to use the average of posterior distributions conditioned on bootstrapped datasets. To define a criterion for reproducible uncertainty quantification under misspecification, we consider the probability that two confidence sets constructed from independent data sets have nonempty overlap, and we establish a lower bound on this overlap probability that holds for any valid confidence sets. We prove that credible sets from the standard posterior can strongly violate this bound, indicating that it is not internally coherent under misspecification, whereas the bagged posterior typically satisfies the bound. We demonstrate on simulated and real data.

About the speaker:

Jeff Miller is an Associate Professor of Biostatistics, Harvard University. He is interested in using statistics to understand the molecular mechanisms of diseases of aging. His methodological research focuses on robustness to model misspecification, nonparametric Bayesian models, frequentist analysis of Bayesian methods, and efficient algorithms for inference in complex models.

https://www.hsph.harvard.edu/profile/jeffrey-miller/

Practical Functions: Practically Magic

Speaker: Nicholas Tierney

Affiliation: Telethon Kids Institute

When: Thursday, 21 March 2024, 11:00 am to 12:00 pm

Where: 303-310

Abstract : I think the highest value skillset in statistical programming is knowing how to write good functions. Functions are often taught as a tool to avoid repetition using the mnemonic DRY: Don't Repeat Yourself. Whilst DRY is both true and real, I think functions are at their best when they encapsulate expression and are easy to reason with. That is, DRY is sufficient, but not necessary. Writing good functions is more than esoteric aesthetics. We need to be able to reason with our code in statistics. We often don't have the capacity to write tests to show our code is "correct". Instead, we need to rely on our ability to reason with, trust, and verify that the code works as it should. I believe writing good functions that encapsulate expressions and are able to be reasoned with are how we can ensure our code, and therefore our methods, and our analyses, work as they should. In this talk I will discuss some practical ideas on writing a good function, how to identify bad ones, and how to move between the two states.

About the speaker : I work as a research software engineer, with Nick Golding on the greta R package for statistical modelling, and implementing novel statistical methods for infectious diseases like COVID19 and malaria. I work at the Telethon Kids Institute, which is based in Perth, Western Australia, but I work remotely in Launceston, Tasmania. I am a strong advocate for free and open source software, and have written several R packages to improve data analysis.

https://www.njtierney.com/about/

Sampling older populations: methods and challenges in the IDEA Programme

Speaker: Ngaire Kerse

Affiliation: UoA

When: Wednesday, 20 March 2024, 11:00 am to 12:00 pm

Where: 303-257

Abstract: Dementia is a global health priority. The IDEA programme is a dementia prevalence study aiming to establish the true prevalence of dementia among older adults in Aotearoa New Zealand, with a particular emphasis on diverse ethnic groups. In this seminar, we will provide an overview of the methods and challenges associated with sampling older adults. We will discuss the various sampling strategies employed for each setting, including the community, retirement villages, and aged residential care.

Speaker: Ngaire Kerse is the Joyce Cook Chair in Ageing Well, a GP, and Professor of General Practice and Primary Health Care at the University of Auckland. With over 350 publications and 50 research grants, she is an international expert in falls prevention, bi-cultural ageing, and primary health care. Leading multiple research teams, Ngaire spearheads projects such as LiLACS NZ, focusing on equity, health service use, and well-being in advanced age. Her work on fall prevention includes studies on older individuals post-stroke and in residential care. Currently, she heads the IDEA programme, investigating the prevalence and impact of dementia in Aotearoa.

https://profiles.auckland.ac.nz/n-kerse

Two Applications of Regression Averaging

Speaker: Norman Matloff

Affiliation: University of California

When: Thursday, 7 March 2024, 3:00 pm to 4:00 pm

Where: 303-310

Abstract:

My term "regression averaging" refers to first running a regression estimation procedure, be it a linear model, k-Nearest Neighbors or whatever, then averaging the fitted values over some region. I will present two applications of this. The first is on the topic of dealing with missing values, specifically in a context of prediction rather than effect estimation. The second is in the area of removing bias with respect to sensitive variables, say race or gender in a prediction model.

About the speaker:

Norman Matloff is a professor in the Department of Computer Science at the University of California. Professor Matloff’s research areas include parallel processing (especially software distributed shared memory), statistical computing, and predictive analytics.

https://faculty.engineering.ucdavis.edu/matloff/

An Overview: Data Analysis for Space-based Gravitational Wave Observations

Speaker: Ollie Burke

Affiliation: Laboratoire des 2 infinis - Toulouse (L2IT)

When: Thursday, 7 March 2024, 2:00 pm to 3:00 pm

Where: 303S-561

Abstract:

Current observations through ground-based detectors of gravitational waves (GWs) are having a pronounced effect on the understanding of our universe. Due to the presence of the earth, ground-based detectors are limited in sensitivity to lower frequency GWs, losing access to the rich science that can be reaped from higher mass black hole coalescences. The proposed space-based detector, the Laser Interferometer Space Antennae (LISA), eliminates sources of noise from the earth and will provide access to observations of GWs in the rich mHz frequency band, thus higher mass binaries. The aim of this talk is to be pedagogical in nature: reviewing GWs up to the first detection GW150914, providing an overview of LISA specific sources with a simple example of Bayesian inference applied to a toy GW model. We will finish on the prospects for the LISA instrument by discussing both current work and future challenges in the context of data analysis.

https://inspirehep.net/authors/1976434

Overview of R Package predictmeans

Speaker: Dongwen Luo

Affiliation: AgResearch

When: Thursday, 7 March 2024, 11:00 am to 12:00 pm

Where: 303-310

Abstract:

The "predictmeans" R package provides a comprehensive set of functions for diagnosis and inference from a range of models common in statistical analysis. These models include those generated by "aov", "lm", "glm", "gls", "lme", "lmer", "glmer", "glmmTMB" and "semireg". Inferences include key statistical metrics such as predicted means and standard errors, contrasts, multiple comparisons, permutation tests, and adjusted R-squared values and graphical representations. This presentation will demonstrate the key capabilities of this package through practical examples, with a particular focus on semiparametric regression techniques and the calculation of adjusted R-squared values for generalized mixed-effects models.

https://www.researchgate.net/scientific-contributions/Dongwen-Luo-2004321262

Healthcare and Public Health Monitoring and Management

Speaker: Kwok-Leung Tsui

Affiliation: Virginia Polytechnic Institute and State University

When: Monday, 4 March 2024, 11:00 am to 12:00 pm

Where: 303-310

Abstract :

Due to the advancement of computation power, sensor technologies, and data collection tools, the field of healthcare and public health monitoring and management have been evolved over the past several decades with different names under different application domains, such as statistical process control (SPC), process monitoring, health surveillance, prognostics and health management (PHM), personalized medicine, etc. There are tremendous opportunities in interdisciplinary research of system monitoring through integration of SPC, system informatics, data analytics, PHM, and personalized health management. In this talk we will present our views and experience in the evolution of systems monitoring and health management, its challenges and opportunities, as well as its applications in both healthcare surveillance and public health management.

About the speaker :

Kwok L Tsui is professor in the Grado Department of industrial and Systems Engineering at Virginia Polytechnic Institute and State University. Tsui’s current research interests include data science and data analytics, surveillance in healthcare and public health, personalized health monitoring, prognostics and systems health management, calibration and validation of computer models, process control and monitoring, and robust design and Taguchi methods.

https://www.ise.vt.edu/people/faculty/tsui.html

Modelling the stochastic gravitational wave background and noise in LISA

Speaker: Nazeela Aimen

Affiliation: UoA

When: Thursday, 29 February 2024, 1:00 pm to 2:00 pm

Where: 303-310

Abstract:

Gravitational waves (GWs) are ripples in space-time produced by some of the universe's most violent and energetic events. Since the first detection through ground-based detectors in 2015, they have opened a new pathway for understanding our universe. A new frequency range for detection will be opened by the Laser Interferometer Space Antenna (LISA), a space-based observatory expected to launch in the 2030s. One potential detection of LISA is the stochastic gravitational wave background (SGWB), which is the superposition of many unresolved GWs. Detecting SGWB holds significant promise in unravelling insights into the early universe and astrophysical sources. However, one of the most critical challenges is distinguishing it from LISA's stochastic instrumental noise. In order to solve this problem, we investigate the parameter estimation of SGWB and LISA noise using a Bayesian framework comprising parametric and non-parametric models. We propose three parametric models for SGWB: power law, broken power law and single peak. We fit the noise with a non-parametric model, using a prior based on a mixture of penalized splines (P-splines). They estimate spectral densities with sharp peaks and abrupt changes due to the flexibility of B-splines with knots placed based on the variation in the data. We combine a fixed number of B-splines with a simple difference penalty, which controls the degree of smoothness of power spectral density (PSD). We demonstrate accurate estimates of PSD in a simulation study and a case of realistic LISA noise, using only P-splines and will extend our analysis to full implementation of our model.

This is the PYR seminar.

Optimising Healthcare Pathways for Elderly Patients: Wellbeing Equity and Efficiency

Speaker: Yvonne Li

Affiliation: UoA

When: Monday, 19 February 2024, 2:00 pm to 3:00 pm

Where: 303-G14

Abstract:

This work explores enhancing healthcare for elderly patients in Aotearoa New Zealand through queueing theory and simulations, responding to the demographic shift towards an aging population. It addresses the need for more effective and equitable healthcare, considering workforce shortages and access disparities. By developing mathematical models for patient flow, waiting time, and resource allocation, this research underscores the necessity of models that adjust priorities and routing to alleviate service congestion, aiming to improve resource use, access equality, and elderly patient wellbeing.

This is Yvonne's PYR seminar.

Close-kin mark-recapture methods to estimate demographic parameters of mosquitoes

Speaker: John Marshall

Affiliation: University of California, Berkeley

When: Wednesday, 31 January 2024, 3:00 pm to 4:00 pm

Where: 303-310

Abstract :

Close-kin mark-recapture (CKMR) methods have recently been used to infer demographic parameters such as census population size and survival for fish of interest to fisheries and conservation. These methods have advantages over traditional mark-recapture methods as the mark is genetic, removing the need for physical marking and recapturing that may interfere with parameter estimation. For mosquitoes, the spatial distribution of close-kin pairs has been used to estimate mean dispersal distance, of relevance to vector-borne disease transmission and novel biocontrol strategies. Here, we extend CKMR methods to the life history of mosquitoes and comparable insects. We derive kinship probabilities for mother-offspring, father-offspring, full-sibling and half-sibling pairs, where an individual in each pair may be a larva, pupa or adult. A pseudo-likelihood approach is used to combine the marginal probabilities of all kinship pairs. To test the effectiveness of this approach at estimating mosquito demographic parameters, we develop an individual-based model of mosquito life history incorporating egg, larva, pupa and adult life stages. The simulation labels each individual with a unique identification number, enabling close-kin relationships to be inferred for sampled individuals. Using the dengue vector Aedes aegypti as a case study, we find the CKMR approach provides unbiased estimates of adult census population size, adult and larval mortality rates, and larval life stage duration for logistically feasible sampling schemes. Considering a simulated population of 3,000 adult mosquitoes, estimation of adult parameters is accurate when ca. 40 adult females are sampled biweekly over a three month period. Estimation of larval parameters is accurate when adult sampling is supplemented with ca. 120 larvae sampled biweekly over the same period. The methods are also effective at detecting intervention-induced increases in adult mortality and decreases in population size. As the cost of genome sequencing declines, CKMR holds great promise for characterizing the demography of mosquitoes and comparable insects of epidemiological and agricultural significance.

About the speaker :

John Marshall is a Professor in Residence of Biostatistics and Epidemiology whose research supports efforts to control and eliminate mosquito-borne diseases such as malaria, dengue, and Zika virus broadly.

https://publichealth.berkeley.edu/people/john-marshall/

Self-reinforced Knothe--Rosenblatt rearrangements for high-dimensional stochastic computation

Speaker: Tiangang Cui

Affiliation: University of Sydney

When: Wednesday, 31 January 2024, 11:00 am to 12:00 pm

Where: 303-310

Abstract :

Characterizing intractable high-dimensional random variables is a fundamental task in stochastic computation. It has broad applications in statistical physics, machine learning, uncertainty quantification and beyond. The recent surge of transport maps offers new insights into this task by constructing variable transformations that couple intractable random variables with tractable reference random variables. In this talk, we will present numerical methods that build the Knothe--Rosenblatt (KR) rearrangement of a family of transport maps in a triangular form in high dimensions. We first design function approximation tools to realize the KR rearrangement that ensures the order-preserving property with controlled statistical errors. We then introduce a self-reinforced procedure to adaptively precondition the construction of KR rearrangements to significantly expand their capability of handling random variables with complicated nonlinear interactions and concentrated density functions. We demonstrate the efficiency of the resulting self-reinforced KR rearrangements on applications in statistical learning and uncertainty quantification, including parameter estimation for dynamical systems, PDE-constrained inverse problems, and rare event estimation.

About the speaker :

Tiangang is a Senior Lecturer of the School of Mathematics and Statistics, University of Sydney. His research interests are broadly in computational mathematics for scientific machine learning and data science. I develop mathematically rigorous computational methods for statistical inverse problems, data assimilation and uncertainty quantification. These methods aim to optimally learn hidden structures and driven factors of complex mathematical models from data for issuing certified model predictions and making risk-averse decisions.

https://www.fastfins.org/

Top

Hosting

Department of Statistics

2024 Seminars

Please give us your feedback or ask us a question