Department of Statistics


2023 Seminars


»
Modelling Self-Excitement in Ecology, Alec van Helsdingen
»
An automatic method for the identification of cycles in Covid-19 time series data, Miaotian (Vivian) Li
»
Using Convolutional Autoencoders for Signal Detection of Extreme Mass Ratio Inspirals Detected by the LISA Mission, Amin Boumerdassi
»
Estimating Customer Impatience in a Service System With Unobserved Balking, Prof. Michel Mandjes
»
Modelling Warranty Claims using Geometric-like Processes, Sarah Marshall
»
An Automatic Finite-Sample Robustness Check: Can Dropping a Little Data Change Conclusions?, Tamara Broderick
»
Cutting Feedback with copula models, David Nott
»
Exploring the deduction of the kinematic properties of globular cluster populations: A Bayesian approach using nested sampling, Yuan Li
»
Is PCA telling statistical lies?, Nuwan Weeraratne
»
StableMate: a new statistical method to select stable predictors in omics data, Professor Kim-Anh Lê Cao
»
Fast Spectral Density Estimation for Multivariate Time Series, Jianan Liu
»
Teaching knowledge, playfulness, student protagonism, social justice, interdisciplinarity: some results of Brazilian research in statistical education, Mauren Porciuncula
»
Forecasting high-dimensional functional time series: Application to sub-national age-specific mortality, Cristian Felipe Jimenez Varon
»
Two-Phase Sampling: Automated Imputation in Analysing Subsamples., Keiran Shao
»
NBA Action, It’s FANtastic (and great for data analysis too!), Ryan Elmore
»
Towards Fluent Interactive Data Visualization, Adam Bartonicek
»
Unsupervised Statistical Tools for the Detection of Anomalies in Populations, Prof. Fabrizio Ruggeri
»
The propensity score for the analysis of observational studies, Prof Markus Neuhaeuser
»
K-12 Data Science or Statistics? Is a distinction needed?, Professor Rob Gould
»
Past seminars

Seminars by year: Current | 2004 | 2005 | 2006 | 2007 | 2008 | 2009 | 2010 | 2011 | 2012 | 2013 | 2014 | 2015 | 2016 | 2017 | 2018 | 2019 | 2020 | 2021 | 2022 | 2023

Modelling Self-Excitement in Ecology

Speaker: Alec van Helsdingen

Affiliation: UoA

When: Friday, 24 November 2023, 11:00 am to 12:00 pm

Where: 303-153

Self-excitement is a phenomenon where one event induces others to occur later (e.g. earthquake aftershocks), and may be modelled with the Hawkes process. This seminar will focus on two applications in ecological statistics.

The first application focuses on Spatial capture-recapture (SCR), which is a method to estimate animal populations from sighting data (e.g. motion-triggered cameras). Standard SCR methods assume animal movements are temporally independent, but this is unrealistic as an animal is most likely to be close to where it was last seen. Our proposed solution is based on the Hawkes process and makes the rates of detection of a given animal a function of not only the location, but also on where and when the animal was last seen. This leads to a detection of an animal “self-exciting” detections at that or nearby cameras. We will show that our model gives more accurate results than traditional SCR in situations where SCR population estimates may be negatively biased.

The second application models cues emitted by sperm whales. The times of these cues form a temporal point pattern that is clearly self-exciting, but the Hawkes process is too restrictive because the clicks do not adhere to the assumptions of the Poisson process. Specifically, they are more evenly spaced (under-dispersed) than expected. Motivated by this example, we have developed a framework for incorporating under-dispersion and over-dispersion into the Hawkes process. We use a Weibull rather an exponential distribution to model the time between events, giving more flexibility and a better fit. We use our new formulation to model the cues of an individual whale, confirm our intuition that the cues are under-dispersed, and quantify the relationship between the cue rate and covariates.

This is the PYR seminar.

An automatic method for the identification of cycles in Covid-19 time series data

Speaker: Miaotian (Vivian) Li

Affiliation: UoA

When: Thursday, 23 November 2023, 10:00 am to 11:00 am

Where: 303-148

Abstract :

All previous methods for the identification of cycles in Covid-19 daily and weekly data involve a subjective interpretation of the results. This poses difficulties for researchers interested in conducting a comprehensive study which analyzes the presence of the cycles for each country/territory/area (CTA). During the first year of PhD studies, we have designed an algorithm that detects automatically the fundamental period T0 and the harmonics T0/2,...,T0/5, where T0=7 days for daily data and T0=52 weeks for weekly data. We have tested the new algorithm by applying it to the time series from 236 CTA's, where World Health Organization (WHO) collected the Covid-19 data. The detection results we have obtained confirm the findings previously reported by other researchers.

In this talk, we will present all the details of our algorithm and comment on the results obtained in experiments with Covid-19 time series data. We will also discuss a proposal for evaluating the dissimilarity between the time series collected for two different CTA’s.

This is the PYR seminar.

Using Convolutional Autoencoders for Signal Detection of Extreme Mass Ratio Inspirals Detected by the LISA Mission

Speaker: Amin Boumerdassi

Affiliation: UoA

When: Tuesday, 21 November 2023, 2:00 pm to 3:00 pm

Where: 303-148

Extreme Mass Ratio Inspirals (EMRIs) are gravitational wave (GW) events produced by the mergers of pairs of massive objects such as black holes and neutron stars whose mass ratio is >10,000. Generally, GWs are caused by the acceleration of masses resulting in the distance between two points to oscillate in time. These oscillations can be detected by measuring the interference of light beams which propagate at a fixed speed but traverse varying distances when a GW passes through. Traditionally, the detection of GW events was performed through matched filtering in which a detected signal would be compared to millions of variations of a template model to find the closest fit to the detected signal. In the case of EMRIs, this is computationally unfeasible owing to the large parameter space of EMRI waveform models, years-long waveform duration, and large file size. My work attempts to overcome these problems by leveraging recent developments in the rapid generation of EMRI waveforms, paired with machine learning (ML) techniques which can perform a given task quickly and with little computational requirement. The ML model of choice is the convolutional autoencoder which learns to map input data to a low-dimensional representation, and back into a reconstruction of the original input. A trained autoencoder is expected to poorly reconstruct data that is not of the same kind as its training data. Hence, the problem of detecting EMRI signals can be framed as an anomaly detection problem in which non-EMRI signals are treated as anomalies to be poorly reconstructed by an autoencoder trained to reproduce EMRIs. This could be used to analyse EMRIs detected by the space-based GW detector LISA, due to be launched in the early 2030s. Successful detections of EMRIs will open the door for novel tests of General Relativity, the theory of gravity which itself led to the prediction of GWs.

This is the PYR seminar.

Estimating Customer Impatience in a Service System With Unobserved Balking

Speaker: Prof. Michel Mandjes

Affiliation: Universiteit van Amsterdam

When: Friday, 3 November 2023, 11:00 am to 12:00 pm

Where: 303-G14

ABSTRACT:

In this talk I'll discuss a service system in which arriving customers are provided with information about the delay they will experience. Based on this information, they decide to wait for service or leave the system. Specifically, every customer has a patience threshold, and they balk if the observed delay is above the threshold. The main objective is to estimate the parameters of the customers' patience-level distribution and the corresponding potential arrival rate, using knowledge of the actual queue-length process only. The main complication and distinguishing feature of our setup lies in the fact that customers who decide not to join are not observed, and remarkably, we manage to devise a procedure to estimate the underlying patience and arrival rate parameters. The underlying model is a multiserver queue with a Poisson stream of customers, enabling evaluation of the corresponding likelihood function of the state-dependent effective arrival process.

We establish strong consistency of the MLE and derive the asymptotic distribution of the estimation error. Several applications and extensions of the method are discussed. The performance is further assessed through a series of numerical experiments. By fitting parameters of hyper-exponential and generalized hyperexponential distributions, our method provides a robust estimation framework for any continuous patience-level distribution.

The last part of the talk will discuss the setting in which the arrival process is not constant but follows a periodic pattern (say, following a daily pattern) -- in this setup various technical hurdles have to be overcome, primarily related to establishing an appropriate regeneration structure.

About the speaker : Prof. Mandjes is a professor in probability and operations research at the University of Leiden. His research interest include stochastic networks, queueing theory, stochastic processes, operations research, large deviations, simulation, performance.

https://www.universiteitleiden.nl/en/staffmembers/michel-mandjes#tab-1

Modelling Warranty Claims using Geometric-like Processes

Speaker: Sarah Marshall

Affiliation: UoA

When: Wednesday, 1 November 2023, 2:00 pm to 3:00 pm

Where: 303-310

Abstract:

The geometric process can be used to model the occurrence of events with an underlying monotonic trend. This type of trend can be observed in many practical problems in reliability, in particular in the recurrent failures of ageing repairable systems. When both the operational and repair times are of interest and are impacted by ageing, the alternating geometric process can be used. Two approaches for computing the mean (i.e. the expected number of events by a given time) and the variance of the alternating geometric process are presented and applied to warranty cost analysis. Various extensions of the geometric process have been proposed to provide greater flexibility in situations involving trends. This talk provides an overview of the related geometric-like processes, focusing on the alpha series process. The alternating geometric process and the alternating alpha series process are applied to warranty data from an automotive manufacturer and are shown to be superior to an alternating renewal process.

About the speaker : Dr Sarah Marshall is a Senior Lecturer in the Department of Information Systems and Operations Management at the University of Auckland Business School, New Zealand. Her research focuses on the use of operations research to address problems of interest to business and industry. She has expertise in deterministic and stochastic modelling, simulation, and analytics, and has applied these across a variety of domains, such as remanufacturing, healthcare, rainfall modelling and water resource management. Currently, her work focusses on the use of geometric-like processes to model ageing repairable systems, with applications in reliability and warranty analysis.

https://profiles.auckland.ac.nz/sarah-marshall

An Automatic Finite-Sample Robustness Check: Can Dropping a Little Data Change Conclusions?

Speaker: Tamara Broderick

Affiliation: Massachusetts Institute of Technology

When: Wednesday, 1 November 2023, 11:00 am to 12:00 pm

Where: 303-310

Abstract : Practitioners will often analyze a data sample with the goal of applying any conclusions to a new population. For instance, if economists conclude microcredit is effective at alleviating poverty based on observed data, policymakers might decide to distribute microcredit in other locations or future years. Typically, the original data is not a perfect random sample from the population where policy is applied -- but researchers might feel comfortable generalizing anyway so long as deviations from random sampling are small, and the corresponding impact on conclusions is small as well. Conversely, researchers might worry if a very small proportion of the data sample was instrumental to the original conclusion. So we propose a method to assess the sensitivity of statistical conclusions to the removal of a very small fraction of the data set. Manually checking all small data subsets is computationally infeasible, so we propose an approximation based on the classical influence function. Our method is automatically computable for common estimators. We provide finite-sample error bounds on approximation performance and a low-cost exact lower bound on sensitivity. We find that sensitivity is driven by a signal-to-noise ratio in the inference problem, does not disappear asymptotically, and is not decided by misspecification. Empirically we find that many data analyses are robust, but the conclusions of several influential economics papers can be changed by removing (much) less than 1% of the data.

About the speaker : Tamara is a A/Professor in the Electrical Engineering and Computer Science Department, MIT and, was awarded several honors including the Evelyn Fix Memorial, Savage Award, National Science Foundation Career Award. She works in the areas of machine learning and statistics, particularly in Bayesian statistics and graphical models with an emphasis on scalable, nonparametric, and unsupervised learning.

https://tamarabroderick.com/

Cutting Feedback with copula models

Speaker: David Nott

Affiliation: National University of Singapore

When: Wednesday, 25 October 2023, 3:00 pm to 4:00 pm

Where: 303-310

Abstract : Complex models are often specified through a collection of coupled submodels or "modules". Bayesian inference for such models is attractive in principle, but the presence of a misspecified module can adversely affect inferences about parameters beyond those appearing in the problematic module. "Cutting feedback" is a modified Bayesian inference method which attempts to mitigate the impact of such suspect modules. In this talk we consider cutting feedback for copula models, which are multivariate models defined by separately specifying marginal distributions and the dependence structure through a copula function. We treat the marginals and the copula function as two distinct modules for modular Bayesian inference, and consider two types of cut posterior distributions. The first limits the influence of a misspecified copula on inference for the marginals, which is a Bayesian analogue of the popular Inference for Margins (IFM) estimator. The second limits the influence of misspecified marginals on inference for the copula parameters by using a rank likelihood to define the cut model. Properties of these cut posterior distributions and their computation will be discussed, and the efficacy of the new methodology demonstrated for a substantive multivariate time series application from macroeconomic forecasting. In the latter, cutting feedback from misspecified marginals improves posterior inference and predictive accuracy greatly, compared to conventional Bayesian inference. This is joint work with Weichang Yu, Michael Smith and David Frazier.

About the speaker : David is a A/Professor of the Department of Statistics and Applied Probability, the National University of Singapore. His research areas are Bayesian model selection, model misspecification, Bayesian nonparametrics, and approximations.

https://scholar.google.com/citations?user=wIG29sYAAAAJ&hl=en

Exploring the deduction of the kinematic properties of globular cluster populations: A Bayesian approach using nested sampling

Speaker: Yuan Li

Affiliation: UoA

When: Wednesday, 25 October 2023, 12:00 pm to 1:00 pm

Where: 303-310

The most appealing star clusters in the universe are the globular clusters. One can learn more about galaxies' properties and formation by studying globular clusters. Here, two galaxies, NGC 1052-DF4 and M31, have been investigated using Bayesian inference. First, it is interesting to see whether the globular clusters of NGC 1052-DF4 galaxy contain rotational components and how this may help to explain the mass of the galaxy and the dark matter problem, as it has been claimed that NGC 1052-DF4 lacks dark matter. Seven globular clusters in the NGC 1052-DF4 system had their rotational features examined. The results demonstrated that the non-rotation model outperformed the rotation model. A tiny value of amplitude, A, and a large value of velocity dispersion indicate that the estimated mass of the galaxy stays the same as in the previous investigation. Moreover, the result confirms that there is very little dark matter in the NGC 1052-DF4 system. Second, exploring how metallicity and substructure affect the rotational characteristics of 397 M31 globular clusters is another main goal of this investigation. By fitting six Bayesian models to the data, the findings showed that the younger globular clusters of the M31 galaxy rotate faster and around a different direction than their older ones. It is also concluded that M31’s younger globular clusters rotate on the same direction as Andromeda’s stellar disk.

This is the PYR seminar.

Is PCA telling statistical lies?

Speaker: Nuwan Weeraratne

Affiliation: University of Waikato

When: Wednesday, 25 October 2023, 11:00 am to 12:00 pm

Where: 303-310

Abstract : The principal component analysis (PCA) is a statistical method that quantifies the relationship between each variable using the covariance matrices, evaluates the direction of the distribution of the data using the eigenvectors, and evaluates the relative significance of those directions using the eigenvalues. But, as the usual covariance estimator does not converge to the true covariance matrix, standard PCA performs poorly in the n less than p high dimensional settings. In this study, inspired by a fundamental issue associated with mean estimation when n less than p, we examine the advantages of employing a multivariate generalization to covariance matrix estimation, of a well-known U-estimator for the (univariate) variance. In simulation experiments we demonstrate (typically small, but) persistent improvements in the estimation of principal components versus known ground truth, with respect to the angular separation between the population and sample principal components (PCs).

About the speaker : Nuwan Weeraratne is a PhD student under the supervision of Dr Jason Kurz at the University of Waikato.

https://scholar.google.com/citations?user=KQC-zGsAAAAJ&hl=en

StableMate: a new statistical method to select stable predictors in omics data

Speaker: Professor Kim-Anh Lê Cao

Affiliation: Melbourne Integrative Genomics, School of Mathematics and Statistics, University of Melbourne

When: Wednesday, 18 October 2023, 11:00 am to 12:00 pm

Where: 303-310

Abstract : Inferring reproducible relationships between biological variables remains a challenge in the statistical analysis of omics data. For example, methods that identify statistical associations may lack interpretability or reproducibility. The situation can be greatly improved, however, by introducing the measure of stability into the association, where small perturbations in the data do not affect the association. We developed this concept into a new statistical framework called StableMate. Given data observed in different environments or conditions, such as experimental batches or disease states, StableMate identifies predictors which are environment-agnostic or specific in predicting the response using stabilised regression.

StableMate is a flexible framework that can be applied to a wide range of biological data types and questions. We applied StableMate to 1) RNA-seq data of breast cancer to discover genes and gene modules that consistently predict estrogen receptor expression across disease conditions, 2) metagenomics data to identify fecal microbial species that show persistent association with colon cancer across studies from different countries and 3) single cell RNA-seq data of glioblastoma to discern signature genes associated with development of microglia to a pro-tumour phenotype regardless of cell location in the core. StableMate is an innovative, adaptable and efficient variable selection framework that achieves a comprehensive characterisation of a biological system for a wide range of biological data types for regression and classification analyses.

This work was primarily led by my PhD student Yidi Deng, in collaboration with Dr Jarny Choi (Centre for Stem Cell Systems) and Dr Jiadong Mao (Melbourne Integrative Genomics, School of Mathematics and Statistics).

About the speaker : Professor Kim-Anh Lê Cao develops computational methods, software and tools to interpret big biological data at Melbourne Integrative Genomics, the University of Melbourne. Kim-Anh has a mathematical engineering background and graduated with a PhD in statistics from the Université de Toulouse, France. She worked as a biostatistician consultant and in biological and biomedical institutes as research group leader. Kim-Anh has received the Australian Academy of Science's Moran Medal for her contributions to Applied Statistics. She was selected to the international HomewardBound leadership program for women in STEMM, culminating to a trip to Antarctica in 2019, and the superstars of STEM women program from Science Technology Australia. She has secured three consecutive NHMRC fellowships since 2015.

https://findanexpert.unimelb.edu.au/profile/791255-kim-anh-le-cao

Fast Spectral Density Estimation for Multivariate Time Series

Speaker: Jianan Liu

Affiliation: UoA

When: Wednesday, 27 September 2023, 11:00 am to 12:00 pm

Where: 303-310

Bayesian nonparametric spectral analysis for multivariate time series has received much attention in the past few decades. However, most of studies on approximating the posterior for spectral density are based on Markov chain Monte Carlo (MCMC) and its derived variants. This is an accurate approach, but when the data size is slightly larger or the model is very complex, the computational cost increases significantly. This study analyses a novel efficient method which combines the stochastic optimization and variational inference as simulation process for spectral density. The effect of the learning rate of stochastic optimization on the spectral density is started to be explored. Through the modification and study of its hyperparameters, it has the possibility to simulate the spectral density for large size of multivariate time series.

Jianan Liu is a PhD student and this is his PYR seminar.

https://profiles.auckland.ac.nz/jliu812

Teaching knowledge, playfulness, student protagonism, social justice, interdisciplinarity: some results of Brazilian research in statistical education

Speaker: Mauren Porciuncula

Affiliation: Federal University of Rio Grande, Brazil

When: Wednesday, 23 August 2023, 3:00 pm to 4:00 pm

Where: 303-310

At the Federal University of Rio Grande (FURG), in the extreme south of Brazil, at the Centre for Innovation in Statistical Education (ICE), research in statistical education is developed with the aim of qualifying teaching and learning in this area of knowledge. In order to contribute to the advancement of scientific knowledge, projects of technological development, applied research, teaching and university extension have been initiated. The results of these research projects indicate pathways for the qualification of teaching at universities and schools. The scientific findings synthesize knowledge across the scopes of Teaching Knowledge, Playfulness, Student Protagonism, Social Justice, Interdisciplinarity, among others. In this seminar, results from these Brazilian statistics education research projects will be presented. The talk will contemplate the context, the theoretical background, the design methodology, and highlight the scientific findings themselves. There will also be opportunity for dialogue.

Dr Porciuncula is an Associate Professor at the Federal University of Rio Grande - FURG. Her research interest focuses on statistical education. https://ppgec.furg.br/index.php/corpo-docente/docentes-permanentes/301-mauren-porciuncula.html

Forecasting high-dimensional functional time series: Application to sub-national age-specific mortality

Speaker: Cristian Felipe Jimenez Varon

Affiliation: KAUST

When: Wednesday, 12 July 2023, 11:00 am to 12:00 pm

Where: 330-310

We explore the modeling and forecasting of high-dimensional functional time series (HDFTS), which can be cross-sectionally correlated and temporally dependent. We present a novel two-way functional median polish decomposition, which is robust against outliers, to decompose HDFTS into deterministic and time-varying components. A functional time series forecasting method, based on dynamic functional principal component analysis, is implemented to produce forecasts for the time-varying components. By combining the forecasts of the time-varying components with the deterministic components, we obtain forecast curves for multiple populations. We apply the model to age- and sex-specific mortality rates in the US, France, and Japan, in which there are 51 states, 95 departments, and 47 prefectures, respectively, to illustrate that the proposed model delivers more accurate point and interval forecasts in forecasting multi-population mortality than several benchmark methods.

Two-Phase Sampling: Automated Imputation in Analysing Subsamples.

Speaker: Keiran Shao

Affiliation: University of Auckland

When: Monday, 19 June 2023, 11:00 am to 12:00 pm

Where: 303-310

Health information techniques, including electronic health records (EHRs), have been widely adopted by medical systems across the globe, offering patients more effective, safer, and enhanced-quality care. However, the presence of measurement errors in EHRs not only introduces individual information biases but also has the potential to result invalidate statistical inferences in medical studies, thereby posing a significant challenge to data analysis. Human validation of electronic health record data can improve the quality but is impractical at large scale, so validation of subsamples is an active research area.

I will investigate multiple imputation as an approach to inference with validation subsamples. Parametric or semi-parametric imputation approaches require the user to manually select the ideal statistical model for imputation, which limits the user must have a strong background in statistics and data analysis. The goal is to address this restriction by incorporating semi-automated imputation techniques based on machine learning to measurement-error problems with a validated sub-sample.

NBA Action, It’s FANtastic (and great for data analysis too!)

Speaker: Ryan Elmore

Affiliation: Associate Professor, Department of Business Information and Analytics, University of Denver

When: Wednesday, 31 May 2023, 11:00 am to 12:00 pm

Where: 303-310

In this talk, I will describe my two most recent statistical problems and solutions related to the National Basketball Association (NBA). In particular, I will discuss (1) the usefulness of a coach calling a timeout to thwart an opposition’s momentum and (2) a novel metric for rating the overall shooting effectiveness of players in the NBA. I will describe the motivation for each problem, how to find data for NBA analyses, modeling considerations, and our results. Lastly, I will describe why I think the analysis of sport, in general, provides an ideal venue for teaching/learning statistical or analytical concepts and techniques.

Towards Fluent Interactive Data Visualization

Speaker: Adam Bartonicek

Affiliation: The University of Auckland

When: Friday, 19 May 2023, 11:00 am to 12:00 pm

Where: 303-310

Humans learn about the world around them by interacting with it. The same applies to data. If we want to learn from our data effectively, we need practical tools for creating and manipulating data visualizations. Currently, there are many options for interactive data visualization within the statistical programming ecosystem, in languages such as R or Python. However, all of the currently available software packages tend to suffer from a common set of drawbacks. Specifically, these packages tend to be either very high-level, such that the users are limited to picking from a small set of ready-made interactive plots, or very low-level, such that a lot of time and effort is required to create even moderately complex interactive figures, and there are no guarantees that the interaction will be predictable or composable. The goal of the presented research is to develop a mid-level framework that would allow the users to create entirely new types of interactive plots, which would nevertheless be guaranteed to behave in consistent ways when combined. To this end, the project incorporates concepts from category theory, an area of mathematics concerned with structure and composition. The findings so far show that, by requiring that the statistical summaries we draw conform to a small set of desirable properties, we can guarantee consistent two-way interaction. The project also seeks to implement the system, and live demostration of a prototype will take place as part of the talk.

Unsupervised Statistical Tools for the Detection of Anomalies in Populations

Speaker: Prof. Fabrizio Ruggeri

Affiliation: CNR-IMATI Milano, Italy

When: Wednesday, 8 March 2023, 2:00 pm to 3:00 pm

Where: 303-310

The research is motivated by the increased interest in detecting possible

frauds in healthcare systems. We propose some unsupervised statistical

tools (Lorenz curve, concentration function, sum of ranks, Gini and Pietra

indices) to provide efficient and easy-to-use methods aimed to signal

possible anomalous behaviours. A more sophisticated method, based on

Bayesian co-clustering, is presented as well.

The propensity score for the analysis of observational studies

Speaker: Prof Markus Neuhaeuser

Affiliation:

When: Tuesday, 28 February 2023, 1:00 pm to 2:00 pm

Where: 303-148

In observational, non-randomized studies, groups usually differ in some baseline covariates. Propensity scores are increasingly being used in the statistical analysis to adjust for those between-group variations. There is great flexibility in how the propensity score can be appropriately used. One possible strategy is stratification, also called subclassification. We present examples and discuss the question how many strata are useful.

Moreover, the flexibility might encourage p-value hacking – where several alternative uses of propensity scores are explored and the one yielding the lowest p-value is selectively reported. Although such an approach is scientifically not acceptable, it might occur and therefore we simulate the extent of type I error inflation.

K-12 Data Science or Statistics? Is a distinction needed?

Speaker: Professor Rob Gould

Affiliation: Vice-chair Undergraduate Studies, Department of Statistics, UCLA.

When: Friday, 10 February 2023, 11:00 am to 12:00 pm

Where: 303-G14

For decades now, statistics educators have worked to achieve wide-spread statistical literacy. And now, well before the task is accomplished, along comes Data Science Education. I’ll explain why, from my perspective, this term is more than just a new label for an old thing, describe updates to the American Statistical Association’s Guidelines for Assessment and Instruction in Statistics Education (GAISE) Pre-K-12 report, and give a brief overview of a high school data science course that I helped design and propagate. I’ll also discuss currents in the US pushing back against data science (and statistics) education.

Top


Please give us your feedback or ask us a question

This message is...


My feedback or question is...


My email address is...

(Only if you need a reply)

A to Z Directory | Site map | Accessibility | Copyright | Privacy | Disclaimer | Feedback on this page