Professor Alan Lee
Alan Lee attended Auckland University for his undergraduate and Masters degrees, and the University of North Carolina for his PhD. He joined the staff of the then Mathematics Department at Auckland University in 1974. He has held visiting academic appointments at Indiana University, the University of North Carolina, McGill University and Southampton University. His research interests include the analysis of directional data, non-parametric statistics, statistical computing, the analysis of correlated categorical data, the application of capture-recapture methods to the estimation of population size, regression for response-selective sampling designs and the creation of synthtic data sets. More Information:Recent manuscripts 1. A. J. Lee, A.J Scott and C.J. Wild (2009) Efficient estimation in multi-phase case-control studies. In this paper we discuss the analysis of multi-phase, or multi-stage, case-control studies and present an efficient semiparametric maximum-likelihood approach that unifies and extends earlier work, including the seminal case-control paper by Prentice & Pyke (1979) as well as work by Breslow & Cain (1988), Scott & Wild (1997), Breslow & Holubkov (1997), and others. The theoretical derivations apply to arbitrary binary regression models but we present results for logistic regression and show that the approach can be implemented by including additional intercept terms in the logistic model and then making some simple corrections to the score and information equations from the prospective loglikelihood. 2. A. J. Lee (2009) Circular Data. We give a brief survey of the field of circular statistics,including summary statistics, circular distributions, basic inference and models for regression and time series. An extensive bibliography is provided. 3. A. J. Lee (2009) Generating Synthetic Microdata From Published Marginal Tables and Confidentialised Files. We describe several methods for generating synthetic data sets. The methods we describe are based on creating data sets using a combination of publically available marginal tables, and microdata samples. We describe a set of R functions which implement the methods under study, and use these functions to apply the methods to data from the 2001 Census of Population and Dwellings. 4.Alan Lee and Yuchi Hirose (2007) Semi-parametric efficiency bounds for regression models under generalised case-control sampling: the profile likelihood approach. Abstract: We obtain an information bound for estimates of parameters in general regression models where data is collected under a variety of response-selective sampling schemes. The asymptotic variances of the semi-parametric estimates of Scott and Wild (1986, 1997, 2001) are compared to the bound and the estimates are found to be fully efficient. 5. A. J. Lee (2007) On the semi-parametric efficiency of the Scott-Wild estimator under choice-based and two-phase sampling. Using a projection approach, we obtain an asymptotic information bound for estimates of parameters in general regression models under choice-based and two-phase, outcome-dependent sampling. The asymptotic variances of the semi-parametric estimates of Scott and Wild (1997, 2001) are compared to these bounds and the estimates are found to be fully efficient. 6. A. J. Lee (2007) Semi-parametric efficiency bounds for regression models under choice-based sampling. We extend the Bickel--Klaassen--Ritov--Wellner theory of semi-parametric efficiency bounds to the case of sampling from several populations, and discuss the form of the efficient score and efficient influence function in this situation. The theory is applied to obtain an information bound for estimates of parameters in general regression models under case-control sampling. .7.A.J. Lee, A.J. Scott and C.J. Wild. (2007) On the Breslow-Holubkov estimator. Abstract: Breslow and Holubkov (1997) developed semiparametric maximum likelihood estimation for two-phase studies with a case-control first phase under a logistic regression model and noted that, apart for the overall intercept term, it was the same as the semiparametric estimator for two-phase studies with a prospective first phase developed in Scott and Wild (1997) . In this paper we extend the Breslow-Holubkov result to general binary regression models and show that it has a very simple relationship with its prospective first-phase counterpart. We also explore why the design of the first phase only affects the intercept of a logistic model, simplify the calculation of standard errors, establish the semiparametric efficiency of the Breslow-Holubkov estimator and derive its asymptotic distribution in the general case. 8. Alan Lee (2006) Generating synthetic unit-record data from published marginal tables. Abstract: We survey methods for generating synthetic data sets without making use of unit-record data. The methods we describe are based on creating data sets which match publically available marginal tables. We describe a set of R functions which implement the methods under study, and apply the methods to data from the 2001 Census of Population and Dwellings. Selected publications:
|
Contact DetailsPostal address: Courier address: Phone: +649 3737599 x86893 or x87510 Enquiries: |