Please use this identifier to cite or link to this item: http://buratest.brunel.ac.uk/handle/2438/14135
Title: Novel regularization models for dynamic and discrete response data
Authors: Hamed, Haseli Mashhadi
Advisors: Vinciotti, V
Yu, K
Keywords: L1 Penalized likelihood;Markov chain Monte Carlo;Discrete Weibull distribution;High dimensional time series;Differentiable penalty
Issue Date: 2017
Publisher: Brunel University London
Abstract: Regularized regression models have gained popularity in recent years. The addition of a penalty term to the likelihood function allows parameter estimation where traditional methods fail, such as in the p » n case. The use of an l1 penalty in particular leads to simultaneous parameter estimation and variable selection, which is rather convenient in practice. Moreover, computationally efficient algorithms make these methods really attractive in many applications. This thesis is inspired by this literature and investigates the development of novel penalty functions and regression methods within this context. In particular, Chapter 2 deals with linear models for time-dependent response and explanatory variables. This is beyond the independent framework which is common to many of the developed regularized regression models. We propose to account for the time dependency in the data by explicitly adding autoregressive terms to the response variable together with an autoregressive process for the residuals. In addition, the use of a l1 penalized likelihood approach for parameter estimation leads to automatic order and variable selection and makes this method feasible for high-dimensional data. Theoretical properties of the estimators are provided and an extensive simulation study is performed. Finally, we show the application of the model on air pollution and stock market data and discuss its implementation in the R package DREGAR, which is freely available in CRAN. In Chapter 3, we develop a new penalty function. Despite all the advantages of the l1 penalty, this penalty is not differentiable at zero, and neither are the alternatives that are proposed in the literature. The only exception is the ridge penalty, which does not lead to variable selection. Motivated by this gap, and noting the advantages that a differentiable penalty can give, such as increased computational efficiency in some cases and the derivation of more accurate model selection criteria, we develop a new penalty function based on the error function. We study the theoretical properties of this function and of the estimators obtained in a regularized regression context. Finally, we perform a simulation study and we use the new penalty to analyse a diabetes and prostate cancer dataset. The new method is implemented in the R package DLASSO, that is freely available in CRAN. Finally, Chapter 4 deals with regression models for discrete response data, which is frequently collected in many application areas. In particular, we consider a discrete Weibull regression model that has recently been introduced in the literature. In this chapter, we propose the first Bayesian implementation of this model. We consider a general parametrization, where both parameters of the discrete Weibull distribution can be conditioned on the predictors, and show theoretically how, under a uniform noninformative prior, the posterior distribution is proper with finite moments. In addition, we consider closely the case of Laplace priors for parameter shrinkage and variable selection. A simulation study and the analysis of four real datasets of medical records show the applicability of this approach to the analysis of count data. The method is implemented in the R package BDWreg, which is freely available in CRAN.
Description: This thesis was submitted for the award of Doctor of Philosophy and was awarded by Brunel University London
URI: http://bura.brunel.ac.uk/handle/2438/14135
Appears in Collections:Dept of Mathematics Theses
Mathematical Sciences

Files in This Item:
File Description SizeFormat 
FulltextThesis.pdf7.24 MBAdobe PDFView/Open


Items in BURA are protected by copyright, with all rights reserved, unless otherwise indicated.