Latent variable modelling and statistical analysis for high-dimensional data
In recent years, advances in technology have made it easier to collect and store highdimensional data, creating a growing need for effective statistical tools. This thesis presents new approaches through three related studies to improve existing methods and enhance their practical applicability. Chapter 2 proposes a novel latent variable model tailored for high-dimensional multivariate longitudinal data. This model accommodates mixed data types and missing observations by incorporating unobserved factors that capture dependence across variables and time points, facilitating both statistical inference and predictive performances. A central limit theorem is established for inference on regression coefficients, and an information criterion is developed to consistently determine the number of factors. The method is applied to grocery shopping data to predict and interpret consumer behaviour. Chapter 3 introduces a stability-based method for selecting the number of latent factors in linear factor models, using principal angles between loading spaces obtained from data splitting. Consistency is established under weaker asymptotic requirements than existing approaches. Simulations and real data examples demonstrate the method’s improved accuracy and robustness. Chapter 4 develops a flexible statistical modelling framework for pairwise comparison data, relaxing the conventional stochastic transitivity assumptions in classical models. By imposing an approximately low-dimensional skew-symmetric structure, the method achieves minimax-optimal estimation rates and performs well with sparse data. Its superiority over the traditional Bradley-Terry model is supported by simulations and real-world applications.
| Item Type | Thesis (Doctoral) |
|---|---|
| Copyright holders | © 2025 Sze Ming Lee |
| Departments | LSE > Academic Departments > Statistics |
| DOI | 10.21953/lse.00004941 |
| Supervisor | Chen, Yunxiao, Moustaki, Irini |
| Date Deposited | 26 Jan 2026 |
| URI | https://researchonline.lse.ac.uk/id/eprint/135694 |