In this paper, we start from relaxing assumptions of traditional hidden Markov model then develop a novel framework for decoding the latent states, from which the dynamics of multi-variable financial data is generated. To construct the framework, we model the observed variables as a p-order vector autoregressive process, allow the latent state to evolve through a semi-Markov chain, and shrink the auto-regression and covariance matrices via a penalized maximization likelihood method. Using the 50-dimensional simulated data, the 12-dimensional 5-minute order book data of the Chinese CSI 300 index component stocks, the 49-dimensional daily data of U.S. industry portfolio, and 1-dimensional hourly data of four primary foreign exchange rates, our empirical analyses show that the proposed model outperforms the alternative model in accurately recognizing anomalous events and achieves better sharp ratio in a pseudo trading strategy via the latent states. The superior performance is across the data frequency of minute, hour and daily, the dimension of one, 12, and 50, the data type of stock, foreign exchange rate, and industry portfolio.
- vector-autoregressive model
- Hidden Markov model