Adaptation of k-means to automated forecasting of poorly structured time series of economic dynamics

Economic & mathematical methods and models
Authors:
Abstract:

With the growing volume of data and increasing complexity of economic interactions, more advanced analysis methods and interdisciplinary approaches should be applied to study of systems with mixed behavior. Data mining methods used in machine learning or deep learning allow to take into account complex patterns and nonlinear dependencies in the data. Applied statistics methods provide reliable approaches to hypothesis testing, model parameter estimation and interpretation of results. It was established for different systems with complex behavior that economic processes are often characterized by nonlinearity, instability, and the presence of hidden dependencies. Furthermore, machine learning and deep data analysis methods allow not only to improve the forecasting accuracy but also to identify hidden patterns that may be overlooked by traditional statistical approaches. This is especially important in the study of financial markets, where the dynamics of change can be extremely unstable and influenced by many external factors. Such methods help to increase the effectiveness of decision-making in conditions of uncertainty, serving as indispensable tools for modern economic research. Thus, research in this area is urgent, as confirmed not only by the nature of the series, but also by the need to find more advanced methods of analysis and forecasting. The article provides preliminary analysis, additionally constructing a forecast based on a linear cellular automaton. Applied statistics and data mining tools were used for time series analysis as well as for adaptation of clustering methods as a means for automating the predictive model. We confirmed that use and integration of well-known clustering methods into the linear cellular automaton algorithm allows to identify patterns and improve the quality of the forecast. The object of the study is the time series of the financial market, since these economic series are influenced by a variety of factors that are hard to detect (in terms of their influence), such as external shocks, seasonal fluctuations and long-term trends. Our findings indicate that data mining algorithms make it possible to automate the process of translating numerical indicators of a time series into a linguistic equivalent to obtain predictive values without loss of quality.