Identifying impact factors on the communications and telecommunications sector using ensemble machine learning methods

Regional and branch economy
Authors:
Abstract:

 Relevance. The communications and telecommunications sector plays one of the key roles in economic development. In this regard, it is especially important to determine the factors that influence this sector. When studying the Russian market, it is essential to take into account the heterogeneity of regional development. The purpose of the study is to model the activities of the communications and telecommunications sector and identify the factors that influence it, taking into account regional specifics. Methods. The empirical basis of this study is formed by the data of official statistical reporting for 85 constituent entities of the Russian Federation quarterly for 2017–2023. The resulting variable is revenue from communication services, the independent variables are the volume of information transmitted from/to subscribers when accessing the Internet; tariff indices for communication services for legal entities; average number of employees in the telecommunications sector; basic consumer price index for goods and services; average monthly wages of employees in the economy; postal transfers. Using the Python programming language, the following models were built: linear regression, Ridge, Bagging, RandomForestRegressor, GradientBoostingRegressor and XGBRegressor. The SHAP method was used to interpret the results of ensemble methods. Results. The analysis confirms the uneven development of the communications and telecommunications sector within the country's regions. Among the constructed models, the best result was achieved using GradientBoostingRegressor, and the analysis of SHAP indicators revealed the influence of various factors on revenues from communications services, such as an increase in the average monthly salary, an increase in the volume of information (traffic), tariff indices for communications services for legal entities, as well as the negative impact of the basic consumer price index and an insignificant impact of postal transfers. Conclusions. Identifying factors that affect the communications and telecommunications sector in the context of the Russian economy is important and has both theoretical and practical significance. Thus, the results obtained can be used for various purposes: strategic planning, investment optimization, government regulation, marketing and sales, forecasting and planning. Directions for further research. Despite the fact that the quality of the model, taking into account the number of observations considered, seems quite high, some overfitting can be seen in the obtained models, to eliminate which, for the purpose of conducting further research, it is advisable to add observation values for new periods when generating a sample. In addition, it seems advisable to generate neural networks to obtain even more accurate estimates of the impact of various factors on the communications and telecommunications sector, as well as to obtain more accurate forecast estimates.