Publishing House SB RAS:

Publishing House SB RAS:

Address of the Publishing House SB RAS:
Morskoy pr. 2, 630090 Novosibirsk, Russia



Advanced Search

Scientific journal “Vestnik NSUEM”

2016 year, number 1

METHODS OF MACHINE LEARNING TO PREDICT THE SPREAD OF THE INFECTION IN THE NETWORK

P.A. Sulimov
Central Bank of the Russian Federation, Neglinnaya str., 12 , Moscow , 107016 , Russia
Keywords: социальная сеть, модель заражения, link prediction problem, random forest, the social network model of infection, link prediction problem, random forest

Abstract

The launch of Facebook in 2004 gave rise to research of the question, how people interact with each other within a social network. Since then more than 10 years passed, and many thematic social networks appeared: Twitter, Instagram, LinkedIn, Flickr etc. People exchange any information (photos, links, contacts etc.) in all listed social networks. Information is some kind of virus which is transferred from person to person. Respectively, the author considers distribution of information in a social network from the point of view of model of infection (epidemics in social network). The paper sets the goal of epidemic threshold prediction (threshold characteristic of a network, above which the network is surely completely infected) in the time point t+1 on the basis of historical data for the periods of t, t-1 and earlier. For the solution of the set goal it is necessary to know how the network will behave in the t+1 time point, whether the network graph is connected, what links will be broken and what will appear etc. All of the above define the speed of spread of an infection in networks and epidemic threshold. Respectively, the Link Prediction Problem emerges, which is solved by methods of machine training (Random Forest, Support Vector Machines) by referring pairs of nodes to classes connected and not connected, and predictions of class of pair of nodes in the t+1 time point on the basis of topological and factorial characteristics of knots of network. Thus, the algorithm of forecasting of spread of infection in a social network by means of methods of machine training is the result of the research.