Abstract
Online Social Networks (ONS) represent the perfect place to disseminate mis- and disinformation. The anonymity and ease of sharing information causes its fast diffusion throughout the so-called spreading cascade, without any type of verification. Users can reshare tweets or posts without even having read the entire information or a bot army could replicate a message and share it from thousands of different fake accounts. The design of specific tools aimed at tackling this problem are essential in or der to help the population to differentiate between information and misinformation. Nevertheless, the implementation of scalable and reliable methods, algorithms and tools able to handle the problem of misinformation is a current open challenge in the area of OSN analysis.
Current state-of-the-art research is mostly focused on detecting misinformation already in circulation, assuming that the verification process is performed when the claim has already been disseminated. This late response, however, limits the capacity to attack misinformation and the harm it causes to the population. Researchers such as Zhao et al. (2020) have already explored the propagation of misinformation, evidencing that false claims can be identified around five hours after being shared for the first time. Previti et al. (2020) proposes a method based on time series to classify true and false news that exploits time series-based features extracted from the evolution of news, and features from the users involved in the news spreading. <<Our goal is to comprehend and detect the emergence of new false claims in Online Social Networks in the very early stages in order to stop its propagation>>. Alerting users of potentially inaccurate information can help in stopping, countering and preventing the spreading of this harmful information.
Our proposal is to leverage Social Network Analysis (SNA) and Natural Language Processing (NLP) tools with the goal of analyzing how information spreads in Online Social Networks. On the one hand, SNA techniques can help at inferring the patterns that can help to understand how misinformation spreads, both in terms of interaction between users but also from the behavior of specific accounts (such as bots), to detect communities of users (accounts) and how they evolve over time to better understand the dynamics and behaviour of both the structure of these communities and the information flow as is shown in Bello-Orgaz et al. (2017). State-of-the-art research focused on modelling how mis- and disinformation is spread on social networks has been classified into two different categories: explanatory and predictive models. Explanatory models conceive the spread of information as an epidemic. The goal is to understand how a piece of information propagates in a network, the so-called sharin g or spreading cascade. On the other hand, predictive models study the future propagation of information. In this proposal, we leverage both kinds of models to analyze the whole information diffusion process in order to trace its origin. For its part, Natural Language Processing can help to analyze the semantic content and to trace how information evolves and mutates to give rise to false claims.
REFERENCES
Bello-Orgaz, G., Hernandez-Castro, J., & Camacho, D. (2017). Detecting discussion communities on vaccination in twitter. Future Generation Computer Systems, 66, 125-136.
Previti, M., Rodriguez-Fernandez, V., Camacho, D., Carchiolo, V., & Malgeri, M. (2020, April). Fake news detection using time series and user features classification. In International Conference on the Applications of Evolutionary Computation (Part of EvoStar) (pp. 339-353). Springer, Cham.
Zhao, Z., Zhao, J., Sano, Y., Levy, O., Takayasu, H., Takayasu, M., Li, D., Wu, J., Havlin, S. (2020). Fake News Propagate Differently from Real News even at Early Stages of Spreading. EPJ Data Science 9, 7.