Skip to main content

Abstract

Propaganda campaigns have shown to be very successful in social media, as viral messages spread efficiently across users and small (if any) analysis about the intention of a message is carried out before forwarding (e.g., re-tweeting, re-posting) to one's network. The creation of a successful viral post involves engineering a message that reinforces the perception of users and appeals to their emotions. Regardless of its veracity, a viral propagandist tweet has the capability of spreading the agenda of malicious stakeholders.

Propaganda is successful when it goes unnoticed. The response of the natural language processing (NLP) community has been the creation of supervised models to uncover propaganda both at the document [1] and at the snippet level [2,3] ---mostly focusing on news articles. Nevertheless, except for a few efforts in other languages (e.g., [4,5]), hardly any NLP effort has been invested to automatically identify propaganda in languages other than English and in social media (indeed, propaganda in social media has been better studied from the social science perspective [6]).

PluriProppy aims to fill the gap in the development of resources and technology for the automatic identification of social media propaganda in multiple languages. We aim at collecting supervised data in at least two European languages, and producing both multilingual and language-agnostic NLP models to uncover it.

References

[1] Barrón-Cedeño, Jaradat, Da San Martino, Nakov. Proppy: Organizing News Coverage on the Basis of Their Propagandistic Content. Information Processing and Management, 2019. DOI:10.1016/j.ipm.2019.03.005.

[2] Da San Martino, Yu, Barrón-Cedeño, Petrov, Nakov. Fine-Grained Analysis of Propaganda in News Articles. In Proc. of the 2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP 2019), Hong Kong, China, November 3-7, 2019.

[3] Da San Martino, Barrón-Cedeño, Wachsmuth, Petrov, Nakov. SemEval-2020 Task 11: Detection of Propaganda Techniques in News Articles. In Proc. of the Fourteenth Workshop on Semantic Evaluation, SemEval’20. Barcelona, Spain, 2020

[4] Kausar, Tahir and Mehmood. ProSOUL: A Framework to Identify Propaganda From Online Urdu Content. In IEEE Access, vol. 8, pp. 186039-186054, 2020, doi: 10.1109/ACCESS.2020.3028131.

[5] Horák, Aleš, Baisa, Herman. Benchmark Dataset for Propaganda Detection in Czech Newspaper Texts. In Proc. of Recent Advances in Natural Language Processing, RANLP 2019. Varna, Bulgaria: INCOMA Ltd., 2019. p. 77-83. doi:10.26615/978-954-452-056-4_010.

[6] Chaudhari and Pawar (2021), "Propaganda analysis in social media: a bibliometric review", Information Discovery and Delivery, Vol. 49 No. 1, pp. 57-70. https://doi.org/10.1108/IDD-06-2020-0065