Web3.0 has already appeared in the public vocabulary over 5 years ago. While its definition remains unclear, what has become clear in the last half decade is that the web has become a support for social media. Directly from cameras, phones, tablets or computers, users are pushing multimedia data towards their peers and the world at large. MUCKE addresses this stream of multimedia social data with new and reliable knowledge extraction models designed for multilingual and multimodal data shared on social networks. It departs from current knowledge extraction models, which are mainly quantitative, by giving a high importance to the quality of the processed data, in order to protect the user from an avalanche of equally topically relevant data. It does so using two central innovations: automatic user credibility estimation for multimedia streams and adaptive multimedia concept similarity. Credibility models for multimedia streams are a highly novel topic, which will be cast as a multimedia information fusion task and will constitute the main scientific contribution of the project. Adaptive multimedia concept similarity departs from existing models by creating a semantic representation of the underlying corpora and assigning a probabilistic framework to them. The utility of these two innovations will be demonstrated in an image retrieval system. Extensive evaluation will be performed in order to assess the reliability of the extracted knowledge against representative datasets. Additionally, a new, shared evaluation task focused on user credibility estimation will be proposed. The two core innovations rely on innovative text processing, image processing and fusion methods. Text processing will concentrate on tasks such as word sense disambiguation, concept recognition and anaphora resolution. Image processing will include parsimonious content description, large scale concept detection and detector robustness. Multimedia fusion will focus on a flexible combination of text and image modalities based on a probabilistic framework. All proposed methods will be designed to take advantage of the structural properties of the social networks. Particular focus will be placed on the proposition of scalable algorithms, which cope with large-scale, heterogeneous data.
The consortium is formed of four partners, three universities and one research institute with complementary competences that cover the scientific domains associated to the project.
Together, in MUCKE, they will introduce new models for processing noisy multimodal and multilingual data that will constitute the base for innovative services.