Fake or not? Automated detection of COVID-19 misinformation and disinformation in social networks and digital media.
Additional Document Info
With the continuous spread of the COVID-19 pandemic, misinformation poses serious threats and concerns. COVID-19-related misinformation integrates a mixture of health aspects along with news and political misinformation. This mixture complicates the ability to judge whether a claim related to COVID-19 is information, misinformation, or disinformation. With no standard terminology in information and disinformation, integrating different datasets and using existing classification models can be impractical. To deal with these issues, we aggregated several COVID-19 misinformation datasets and compared differences between learning models from individual datasets versus one that was aggregated. We also evaluated the impact of using several word- and sentence-embedding models and transformers on the performance of classification models. We observed that whereas word-embedding models showed improvements in all evaluated classification models, the improvement level varied among the different classifiers. Although our work was focused on COVID-19 misinformation detection, a similar approach can be applied to myriad other topics, such as the recent Russian invasion of Ukraine.