• Login
    View Item 
    •   DSpace Home
    • A) Producción científica UCSC
    • Artículos Científicos
    • View Item
    •   DSpace Home
    • A) Producción científica UCSC
    • Artículos Científicos
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Semantic Analysis and Topic Modelling of Web-Scrapped COVID-19 Tweet Corpora through Data Mining Methodologies

    Thumbnail
    View/Open
    Semantic Analysis and Topic Modelling of Web-Scrapped COVID-19 Tweet Corpora through Data Mining Methodologies.pdf (7.328Mb)
    Date
    2022
    Author
    Gourisaria, Mahendra Kumar
    Chandra, Satish
    Das, Himansu
    Patra, Sudhansu Shekhar
    Sahni, Manoj
    Leon Castro, Ernesto
    Singh, Vijander
    Kumar, Sandeep
    Publisher
    Healthcare (Switzerland)
    Description
    Artículo de publicación SCOPUS - WOS
    Metadata
    Show full item record
    Abstract
    The evolution of the coronavirus (COVID-19) disease took a toll on the social, healthcare, economic, and psychological prosperity of human beings. In the past couple of months, many organizations, individuals, and governments have adopted Twitter to convey their sentiments on COVID-19, the lockdown, the pandemic, and hashtags. This paper aims to analyze the psychological reactions and discourse of Twitter users related to COVID-19. In this experiment, Latent Dirichlet Allocation (LDA) has been used for topic modeling. In addition, a Bidirectional Long Short-Term Memory (BiLSTM) model and various classification techniques such as random forest, support vector machine, logistic regression, naive Bayes, decision tree, logistic regression with stochastic gradient descent optimizer, and majority voting classifier have been adapted for analyzing the polarity of sentiment. The effectiveness of the aforesaid approaches along with LDA modeling has been tested, validated, and compared with several benchmark datasets and on a newly generated dataset for analysis. To achieve better results, a dual dataset approach has been incorporated to determine the frequency of positive and negative tweets and word clouds, which helps to identify the most effective model for analyzing the corpora. The experimental result shows that the BiLSTM approach outperforms the other approaches with an accuracy of 96.7%. © 2022 by the authors. Licensee MDPI, Basel, Switzerland.
    URI
    http://repositoriodigital.ucsc.cl/handle/25022009/3086
    Collections
    • Artículos Científicos

    UCSC
    UCSC | Contact Us | Send Feedback
     

     

    Browse

    All of DSpaceCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

    My Account

    Login

    Statistics

    View Usage Statistics

    UCSC
    UCSC | Contact Us | Send Feedback