Bonde, Aniket Sanjiv (2018-12). Identifying Expert Reviews in the Crowd: Linking Curated and Noisy Domains. Master's Thesis. Thesis uri icon

abstract

  • Over the past decade, vast number of online consumer reviews have made a significant presence on the Internet. These reviews play a vital role in consumer awareness about the products and deeply impact the consumer's decision-making process. On one hand, websites like Amazon, Yelp provide huge collections of crowd- sourced reviews, which are written by consumers themselves having experience in using that product. Many researchers argue about the credibility and bias of these reviews. These factors, coupled with the sheer plethora of reviews for each product, it can become tiring to form a perspective about the product. On other hand, websites like Wirecutter, Thesweetsetup provide hand-made highly curated detailed guides on products across various categories. Although these reviews are unbiased expert opinions, they require vigorous reporting, interviewing, and testing by various journalists, scientists, and researchers. Thus making them hard to scale. Our aim is to study the possible correlations between the crowd-sourced noisy domain reviews and the curated reviews. We take into account meta-features of re- views, context-based textual features of reviews and word-embedding based features of words from reviews. In addition to this, we identify "good reviews", defined as those noisy domain reviews that align with the curated ones, and use this to propose a general purpose, extremely streamlined recommender that can provide value to the general public without any personalized inputs. This research will contribute significantly towards identifying unbiased crowd-sourced reviews that align with curated reviews, across different categories of products, thereby linking the curated and noisy domains. Our research will also contribute significantly towards understanding the intricacies of good product reviews across different categories.
  • Over the past decade, vast number of online consumer reviews have made a
    significant presence on the Internet. These reviews play a vital role in consumer
    awareness about the products and deeply impact the consumer's decision-making
    process. On one hand, websites like Amazon, Yelp provide huge collections of crowd-
    sourced reviews, which are written by consumers themselves having experience in
    using that product. Many researchers argue about the credibility and bias of these
    reviews. These factors, coupled with the sheer plethora of reviews for each product,
    it can become tiring to form a perspective about the product. On other hand,
    websites like Wirecutter, Thesweetsetup provide hand-made highly curated detailed
    guides on products across various categories. Although these reviews are unbiased
    expert opinions, they require vigorous reporting, interviewing, and testing by various
    journalists, scientists, and researchers. Thus making them hard to scale.
    Our aim is to study the possible correlations between the crowd-sourced noisy
    domain reviews and the curated reviews. We take into account meta-features of re-
    views, context-based textual features of reviews and word-embedding based features
    of words from reviews. In addition to this, we identify "good reviews", defined as
    those noisy domain reviews that align with the curated ones, and use this to propose
    a general purpose, extremely streamlined recommender that can provide value to the
    general public without any personalized inputs. This research will contribute significantly towards identifying unbiased crowd-sourced reviews that align with curated
    reviews, across different categories of products, thereby linking the curated and noisy
    domains. Our research will also contribute significantly towards understanding the
    intricacies of good product reviews across different categories.

ETD Chair

publication date

  • December 2018