Skip to content

Latest commit

 

History

History
32 lines (26 loc) · 1.81 KB

README.md

File metadata and controls

32 lines (26 loc) · 1.81 KB

Stress Detection from Social Media Articles: New Dataset Benchmark and Analytical Study

This repository contains the datasets for classification of stress from text-based social media articles from Reddit and Twitter, which were created within the paper titled "Stress Detection from Social Media Articles: New Dataset Benchmark and Analytical Study".

Status - Accepted for Oral Presentation at IEEE WCCI 2022, IJCNN track.

Overview of the datasets

We construct four high quality datasets using the text articles from Reddit and Twitter. Against each of the articles is a class label with a value of '0' or '1', where '0' specifies a Stress Negative article and '1' specifies a Stress Positive article. Annotation was done using an automated DNN-based strategy highlighted in the aforementioned study.

The description about each of the datasets is given as under:

  • Reddit Title: Consists of titles from the articles collected from both stress and non-stress related subreddits from Reddit.
  • Reddit Combi: Consists of title and body text combined together to form a single text sequence, collected from both stress and non-stress related subreddits from Reddit.
  • Twitter Full: Consists of stress and non-stress related tweets, collected from Twitter.
  • Twitter Non-Advert: Consists of the denoised version of the Twitter Full dataset.

The details about the dataset may be directly referred to from the study.

Citations (To be updated)

@INPROCEEDINGS{rastogi2022stress,
  author={Aryan Rastogi, Liu Qian and Erik Cambria},
  booktitle={2022 IEEE World Conference of Computational Intelligence (WCCI).},
  title={Stress Detection from Social Media Articles: New Dataset Benchmark and Analytical Study},
  year={2022},
  volume={},
  number={},
  pages={},
  doi={},
  ISSN={},
  month={}
}