VacStance

Detecting Stance on Covid-19 Vaccine in a Polarized Media

The project introduces VacStance, a dataset of 2,000 stance-labeled Covid-19 vaccine sentences extracted from 169,432 sentences drawing from 15,750 news articles covering left-leaning and right-leaning media outlets. The leaning of the media outlets is based on Media Bias/Fact Check classification for media leaning. A trained BERT classifier analyzes aspects of argumentation, how the different sides of the vaccine debate represent their own and each other’s opinions to determine if Left-Leaning and Right-Leaning media use framing devices and opinion attribution.

This repository contains code and data for the capstone project as well as work done after capstone submission.

It is based on a paper and code by:

Luo, Y., Card, D. and Jurafsky, D. (2020). DeSMOG: Detecting Stance in Media On Global Warming. In Findings of the Association for Computational Linguistics: EMNLP 2020. GitHub: https://github.com/yiweiluo/GWStance

Getting started

Create and activate a Python 3.6 environment.
Run pip install -r requirements.txt.
Re-install neuralcoref with the --no-binary option:

pip uninstall neuralcoref
pip install neuralcoref --no-binary neuralcoref

Download SpaCy's English model: python -m spacy download en
Update the config.json file with your local OS variables.

Important note: Not all scripts that were run on this project are included in this repository.

Repository structure

The dataset VacStance itself will be provided on request. The dataset contains tab-separated fields for each of the following:
1. sentence: the sentence
2. annotator_0, ..., annotator_3: ratings from each of the 4 annotators for the stance of the sentence.
3. disagree: the probability that the sentence expresses disagreement with the target opinion (that Covid-19 vaccine is safe.), as estimated by the Bayesian model.
4. agree: the probability that the sentence expresses agreement with the target opinion (that Covid-19 vaccine is safe.)
5. neutral: the probability that the sentence is neutral to the target opinion (that Covid-19 vaccine is safe.)
6. guid: a unique ID for each sentence
7. in_held_out_test: whether the sentence was used in our held-out-test set for model and baseline evaluation
The lexicons of framing devices are located in 4_analyses/lexicons.
The sequence of code to replicate the results can be found in the individual READMEs of the numbered sub-directories.

Contact: [email protected]

Name		Name	Last commit message	Last commit date
Latest commit History 38 Commits
1_data_scraping		1_data_scraping
2_data_processing		2_data_processing
3_stance_detection		3_stance_detection
4_analyses		4_analyses
README.md		README.md
docs		docs

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

VacStance

Getting started

Repository structure

About

Releases

Packages

Languages

ThoughtfulMind/VacStance

Folders and files

Latest commit

History

Repository files navigation

VacStance

Getting started

Repository structure

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages