Skip to content

Latest commit

 

History

History
163 lines (103 loc) · 7.77 KB

README.md

File metadata and controls

163 lines (103 loc) · 7.77 KB

BiasFinder: Metamorphic Test Generation to Uncover Bias for Sentiment Analysis Systems

Overview

Artificial intelligence systems, such as Sentiment Analysis (SA) systems, typically learn from large amounts of data that may reflect human bias. Consequently, such systems may exhibit unintended demographic bias against specific characteristics (e.g., gender, occupation, country-of-origin, etc.). Such bias manifests in an SA system when it predicts different sentiments for similar texts that differ only in the characteristic of individuals described. To automatically uncover bias in SA systems, this paper presents BiasFinder, an approach that can discover biased predictions in SA systems via metamorphic testing. A key feature of BiasFinder is the automatic curation of suitable templates from any given text inputs, using various Natural Language Processing (NLP) techniques to identify words that describe demographic characteristics. Next, BiasFinder generates new texts from these templates by mutating words associated with a class of a characteristic (e.g., gender-specific words such as female names, 'she', 'her'). These texts are then used to tease out bias in an SA system. BiasFinder identifies a bias-uncovering test case (BTC) when an SA system predicts different sentiments for texts that differ only in words associated with a different class (e.g., male vs. female) of a target characteristic (e.g., gender). We evaluate BiasFinder on 10 SA systems and 2 large scale datasets, and the results show that BiasFinder can create more BTCs than two popular baselines. We also conduct an annotation study and find that human annotators consistently think that test cases generated by BiasFinder are more fluent than the two baselines.

Requirements

For fine-tuning SA, we use HuggingFace library that provide many pre-trained language models, including BERT, RoBERTa, and XLNET.

For nlp task, please install thess libraries:

  • spacy (need en_core_web_lg)
  • pandas
  • numpy
  • scikit-learn
  • nltk
  • neuralcoref
  • fastNLP

For occupation bias, you need StanfordCoreNLP and several libraries:

For preparing data from genderComputer, please install thess libraries:

  • python-nameparser
  • unidecode

Tips: you may use docker for faster implemention on your coding environment. https://hub.docker.com/r/pytorch/pytorch/tags provide several version of PyTorch containers. Please pull the appropiate pytorch container with the tag 1.9 version, using this command.

docker pull pytorch/pytorch:1.9.0-cuda10.2-cudnn7-devel

Setup Dataset and BERT fine-tuning model

1) Prepare the dataset

dataset description
asset/imdb/ We use IMDB movie review dataset downloaded from Google Drive proposed by Zhang et al. (2015).
asset/gender_associated_word/ It contains pre-determined values for Gender Associated Words
asset/gender_computer/ It contains a notebook asset/gender_computer/genderComputer/prepare_male_female_names.ipynb to prepare the names for BiasFinder experiment.
asset/predefined_occupation_list/neutral-occupation.csv/ It contains pre-determined words for neutral occupations

2) Fine-Tune SA systems

Run this command inside the codes/fine-tuning/ folder to fine-tune SA models.

bash fine-tune-imdb.sh
bash fine-tune-twitter-s140.sh

Then check the test accuracy of the fine-tuned models

bash test-imdb.sh
bash test-twitter-s140.sh

Check the accuracy in codes/evaluation/Model-Performance.ipynb

Mutant Generation

Our framework, BiasFinder, can be instantiated to identify different kinds of bias. In this work, we show how BiasFinder can be instantiated to uncover bias in three different demographic characteristics: gender, occupation, and country-oforigin.

BiasFinder automatically identifies and curates suitable texts in a large corpus of reviews, and transforms these texts into templates. Each template can be used to produce a large number of mutant texts, by filling in placeholders with concrete values associated with a class (e.g., male vs. female) given a demographic characteristic (e.g., gender)(See Section III and IV). Using these mutant texts, BiasFinder then runs the SA system under test, checking if it predicts the same sentiment for two mutants associated with a different class (e.g. male vs. female) of the given characteristic (e.g. gender). A pair of such mutants are related through a metamorphic relation where they share the same predicted sentiment from a fair SA system (See Section V and VI).

1) Gender Bias

Run this command inside the codes/gender/ folder

bash biasfinder-generate-mutant.sh

Some trouble shooting:

  • If you face a problem with neuralcoref, please build the library from the source instead of installing using pip. Check here.

  • You also need to run the following commands if you meet problem ModuleNotFoundError: No module named 'en_core_web_lg'.

python -m spacy download en
python -m spacy download en_core_web_lg

This code will generate mutant texts for gender and saved the mutant texts inside a folder data/biasfinder/gender/

2) Occupation Bias

Run this command inside the codes/occupation/ folder

python main.py

This code will generate mutant texts for occupation and saved the mutant texts inside a folder data/biasfinder/occupation/. Important note: Occupation bias need StanfordCoreNLP to detect occupation term in the text. Thus please make sure to serve StanfordCoreNLP as an API - Stackoverflow Guide to serve StanfordCoreNLP as an API.

3) Country-of-origin Bias

Run this command inside the codes/country/ folder

bash generate-country-mutant.sh

This code will generate mutant texts for country-of-origin and saved the mutant texts inside a folder data/biasfinder/country/

Predict The Mutant Texts using Fine-tuned BERT

1) IMDB Experiments

Run this command inside the codes/fine-tuning/ folder

bash predict-imdb.sh

This code will produce the prediction of mutant texts.

2) Twitter Experiments

Run this command inside the codes/fine-tuning/ folder

bash predict-twitter-s140.sh

This code will produce the prediction of mutant texts.

Measuring the Bias Uncovering Test Case (BTC)

Mutants of differing classes that are produced from the same template are expected to have the same sentiment. Therefore, if the SA predicts that two mutants of different classes to have different sentiments, they are an evidence of a biased prediction. Such pairs of mutants are output as bias-uncovering test cases (BTC). Thus BTC is a pair that contains 2 different class (e.g. male female for gender bias) and their predictions, such that the Sentiment Analysis produce a different prediction. Example of BTC for gender bias:

<(male, prediction), (female, prediction)>

<(“He is angry”, "positive"), (“She is angry”, "negative")>

1) Gender Bias

Notebook evaluation/BTC-Gender.ipynb contains the BTC calculation for gender bias targeting mutant texts.

2) Occupation Bias

Notebook evaluation/BTC-Occupation.ipynb contains the BTC calculation for occupation bias targeting mutant texts.

3) Country-of-origin Bias

Notebook evaluation/BTC-Country.ipynb contains the BTC calculation for country-of-origin bias targeting mutant texts.