GuardBench

⚡️ Introduction

GuardBench is a Python library for guardrail models evaluation. It provides a common interface to 40 evaluation datasets, which are downloaded and converted into a standardized format for improved usability. It also allows to quickly compare results and export LaTeX tables for scientific publications. GuardBench's benchmarking pipeline can also be leveraged on custom datasets.

GuardBench was featured in EMNLP 2024. The related paper is available here.

You can find the list of supported datasets here. A few of them requires authorization. Please, read this.

If you use GuardBench to evaluate guardrail models for your scientific publications, please consider citing our work.

✨ Features

40 datasets for guardrail models evaluation.
Automated evaluation pipeline.
User-friendly.
Extendable.
Reproducible and sharable evaluation.
Exportable evaluation reports.

🔌 Requirements

python>=3.10

💾 Installation

pip install guardbench

💡 Usage

from guardbench import benchmark

def moderate(
    conversations: list[list[dict[str, str]]],  # MANDATORY!
    # additional `kwargs` as needed
) -> list[float]:
    # do moderation
    # return list of floats (unsafe probabilities)

benchmark(
    moderate=moderate,  # User-defined moderation function
    model_name="My Guardrail Model",
    batch_size=32,
    datasets="all", 
    # Note: you can pass additional `kwargs` for `moderate`
)

📖 Examples

Follow our tutorial on benchmarking Llama Guard with GuardBench.
More examples are available in the scripts folder.

📚 Documentation

Browse the documentation for more details about:

The datasets and how to obtain them.
The data format used by GuardBench.
How to use the Report class to compare models and export results as LaTeX tables.
How to leverage GuardBench's benchmarking pipeline on custom datasets.

🏆 Leaderboard

You can find GuardBench's leaderboard here. All results can be reproduced using the provided scripts.
If you want to submit your results, please contact us.

👨‍💻 Authors

Elias Bassani (European Commission - Joint Research Centre)

🎓 Citation

@inproceedings{guardbench,
    title = "{G}uard{B}ench: A Large-Scale Benchmark for Guardrail Models",
    author = "Bassani, Elias  and
      Sanchez, Ignacio",
    editor = "Al-Onaizan, Yaser  and
      Bansal, Mohit  and
      Chen, Yun-Nung",
    booktitle = "Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing",
    month = nov,
    year = "2024",
    address = "Miami, Florida, USA",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2024.emnlp-main.1022",
    doi = "10.18653/v1/2024.emnlp-main.1022",
    pages = "18393--18409",
}

🎁 Feature Requests

Would you like to see other features implemented? Please, open a feature request.

📄 License

GuardBench is provided as open-source software licensed under EUPL v1.2.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
docs		docs
guardbench		guardbench
scripts/effectiveness		scripts/effectiveness
tests/unit/guardbench		tests/unit/guardbench
.gitignore		.gitignore
LICENSE.txt		LICENSE.txt
Makefile		Makefile
NOTICE.txt		NOTICE.txt
README.md		README.md
_typos.toml		_typos.toml
conftest.py		conftest.py
pyproject.toml		pyproject.toml
requirements-dev.txt		requirements-dev.txt
requirements.txt		requirements.txt
sbom.txt		sbom.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

GuardBench

⚡️ Introduction

✨ Features

🔌 Requirements

💾 Installation

💡 Usage

📖 Examples

📚 Documentation

🏆 Leaderboard

👨‍💻 Authors

🎓 Citation

🎁 Feature Requests

📄 License

About

Releases

Packages

Languages

License

AmenRa/GuardBench

Folders and files

Latest commit

History

Repository files navigation

GuardBench

⚡️ Introduction

✨ Features

🔌 Requirements

💾 Installation

💡 Usage

📖 Examples

📚 Documentation

🏆 Leaderboard

👨‍💻 Authors

🎓 Citation

🎁 Feature Requests

📄 License

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages