Skip to content

AmenRa/GuardBench

Repository files navigation

PyPI version Documentation Status License: EUPL-1.2

GuardBench

⚡️ Introduction

GuardBench is a Python library for guardrail models evaluation. It provides a common interface to 40 evaluation datasets, which are downloaded and converted into a standardized format for improved usability. It also allows to quickly compare results and export LaTeX tables for scientific publications. GuardBench's benchmarking pipeline can also be leveraged on custom datasets.

GuardBench was featured in EMNLP 2024. The related paper is available here.

You can find the list of supported datasets here. A few of them requires authorization. Please, read this.

If you use GuardBench to evaluate guardrail models for your scientific publications, please consider citing our work.

✨ Features

🔌 Requirements

python>=3.10

💾 Installation

pip install guardbench

💡 Usage

from guardbench import benchmark

def moderate(
    conversations: list[list[dict[str, str]]],  # MANDATORY!
    # additional `kwargs` as needed
) -> list[float]:
    # do moderation
    # return list of floats (unsafe probabilities)

benchmark(
    moderate=moderate,  # User-defined moderation function
    model_name="My Guardrail Model",
    batch_size=32,
    datasets="all", 
    # Note: you can pass additional `kwargs` for `moderate`
)

📖 Examples

📚 Documentation

Browse the documentation for more details about:

🏆 Leaderboard

You can find GuardBench's leaderboard here. All results can be reproduced using the provided scripts.
If you want to submit your results, please contact us.

👨‍💻 Authors

  • Elias Bassani (European Commission - Joint Research Centre)

🎓 Citation

@inproceedings{guardbench,
    title = "{G}uard{B}ench: A Large-Scale Benchmark for Guardrail Models",
    author = "Bassani, Elias  and
      Sanchez, Ignacio",
    editor = "Al-Onaizan, Yaser  and
      Bansal, Mohit  and
      Chen, Yun-Nung",
    booktitle = "Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing",
    month = nov,
    year = "2024",
    address = "Miami, Florida, USA",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2024.emnlp-main.1022",
    doi = "10.18653/v1/2024.emnlp-main.1022",
    pages = "18393--18409",
}

🎁 Feature Requests

Would you like to see other features implemented? Please, open a feature request.

📄 License

GuardBench is provided as open-source software licensed under EUPL v1.2.

Releases

No releases published

Packages

No packages published