🌐 Website • 🤗 Hugging Face • 📃 Paper
IPEval is a pioneering bilingual Intellectual Property (IP) agency consultation evaluation benchmark, meticulously crafted to assess the competencies of Large Language Models (LLMs) in the intricate domain of intellectual property. This benchmark is the first of its kind, encompassing a diverse spectrum of 2,657 multiple-choice questions that are intricately divided across four major capability dimensions: creation, application, protection, and management. More details can be found in our paper.
- Data
- How to Evaluate on IPEval
- How to submit
- Citation
- Acknowledgement
You can directly obtain the text generated by the model and use our provided script (extract_answer.py) to extract the answer tokens (i.e., A, B, C, D, E). However, in zero-shot evaluation, it may be necessary to include human screening because the model, especially those that have not undergone instruction tuning, may not follow the instructions well enough to produce well-formatted text outputs.
We use the following prompt for testing:
For Chinese questions:
请你做为一个专利代理律师,以下是中国专利代理人资格考试的单项选择题(多项选择题),请选出四个选项中最(所有)符合题目要求的一个答案。请以下列形式回答,答案: 。
[k-shot demo, note that k is 0 in the zero-shot case]
问题:{question}
A. {A}
B. {B}
C. {C}
D. {D}
答案:
For English questions:
You are a patent attorney, the following is a multiple-choice question for the USPTO Patent Attorney Exam, select only(all) the best(none) answer for this question. The format of answer should be 'Answer: option.(option1, option2, ...)'.
[k-shot demo, note that k is 0 in the zero-shot case]
Question: {question}
A. {A}
B. {B}
C. {C}
D. {D}
Answer:
For Chinese questions:
请你做为一个专利代理律师,以下是中国专利代理人资格考试的单项选择题(多项选择题),请选出四个选项中最(所有)符合题目要求的一个答案。请以下列形式回答,答案: 。
[k-shot demo, note that k is 0 in the zero-shot case]
让我们一步一步思考。
问题:{question}
A. {A}
B. {B}
C. {C}
D. {D}
答案:
For English questions:
You are a patent attorney, the following is a multiple-choice question for the USPTO Patent Attorney Exam, select only(all) the best(none) answer for this question. The format of answer should be 'Answer: option.(option1, option2, ...)'.
[k-shot demo, note that k is 0 in the zero-shot case]
Let's think Step by Step.
Question: {question}
A. {A}
B. {B}
C. {C}
D. {D}
Answer:
You should prepare a JSON file in UTF-8 encoding format, using the following structure, and then send it to the email [email protected].
[
{
"index": "question_id",
"answer": "model's answer with extraction"
},
{
"index": "question_id",
"answer": "model's answer with extraction"
},
...
]
Please name the file as: language (ch/en)_task (patent/relation/english)_mode (zero-shot/few-shot/CoT).json, and include the model information.
@article{wang2024ipeval,
title={IPEval: A Bilingual Intellectual Property Agency Consultation Evaluation Benchmark for Large Language Models},
author={Wang, Qiyao and Huang, Jianguo and Lu, Shule and Lin, Yuan and Xu, Kan and Yang, Liang and Lin, Hongfei},
journal={arXiv preprint arXiv:2406.12386},
year={2024}
}
Thanks to DUTIR for their support of this work.
Thanks to E-Eval for providing the README.md file style.