GitHub - QiYao-Wang/ipeval

IPEval

IPEval is a pioneering bilingual Intellectual Property (IP) agency consultation evaluation benchmark, meticulously crafted to assess the competencies of Large Language Models (LLMs) in the intricate domain of intellectual property. This benchmark is the first of its kind, encompassing a diverse spectrum of 2,657 multiple-choice questions that are intricately divided across four major capability dimensions: creation, application, protection, and management. More details can be found in our paper.

News

How to Evaluate on IPEval

You can directly obtain the text generated by the model and use our provided script (extract_answer.py) to extract the answer tokens (i.e., A, B, C, D, E). However, in zero-shot evaluation, it may be necessary to include human screening because the model, especially those that have not undergone instruction tuning, may not follow the instructions well enough to produce well-formatted text outputs.

We use the following prompt for testing:

Zero-Shot or Few-Shot prompt

For Chinese questions:

请你做为一个专利代理律师，以下是中国专利代理人资格考试的单项选择题（多项选择题），请选出四个选项中最（所有）符合题目要求的一个答案。请以下列形式回答，答案： 。

[k-shot demo, note that k is 0 in the zero-shot case]

问题：{question}
A. {A}
B. {B}
C. {C}
D. {D}
答案：

For English questions:

You are a patent attorney, the following is a multiple-choice question for the USPTO Patent Attorney Exam, select only(all) the best(none) answer for this question. The format of answer should be 'Answer: option.(option1, option2, ...)'.

[k-shot demo, note that k is 0 in the zero-shot case]

Question: {question}
A. {A}
B. {B}
C. {C}
D. {D}
Answer:

CoT Prompt

For Chinese questions:

请你做为一个专利代理律师，以下是中国专利代理人资格考试的单项选择题（多项选择题），请选出四个选项中最（所有）符合题目要求的一个答案。请以下列形式回答，答案： 。

[k-shot demo, note that k is 0 in the zero-shot case]

让我们一步一步思考。

问题：{question}
A. {A}
B. {B}
C. {C}
D. {D}
答案：

For English questions:

You are a patent attorney, the following is a multiple-choice question for the USPTO Patent Attorney Exam, select only(all) the best(none) answer for this question. The format of answer should be 'Answer: option.(option1, option2, ...)'.

[k-shot demo, note that k is 0 in the zero-shot case]

Let's think Step by Step.

Question: {question}
A. {A}
B. {B}
C. {C}
D. {D}
Answer:

How to Submit

You should prepare a JSON file in UTF-8 encoding format, using the following structure, and then send it to the email [email protected].

[
{
   "index": "question_id",
   "answer": "model's answer with extraction"
},
{
   "index": "question_id",
   "answer": "model's answer with extraction"
},
...
]

Please name the file as: language (ch/en)_task (patent/relation/english)_mode (zero-shot/few-shot/CoT).json, and include the model information.

Citation

@article{wang2024ipeval,
  title={IPEval: A Bilingual Intellectual Property Agency Consultation Evaluation Benchmark for Large Language Models},
  author={Wang, Qiyao and Huang, Jianguo and Lu, Shule and Lin, Yuan and Xu, Kan and Yang, Liang and Lin, Hongfei},
  journal={arXiv preprint arXiv:2406.12386},
  year={2024}
}

Acknowledgement

Thanks to DUTIR for their support of this work.

Thanks to E-Eval for providing the README.md file style.

Name		Name	Last commit message	Last commit date
Latest commit History 56 Commits
data		data
static		static
README.md		README.md
extract_answer.py		extract_answer.py
index.html		index.html

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

IPEval

News

Table of Contents

How to Evaluate on IPEval

Zero-Shot or Few-Shot prompt

CoT Prompt

How to Submit

Citation

Acknowledgement

About

Releases

Packages

Languages

QiYao-Wang/ipeval

Folders and files

Latest commit

History

Repository files navigation

IPEval

News

Table of Contents

How to Evaluate on IPEval

Zero-Shot or Few-Shot prompt

CoT Prompt

How to Submit

Citation

Acknowledgement

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages