IPEval: A Bilingual Intellectual Property Agency Consultation Evaluation Benchmark for Large Language Models
DOI:
https://doi.org/10.70891/JAIR.2025.040011Keywords:
large language models, benchmark, intellectual propertyAbstract
With the rapid development of Large Language Models (LLMs) in vertical domains, attempts have been made to the field of intellectual property (IP). However, there is currently no evaluation benchmark specifically for assessing the understanding, application, and reasoning abilities of LLMs in the IP domain. To address this issue, we introduce IPEval, the first capability evaluation benchmark designed for IP agency and consulting tasks. IPEval consists of 2657 multiple-choice questions, divided into four major capability dimensions: creation, application, protection, and management. These questions cover eight areas: patent rights which including inventions, utility models, and designs, trademarks, copyrights, trade secrets, integrated circuit layout design rights, geographical indications, and related laws. We designed three evaluation methods: zero-shot, 5-few-shot, and Chain of Thought (CoT) for seven kinds of LLMs with varying parameters, primarily using either English or Chinese. The study results indicate that the GPT series and Qwen series models demonstrate stronger performance in English tests, while Chinese-major LLMs, such as the Qwen series, outperform GPT-4 in Chinese tests. Specialized legal domain LLMs, such as the fuzi-mingcha and MoZi, still significantly lag behind general-purpose LLMs of comparable parameter sizes in IP performance. This highlights the necessity and substantial potential for developing more specialized LLMs with stronger IP abilities. We also analyze the models' capabilities in terms of the regional and temporal aspects of IP, emphasizing that IP domain LLMs need to clearly understand the differences in IP laws across different regions and their dynamic changes over time. We hope IPEval can provide an accurate assessment of LLM capabilities in the IP domain and encourage researchers interested in IP to develop LLMs with richer IP knowledge.

Downloads
Published
Issue
Section
License
Copyright (c) 2025 Journal of Artificial Intelligence Research

This work is licensed under a Creative Commons Attribution 4.0 International License.