国产亚洲AV自拍|av中文字幕一区|资源在线观看一区二区|亚洲影视久久亚洲特级性交|一级做一级a做片爱免费观看|欧美另类亚洲色婷婷精品无码|亚洲青青草免费一区|青青草免费成人网|91久久国内视频|五月天丁香久久

One paper has been accepted by TETCI.

Our paper entitled "Benchmarking Medical LLMs on Anesthesiology: A Comprehensive Dataset in Chinese" has been accepted by TETCI.

 

Title: Benchmarking Medical LLMs on Anesthesiology: A Comprehensive Dataset in Chinese  

Author: Bohao Zhou, Yibing Zhan, Zhonghai Wang, Yanhong Li, Chong Zhang, Baosheng Yu, Liang Ding, Hua Jin,Weifeng Liu, Xiongbin Wang, and Dapeng Tao

Abstract:With the recent success of large language models(LLMs), interest in developing them for medical domains has increased. However, due to the lack of benchmark datasets, evaluating the capabilities of medical LLMs remains challenging, particularly in highly specialized fields such as anesthesiology. To address this gap, we introduce a comprehensive anesthesiology benchmark dataset in Chinese, known as the Chinese Anesthesiology Benchmark (CAB). This benchmark facilitates the evaluation of medical LLMs for anesthesiology across three crucial dimensions: knowledge, application, and safety. Specifically, the CAB provides more than 8k questions collected from examinations and books for knowledge-level evaluation; more than 2k questions collected from online anesthesia consultations and hospitals for application-level evaluation; and 136 tests from seven anesthesia medical care scenarios for safety-level evaluation. With the proposed CAB dataset, we conducted a thorough evaluation of six medical LLMs, such as Bianque-2 and HuatuoGPT-13B, and eleven general LLMs, such as Qwen-7B-Chat and GPT-4. The evaluation results revealed that there are still clear gaps in the capacities of medical LLMs for anesthesiology compared with those of medical students in the field of anesthesia. We hope that the proposed CAB dataset can facilitate the development of medical LLMs for anesthesiology.


登錄用戶可以查看和發(fā)表評論, 請前往  登錄 或  注冊
SCHOLAT.com 學(xué)者網(wǎng)
免責(zé)聲明 | 關(guān)于我們 | 用戶反饋
聯(lián)系我們: