Clinical Camel

Model Description

Clinical Camel is an open large language model (LLM), fine-tuned on the LLaMA-2 70B architecture using QLoRA. It is tailored for the medical and clinical research, capable of processing and generating relevant content.

Review our pre-print for more details: Clinical Camel - Pre-print

Performance

Clinical Camel demonstrates competitive performance on medical benchmarks.

Table: Five-Shot Performance of Clinical Camel-70B (C70), GPT3.5, GPT4, and Med-PaLM 2 on Various Medical Datasets

Dataset ClinicalCamel-70B GPT3.5 GPT4 Med-PaLM 2
MMLU Anatomy 65.2 60.7 80.0 77.8
MMLU Clinical Knowledge 72.8 68.7 86.4 88.3
MMLU College Biology 81.2 72.9 93.8 94.4
MMLU College Medicine 68.2 63.6 76.3 80.9
MMLU Medical Genetics 69.0 68.0 92.0 90.0
MMLU Professional Medicine 75.0 69.8 93.8 95.2
MedMCQA 54.2 51.0 72.4 71.3
MedQA (USMLE) 60.7 53.6 81.4 79.7
PubMedQA 77.9 60.2 74.4 79.2
USMLE Sample Exam 64.3 58.5 86.6 -

Evaluation Datasets:

The performance of Clinical Camel was benchmarked across several datasets, including:

Evaluation Reproduction:

To reproduce the evaluations with lm-evaluation-harness see the 'TaskFiles' folder