
<!-- This model card has been generated automatically according to the information the Trainer had access to. You should probably proofread and complete it, then remove this comment. -->


This model is a fine-tuned version of BioLinkBERT-large on the GLUE MNLI and SNLI dataset.

The results are

Model Dataset Acc
Roberta-large-mnli MNLI dev mm 90.12
MNLI dev m 90.59
SNLI test 88.25
BioLinkBERT-large MNLI dev mm 33.56
MNLI dev m 33.18
SNLI test 32.66
BioLinkBERT-large-mnli-snli MNLI dev mm 85.75
MNLI dev m 85.30
SNLI test 89.82

The labels are "0": "entailment", "1": "neutral", "2": "contradiction"

For finetuning BioLinkBERT-large on

Training procedure

This model checkpoint is created by with the following command:

CUDA_VISIBLE_DEVICES=0,1,2; python -m torch.distributed.launch \
    --nproc_per_node 3 \
    --model_name_or_path michiyasunaga/BioLinkBERT-large --task_name mnli --add_snli \
    --do_train --max_seq_length 512 --fp16 --per_device_train_batch_size 16 --gradient_accumulation_steps 2 \
    --learning_rate 3e-5 --warmup_ratio 0.5 --num_train_epochs 10 \
    --output_dir ./biolinkbert_mnli_snli

which will create a folder biolinkbert_mnli_snli that contains the checkpoints.

And all checkpoints are evaluated on MNLI dev (mismatched and matched) & SNLI test by

mkdir $PWD/eval/$name
for run in $root/checkpoint-*; do
    step=$( echo $run | rg "checkpoint-(?P<step>\d+)" -or '$step')
    echo "eval of $step ---- save to $out";

    CUDA_VISIBLE_DEVICES=0; python \
        --model_name_or_path $run --task_name mnli --add_snli \
        --do_eval --max_seq_length 512 --fp16 --report_to none \
        --per_device_eval_batch_size 8 --output_dir $out

which will create a folder eval/biolinkbert_mnli_snli that contains evaluation results for MNLI dev & SNLI test.

The best checkpoint is then selected according to mnli-m:

import json
import os
from pathlib import Path
import pandas as pd
from rich.console import Console
from rich.table import Table

pwd = Path(__file__).parent.resolve()
name = "eval/biolinkbert_mnli"
results = []
for evalckpt in os.listdir(pwd / name):
    step = evalckpt.split("_")[1]
    with open(pwd / name / evalckpt / "all_results.json") as f:
        data = json.load(f)
    results.append([int(step), data["mnli-mm_eval_accuracy"] * 100, data["mnli_eval_accuracy"] * 100, data["snli-test_eval_accuracy"] * 100])
results = pd.DataFrame(results, columns=["step", "mnli-mm", "mnli-m", "snli"]).sort_values(by=['step'])
console = Console()
table = Table(name)
table.add_row(results.to_string(float_format=lambda _: '{:.3f}'.format(_)))
best = results["mnli-m"].idxmax()

Training hyperparameters

The following hyperparameters were used during training:

Framework versions