cross-encoder-mmarco-german-distilbert-base

Model description:

This model is a fine-tuned cross-encoder on the MMARCO dataset which is the machine translated version of the MS MARCO dataset. As base model for the fine-tuning we use distilbert-base-multilingual-cased

Model input samples are tuples of the following format, either <query, positive_paragraph> assigned to 1 or <query, negative_paragraph> assigned to 0.

The model was trained for 1 epoch.

Model usage

The cross-encoder model can be used like this:

from sentence_transformers import CrossEncoder
model = CrossEncoder('model_name')
scores = model.predict([('Query 1', 'Paragraph 1'), ('Query 2', 'Paragraph 2')])

The model will predict scores for the pairs ('Query 1', 'Paragraph 1') and ('Query 2', 'Paragraph 2').

For more details on the usage of the cross-encoder models have a look into the Sentence-Transformers

Model Performance:

Model evaluation was done on 2000 evaluation paragraphs of the dataset.

Accuracy	F1-Score	Precision	Recall
89.70	86.82	86.82	93.50