cross-encoder-mmarco-german-distilbert-base
Model description:
This model is a fine-tuned cross-encoder on the MMARCO dataset which is the machine translated version of the MS MARCO dataset. As base model for the fine-tuning we use distilbert-base-multilingual-cased
Model input samples are tuples of the following format, either
<query, positive_paragraph>
assigned to 1 or <query, negative_paragraph>
assigned to 0.
The model was trained for 1 epoch.
Model usage
The cross-encoder model can be used like this:
from sentence_transformers import CrossEncoder
model = CrossEncoder('model_name')
scores = model.predict([('Query 1', 'Paragraph 1'), ('Query 2', 'Paragraph 2')])
The model will predict scores for the pairs ('Query 1', 'Paragraph 1')
and ('Query 2', 'Paragraph 2')
.
For more details on the usage of the cross-encoder models have a look into the Sentence-Transformers
Model Performance:
Model evaluation was done on 2000 evaluation paragraphs of the dataset.
Accuracy | F1-Score | Precision | Recall |
---|---|---|---|
89.70 | 86.82 | 86.82 | 93.50 |