Opus Tatoeba English-German
*This model was obtained by running the script convert_marian_to_pytorch.py - Instruction available here. The original models were trained by Jörg Tiedemann using the MarianNMT library. See all available MarianMTModel
models on the profile of the Helsinki NLP group.
This is the conversion of checkpoint opus-2021-02-22.zip *
eng-deu
-
source language name: English
-
target language name: German
-
OPUS readme: README.md
-
model: transformer
-
source language code: en
-
target language code: de
-
dataset: opus
-
release date: 2021-02-22
-
pre-processing: normalization + SentencePiece (spm32k,spm32k)
-
download original weights: opus-2021-02-22.zip
-
Training data:
- deu-eng: Tatoeba-train (86845165)
-
Validation data:
- deu-eng: Tatoeba-dev, 284809
- total-size-shuffled: 284809
- devset-selected: top 5000 lines of Tatoeba-dev.src.shuffled!
-
Test data:
- newssyscomb2009.eng-deu: 502/11271
- news-test2008.eng-deu: 2051/47427
- newstest2009.eng-deu: 2525/62816
- newstest2010.eng-deu: 2489/61511
- newstest2011.eng-deu: 3003/72981
- newstest2012.eng-deu: 3003/72886
- newstest2013.eng-deu: 3000/63737
- newstest2014-deen.eng-deu: 3003/62964
- newstest2015-ende.eng-deu: 2169/44260
- newstest2016-ende.eng-deu: 2999/62670
- newstest2017-ende.eng-deu: 3004/61291
- newstest2018-ende.eng-deu: 2998/64276
- newstest2019-ende.eng-deu: 1997/48969
- Tatoeba-test.eng-deu: 10000/83347
-
test set translations file: test.txt
-
test set scores file: eval.txt
-
BLEU-scores |Test set|score| |---|---| |newstest2018-ende.eng-deu|46.4| |Tatoeba-test.eng-deu|45.8| |newstest2019-ende.eng-deu|42.4| |newstest2016-ende.eng-deu|37.9| |newstest2015-ende.eng-deu|32.0| |newstest2017-ende.eng-deu|30.6| |newstest2014-deen.eng-deu|29.6| |newstest2013.eng-deu|27.6| |newstest2010.eng-deu|25.9| |news-test2008.eng-deu|23.9| |newstest2012.eng-deu|23.8| |newssyscomb2009.eng-deu|23.3| |newstest2011.eng-deu|22.9| |newstest2009.eng-deu|22.7|
-
chr-F-scores |Test set|score| |---|---| |newstest2018-ende.eng-deu|0.697| |newstest2019-ende.eng-deu|0.664| |Tatoeba-test.eng-deu|0.655| |newstest2016-ende.eng-deu|0.644| |newstest2015-ende.eng-deu|0.601| |newstest2014-deen.eng-deu|0.595| |newstest2017-ende.eng-deu|0.593| |newstest2013.eng-deu|0.558| |newstest2010.eng-deu|0.55| |newssyscomb2009.eng-deu|0.539| |news-test2008.eng-deu|0.533| |newstest2009.eng-deu|0.533| |newstest2012.eng-deu|0.53| |newstest2011.eng-deu|0.528|