Model

This model is based on nicoladecao/msmarco-word2vec256000-distilbert-base-uncased with a 256k sized vocabulary initialized with word2vec.

This model has been trained with MLM on the MS MARCO corpus collection for 785k steps. See train_mlm.py for the train script. It was run on 2x V100 GPUs.

Note: Token embeddings where updated!