generated_from_keras_callback

MUmairAB/bert-based-MaskedLM

The model training code is available as a notebook on my GitHub

This model is a fine-tuned version of distilbert-base-uncased on IMDB Movies Review dataset. It achieves the following results on the evaluation set:

Training and validation loss during training

<img src="https://huggingface.co/MUmairAB/bert-based-MaskedLM/resolve/main/Loss%20plot.png" style="height: 432px; width:567px;"/>

Model description

DistilBERT-base-uncased

Model: "tf_distil_bert_for_masked_lm"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 distilbert (TFDistilBertMai  multiple                 66362880  
 nLayer)                                                         
                                                                 
 vocab_transform (Dense)     multiple                  590592    
                                                                 
 vocab_layer_norm (LayerNorm  multiple                 1536      
 alization)                                                      
                                                                 
 vocab_projector (TFDistilBe  multiple                 23866170  
 rtLMHead)                                                       
                                                                 
=================================================================
Total params: 66,985,530
Trainable params: 66,985,530
Non-trainable params: 0
_________________________________________________________________

Intended uses & limitations

The model was trained on IMDB movies review dataset. So, it inherits the language biases from the dataset.

Training and evaluation data

The model was trained on IMDB Movies Review dataset.

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

Training results

Train Loss Validation Loss Epoch
3.0754 2.7548 0
2.7969 2.6209 1
2.7214 2.5588 2
2.6626 2.5554 3
2.6466 2.4881 4
2.6238 2.4775 5
2.5696 2.4280 6
2.5504 2.3924 7
2.5171 2.3725 8
2.5180 2.3142 9
2.4443 2.2974 10
2.4497 2.3317 11
2.4371 2.3317 12
2.4377 2.3237 13
2.4369 2.3338 14
2.4350 2.3021 15
2.4267 2.3264 16
2.4557 2.3280 17
2.4461 2.3165 18
2.4360 2.3284 19

Framework versions