generated_from_trainer

bert-finetuned-squad-v1

This model is a fine-tuned version of bert-base-cased on the Stanford Question Answering Dataset (SQuAD).

Model description:

The bert-finetuned-squad-v1 model is built upon the BERT (Bidirectional Encoder Representations from Transformers) architecture and has been fine-tuned specifically for the task of question-answering on the SQuAD dataset. It takes a passage of text (context) and a question as input and predicts the start and end positions of the answer within the context.

Intended uses & limitations:

Intended Uses:

Limitations:

Training and evaluation data:

The model was trained on the SQuAD dataset, which consists of two main splits:

Training procedure:

The training process involved several key steps:

  1. Preprocessing: The training data was preprocessed to convert text inputs into numerical IDs using a BERT tokenizer. Additionally, labels for start and end positions of answer spans were generated.

  2. Sliding Window: To handle long contexts, a sliding window approach was employed. Long contexts were split into multiple input features with overlapping tokens.

  3. Fine-tuning: The model was fine-tuned on the SQuAD training data, with a focus on minimizing the loss associated with predicting answer spans.

  4. Post-processing: During inference, the model predicts start and end logits for answer spans, and these logits are used to determine the answer span with the highest score. The predictions are then converted back into text spans based on token offsets.

  5. Evaluation: The model's performance was evaluated on the SQuAD validation set using metrics such as exact match (EM) and F1 score, which measure the accuracy of the predicted answers.

Training hyperparameters

The following hyperparameters were used during training:

Validation results

Framework versions