Flan-T5-Base for Information Extraction from English Legal Files

Overview

This repository contains a finetuned Flan-T5-Base model for the task of extracting adjudicating members from legal case files in the domain of housing law in Ontario, Canada. The model has been trained using the HuggingFace Transformers library and fine-tuned on a custom dataset specific to the housing law domain.

Model Details

Model Architecture: Flan-T5-Base Domain: Housing Law, Ontario, Canada Task: Adjudicating Member Extraction Language: English Training Data: Custom dataset of legal case files from the housing law domain in Ontario, Canada. Training data used from https://www.canlii.org/.

Performance: The model demonstrates competitive performance on the task of extracting adjudicating members from housing law legal case files. However, it's important to note that its accuracy can vary depending on the complexity of the case files and the quality of the input data.

Training Methodology

This model has been fine-tuned using instruction-training, with the following prompts, with an example of intended result:

Evaluation Metric: Bleu Score

In training, this model achieved a Bleu score of 99%, indicating that the fine-tuning had a large and positive impact on the extraction task. In practice, with examples similar to those seen in training, the model performed similarly.

Limitations and Considerations

The model's performance heavily relies on the quality and relevance of the training data. More diverse and representative data can potentially improve its accuracy. The model's predictions might not be accurate in cases involving complex legal language, ambiguous contexts, or incomplete information.

Acknowledgments

We would acknowledge the HuggingFace Transformers library for providing the base model and training infrastructure, and Dr. Miikka Silfverberg for the notebook file from which the model fine-tuning code was adapted.

Contact Information

For questions, issues, or collaboration opportunities, please contact kmaurinjones@gmail.com.

Feel free to customize this template with specific information about your model, training process, and contact details.