Model Card for Mistral-sci-phi

This model is a fine-tuned version of the Mistral-7B model, optimized for performance and efficiency using the PEFT library and INT4 quantization.

Model Details

Model Description

Mistral-sci-phi is a model fine-tuned from the Mistral-7B base model. It has been optimized for enhanced performance and reduced size, making it highly efficient for various NLP tasks. The model is trained using the "emrgnt-cmplxty/sciphi-textbooks-are-all-you-need" dataset from the Hugging Face Hub, ensuring it's well-suited for real-world applications.

Developed by: Arturo de Pablo
Trained by: IZX, Hyper88
Model type: Causal Language Model
Language(s) (NLP): English
License: [More Information Needed]
Finetuned from model: mistralai/Mistral-7B-v0.1

Model Sources

Repository: hyper88/ast1test

Uses

Direct Use

The model can be used directly for generating text and other NLP tasks.

Downstream Use

It can also be integrated into larger systems for more complex applications.

Out-of-Scope Use

The model should not be used for tasks beyond its training and capability scope.

Bias, Risks, and Limitations

The model inherits the biases and limitations of the base Mistral-7B model. Users should be cautious of these when using the model.

Recommendations

Users should evaluate the model's performance and biases in their specific use case and make adjustments as necessary.

How to Get Started with the Model

The model can be loaded and used for inference using the Hugging Face Transformers library.

Training Details

Training Data

The model was trained on the "emrgnt-cmplxty/sciphi-textbooks-are-all-you-need" dataset available on the Hugging Face Hub.

Training Procedure

The model was fine-tuned using INT4 quantization to optimize its performance and size.

Training Hyperparameters

Training was done with a learning rate of 2e-4
Batch size of 12
Trained for 3 epochs

Evaluation

Testing Data, Factors & Metrics

[More Information Needed]

Results

[More Information Needed]

Environmental Impact

The environmental impact is minimized due to the optimized size and efficiency of the model.

Technical Specifications

Model Architecture and Objective

The model is based on the Mistral-7B architecture and fine-tuned for enhanced performance.

Compute Infrastructure

Software

PEFT 0.6.0.dev0

More Information

For more details, visit the model repository.

Model Card Authors

Arturo de Pablo (https://www.linkedin.com/in/arde88/)

Model Card Contact

https://discord.gg/KGCeKP4ng9