Model Card for Mistral-sci-phi
This model is a fine-tuned version of the Mistral-7B model, optimized for performance and efficiency using the PEFT library and INT4 quantization.
Model Details
Model Description
Mistral-sci-phi is a model fine-tuned from the Mistral-7B base model. It has been optimized for enhanced performance and reduced size, making it highly efficient for various NLP tasks. The model is trained using the "emrgnt-cmplxty/sciphi-textbooks-are-all-you-need" dataset from the Hugging Face Hub, ensuring it's well-suited for real-world applications.
- Developed by: Arturo de Pablo
- Trained by: IZX, Hyper88
- Model type: Causal Language Model
- Language(s) (NLP): English
- License: [More Information Needed]
- Finetuned from model: mistralai/Mistral-7B-v0.1
Model Sources
- Repository: hyper88/ast1test
Uses
Direct Use
The model can be used directly for generating text and other NLP tasks.
Downstream Use
It can also be integrated into larger systems for more complex applications.
Out-of-Scope Use
The model should not be used for tasks beyond its training and capability scope.
Bias, Risks, and Limitations
The model inherits the biases and limitations of the base Mistral-7B model. Users should be cautious of these when using the model.
Recommendations
Users should evaluate the model's performance and biases in their specific use case and make adjustments as necessary.
How to Get Started with the Model
The model can be loaded and used for inference using the Hugging Face Transformers library.
Training Details
Training Data
The model was trained on the "emrgnt-cmplxty/sciphi-textbooks-are-all-you-need" dataset available on the Hugging Face Hub.
Training Procedure
The model was fine-tuned using INT4 quantization to optimize its performance and size.
Training Hyperparameters
- Training was done with a learning rate of 2e-4
- Batch size of 12
- Trained for 3 epochs
Evaluation
Testing Data, Factors & Metrics
[More Information Needed]
Results
[More Information Needed]
Environmental Impact
The environmental impact is minimized due to the optimized size and efficiency of the model.
Technical Specifications
Model Architecture and Objective
The model is based on the Mistral-7B architecture and fine-tuned for enhanced performance.
Compute Infrastructure
Sponsored by izx.ai
Software
- PEFT 0.6.0.dev0
More Information
For more details, visit the model repository.
Model Card Authors
- Arturo de Pablo (https://www.linkedin.com/in/arde88/)
Model Card Contact
https://discord.gg/KGCeKP4ng9