Model Card for bellman-7b-1k

The goal is to see how far you can get with a non-english finetune of base llama2.

The name comes from the Swedish bard and poet Carl Mikael Bellman who lived in the 1700s. As with any bard, what this model says should be taken with a grain of salt. Even though it has the best of intentions.

Model Details

Model Description

WIP!

Swedish finetune of NousResearch/Llama-2-7b-chat-hf

Full qlora finetune using "jeremyc/Alpaca-Lora-GPT4-Swedish"

Sadly only trained on 1024 context length. If this turns out well, I'll aim for another one with 4k.

Currently at: 100% of the total dataset. 1 epoch.

Developed by: Me
Model type: [More Information Needed]
Language(s) (NLP): [More Information Needed]
License: [More Information Needed]
Finetuned from model: https://huggingface.co/NousResearch/Llama-2-7b-chat-hf

Model Sources [optional]

Repository: [More Information Needed]
Paper [optional]: [More Information Needed]
Demo [optional]: [More Information Needed]

Uses

This is an experimental finetune. Its use should mainly be for entertainment purposes, until it has been tested further. It's trained as an instruct model and could function as an assistant.

Direct Use

[More Information Needed]

Out-of-Scope Use

The model should not be used for medical, economical or juridical advice.

Bias, Risks, and Limitations

The model has no additional alignment tuning and inherits any bias from the base model or dataset.

In addition, the dataset used has been machine translated and this affects the models knowledge of the Swedish language.

Since it's only 7b parameters, it can get confused and mix up facts.

Recommendations

Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.

How to Get Started with the Model

Prompt format: [INST] Your input. [/INST] Model response.

Training Details

Training Data

https://huggingface.co/datasets/jeremyc/Alpaca-Lora-GPT4-Swedish

[More Information Needed]

Training Procedure

Trained on Google Colab, on a V100 GPU

Preprocessing [optional]

[More Information Needed]

Training Hyperparameters

Training regime: QLORA finetune on all layers
learning rate: 1e-4
rank: 8
alpha: 16

Speeds, Sizes, Times [optional]

[More Information Needed]

Evaluation

Testing Data, Factors & Metrics

Testing Data

[More Information Needed]

Factors

[More Information Needed]

Metrics

[More Information Needed]

Results

[More Information Needed]

Summary

Model Examination [optional]

[More Information Needed]

Environmental Impact

Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019).

Hardware Type: [More Information Needed]
Hours used: 10h
Cloud Provider: [More Information Needed]
Compute Region: [More Information Needed]
Carbon Emitted: [More Information Needed]

Technical Specifications [optional]

Model Architecture and Objective

[More Information Needed]

Compute Infrastructure

[More Information Needed]

Hardware

[More Information Needed]

Software

[More Information Needed]

Citation [optional]

BibTeX:

[More Information Needed]

APA:

[More Information Needed]

Glossary [optional]

[More Information Needed]

More Information [optional]

[More Information Needed]

Model Card Authors [optional]

[More Information Needed]

Model Card Contact

[More Information Needed]

Model Card for bellman-7b-1k

Model Details

Model Description

Model Sources [optional]

Uses

Direct Use

Out-of-Scope Use

Bias, Risks, and Limitations

Recommendations

How to Get Started with the Model

Training Details

Training Data

Training Procedure

Preprocessing [optional]

Training Hyperparameters

Speeds, Sizes, Times [optional]

Evaluation

Testing Data, Factors & Metrics

Testing Data

Factors

Metrics

Results

Summary

Model Examination [optional]

Environmental Impact

Technical Specifications [optional]

Model Architecture and Objective

Compute Infrastructure

Hardware

Software

Citation [optional]

Glossary [optional]

More Information [optional]

Model Card Authors [optional]

Model Card Contact

NSDT 3DConvert

UnrealSynth

DreamTexture.js