generated_from_trainer

<!-- This model card has been generated automatically according to the information the Trainer had access to. You should probably proofread and complete it, then remove this comment. -->

bertin-gpt-clara-med

This model is a fine-tuned version of bertin-project/bertin-gpt-j-6B-alpaca on an unknown dataset. It achieves the following results on the evaluation set:

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer, GenerationConfig, pipeline
from peft import PeftConfig, PeftModel
import torch
from accelerate import init_empty_weights, load_checkpoint_and_dispatch, infer_auto_device_map


repo_name = "CLARA-MeD/bertin-gpt"
config = PeftConfig.from_pretrained(repo_name)
tokenizer = AutoTokenizer.from_pretrained(config.base_model_name_or_path)
model = AutoModelForCausalLM.from_pretrained(config.base_model_name_or_path,torch_dtype=torch.float16,
    device_map="auto")
model = PeftModel.from_pretrained(model, repo_name)

For generation, we can use the model's .generate() method. Remember that the prompt needs a Spanish template:

# Generate responses
def generate(input):
    prompt = f"""A continuación hay una instrucción que describe una tarea, junto con una entrada que proporciona más contexto. Escribe una respuesta que complete adecuadamente lo que se pide.

### Instrucción:
Simplifica la siguiente frase

### Entrada:
{input}

### Respuesta:"""
    
    inputs = tokenizer(prompt, return_tensors="pt")
    input_ids = inputs["input_ids"].cuda()
    generation_output = model.generate(
        input_ids=input_ids,
        generation_config=GenerationConfig(temperature=0.2, top_p=0.75, num_beams=4),
        return_dict_in_generate=True,
        output_scores=True,
        max_new_tokens=256
    )
    for seq in generation_output.sequences:
        output = tokenizer.decode(seq, skip_special_tokens=True)
        print(output.split("### Respuesta:")[-1].strip())

generate("Acromegalia")
# La acromegalia es un trastorno causado por un exceso de hormona del crecimiento en el cuerpo.


Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

Training results

Training Loss Epoch Step Validation Loss
0.5564 0.38 50 0.7804
0.3879 0.75 100 0.6551
0.3609 1.13 150 0.6327
0.3615 1.5 200 0.6179
0.3371 1.88 250 0.6135
0.3242 2.25 300 0.6110

Framework versions