multilingual PyTorch Transformers gpt3 gpt2 Deepspeed Megatron mGPT

mGPT: fine-tune on message data - 2E

Model description

Interesting findings thus far:

Usage in python

Install the transformers library if you don't have it:

pip install -U transformers

load the model into a pipeline object:

from transformers import pipeline
import torch
device = 'cuda' if torch.cuda.is_available() else 'cpu'
my_chatbot = pipeline('text-generation', 
                      'pszemraj/mGPT-Peter-2E',
                      device=0 if device == 'cuda' else -1,
                    )

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

Framework versions