conversational

#12 epochs, each batch size 2, gradient accumulation steps 2, tail 20000