This model has been trained on 2M, from 3M to 5M. It was trained based on the best checkpoint from the latest training on 1M corrupted.