Model Card for Model ID
<!-- Provide a quick summary of what the model is/does. -->
Pretrained GPT-NeoX model with 2.06GB English news dataset. Took about 10 hours to reach 20,000 iterations. Trained on p3.16xlarge.
Different hyperparameter: gradient_accumulation_step 4
Model Details
Model Description
<!-- Provide a longer summary of what this model is. -->
- Developed by: Eunyoung Lee
- Model type: GPT-NeoX
- Language(s) (NLP): English