It is this model https://huggingface.co/distilgpt2 converted with optimum https://huggingface.co/docs/transformers/serialization to .onnx "python -m transformers.onnx --model=distilgpt2 distilgpt2-onnx/" and optimized with https://github.com/daquexian/onnx-simplifier I got the fastest gtp2 inference with this so far on the web. You can try the model here: https://gpt2.handmadeproductions.de/