This is just a standard conversion to hf transformers format of models from here:
https://huggingface.co/BlinkDL/rwkv-4-pileplus
According to the documentation I found, this model should have seen roundabout 1.3 trillion tokens!
This is just a standard conversion to hf transformers format of models from here:
https://huggingface.co/BlinkDL/rwkv-4-pileplus
According to the documentation I found, this model should have seen roundabout 1.3 trillion tokens!