random-mega-ar-small-4096

This is a random weight init for architecture:

12 decoder layers, 512 hidden size
4096 context length, chunked at 1024
GPT NeoX tokenizer

It needs to be trained before being useful.

architecture

Here is an image from the paper. This architecture roughly follows the enwiki8 architecture in the paper, with some differences:

dropout rates are > 0 for attention, FFN (want to generate not memorize)
Chunk size set to 1024
ctx length/max positions set to 4096 (this may be hf-implementation specific)

this leads to approx 70M params.

lm_arch

Note that the parameter counts in the figure vs. this model/others will not be the same as this model uses the GPTNeoX tokenizer/vocab.

NSDT 3DConvert

Convert 30+ 3D formats online: GLTF, GLB, GBX, OBJ, DAE, IFC, STEP, STL...

UnrealSynth

Unreal engine based photo realistic synthetic data generator for YOLO.

DreamTexture.js

AI powered 3d texture generation and projection SDK for three.js.