CLIP Big G Embeddings for Imagenet-1k, text-to-image generative model assays

This is a tensor with dimensions (1001, 42, 77, 1280). 0 through 999 are imagnet classes, while 1000 is the unconditional class (with the same embedding 42 times). The embeddings are the unsquished, final layernormed output of the CLIP text tower for CLIP big G.

The embeddings were generated from the captions and functions contained in imagenet_zeroshot_data.py.

CLIP Big G Embeddings for Imagenet-1k, text-to-image generative model assays

NSDT 3DConvert

UnrealSynth

DreamTexture.js