Word2vec/german_model - AI Model Zoo - BimAnt

word2vec

Description

German word embedding model trained by Müller with the following parameter configuration:

a corpus as big as possible (and as diverse as possible without being informal) filtering of punctuation and stopwords
forming bigramm tokens
using skip-gram as training algorithm with hierarchical softmax
window size between 5 and 10
dimensionality of feature vectors of 300 or more
using negative sampling with 10 samples
ignoring all words with total frequency lower than 50

For more information, see https://devmount.github.io/GermanWordEmbeddings/

How to use?

from gensim.models import KeyedVectors
from huggingface_hub import hf_hub_download
model = KeyedVectors.load_word2vec_format(hf_hub_download(repo_id="Word2vec/german_model", filename="german.model"), binary=True, unicode_errors="ignore")

Citation

@thesis{mueller2015,
  author = {{Müller}, Andreas},
  title  = "{Analyse von Wort-Vektoren deutscher Textkorpora}",
  school = {Technische Universität Berlin},
  year   = 2015,
  month  = jun,
  type   = {Bachelor's Thesis},
  url    = {https://devmount.github.io/GermanWordEmbeddings}
}

NSDT 3DConvert

Convert 30+ 3D formats online: GLTF, GLB, GBX, OBJ, DAE, IFC, STEP, STL...

UnrealSynth

Unreal engine based photo realistic synthetic data generator for YOLO.

DreamTexture.js

AI powered 3d texture generation and projection SDK for three.js.