{MODEL_NAME}

This is a sentence-transformers model: It maps sentences & paragraphs to a 384 dimensional dense vector space and can be used for tasks like clustering or semantic search.

Usage (Sentence-Transformers)

Using this model becomes easy when you have sentence-transformers installed:

pip install -U sentence-transformers

Then you can use the model like this:

from sentence_transformers import SentenceTransformer
sentences = ["This is an example sentence", "Each sentence is converted"]

model = SentenceTransformer('{MODEL_NAME}')
embeddings = model.encode(sentences)
print(embeddings)

Usage (HuggingFace Transformers)

Without sentence-transformers, you can use the model like this: First, you pass your input through the transformer model, then you have to apply the right pooling-operation on-top of the contextualized word embeddings.

from transformers import AutoTokenizer, AutoModel
import torch


#Mean Pooling - Take attention mask into account for correct averaging
def mean_pooling(model_output, attention_mask):
    token_embeddings = model_output[0] #First element of model_output contains all token embeddings
    input_mask_expanded = attention_mask.unsqueeze(-1).expand(token_embeddings.size()).float()
    return torch.sum(token_embeddings * input_mask_expanded, 1) / torch.clamp(input_mask_expanded.sum(1), min=1e-9)


# Sentences we want sentence embeddings for
sentences = ['This is an example sentence', 'Each sentence is converted']

# Load model from HuggingFace Hub
tokenizer = AutoTokenizer.from_pretrained('{MODEL_NAME}')
model = AutoModel.from_pretrained('{MODEL_NAME}')

# Tokenize sentences
encoded_input = tokenizer(sentences, padding=True, truncation=True, return_tensors='pt')

# Compute token embeddings
with torch.no_grad():
    model_output = model(**encoded_input)

# Perform pooling. In this case, mean pooling.
sentence_embeddings = mean_pooling(model_output, encoded_input['attention_mask'])

print("Sentence embeddings:")
print(sentence_embeddings)

Evaluation Results

Model	id_raw_acc	vn_raw_acc	br_raw_acc	th_raw_acc	my_raw_acc	ph_raw_acc	sg_raw_acc	avg
thtang/SetFit_ALL_200M_itr5	74.24%	64.04%	58.98%	67.24%	70.77%	70.63%	70.58%	68.07%
('ViT-B-16-SigLIP-i18n-256', 'webli')	69.38%	57.92%	47.40%	56.40%	65.20%	65.72%	65.12%	61.02%
('xlm-roberta-base-ViT-B-32', 'laion5b_s13b_b90k')	66.23%	54.05%	49.26%	55.39%	65.61%	66.11%	66.72%	60.48%
('xlm-roberta-large-ViT-H-14', 'frozen_laion5b_s13b_b90k')	66.05%	52.77%	46.46%	53.44%	62.70%	64.40%	64.24%	58.58%
('ViT-L-14', 'commonpool_xl_s13b_b90k')	65.48%	53.80%	46.61%	51.00%	62.01%	64.37%	63.94%	58.17%
('ViT-L-14', 'commonpool_xl_clip_s13b_b90k')	66.73%	49.82%	45.25%	38.32%	63.64%	66.17%	65.29%	56.46%
('ViT-B-16', 'commonpool_l_s1b_b8k')	62.14%	49.25%	45.20%	39.47%	61.15%	63.03%	62.63%	54.69%
('ViT-bigG-14-CLIPA', 'datacomp1b')	69.21%	44.39%	48.25%	20.54%	62.83%	68.15%	66.48%	54.26%
('ViT-bigG-14-CLIPA-336', 'datacomp1b')	69.17%	44.22%	48.06%	20.48%	62.79%	67.74%	66.63%	54.15%
('ViT-H-14-CLIPA-336', 'datacomp1b')	68.03%	42.79%	47.52%	20.82%	62.38%	67.06%	66.92%	53.65%
('ViT-H-14-CLIPA', 'datacomp1b')	68.18%	42.82%	47.33%	20.68%	62.31%	67.26%	66.56%	53.59%
('ViT-B-16', 'commonpool_l_clip_s1b_b8k')	63.68%	42.24%	44.87%	28.59%	62.04%	65.18%	64.97%	53.08%
('ViT-B-32-256', 'datacomp_s34b_b86k')	65.44%	38.94%	43.57%	25.11%	62.39%	65.82%	64.94%	52.32%
('ViT-L-14-CLIPA-336', 'datacomp1b')	66.99%	38.69%	45.25%	20.36%	61.47%	66.78%	65.56%	52.16%
('ViT-L-14-CLIPA', 'datacomp1b')	66.86%	38.34%	45.21%	20.18%	61.51%	66.71%	65.41%	52.03%
('ViT-H-14-CLIPA-336', 'laion2b')	64.62%	35.52%	44.73%	21.27%	61.01%	67.12%	65.76%	51.43%
('ViT-B-32', 'datacomp_xl_s13b_b90k')	64.57%	37.26%	42.06%	22.61%	61.96%	65.59%	64.63%	51.24%
('ViT-L-14', 'datacomp_xl_s13b_b90k')	64.37%	37.78%	40.65%	22.89%	60.72%	65.26%	64.30%	50.85%
('EVA02-E-14-plus', 'laion2b_s9b_b144k')	63.51%	31.79%	42.52%	23.71%	60.74%	64.74%	63.97%	50.14%
('ViT-H-14-quickgelu', 'metaclip_fullcc')	59.75%	34.61%	43.12%	22.69%	60.61%	65.47%	64.58%	50.12%
('ViT-B-16', 'datacomp_xl_s13b_b90k')	63.15%	36.19%	39.81%	22.39%	60.66%	63.96%	63.31%	49.92%
('ViT-bigG-14', 'laion2b_s39b_b160k')	63.03%	31.52%	41.20%	23.65%	60.52%	65.11%	63.99%	49.86%
('ViT-B-16', 'commonpool_l_basic_s1b_b8k')	62.56%	36.99%	40.87%	22.16%	59.57%	63.56%	63.06%	49.82%
intfloat/multilingual-e5-large	52.99%	42.00%	33.92%	47.69%	55.82%	57.76%	58.16%	49.76%
intfloat/multilingual-e5-base	52.06%	43.21%	34.17%	47.41%	55.28%	57.38%	57.45%	49.57%
('ViT-B-16', 'commonpool_l_image_s1b_b8k')	61.48%	36.08%	40.87%	22.62%	59.17%	63.47%	62.80%	49.50%
('convnext_large_d', 'laion2b_s26b_b102k_augreg')	61.61%	29.78%	39.92%	23.49%	60.93%	65.69%	64.60%	49.43%
('EVA01-g-14-plus', 'merged2b_s11b_b114k')	62.34%	30.29%	39.02%	22.80%	60.83%	65.19%	63.49%	49.14%
('convnext_large_d_320', 'laion2b_s29b_b131k_ft')	61.18%	29.24%	39.09%	23.23%	60.65%	65.64%	64.12%	49.02%
('ViT-B-32', 'laion2b_s34b_b79k')	61.21%	29.82%	37.51%	24.49%	60.21%	65.28%	64.08%	48.94%
('convnext_large_d_320', 'laion2b_s29b_b131k_ft_soup')	60.91%	29.28%	38.97%	22.61%	60.78%	65.76%	63.84%	48.88%
('convnext_xxlarge', 'laion2b_s34b_b82k_augreg_soup')	61.55%	30.17%	38.85%	22.30%	60.28%	64.83%	63.22%	48.74%
('ViT-B-32', 'laion2b_e16')	61.44%	28.15%	38.05%	24.49%	59.93%	65.14%	63.87%	48.72%
('ViT-B-16', 'datacomp_l_s1b_b8k')	61.33%	29.35%	38.67%	23.31%	60.29%	64.42%	63.64%	48.72%
('ViT-H-14', 'laion2b_s32b_b79k')	61.45%	29.19%	38.91%	22.64%	60.56%	64.86%	63.30%	48.70%
('EVA02-E-14', 'laion2b_s4b_b115k')	61.63%	29.60%	38.57%	22.89%	60.22%	64.83%	63.18%	48.70%
('convnext_xxlarge', 'laion2b_s34b_b82k_augreg_rewind')	61.24%	30.22%	39.04%	22.40%	60.02%	64.75%	62.99%	48.67%
('ViT-B-32-quickgelu', 'metaclip_fullcc')	58.26%	29.70%	38.99%	23.24%	60.07%	65.67%	64.30%	48.60%
('convnext_xxlarge', 'laion2b_s34b_b82k_augreg')	60.94%	29.90%	39.49%	22.08%	60.10%	64.50%	63.15%	48.59%
('ViT-g-14', 'laion2b_s12b_b42k')	61.46%	27.70%	38.23%	22.46%	60.65%	65.68%	63.87%	48.58%
('ViT-g-14', 'laion2b_s34b_b88k')	60.83%	29.56%	39.37%	21.63%	59.87%	64.68%	63.30%	48.46%
('ViT-L-14-quickgelu', 'metaclip_fullcc')	56.99%	31.07%	40.45%	23.13%	59.21%	64.77%	63.50%	48.45%
intfloat/multilingual-e5-small	49.50%	42.68%	30.96%	47.42%	54.44%	56.44%	57.04%	48.35%
('ViT-B-16-quickgelu', 'metaclip_fullcc')	58.00%	28.59%	37.68%	23.22%	59.42%	65.03%	64.10%	48.01%
('ViT-L-14', 'laion2b_s32b_b82k')	60.18%	28.09%	36.28%	23.70%	59.89%	64.86%	63.01%	48.00%
('ViT-B-32-quickgelu', 'laion400m_e32')	59.74%	25.92%	36.98%	25.19%	59.67%	64.79%	63.68%	48.00%
('ViT-B-32-quickgelu', 'laion400m_e31')	59.86%	25.92%	36.84%	25.20%	59.56%	64.76%	63.79%	47.99%
('convnext_base_w', 'laion2b_s13b_b82k_augreg')	60.97%	27.03%	36.75%	22.90%	59.70%	64.78%	63.46%	47.94%
('ViT-L-14', 'laion400m_e32')	60.01%	24.45%	37.24%	23.95%	59.17%	65.02%	63.78%	47.66%
('EVA01-g-14', 'laion400m_s11b_b41k')	60.51%	25.96%	36.17%	23.69%	59.57%	64.40%	63.22%	47.64%
('ViT-B-16-plus-240', 'laion400m_e32')	59.84%	25.29%	36.80%	23.73%	59.31%	64.99%	63.43%	47.63%
('ViT-B-16-plus-240', 'laion400m_e31')	59.69%	25.22%	36.79%	23.69%	59.44%	64.92%	63.53%	47.61%
('ViT-B-16', 'laion2b_s34b_b88k')	59.82%	27.45%	35.12%	24.41%	59.39%	64.37%	62.66%	47.60%
('ViT-L-14', 'laion400m_e31')	59.91%	24.26%	37.53%	23.84%	59.08%	64.90%	63.64%	47.60%
('ViT-L-16-SigLIP-256', 'webli')	65.54%	20.39%	44.65%	15.18%	60.10%	64.64%	62.44%	47.56%
('roberta-ViT-B-32', 'laion2b_s12b_b32k')	59.70%	25.15%	39.81%	17.10%	59.95%	65.81%	65.00%	47.50%
('ViT-L-14', 'commonpool_xl_laion_s13b_b90k')	58.13%	26.95%	34.93%	23.34%	59.05%	64.51%	63.63%	47.22%
('ViT-B-16-SigLIP', 'webli')	64.31%	19.87%	44.78%	14.87%	58.38%	65.16%	62.44%	47.12%
('ViT-B-16-SigLIP-256', 'webli')	64.24%	20.94%	44.15%	15.35%	58.22%	64.41%	62.43%	47.10%
('ViT-B-16-SigLIP-384', 'webli')	64.36%	20.06%	44.41%	15.11%	58.03%	64.68%	62.10%	46.96%
('ViT-L-16-SigLIP-384', 'webli')	64.49%	20.17%	44.01%	14.80%	58.89%	64.92%	61.39%	46.95%
('ViT-B-32', 'laion400m_e31')	59.06%	26.66%	35.69%	23.68%	58.00%	62.82%	62.68%	46.94%
('ViT-B-16-SigLIP-512', 'webli')	64.28%	19.61%	44.17%	15.09%	57.71%	64.83%	62.44%	46.88%
('convnext_base_w_320', 'laion_aesthetic_s13b_b82k_augreg')	57.60%	26.52%	35.01%	24.43%	57.05%	64.54%	62.74%	46.84%
('ViT-B-16', 'commonpool_l_text_s1b_b8k')	59.57%	28.15%	37.37%	20.89%	57.54%	62.68%	61.63%	46.83%
('ViT-B-32', 'laion400m_e32')	59.05%	26.62%	35.44%	23.54%	58.00%	62.74%	62.27%	46.81%
('convnext_base_w', 'laion2b_s13b_b82k')	58.65%	26.97%	34.80%	23.26%	58.31%	63.39%	61.56%	46.71%
sentence-transformers/gtr-t5-xxl	59.93%	24.82%	40.79%	17.23%	58.41%	64.00%	61.57%	46.68%
('ViT-B-16', 'laion400m_e32')	59.01%	24.34%	35.07%	21.84%	59.04%	64.58%	62.73%	46.66%
('ViT-B-16', 'laion400m_e31')	58.94%	24.20%	34.92%	21.58%	59.11%	64.77%	63.09%	46.66%
('convnext_base', 'laion400m_s13b_b51k')	58.44%	24.99%	34.05%	23.99%	58.33%	63.79%	62.59%	46.60%
('EVA02-L-14-336', 'merged2b_s6b_b61k')	59.54%	23.19%	34.54%	22.36%	59.24%	63.90%	63.40%	46.60%
('coca_ViT-B-32', 'laion2b_s13b_b90k')	58.70%	27.10%	33.22%	24.13%	57.53%	63.56%	61.87%	46.59%
('EVA02-L-14', 'merged2b_s4b_b131k')	59.64%	23.18%	34.62%	22.55%	59.11%	63.86%	63.10%	46.58%
thenlper/gte-large	55.10%	28.16%	33.96%	18.73%	59.50%	65.19%	63.52%	46.31%
('ViT-L-14-quickgelu', 'metaclip_400m')	54.32%	25.87%	34.30%	23.41%	58.50%	64.48%	63.24%	46.30%
('coca_ViT-L-14', 'laion2b_s13b_b90k')	57.92%	25.78%	33.97%	24.17%	57.64%	63.08%	61.55%	46.30%
('coca_ViT-L-14', 'mscoco_finetuned_laion2b_s13b_b90k')	58.07%	25.32%	34.18%	24.60%	57.77%	62.80%	61.28%	46.29%
('ViT-B-32-quickgelu', 'metaclip_400m')	55.85%	27.37%	31.91%	21.76%	58.64%	64.69%	63.11%	46.19%
sentence-transformers/paraphrase-multilingual-mpnet-base-v2	49.03%	32.58%	32.82%	38.43%	55.30%	57.36%	57.34%	46.12%
('convnext_base_w', 'laion_aesthetic_s13b_b82k')	57.39%	25.68%	33.71%	23.82%	56.64%	63.22%	62.22%	46.10%
('ViT-B-32', 'commonpool_m_clip_s128m_b4k')	56.09%	26.70%	38.25%	22.79%	56.52%	61.26%	61.05%	46.09%
('convnext_base_w_320', 'laion_aesthetic_s13b_b82k')	56.96%	25.60%	33.77%	24.64%	56.32%	63.33%	61.87%	46.07%
('ViT-B-16', 'commonpool_l_laion_s1b_b8k')	56.37%	25.70%	31.07%	23.18%	58.65%	63.93%	63.49%	46.06%
('ViT-B-16-quickgelu', 'metaclip_400m')	55.90%	25.88%	32.67%	21.57%	58.65%	64.48%	63.04%	46.03%
intfloat/e5-large	55.45%	28.54%	36.69%	18.15%	57.78%	62.92%	61.83%	45.91%
('EVA02-B-16', 'merged2b_s8b_b131k')	58.08%	24.45%	31.80%	22.36%	58.45%	63.25%	62.44%	45.83%
sentence-transformers/LaBSE	50.30%	32.82%	33.15%	39.79%	54.95%	53.71%	55.06%	45.68%
thenlper/gte-base	55.46%	27.88%	32.77%	17.20%	58.09%	63.68%	62.03%	45.30%
intfloat/e5-large-v2	55.10%	28.06%	35.95%	17.16%	57.16%	61.21%	60.84%	45.07%
('ViT-SO400M-14-SigLIP', 'webli')	60.18%	29.39%	38.90%	13.73%	52.79%	59.15%	56.81%	44.42%
('ViT-B-32', 'commonpool_m_s128m_b4k')	50.30%	32.12%	37.08%	23.02%	53.63%	57.64%	56.91%	44.39%
sentence-transformers/sentence-t5-xxl	50.98%	18.38%	36.37%	16.91%	59.25%	64.82%	63.75%	44.35%
infgrad/stella-base-en-v2	52.42%	26.24%	30.61%	18.81%	56.84%	63.03%	61.67%	44.23%
('RN50x4', 'openai')	56.39%	25.77%	29.99%	21.48%	55.31%	61.02%	59.42%	44.20%
('RN50x16', 'openai')	56.58%	25.09%	29.77%	21.03%	54.81%	61.28%	58.47%	43.86%
('RN101-quickgelu', 'openai')	56.57%	25.83%	29.66%	21.09%	54.50%	60.18%	58.74%	43.80%
('RN101', 'openai')	56.57%	25.83%	29.66%	21.09%	54.50%	60.18%	58.74%	43.80%
llmrails/ember-v1	50.85%	24.76%	31.02%	17.20%	57.62%	63.06%	62.04%	43.79%
sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2	44.88%	28.32%	29.45%	36.40%	53.97%	56.87%	56.14%	43.72%
BAAI/bge-large-en-v1.5	49.81%	25.55%	30.68%	17.41%	56.89%	62.87%	61.72%	43.56%
('RN50x64', 'openai')	55.34%	22.19%	30.63%	20.79%	55.18%	60.93%	59.45%	43.50%
('nllb-clip-large', 'v1')	48.84%	23.45%	33.92%	32.38%	53.67%	55.36%	56.76%	43.48%
BAAI/bge-base-en-v1.5	51.73%	24.30%	31.51%	17.53%	56.21%	62.37%	60.25%	43.42%
intfloat/e5-small	51.31%	27.36%	32.05%	16.66%	55.15%	60.39%	59.06%	43.14%
BAAI/bge-small-en-v1.5	51.37%	25.16%	29.99%	16.13%	56.17%	61.69%	61.01%	43.07%
('ViT-L-14', 'openai')	54.57%	21.44%	30.13%	19.50%	54.99%	60.94%	59.59%	43.02%
('ViT-L-14-336', 'openai')	54.12%	21.52%	30.63%	19.47%	55.41%	60.77%	58.87%	42.97%
intfloat/e5-small-v2	51.41%	26.82%	33.04%	16.30%	54.97%	58.66%	58.68%	42.84%
('ViT-SO400M-14-SigLIP-384', 'webli')	62.68%	15.00%	32.38%	7.32%	56.65%	64.12%	61.49%	42.81%
('RN50-quickgelu', 'openai')	53.15%	24.79%	29.57%	20.84%	53.15%	59.19%	57.59%	42.61%
('RN50', 'openai')	53.15%	24.79%	29.57%	20.84%	53.15%	59.19%	57.59%	42.61%
('ViT-B-16', 'openai')	53.31%	22.22%	27.96%	21.22%	53.68%	59.47%	58.45%	42.33%
('ViT-B-32', 'openai')	52.93%	23.44%	28.70%	20.78%	52.96%	59.38%	57.93%	42.30%
('ViT-B-32-quickgelu', 'openai')	52.93%	23.44%	28.70%	20.78%	52.96%	59.38%	57.93%	42.30%
sentence-transformers/all-MiniLM-L6-v2	50.80%	25.76%	27.04%	15.81%	54.63%	60.07%	59.68%	41.97%
('ViT-B-32', 'commonpool_m_basic_s128m_b4k')	52.54%	22.67%	30.25%	16.17%	53.22%	59.40%	58.31%	41.80%
sentence-transformers/all-MiniLM-L12-v2	48.98%	24.05%	25.74%	16.41%	54.51%	60.38%	58.90%	41.28%
('ViT-B-32', 'commonpool_m_image_s128m_b4k')	51.93%	20.40%	29.44%	16.53%	53.16%	58.71%	58.17%	41.19%
sentence-transformers/clip-ViT-B-32-multilingual-v1	44.45%	27.34%	28.00%	28.25%	50.30%	54.05%	53.39%	40.82%
sentence-transformers/distiluse-base-multilingual-cased-v2	43.51%	23.86%	28.41%	26.90%	53.14%	53.54%	54.38%	40.53%
('ViT-B-32', 'datacomp_m_s128m_b4k')	51.60%	19.45%	26.58%	16.46%	52.54%	59.03%	58.03%	40.53%
('ViT-B-32', 'commonpool_m_text_s128m_b4k')	50.38%	20.31%	27.01%	16.00%	52.61%	58.82%	58.10%	40.46%
sentence-transformers/all-mpnet-base-v2	46.97%	23.15%	24.75%	16.31%	52.66%	59.07%	57.75%	40.09%
('nllb-clip-base', 'v1')	42.72%	23.90%	29.29%	33.96%	48.33%	49.09%	51.21%	39.79%
sentence-transformers/paraphrase-mpnet-base-v2	46.00%	20.45%	26.92%	14.75%	52.89%	58.71%	58.20%	39.70%
sentence-transformers/all-distilroberta-v1	46.74%	22.34%	24.06%	17.59%	51.49%	57.54%	56.45%	39.46%
sentence-transformers/paraphrase-MiniLM-L6-v2	44.92%	23.59%	26.12%	14.23%	51.84%	57.14%	56.03%	39.12%
('ViT-B-32', 'commonpool_m_laion_s128m_b4k')	42.94%	19.21%	19.70%	17.26%	50.84%	57.59%	56.06%	37.66%
('RN50-quickgelu', 'cc12m')	40.71%	18.10%	16.78%	16.23%	45.55%	52.89%	50.77%	34.43%
('RN50', 'cc12m')	39.76%	17.32%	16.15%	15.76%	44.25%	52.46%	49.18%	33.55%
('RN101', 'yfcc15m')	33.79%	18.04%	16.05%	11.10%	37.62%	43.50%	42.45%	28.94%
('RN101-quickgelu', 'yfcc15m')	32.79%	16.89%	14.45%	11.56%	37.77%	42.86%	41.93%	28.32%
('ViT-B-32', 'commonpool_s_clip_s13m_b4k')	33.80%	13.26%	18.82%	12.42%	37.36%	42.09%	40.39%	28.31%
('RN50', 'yfcc15m')	31.81%	15.87%	14.88%	8.99%	37.42%	42.06%	41.19%	27.46%
('RN50-quickgelu', 'yfcc15m')	31.57%	15.90%	14.44%	8.99%	36.81%	41.81%	41.20%	27.24%
('ViT-B-32', 'commonpool_s_s13m_b4k')	29.42%	12.57%	16.82%	11.00%	32.42%	36.77%	35.48%	24.93%
('ViT-B-32', 'commonpool_s_text_s13m_b4k')	28.02%	10.61%	12.49%	9.85%	31.18%	37.10%	34.85%	23.44%
('ViT-B-32', 'commonpool_s_basic_s13m_b4k')	27.87%	10.72%	12.67%	8.16%	30.11%	36.13%	32.68%	22.62%
('coca_ViT-B-32', 'mscoco_finetuned_laion2b_s13b_b90k')	12.60%	7.91%	5.11%	9.96%	17.15%	20.67%	20.32%	13.39%
('ViT-B-32', 'commonpool_s_image_s13m_b4k')	15.20%	5.59%	5.91%	4.63%	16.80%	20.74%	18.78%	12.52%
('ViT-B-32', 'datacomp_s_s13m_b4k')	15.20%	5.59%	5.91%	4.63%	16.80%	20.74%	18.78%	12.52%
('ViT-B-32', 'commonpool_s_laion_s13m_b4k')	11.72%	5.12%	4.05%	4.23%	14.33%	18.82%	16.44%	10.67%

Training

The model was trained with the parameters:

DataLoader:

torch.utils.data.dataloader.DataLoader of length 1468721 with parameters:

{'batch_size': 160, 'sampler': 'torch.utils.data.sampler.RandomSampler', 'batch_sampler': 'torch.utils.data.sampler.BatchSampler'}

Loss:

sentence_transformers.losses.CosineSimilarityLoss.CosineSimilarityLoss

Parameters of the fit()-Method:

{
    "epochs": 1,
    "evaluation_steps": 0,
    "evaluator": "NoneType",
    "max_grad_norm": 1,
    "optimizer_class": "<class 'torch.optim.adamw.AdamW'>",
    "optimizer_params": {
        "lr": 2e-05
    },
    "scheduler": "WarmupLinear",
    "steps_per_epoch": null,
    "warmup_steps": 100,
    "weight_decay": 0.01
}

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 128, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False})
)

Citing & Authors