Vi-XLM-RoBERTa base model (uncased)

MODEL IS BEING TRAINED (Pending...)

Progress: ▓▓░░░░░░░░ 25.00%
Epochs: 10/40 (2022-12-06)

Logs

<a href="https://wandb.ai/anhdungitvn/test/reports/perplexity-22-10-21-21-03-10---VmlldzoyODMwOTQ3?accessToken=lk98cqcsbmcdl4ftjw8vu2bfoc6ifigal1p2db3zo27cc7xwzedfivk3o4aeei3h">Perplexity</a> | <a href="https://wandb.ai/anhdungitvn/test/reports/Training-loss-22-10-21-21-04-12---VmlldzoyODMwOTU4?accessToken=8kfl35d92xxa1eockvrf08p8r65zhlrm1dxfl7vhu5avc3cfozl897n030t1q1f6">Train Loss</a> | <a href="https://wandb.ai/anhdungitvn/test/reports/eval_loss-22-10-21-21-04-24---VmlldzoyODMwOTYx?accessToken=3efesxgcurawkzl93anh4w4hz28c1fl5oexrnwb0mo0pz118fz6mappkpr434tp9">Eval Loss</a> | <a href="https://wandb.ai/anhdungitvn/test/reports/lr-22-10-21-21-03-57---VmlldzoyODMwOTU1?accessToken=ggggf7fe4c4rvhexuonybf3m6mhrdn8tz7o4qjqwbml6k0h9f0953zpsbrympktq">Learning Rate</a> | <a href="https://wandb.ai/anhdungitvn/test/reports/global_step-22-10-21-21-04-36---VmlldzoyODMwOTYy?accessToken=cjd65vgz6uviswqqi5tbud14g9v4bxhjpuccal4hqmrm1h4vqfxithrf4wg0u7xq">Global Step</a>

Gradients by layers

<a href="https://wandb.ai/anhdungitvn/test/reports/gradients-roberta-embeddings-LayerNorm-weight-22-12-06-20-31-20---VmlldzozMDk0NTkx?accessToken=vepursxxx3s66rvrnwtiep9jhx8vjkoid1oyk3gd83b6ertq6r70qhotjg9ne36s">Embedding</a> | <a href="https://wandb.ai/anhdungitvn/test/reports/gradients-roberta-encoder-layer-0-attention-output-LayerNorm-weight-22-12-06-20-30-40---VmlldzozMDk0NTg4?accessToken=x975ir3hd7smhqm64dahbgtv76uwklmgf87j7izkkjbcgu41ed9zcfkk3rzb1nmm">L0-Attention</a> | <a href="https://wandb.ai/anhdungitvn/test/reports/gradients-roberta-encoder-layer-1-attention-output-LayerNorm-weight-22-12-06-20-30-24---VmlldzozMDk0NTg2?accessToken=gwqxn1asm08k2jrdhnpa2oiprley2cmpqgnazjxctorq53yfb0zq5yqq8p46ubzq">L1-Attention</a> | <a href="https://wandb.ai/anhdungitvn/test/reports/gradients-roberta-encoder-layer-2-attention-output-LayerNorm-weight-22-12-06-20-30-09---VmlldzozMDk0NTgz?accessToken=b6kr19k4qutc12i1hsfki4jryr8ou5pzbmlkvyq6zwxj25tw1txe5vsd8t4li4i3">L2-Attention</a> | <a href="https://wandb.ai/anhdungitvn/test/reports/gradients-roberta-encoder-layer-3-attention-output-LayerNorm-weight-22-12-06-20-29-49---VmlldzozMDk0NTgx?accessToken=12klldmz2xpfjcw0bi6a5ggqe3ebtcdq9yismzaz562xajxb696zqcgvsdxp3ext">L3-Attention</a> | <a href="https://wandb.ai/anhdungitvn/test/reports/gradients-roberta-encoder-layer-4-attention-output-LayerNorm-weight-22-12-06-20-29-28---VmlldzozMDk0NTc2?accessToken=qgbz1njt7vwi0aspw7bh4n6ad4i0bctctgedq7qmsd4lds9l758ibbm5g4ndfs9f">L4-Attention</a> | <a href="https://wandb.ai/anhdungitvn/test/reports/gradients-roberta-encoder-layer-5-attention-output-LayerNorm-weight-22-12-06-20-29-11---VmlldzozMDk0NTc1?accessToken=wytlftyuujtgpmxxwk6q3kd30zjre3qiloa6w6h1aw6eb2169udu3lwtgbbyhd8x">L5-Attention</a> | <a href="https://wandb.ai/anhdungitvn/test/reports/gradients-roberta-encoder-layer-6-attention-output-LayerNorm-weight-22-12-06-20-28-53---VmlldzozMDk0NTcx?accessToken=k2ni3ba1xx50vpeto3xpnn1hc0ypvxd7im5mrk1pq47krl64fc765pjr1zt14eyc">L6-Attention</a> | <a href="https://wandb.ai/anhdungitvn/test/reports/gradients-roberta-encoder-layer-7-attention-output-LayerNorm-weight-22-12-06-20-28-37---VmlldzozMDk0NTY3?accessToken=kwsawb1wu60ecq1pnvtiim7sumqqca9uqn7rcv2o0ize794h49cgvchv680nju1z">L7-Attention</a> | <a href="https://wandb.ai/anhdungitvn/test/reports/gradients-roberta-encoder-layer-8-attention-output-LayerNorm-weight-22-12-06-20-28-20---VmlldzozMDk0NTY0?accessToken=uaqu2tqx39ki634m6j3fywcs76tzyp3q9i1riw9af52pehym9p3ukyamsyldq6te">L8-Attention</a> | <a href="https://wandb.ai/anhdungitvn/test/reports/gradients-roberta-encoder-layer-9-attention-output-LayerNorm-weight-22-12-06-20-27-57---VmlldzozMDk0NTYz?accessToken=8fvpijkrfvaywa909woxz38qa0dj8jsm6wyjd3tpubemoakzf3u84p3yel3g5b4a">L9-Attention</a> | <a href="https://wandb.ai/anhdungitvn/test/reports/gradients-roberta-encoder-layer-10-attention-output-LayerNorm-weight-22-12-06-20-26-45---VmlldzozMDk0NTU3?accessToken=qv18nauts75orcu6b3lpt227g3aa88wfnpy2lnckwn771pp5j4fn270lrnrxvaie">L10-Attention</a> | <a href="https://wandb.ai/anhdungitvn/test/reports/gradients-roberta-encoder-layer-11-attention-output-LayerNorm-weight-22-12-06-20-25-42---VmlldzozMDk0NTUz?accessToken=6xc2nhzbbctc7o0h8ztaspy8m1ha9r1qoldav2e1rilvwl1yu48frels9134ittl">L11-Attention</a> | <a href="https://wandb.ai/anhdungitvn/test/reports/gradients-lm_head-layer_norm-weight-22-12-06-20-32-36---VmlldzozMDk0NTk4?accessToken=t2zroq4rqqu12sak0v8hohmkkqgjuqfgni6apqtearwxa01d3yppglhtux91y4ga">LM Head</a>

This model is Vietnamese XLM Robert, base, uncased.

Model description

Intended uses & limitations

You can use the raw model for either masked language modeling, but it's mostly intended to be fine-tuned on a downstream task.

How to use

Here is how to use this model to get the features of a given text in PyTorch:

from transformers import AutoTokenizer, XLMRobertaForMaskedLM
tokenizer = AutoTokenizer.from_pretrained('anhdungitvn/vi-xlm-roberta-base')
model = XLMRobertaForMaskedLM.from_pretrained('anhdungitvn/vi-xlm-roberta-base')
text = "Câu bằng tiếng Việt."
encoded_input = tokenizer(text, return_tensors='pt')
output = model(**encoded_input)

and in TensorFlow:

from transformers import AutoTokenizer, XLMRobertaForMaskedLM
tokenizer = AutoTokenizer.from_pretrained('anhdungitvn/vi-xlm-roberta-base')
model = XLMRobertaForMaskedLM.from_pretrained('anhdungitvn/vi-xlm-roberta-base')
text = "Câu bằng tiếng Việt."
encoded_input = tokenizer(text, return_tensors='tf')
output = model(encoded_input)

Limitations and bias

Even if the training data used for this model could be characterized as fairly neutral, this model can have biased predictions.

Training data

Vietnamese Wiki (2022, 1GB).
Vietnamese News (2022, 17.2GB).

Training procedure

Model was pretrained:

Tokenizer: SentencePiece (BPE) with a vocab of 32000.
Model type: xlmroberta
Optimizer: AdamW
Learning rate: 100µ

Evaluation results

Pretraining metrics and results:

When fine-tuned on downstream tasks, this model achieves the following results:

Task	SC	CN	NC	VSLP_2016_ASC	T	T	T	T
	92.67	x	x	x	x	x	x	x

Downstream Task Dataset:

<a href="https://www.aivivn.com/contests/6">SC: Sentiment Classification</a>
- <a href="https://huggingface.co/datasets/anhdungitvn/sccr">a copy version</a>
<a href="https://huggingface.co/datasets/truongpdd/Covid-19-ner-lowercased">CN: Covid-19 NER</a>
<a href="https://huggingface.co/datasets/truongpdd/new_categorical_dataset">NC: News Classification</a>
<a href="https://huggingface.co/datasets/truongpdd/VSLP_2016_ASC">VSLP_2016_ASC</a>

Evaluation results

SC: Sentiment Classification (Phân loại sắc thái bình luận)

global_step	train_loss	mcc	tp	tn	fp	fn	auroc	auprc	eval_loss	acc
9	0.3401549756526947	0.7379583642283261	576	827	115	91	0.9352091470188472	0.8836813825582118	0.307791425122155	0.8719701678060907
10	0.3009944260120392	0.7153561686108502	644	714	228	23	0.9341794071117306	0.8855559953803023	0.4522255692217085	0.8440024860161591
18	0.2707692086696625	0.7568798071058903	553	867	75	114	0.9546882609650588	0.9305472414572102	0.2917362087302738	0.8825357364822871
20	0.324040949344635	0.7954276426347437	621	823	119	46	0.9585541624092412	0.9350586373986622	0.2703523271613651	0.8974518334369174
27	0.2098093926906585	0.8167227165176854	619	844	98	48	0.9641707808516125	0.9433042047286796	0.2318740271859699	0.9092604101926662
30	0.1790707111358642	0.8255523946160684	608	864	78	59	0.9659978927733586	0.9451571077189844	0.2376453859938515	0.9148539465506526
36	0.1825853586196899	0.8376375135092212	617	864	78	50	0.9686303345142716	0.950031878654048	0.2207422753175099	0.9204474829086389
40	0.1000976711511612	0.8314993418206894	616	860	82	51	0.967654707678008	0.948305555504866	0.2241402003500196	0.9173399627097576
45	0.1039206981658935	0.8475960829767908	627	861	81	40	0.9687847159222936	0.9501524466016228	0.2368797047270668	0.9247980111870727
50	0.0745943784713745	0.8403425520503286	626	856	86	41	0.9692637757554344	0.9509443104477632	0.2345909766025013	0.9210689869484152
54	0.2125148624181747	0.8452359275481012	627	859	83	40	0.9685555311516216	0.9488296967445168	0.2353654123014874	0.9235550031075201
60	0.1117694526910781	0.850232290376995	621	870	72	46	0.9680144004430906	0.9471278564716886	0.2481025093131595	0.9266625233064015
63	0.0778111666440963	0.8399036864239123	616	867	75	51	0.9673109305220002	0.9452656759041156	0.2551250010728836	0.9216904909881914
70	0.0746646523475647	0.8413858446807108	618	866	76	49	0.9675878621198954	0.9460594635799092	0.2637980712784661	0.9223119950279677
72	0.0180709399282932	0.8423208614267836	616	869	73	51	0.9677056376270464	0.9463613907615576	0.2657667969663937	0.9229334990677439
80	0.043884091079235	0.8380734551435709	611	871	71	56	0.9675926368026178	0.9460924400848304	0.2679461878206994	0.9210689869484152
81	0.0459724143147468	0.8395225086880644	613	870	72	54	0.967546481536302	0.9459560641294614	0.2676228193773163	0.9216904909881914
90	0.0371172726154327	0.8397713078008345	615	868	74	52	0.967408015737354	0.94555620631723	0.2671518574158351	0.9216904909881914
90	0.0517170429229736	0.8397713078008345	615	868	74	52	0.967408015737354	0.94555620631723	0.2671518574158351	0.9216904909881914

Logs

<a href="https://wandb.ai/anhdungitvn/test_cls/reports/-tp-tn-tp-tn-fp-fn-22-10-21-09-50-27---VmlldzoyODI4MDcz?accessToken=axp3jb8pw01a7susb3n6pu0i5mfbay66v21oeae1dkx93km41rcuhvrkkqfqpar5">Perplexity</a>

BibTeX entry and citation info

@article{2022,
  title={x},
  author={x},
  journal={ArXiv},
  year={2022},
  volume={x}
}

Vi-XLM-RoBERTa base model (uncased)

Logs

Gradients by layers

Model description

Intended uses & limitations

How to use

Limitations and bias

Training data

Training procedure

Evaluation results

Evaluation results

SC: Sentiment Classification (Phân loại sắc thái bình luận)

Logs

BibTeX entry and citation info

NSDT 3DConvert

UnrealSynth

DreamTexture.js