bert roberta xlmroberta vietnam vietnamese wiki

Vi-XLM-RoBERTa base model (uncased)

MODEL IS BEING TRAINED (Pending...)

Logs

<a href="https://wandb.ai/anhdungitvn/test/reports/perplexity-22-10-21-21-03-10---VmlldzoyODMwOTQ3?accessToken=lk98cqcsbmcdl4ftjw8vu2bfoc6ifigal1p2db3zo27cc7xwzedfivk3o4aeei3h">Perplexity</a> | <a href="https://wandb.ai/anhdungitvn/test/reports/Training-loss-22-10-21-21-04-12---VmlldzoyODMwOTU4?accessToken=8kfl35d92xxa1eockvrf08p8r65zhlrm1dxfl7vhu5avc3cfozl897n030t1q1f6">Train Loss</a> | <a href="https://wandb.ai/anhdungitvn/test/reports/eval_loss-22-10-21-21-04-24---VmlldzoyODMwOTYx?accessToken=3efesxgcurawkzl93anh4w4hz28c1fl5oexrnwb0mo0pz118fz6mappkpr434tp9">Eval Loss</a> | <a href="https://wandb.ai/anhdungitvn/test/reports/lr-22-10-21-21-03-57---VmlldzoyODMwOTU1?accessToken=ggggf7fe4c4rvhexuonybf3m6mhrdn8tz7o4qjqwbml6k0h9f0953zpsbrympktq">Learning Rate</a> | <a href="https://wandb.ai/anhdungitvn/test/reports/global_step-22-10-21-21-04-36---VmlldzoyODMwOTYy?accessToken=cjd65vgz6uviswqqi5tbud14g9v4bxhjpuccal4hqmrm1h4vqfxithrf4wg0u7xq">Global Step</a>

Gradients by layers

<a href="https://wandb.ai/anhdungitvn/test/reports/gradients-roberta-embeddings-LayerNorm-weight-22-12-06-20-31-20---VmlldzozMDk0NTkx?accessToken=vepursxxx3s66rvrnwtiep9jhx8vjkoid1oyk3gd83b6ertq6r70qhotjg9ne36s">Embedding</a> | <a href="https://wandb.ai/anhdungitvn/test/reports/gradients-roberta-encoder-layer-0-attention-output-LayerNorm-weight-22-12-06-20-30-40---VmlldzozMDk0NTg4?accessToken=x975ir3hd7smhqm64dahbgtv76uwklmgf87j7izkkjbcgu41ed9zcfkk3rzb1nmm">L0-Attention</a> | <a href="https://wandb.ai/anhdungitvn/test/reports/gradients-roberta-encoder-layer-1-attention-output-LayerNorm-weight-22-12-06-20-30-24---VmlldzozMDk0NTg2?accessToken=gwqxn1asm08k2jrdhnpa2oiprley2cmpqgnazjxctorq53yfb0zq5yqq8p46ubzq">L1-Attention</a> | <a href="https://wandb.ai/anhdungitvn/test/reports/gradients-roberta-encoder-layer-2-attention-output-LayerNorm-weight-22-12-06-20-30-09---VmlldzozMDk0NTgz?accessToken=b6kr19k4qutc12i1hsfki4jryr8ou5pzbmlkvyq6zwxj25tw1txe5vsd8t4li4i3">L2-Attention</a> | <a href="https://wandb.ai/anhdungitvn/test/reports/gradients-roberta-encoder-layer-3-attention-output-LayerNorm-weight-22-12-06-20-29-49---VmlldzozMDk0NTgx?accessToken=12klldmz2xpfjcw0bi6a5ggqe3ebtcdq9yismzaz562xajxb696zqcgvsdxp3ext">L3-Attention</a> | <a href="https://wandb.ai/anhdungitvn/test/reports/gradients-roberta-encoder-layer-4-attention-output-LayerNorm-weight-22-12-06-20-29-28---VmlldzozMDk0NTc2?accessToken=qgbz1njt7vwi0aspw7bh4n6ad4i0bctctgedq7qmsd4lds9l758ibbm5g4ndfs9f">L4-Attention</a> | <a href="https://wandb.ai/anhdungitvn/test/reports/gradients-roberta-encoder-layer-5-attention-output-LayerNorm-weight-22-12-06-20-29-11---VmlldzozMDk0NTc1?accessToken=wytlftyuujtgpmxxwk6q3kd30zjre3qiloa6w6h1aw6eb2169udu3lwtgbbyhd8x">L5-Attention</a> | <a href="https://wandb.ai/anhdungitvn/test/reports/gradients-roberta-encoder-layer-6-attention-output-LayerNorm-weight-22-12-06-20-28-53---VmlldzozMDk0NTcx?accessToken=k2ni3ba1xx50vpeto3xpnn1hc0ypvxd7im5mrk1pq47krl64fc765pjr1zt14eyc">L6-Attention</a> | <a href="https://wandb.ai/anhdungitvn/test/reports/gradients-roberta-encoder-layer-7-attention-output-LayerNorm-weight-22-12-06-20-28-37---VmlldzozMDk0NTY3?accessToken=kwsawb1wu60ecq1pnvtiim7sumqqca9uqn7rcv2o0ize794h49cgvchv680nju1z">L7-Attention</a> | <a href="https://wandb.ai/anhdungitvn/test/reports/gradients-roberta-encoder-layer-8-attention-output-LayerNorm-weight-22-12-06-20-28-20---VmlldzozMDk0NTY0?accessToken=uaqu2tqx39ki634m6j3fywcs76tzyp3q9i1riw9af52pehym9p3ukyamsyldq6te">L8-Attention</a> | <a href="https://wandb.ai/anhdungitvn/test/reports/gradients-roberta-encoder-layer-9-attention-output-LayerNorm-weight-22-12-06-20-27-57---VmlldzozMDk0NTYz?accessToken=8fvpijkrfvaywa909woxz38qa0dj8jsm6wyjd3tpubemoakzf3u84p3yel3g5b4a">L9-Attention</a> | <a href="https://wandb.ai/anhdungitvn/test/reports/gradients-roberta-encoder-layer-10-attention-output-LayerNorm-weight-22-12-06-20-26-45---VmlldzozMDk0NTU3?accessToken=qv18nauts75orcu6b3lpt227g3aa88wfnpy2lnckwn771pp5j4fn270lrnrxvaie">L10-Attention</a> | <a href="https://wandb.ai/anhdungitvn/test/reports/gradients-roberta-encoder-layer-11-attention-output-LayerNorm-weight-22-12-06-20-25-42---VmlldzozMDk0NTUz?accessToken=6xc2nhzbbctc7o0h8ztaspy8m1ha9r1qoldav2e1rilvwl1yu48frels9134ittl">L11-Attention</a> | <a href="https://wandb.ai/anhdungitvn/test/reports/gradients-lm_head-layer_norm-weight-22-12-06-20-32-36---VmlldzozMDk0NTk4?accessToken=t2zroq4rqqu12sak0v8hohmkkqgjuqfgni6apqtearwxa01d3yppglhtux91y4ga">LM Head</a>

This model is Vietnamese XLM Robert, base, uncased.

Model description

Intended uses & limitations

You can use the raw model for either masked language modeling, but it's mostly intended to be fine-tuned on a downstream task.

How to use

Here is how to use this model to get the features of a given text in PyTorch:

from transformers import AutoTokenizer, XLMRobertaForMaskedLM
tokenizer = AutoTokenizer.from_pretrained('anhdungitvn/vi-xlm-roberta-base')
model = XLMRobertaForMaskedLM.from_pretrained('anhdungitvn/vi-xlm-roberta-base')
text = "Câu bằng tiếng Việt."
encoded_input = tokenizer(text, return_tensors='pt')
output = model(**encoded_input)

and in TensorFlow:

from transformers import AutoTokenizer, XLMRobertaForMaskedLM
tokenizer = AutoTokenizer.from_pretrained('anhdungitvn/vi-xlm-roberta-base')
model = XLMRobertaForMaskedLM.from_pretrained('anhdungitvn/vi-xlm-roberta-base')
text = "Câu bằng tiếng Việt."
encoded_input = tokenizer(text, return_tensors='tf')
output = model(encoded_input)

Limitations and bias

Even if the training data used for this model could be characterized as fairly neutral, this model can have biased predictions.

Training data

Training procedure

Model was pretrained:

Evaluation results

Pretraining metrics and results:

When fine-tuned on downstream tasks, this model achieves the following results:

Task SC CN NC VSLP_2016_ASC T T T T
92.67 x x x x x x x

Downstream Task Dataset:

Evaluation results

SC: Sentiment Classification (Phân loại sắc thái bình luận)

<img width="1024px" src="https://i.postimg.cc/02fLqVDP/sc-cosine-schedule-with-warmup.png">

global_step train_loss mcc tp tn fp fn auroc auprc eval_loss acc
9 0.3401549756526947 0.7379583642283261 576 827 115 91 0.9352091470188472 0.8836813825582118 0.307791425122155 0.8719701678060907
10 0.3009944260120392 0.7153561686108502 644 714 228 23 0.9341794071117306 0.8855559953803023 0.4522255692217085 0.8440024860161591
18 0.2707692086696625 0.7568798071058903 553 867 75 114 0.9546882609650588 0.9305472414572102 0.2917362087302738 0.8825357364822871
20 0.324040949344635 0.7954276426347437 621 823 119 46 0.9585541624092412 0.9350586373986622 0.2703523271613651 0.8974518334369174
27 0.2098093926906585 0.8167227165176854 619 844 98 48 0.9641707808516125 0.9433042047286796 0.2318740271859699 0.9092604101926662
30 0.1790707111358642 0.8255523946160684 608 864 78 59 0.9659978927733586 0.9451571077189844 0.2376453859938515 0.9148539465506526
36 0.1825853586196899 0.8376375135092212 617 864 78 50 0.9686303345142716 0.950031878654048 0.2207422753175099 0.9204474829086389
40 0.1000976711511612 0.8314993418206894 616 860 82 51 0.967654707678008 0.948305555504866 0.2241402003500196 0.9173399627097576
45 0.1039206981658935 0.8475960829767908 627 861 81 40 0.9687847159222936 0.9501524466016228 0.2368797047270668 0.9247980111870727
50 0.0745943784713745 0.8403425520503286 626 856 86 41 0.9692637757554344 0.9509443104477632 0.2345909766025013 0.9210689869484152
54 0.2125148624181747 0.8452359275481012 627 859 83 40 0.9685555311516216 0.9488296967445168 0.2353654123014874 0.9235550031075201
60 0.1117694526910781 0.850232290376995 621 870 72 46 0.9680144004430906 0.9471278564716886 0.2481025093131595 0.9266625233064015
63 0.0778111666440963 0.8399036864239123 616 867 75 51 0.9673109305220002 0.9452656759041156 0.2551250010728836 0.9216904909881914
70 0.0746646523475647 0.8413858446807108 618 866 76 49 0.9675878621198954 0.9460594635799092 0.2637980712784661 0.9223119950279677
72 0.0180709399282932 0.8423208614267836 616 869 73 51 0.9677056376270464 0.9463613907615576 0.2657667969663937 0.9229334990677439
80 0.043884091079235 0.8380734551435709 611 871 71 56 0.9675926368026178 0.9460924400848304 0.2679461878206994 0.9210689869484152
81 0.0459724143147468 0.8395225086880644 613 870 72 54 0.967546481536302 0.9459560641294614 0.2676228193773163 0.9216904909881914
90 0.0371172726154327 0.8397713078008345 615 868 74 52 0.967408015737354 0.94555620631723 0.2671518574158351 0.9216904909881914
90 0.0517170429229736 0.8397713078008345 615 868 74 52 0.967408015737354 0.94555620631723 0.2671518574158351 0.9216904909881914

Logs

<a href="https://wandb.ai/anhdungitvn/test_cls/reports/-tp-tn-tp-tn-fp-fn-22-10-21-09-50-27---VmlldzoyODI4MDcz?accessToken=axp3jb8pw01a7susb3n6pu0i5mfbay66v21oeae1dkx93km41rcuhvrkkqfqpar5">Perplexity</a>

BibTeX entry and citation info

@article{2022,
  title={x},
  author={x},
  journal={ArXiv},
  year={2022},
  volume={x}
}

<a href="https://huggingface.co/exbert/?model=anhdungitvn/vi-xlm-roberta-base"> <img width="300px" src="https://cdn-media.huggingface.co/exbert/button.png"> </a>