spacy dacy danish token-classification pos tagging morphological analysis lemmatization dependency parsing named entity recognition coreference resolution named entity linking named entity disambiguation

<a href="https://github.com/centre-for-humanities-computing/Dacy"><img src="https://centre-for-humanities-computing.github.io/DaCy/_static/icon.png" width="175" height="175" align="right" /></a>

DaCy small

DaCy is a Danish language processing framework with state-of-the-art pipelines as well as functionality for analysing Danish pipelines. DaCy's largest pipeline has achieved State-of-the-Art performance on parts-of-speech tagging and dependency parsing for Danish on the Danish Dependency treebank as well as competitive performance on named entity recognition, named entity disambiguation and coreference resolution. To read more check out the DaCy repository for material on how to use DaCy and reproduce the results. DaCy also contains guides on usage of the package as well as behavioural test for biases and robustness of Danish NLP pipelines.

Feature Description
Name da_dacy_small_trf
Version 0.2.0
spaCy >=3.5.2,<3.6.0
Default Pipeline transformer, tagger, morphologizer, trainable_lemmatizer, parser, ner, coref, span_resolver, span_cleaner, entity_linker
Components transformer, tagger, morphologizer, trainable_lemmatizer, parser, ner, coref, span_resolver, span_cleaner, entity_linker
Vectors 0 keys, 0 unique vectors (0 dimensions)
Sources UD Danish DDT v2.11 (Johannsen, Anders; Martínez Alonso, Héctor; Plank, Barbara)<br />DaNE (Rasmus Hvingelby, Amalie B. Pauli, Maria Barrett, Christina Rosted, Lasse M. Lidegaard, Anders Søgaard)<br />DaCoref (Buch-Kromann, Matthias)<br />DaNED (Barrett, M. J., Lam, H., Wu, M., Lacroix, O., Plank, B., & Søgaard, A.)<br />jonfd/electra-small-nordic (Jón Friðrik Daðason)
License Apache-2.0
Author Kenneth Enevoldsen

Label Scheme

<details>

<summary>View label scheme (211 labels for 4 components)</summary>

Component Labels
tagger ADJ, ADP, ADV, AUX, CCONJ, DET, INTJ, NOUN, NUM, PART, PRON, PROPN, PUNCT, SCONJ, SYM, VERB, X
morphologizer AdpType=Prep|POS=ADP, Definite=Ind|Gender=Com|Number=Sing|POS=NOUN, Mood=Ind|POS=AUX|Tense=Pres|VerbForm=Fin|Voice=Act, POS=PROPN, Definite=Ind|Number=Sing|POS=VERB|Tense=Past|VerbForm=Part, Definite=Def|Gender=Neut|Number=Sing|POS=NOUN, POS=SCONJ, Definite=Def|Gender=Com|Number=Sing|POS=NOUN, Mood=Ind|POS=VERB|Tense=Pres|VerbForm=Fin|Voice=Act, POS=ADV, Number=Plur|POS=DET|PronType=Dem, Degree=Pos|Number=Plur|POS=ADJ, Definite=Ind|Gender=Com|Number=Plur|POS=NOUN, POS=PUNCT, NumType=Ord|POS=ADJ, POS=CCONJ, Definite=Ind|Gender=Neut|Number=Plur|POS=NOUN, POS=VERB|VerbForm=Inf|Voice=Act, Case=Acc|Gender=Neut|Number=Sing|POS=PRON|Person=3|PronType=Prs, Degree=Sup|POS=ADV, Degree=Pos|POS=ADV, Gender=Com|Number=Sing|POS=DET|PronType=Ind, Number=Plur|POS=DET|PronType=Ind, POS=ADP, POS=ADV|PartType=Inf, Case=Nom|Gender=Com|Number=Sing|POS=PRON|Person=3|PronType=Prs, Mood=Ind|POS=AUX|Tense=Past|VerbForm=Fin|Voice=Act, Definite=Def|Degree=Pos|Number=Sing|POS=ADJ, Number[psor]=Sing|POS=DET|Person=3|Poss=Yes|PronType=Prs, Mood=Ind|POS=VERB|Tense=Past|VerbForm=Fin|Voice=Act, POS=ADP|PartType=Inf, Definite=Ind|Degree=Pos|Gender=Com|Number=Sing|POS=ADJ, NumType=Card|POS=NUM, Degree=Pos|POS=ADJ, Definite=Ind|Number=Sing|POS=AUX|Tense=Past|VerbForm=Part, POS=PART|PartType=Inf, Case=Acc|POS=PRON|Person=3|PronType=Prs|Reflex=Yes, Definite=Def|Gender=Com|Number=Plur|POS=NOUN, Definite=Ind|Gender=Neut|Number=Sing|POS=NOUN, Number[psor]=Plur|POS=DET|Person=3|Poss=Yes|PronType=Prs, POS=VERB|Tense=Pres|VerbForm=Part, Case=Nom|Number=Plur|POS=PRON|Person=3|PronType=Prs, Case=Gen|Definite=Def|Gender=Com|Number=Sing|POS=NOUN, Definite=Def|Degree=Sup|Number=Plur|POS=ADJ, Case=Acc|Number=Plur|POS=PRON|Person=3|PronType=Prs, POS=AUX|VerbForm=Inf|Voice=Act, Definite=Ind|Degree=Pos|Gender=Neut|Number=Sing|POS=ADJ, Definite=Ind|Degree=Cmp|Number=Sing|POS=ADJ, Degree=Cmp|POS=ADJ, POS=PRON|PartType=Inf, Definite=Ind|Degree=Pos|Number=Sing|POS=ADJ, Case=Nom|Gender=Com|POS=PRON|PronType=Ind, Number=Plur|POS=PRON|PronType=Ind, POS=INTJ, Gender=Com|Number=Sing|POS=DET|PronType=Dem, Case=Gen|Number=Plur|POS=DET|PronType=Ind, Mood=Ind|POS=VERB|Tense=Pres|VerbForm=Fin|Voice=Pass, Definite=Def|Gender=Neut|Number=Plur|POS=NOUN, Degree=Cmp|POS=ADV, Number=Plur|Number[psor]=Plur|POS=PRON|Person=1|Poss=Yes|PronType=Prs|Style=Form, Case=Acc|Gender=Com|Number=Sing|POS=PRON|Person=3|PronType=Prs, Number=Plur|Number[psor]=Sing|POS=DET|Person=3|Poss=Yes|PronType=Prs|Reflex=Yes, Case=Gen|POS=PROPN, Gender=Neut|Number=Sing|POS=PRON|PronType=Ind, Number=Plur|POS=VERB|Tense=Past|VerbForm=Part, Gender=Neut|Number=Sing|Number[psor]=Sing|POS=DET|Person=3|Poss=Yes|PronType=Prs|Reflex=Yes, Case=Acc|Gender=Com|Number=Sing|POS=PRON|Person=1|PronType=Prs, Definite=Def|Degree=Sup|POS=ADJ, Gender=Neut|Number=Sing|POS=DET|PronType=Ind, Case=Gen|Definite=Ind|Gender=Neut|Number=Sing|POS=NOUN, Gender=Neut|Number=Sing|POS=DET|PronType=Dem, Definite=Def|Number=Sing|POS=VERB|Tense=Past|VerbForm=Part, POS=PRON|PronType=Dem, Degree=Pos|Gender=Com|Number=Sing|POS=ADJ, Number=Plur|POS=NUM, POS=VERB|VerbForm=Inf|Voice=Pass, Definite=Def|Degree=Sup|Number=Sing|POS=ADJ, Number=Sing|POS=PRON|PronType=Int,Rel, Case=Nom|Gender=Com|Number=Sing|POS=PRON|Person=1|PronType=Prs, Gender=Neut|Number=Sing|Number[psor]=Sing|POS=DET|Person=1|Poss=Yes|PronType=Prs, Gender=Com|Number=Sing|Number[psor]=Sing|POS=DET|Person=1|Poss=Yes|PronType=Prs, POS=PRON, Definite=Ind|Number=Sing|POS=NOUN, Definite=Ind|Number=Sing|POS=NUM, Case=Gen|Definite=Ind|Gender=Com|Number=Sing|POS=NOUN, Foreign=Yes|POS=ADV, POS=NOUN, Case=Gen|Definite=Def|Gender=Neut|Number=Sing|POS=NOUN, Gender=Com|Number=Plur|POS=NOUN, Gender=Neut|Number=Sing|POS=PRON|PronType=Int,Rel, Case=Nom|Gender=Com|Number=Plur|POS=PRON|Person=1|PronType=Prs, Number[psor]=Plur|POS=DET|Person=1|Poss=Yes|PronType=Prs, Gender=Com|Number=Sing|POS=PRON|PronType=Ind, Case=Gen|Definite=Ind|Gender=Com|Number=Plur|POS=NOUN, Degree=Pos|Gender=Neut|Number=Sing|POS=ADJ, Degree=Sup|POS=ADJ, Degree=Pos|Number=Sing|POS=ADJ, Mood=Imp|POS=VERB, Case=Nom|Gender=Com|POS=PRON|Person=2|Polite=Form|PronType=Prs, Case=Acc|Gender=Com|POS=PRON|Person=2|Polite=Form|PronType=Prs, POS=X, Case=Gen|Definite=Def|Gender=Com|Number=Plur|POS=NOUN, Number=Plur|POS=PRON|PronType=Dem, Case=Acc|Gender=Com|Number=Plur|POS=PRON|Person=1|PronType=Prs, Number=Plur|POS=PRON|PronType=Int,Rel, Gender=Com|Number=Sing|Number[psor]=Sing|POS=DET|Person=3|Poss=Yes|PronType=Prs|Reflex=Yes, Degree=Cmp|Number=Plur|POS=ADJ, Number=Plur|Number[psor]=Sing|POS=DET|Person=1|Poss=Yes|PronType=Prs, Gender=Com|Number=Sing|Number[psor]=Plur|POS=DET|Person=1|Poss=Yes|PronType=Prs|Style=Form, Case=Nom|Gender=Com|Number=Sing|POS=PRON|Person=2|PronType=Prs, Case=Acc|Gender=Com|Number=Sing|POS=PRON|Person=2|PronType=Prs, Gender=Com|POS=PRON|PronType=Int,Rel, Case=Gen|Degree=Pos|Number=Plur|POS=ADJ, Gender=Neut|Number=Sing|Number[psor]=Sing|POS=PRON|Person=3|Poss=Yes|PronType=Prs|Reflex=Yes, POS=VERB|VerbForm=Ger, Gender=Com|Number=Sing|POS=PRON|PronType=Dem, Case=Gen|POS=PRON|PronType=Int,Rel, Mood=Ind|POS=VERB|Tense=Past|VerbForm=Fin|Voice=Pass, Abbr=Yes|POS=X, Case=Gen|Definite=Ind|Gender=Neut|Number=Plur|POS=NOUN, Gender=Com|Number=Sing|Number[psor]=Sing|POS=DET|Person=2|Poss=Yes|PronType=Prs, Definite=Ind|Number=Plur|POS=NOUN, Foreign=Yes|POS=X, Number=Plur|POS=PRON|PronType=Rcp, Case=Nom|Gender=Com|Number=Plur|POS=PRON|Person=2|PronType=Prs, Case=Gen|Degree=Cmp|POS=ADJ, Case=Gen|Definite=Def|Gender=Neut|Number=Plur|POS=NOUN, Case=Acc|Gender=Com|Number=Plur|POS=PRON|Person=2|PronType=Prs, Gender=Neut|Number=Sing|POS=PRON|PronType=Dem, Number=Plur|Number[psor]=Plur|POS=DET|Person=1|Poss=Yes|PronType=Prs|Style=Form, Gender=Neut|Number=Sing|Number[psor]=Plur|POS=DET|Person=1|Poss=Yes|PronType=Prs|Style=Form, Number=Plur|Number[psor]=Sing|POS=PRON|Person=3|Poss=Yes|PronType=Prs|Reflex=Yes, Number[psor]=Sing|POS=PRON|Person=3|Poss=Yes|PronType=Prs, Case=Gen|Number=Plur|POS=PRON|PronType=Rcp, POS=DET|Person=2|Polite=Form|Poss=Yes|PronType=Prs, POS=SYM, POS=DET|PronType=Dem, Gender=Com|Number=Sing|POS=NUM, Number[psor]=Plur|POS=DET|Person=2|Poss=Yes|PronType=Prs, Case=Gen|Number=Plur|POS=VERB|Tense=Past|VerbForm=Part, Definite=Def|Degree=Abs|POS=ADJ, POS=VERB|Tense=Pres, Definite=Ind|Gender=Neut|Number=Sing|POS=NUM, Degree=Abs|POS=ADV, Case=Gen|Definite=Def|Degree=Pos|Number=Sing|POS=ADJ, Gender=Com|Number=Sing|POS=PRON|PronType=Int,Rel, POS=VERB|Tense=Past|VerbForm=Part, Definite=Ind|Degree=Sup|Number=Sing|POS=ADJ, Gender=Neut|Number=Sing|Number[psor]=Sing|POS=DET|Person=2|Poss=Yes|PronType=Prs, Gender=Com|Number=Sing|Number[psor]=Sing|POS=PRON|Person=1|Poss=Yes|PronType=Prs, Number=Plur|Number[psor]=Sing|POS=DET|Person=2|Poss=Yes|PronType=Prs, Number[psor]=Plur|POS=PRON|Person=3|Poss=Yes|PronType=Prs, Definite=Ind|POS=NOUN, Case=Gen|Gender=Com|Number=Sing|POS=DET|PronType=Ind, Definite=Ind|Gender=Com|Number=Sing|POS=NUM, Definite=Def|Number=Plur|POS=NOUN, Case=Gen|POS=NOUN, POS=AUX|Tense=Pres|VerbForm=Part
parser ROOT, acl:relcl, advcl, advmod, advmod:lmod, amod, appos, aux, case, cc, ccomp, compound:prt, conj, cop, dep, det, expl, fixed, flat, iobj, list, mark, nmod, nmod:poss, nsubj, nummod, obj, obl, obl:lmod, obl:tmod, punct, xcomp
ner LOC, MISC, ORG, PER

</details>

Accuracy

Type Score
TOKEN_ACC 99.92
TOKEN_P 99.70
TOKEN_R 99.77
TOKEN_F 99.74
SENTS_P 92.96
SENTS_R 95.75
SENTS_F 94.33
TAG_ACC 98.47
POS_ACC 98.42
MORPH_ACC 97.73
MORPH_MICRO_P 98.94
MORPH_MICRO_R 98.33
MORPH_MICRO_F 98.64
DEP_UAS 89.79
DEP_LAS 87.02
ENTS_P 83.06
ENTS_R 81.72
ENTS_F 82.38
LEMMA_ACC 94.67
COREF_LEA_F1 42.18
COREF_LEA_PRECISION 44.79
COREF_LEA_RECALL 39.86
NEL_SCORE 35.20
NEL_MICRO_P 84.62
NEL_MICRO_R 22.22
NEL_MICRO_F 35.20
NEL_MACRO_P 87.68
NEL_MACRO_R 24.76
NEL_MACRO_F 37.52

Training

This model was trained using spaCy and logged to Weights & Biases. You can find all the training logs here.