Text statistics including readability and formality.
| Feature | Description |
|---|---|
| Name | en_statistics |
| Version | 0.0.1 |
| spaCy | >=3.1.1,<3.2.0 |
| Default Pipeline | tok2vec, tagger, parser, attribute_ruler, lemmatizer, syllables, formality, readability |
| Components | tok2vec, tagger, parser, senter, attribute_ruler, lemmatizer, syllables, formality, readability |
| Vectors | 684830 keys, 20000 unique vectors (300 dimensions) |
| Sources | OntoNotes 5 (Ralph Weischedel, Martha Palmer, Mitchell Marcus, Eduard Hovy, Sameer Pradhan, Lance Ramshaw, Nianwen Xue, Ann Taylor, Jeff Kaufman, Michelle Franchini, Mohammed El-Bachouti, Robert Belvin, Ann Houston)<br />ClearNLP Constituent-to-Dependency Conversion (Emory University)<br />WordNet 3.0 (Princeton University)<br />GloVe Common Crawl (Jeffrey Pennington, Richard Socher, and Christopher D. Manning) |
| License | MIT |
| Author | Chris Knowles |
Label Scheme
<details>
<summary>View label scheme (96 labels for 3 components)</summary>
| Component | Labels |
|---|---|
tagger |
$, '', ,, -LRB-, -RRB-, ., :, ADD, AFX, CC, CD, DT, EX, FW, HYPH, IN, JJ, JJR, JJS, LS, MD, NFP, NN, NNP, NNPS, NNS, PDT, POS, PRP, PRP$, RB, RBR, RBS, RP, SYM, TO, UH, VB, VBD, VBG, VBN, VBP, VBZ, WDT, WP, WP$, WRB, XX, ```` |
parser |
ROOT, acl, acomp, advcl, advmod, agent, amod, appos, attr, aux, auxpass, case, cc, ccomp, compound, conj, csubj, csubjpass, dative, dep, det, dobj, expl, intj, mark, meta, neg, nmod, npadvmod, nsubj, nsubjpass, nummod, oprd, parataxis, pcomp, pobj, poss, preconj, predet, prep, prt, punct, quantmod, relcl, xcomp |
senter |
I, S |
</details>