Lexical diversity

Consider a speaker, who uses the term allow multiple times throughout the speech, compared to an another speaker who uses terms allow, concur, acquiesce, accede, and avow for the same word. The latter speech has more lexical diversity than the former. Lexical diversity is widely believed to be an important parameter to rate a document in terms of textual richness and effectiveness.

Lexical diversity, in simple terms, is a measurement of the breadth and variety of vocabulary used in a document. The different measures of lexical diversity are TTR, MSTTR, MATTR, C, R, CTTR, U, S, K, Maas, HD-D, MTLD, and MTLD-MA.

koRpus package in R provides functions to estimate the lexical diversity or complexity.

If N is the total number of tokens and V is the number of types:

Measure

Description

Wrapper Function (koRpus package in R)

TTR

Type-Token Ratio

TTR

MSTTR

Mean segment type token ratio

MSTTR

C

logTTR

C.ld

R

Root TTR

R.ld

CTTR

Corrected TTR

CTTR

U

Uber Index

U.ld

S

Summer index

S.ld

Analyse lexical diversity

This function provides all the lexical diversity measure characteristics as described previously. If you are only interested in estimating one of the measures, then you can use the wrapper functions as mentioned in table instead of lex.div:

Library(koRpus)
Lex.div(tagged.txt))
ttr.res <- TTR(tagged.text, char=TRUE)

Calculate lexical diversity

This function is truncated version of lex.div, as argument it just requires the number of token and types and calculates the lexical diversity. Lexical diversity measures like TTR, C, R, CTTR, U, S, and Maas can be estimated by using this function:

lex.div.num(N, V)

Readability

Readability provides quantitative measures to analyze the complexity and quality of a text document.

Automated readability index

Automated readability index

The function does not count the syllables, when the parameter is specified as "NRI", navy Readability index is calculated while if it set to "simple", simplified formula is calculated.

Apart from ARI, koRpus package provides different functions for readability analysis like bormuth, Degree of Reading Power(DRP), Easy Listening Formula(ELF), dickes.steiwer, danielson.bryan, dale.chall to estimate different readability indices.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset