Apparently, the top 100 words used in the English language make up about 50% of what we say, hear and read. This really brings into fruition the lacking vocabulary humans carry despite there being thousands of words in the English dictionary.
This analysis of language is known as Zipf’s Law. In short, the law basically describes the relationship between the popularity of words and their frequency in any given language.
More specifically, the rank of the word is inversely proportional to the frequency of the word when compared to the most common word. For example, in the English language, “the” is the most common word. The next most common word is “of”, followed by “and”, and so on. Basically, on a frequency table, “of” shows up half as frequently as “the” while “and” one-third as frequently. If given a sample of 100 words, say 30 of them was the word “the”.
Based on the previous conclusion, “of” should appear 15 times while “and” should appear 10 times within the word set. Though this sample isn’t necessarily realistic, many novels follow this algorithm which begs the question: why is this happening? The only way to answer the question is to utilize linguistics and how humans adapt to language.
The reason Zipf’s Law is so important is that it is utilized in many fields of statistics such as population trends, neural networking, and even the predictions of wartime casualties. However, this provides an interesting relationship we don’t see often. This relationship connects linguistics and math which are typically polar opposites of each other, but Zipf’s Law seems to connect the two. Though Zipf’s Law seems to reside in the branch of linguistics, it actually connects two completely different sides of the spectrum, which provide an interesting relationship between linguistics and math.