On the comparison of diversity of parts of a distribution

The literature on diversity measures, regardless of the metric used (e.g., Gini-Simpson index, Shannon entropy) has a notable gap: not much has been done to connect these measures back to the shape of the original distribution, or to use them to compare the diversity of parts of a given distribution and their relationship to the diversity of the whole distribution. As such, the precise quantification of the relationship between the probability of each type p i and the diversity D in non-uniform distributions, both among parts of a distribution as well as the whole, remains unresolved. This is particularly true for Hill numbers, despite their usefulness as ‘effective numbers’. This gap is problematic as most real-world systems (e.g., income distributions, economic complexity indices, rankings, ecological systems) have unequal distributions, varying frequencies, and comprise multiple diversity types with unknown frequencies that can change. To address this issue, we connect case-based entropy, an approach to diversity we developed, to the shape of a probability distribution; allowing us to show that the original probability distribution g 1, the case-based entropy curve g 2 and the c {1,k} versus the c{1,k}*lnA{1,k} curve g 3, which we call the slope of diversity, are one-to-one (or injective), i.e., a different probability distribution g 1 gives a different curve for g 2 and g 3. Hence, a different permutation of the original probability distribution g 1(that leads to a different shape) will uniquely determine the graphs g 2 and g 3. By proving the injective nature of our approach, we will have established a unique way to measure the degree of uniformity of parts as measured by D P /c P for a given part P of the original probability distribution, and also have shown a unique way to compute the D P /c P for various shapes of the original distribution and (in terms of comparison) for different curves.

numerical index. These indices allow for the comparison and ranking of diverse systems, such as ecological communities, species populations, or even mathematical databases.
Hill numbers are characterized by a parameter q that favors types with lower or higher frequencies, depending on whether 0 < q < 1 or q > 1, respectively. When q = 1, 1 D weighs each type proportional to its relative frequency, ultimately resulting in e H , where H is the Shannon entropy of the distribution. A more recent book on an axiomatic approach to defining and characterizing diversity can be found in (Leinster 2021).
The interpretation of Hill numbers and mathematical diversity depends on the specific context in which they are applied. In ecology, Hill numbers are often used to characterize biodiversity in ecological communities. They provide a way to summarize the distribution of species abundances and assess the relative importance of rare versus common species.
When analyzing species data using Hill numbers q D, the values obtained can be interpreted as follows: 1. Hill number with q = 0: This represents the species richness, which counts the number of unique species present in the community. A higher value indicates greater species richness.
2. Hill number with q = 1: This is known as the exponential of the Shannon entropy and reflects both species richness and evenness. It captures the distribution of abundances among species, with higher values indicating a more even distribution.
3. Hill number with q = 2: Also referred to as the inverse Simpson index, it emphasizes the dominance of abundant species. A lower value indicates greater dominance of a few dominant species, while a higher value suggests a more equitable distribution of abundances among species.
4. Hill number with q → ∞ : This represents the effective number of species, which accounts for both richness and evenness. It quantifies the diversity as if the species were equally abundant. A higher value signifies a more diverse community.
Interpreting Hill numbers in other contexts depends on the application and the specific definition of diversity being used. For example, in mathematical datasets, Hill numbers can be employed to assess the diversity of numerical values, patterns, or structures. In this case, higher Hill numbers indicate a greater variety and complexity in the dataset. Overall, Hill numbers provide a unified framework to measure and interpret diversity by incorporating multiple dimensions of richness, evenness, and dominance. They enable researchers to compare and quantify diversity across different systems, identify patterns of variation, and evaluate the impacts of disturbances or interventions on diversity.

The challenge
Despite the usefulness of Hill numbers as 'effective numbers,' the exact relationship between the probability of each type in a distribution and the Hill number itself remains unexplored. Furthermore, the original notion of diversity that was due to Hill and Jost is actually insensitive to permutations i.e., if we rearrange the probabilities of the original distribution g 1 , then the diversity of the entire distribution will remain unchanged.
These issues are particularly problematic since most real-world systems have unequal distributions, varying frequencies, and comprise multiple diversity types with unknown frequencies that can change. Such systems include income distributions, economic complexity indices, ecological systems, species diversity, and ranking systems, from genes and exposomic biological assays to measures of economic and health inequality. An excellent example is the Gini coefficient. Despite being one of the most widely used measures of economic inequality, it has several serious flaws. For our purposes, the most important is that it provides the same coefficient for different income distributions, such that several countries can have different income distributions but the same Gini index. As this example hopefully illustrates, the Gini index and other measures of diversity struggle with the precise quantification of the relationship between the probability of each type p i and the diversity D in non-uniform distributions, both among parts of a distribution as well as the whole. As a result, while highly important, this issue remains unresolved.

Purpose of current study
To address this gap, we will explicitly connect case-based entropy, an approach to diversity that we developed, to the shape of a probability distribution. We made initial progress on this gap in (Rajaram and Castellani 2020) by proving an interesting result relating the probabilities p i in a distribution with K types (including J types whose frequencies can be changed) and the total diversity D K . In the current paper, we will show that the case-based entropy curve g 2 and the c {1,k} versus the c A ln k k 1, 1, different permutation of the original probability distribution g 1 (that leads to a different shape) will uniquely determine the graphs g 2 and g 3 . By proving the injective nature of our approach, we will have established a unique way to measure the degree of uniformity of parts as measured by D P /c P and also have shown a unique way to compute the D P /c P for various shapes of the original distribution. As our case study, we will consider a general probability distribution with a random variable X as shown in table 1 (signifying different types or categories), where x i denotes the i − th type, with probability p i and frequency f i . We note that the random variable X under study can be quantitative as well as qualitative. For our case study, we will ask the following question: Can we establish a relationship (direct or indirect) between the probabilities p i and the case-based entropy curve C c as a function of the cumulative probability c? More specifically, what if any, is a relationship between the shape of the case-based entropy curve (C c versus c) and the original probability distribution shown in table 1?

A formal introduction to diversity
Diversity is commonly used as a measure to assess the 'richness' or number of types in a distribution and its 'evenness,' or equal probability of occurrence among diversity types, as reported by several studies (MacArthur 1965, Hill 1973, Peet 1974, Jost 2006. This definition of diversity is based on the intuition that if all types in the distribution occur with the same probability, diversity should be equal to the number of types K. Conversely, any deviation from uniformity in probabilities will always result in a lower diversity value. Definition 2.1. (Shannon Diversity corresponding to q = 1 for Hill numbers) Given an ordered set of types numbered as i N Î and their corresponding probabilities p i , the diversity of the entire distribution D K 1 is defined as the number of equiprobable types needed to yield the same value of Shannon entropy H.
Shannon entropy is defined as below: It was shown (MacArthur 1965, Hill 1973, Peet 1974, Jost 2006, Rajaram and Castellani 2016) that definition 2.1 implies that the total diversity 1 D K is given by: Furthermore, we denote the diversity of the first i types (or partial diversity) as 1 D {1,i} , where i = 1,..,K. The partial diversity up to the first i types is given by: We note that equations (2) and (3) can be rewritten in terms of the frequencies f i as below. We will continue to use the modified equation (4) in our exposition.
In this paper, we have two objectives: 1. We make a case for the ratio D P /c P i.e., diversity of a part to its cumulative probability as a way to measure the degree of uniformity of the part P, and also show a way to compute this ratio for arbitrary parts from the graph of the slope of diversity curve (c {1,k} versus c A ln . This will prove to be an important way to measure the extent of uniformity of parts of a distribution.
2. We prove some results that relate the case-based entropy curve i.e. c {1,k} versus C {1,k} to the original probability distribution, again, by using the graph of slope of diversity curve c {1,k} versus c A ln This will close the gap of relating the Hill numbers back to the shape of the original distribution.
The paper is organized as follows: In section 3 we lay down the foundation towards using the ratio D P /c P as a means to compare the degree of uniformity of parts of a distribution. In section 4, we show a way to compute D P /c P for parts of a given distribution using a new curve that plots c {1,k} versus c A k ln 1, , which we call slope of diversity. In section 5, we prove some results related to comparing the ratio D P /c P for different parts of a distribution. In section 6 we relate the case-based entropy curve to the original probability distribution given in table 1. In section 7 we use the geometric distribution as an example to demonstrate some of our results. In section 8, conclude the paper with some remarks on the results.

The ratio
D c P P for parts P of a distribution We recall the following two important 'parts-to-whole' formulae that were proved in (Rajaram and Castellani 2020).
Theorem 3.1. Given a probability distribution similar to table 1, the diversity of the entire distribution D q K for some complex system or dataset, and the diversities of disjoint parts D q P i and their respective cumulative probabilities c P i are related as follows: We note that equations (5) and (6) are simply the weighted geometric and arithmetic means (of order 1 − q) respectively of the ratio The following corollary can be easily proved.
Proof. The proof is obtained by re-normalizing the probability of the type j in part P i as p j p c j P i = and using the formulas 5 and 6 in a recursive fashion.
Remark 3.1. If we consider each part P i in the derivation of the above theorem to be exactly one type i.e., 1, , P i = " = ¼ and equation (5) reduces to equation (2).
Remark 3.2. We can restrict ourselves to a portion of the distribution starting from l = 1 to say l = k. Then theorem 3.1 is true for the restriction for any sub-partition k  of such a restriction. In this case, the probabilities will have to be re-normalized as p p l l k 1, We can re-imagine the given probability distribution (or, as we will see, any of its parts as well) as an abstract uniform distribution having the same entropy (or conditional entropy of parts, if we are looking at parts). Hence, there is a close relationship between the diversity of a distribution (or its part) to uniformity. In both theorem 3.1 and corollary 3.1 we notice the occurrence of the ratio D c P i P i repeatedly, giving us a sense that this ratio should play an important part in comparing the degree of uniformity or closeness to uniform distribution among parts of a given distribution.
From this point in the paper, we choose to focus our results for q = 1 for the Hill number q D to show our results since the weight given to each type is proportional to the abundance of the type if q = 1. Accordingly, we will omit the left superscript of 1 while writing the diversity D.
3.1. An abstract visualization of the part P of a distribution It is well known (MacArthur 1965, Hill 1973, Peet 1974, Jost 2006, Gaggiotti et al, 2018, Jost 2019) that a nonuniform distribution with a diversity of D K can be abstractly redrawn as a uniform distribution with D K number of types each having a probability of D 1 K . The abstract uniform distribution has the same Shannon entropy as the original distribution, and hence has the same degree of probabilistic uncertainty as the original distribution. The D K number of types for the abstract equivalent distribution may no longer be an integer. We call the D K types as Shannon Equivalent Equiprobable (SEE) types.
In a similar way, we consider a part of a distribution P with diversity D P and cumulative probability c P , where P = {k 1 , k 2 } is the part between indices k 1 and k 2 for example. For this part P, we can associate an equivalent abstract uniform distribution which has D P number of SEE types, each of which has a probability of c D P P . This abstract equivalent uniform distribution has the same entropy as conditional entropy of the original distribution given the part P. In other words, it has the same degree of uncertainty as the part P.
Hence, we have the following: Given a distribution consisting of disjoint parts P i with diversity D P i and cumulative probability c P i so that U i P i is the entire distribution, each part P i can be redrawn as an abstract uniform distribution with D P i number of SEE types each having a probability of c D P i P i . We also have that the abstract equivalent has the same entropy as the conditional entropy of the original distribution given the part P i and its total cumulative probability will also be equal to c P i . We will refer to this abstract equivalent uniform distribution henceforth as the SEE equivalent of the part P i . More generally, as shown in figure 1, each of the disjoint parts P i of a given distribution can be equivalently replaced with an abstract uniform distribution each having a diversity of D P i and a cumulative probability of c P i . As we will see, this is a very important equivalence that allows us to compare the uniformity of the abstract equivalent SEE types of the original parts instead of the original parts themselves. Comparing the latter is much easier because the abstract equivalent SEE types are uniformly distributed even though the original parts themselves may not be uniform.
3.2. The case for using the ratio D P /c P to compare degrees of uniformity Given the conclusion from the last section, it is clear that comparing degrees of uniformity of parts of a distribution boils down to comparing degrees of uniformity of the abstract SEE (Shannon Equivalent Equiprobable) equivalents of its parts.
We look at an example of a distribution where the SEE equivalents of three parts I, II, and III are shown as in figure 2. We assume that these three parts are the SEE equivalent types of three parts of a given distribution. The probability values are fictitious and are used to make an important point i.e., the ratio D P /c P for each abstract equivalent SEE part (and hence the same ratio for the original part itself) is a measure of how much more or less uniformly distributed a given part is compared to other parts, thereby showing that the ratio D P /c P is a relative measurement of degree of uniformity of parts of the original distribution.
It is easy to calculate So, in part I, 33.33 SEE types are assigned per unit cumulative frequency, in part II, 50 SEE types are assigned and in part III, 100 SEE types are assigned per unit cumulative frequency. Thus, part III has the most number of SEE types per unit cumulative frequency followed by part II, followed by part I.
The main point is this: SEE types are uniformly distributed. If we are talking about the entire distribution (which has a cumulative probability of 1), then the diversity of the entire distribution (in this case D = 54.51) is an indication of the extent of uniformity of the entire distribution. Hence, the diversity D of entire distributions can be compared to indicate  their degrees of uniformity. However, if we are talking about parts of a distribution (where the cumulative probability of the given part is no longer equal to 1 but c < 1), then the number of SEE types per unit cumulative frequency (which is the same as D P /c P ) is a better indication of the degree of uniformity. We note that D P /c P = D if c P = 1 and hence D P /c P as a measure of degree of uniformity, is a generalization of the total diversity D but for parts of a distribution.
If we think of each type having the same width(amount) of money, then part III has the most amount of Shannon Equivalent Equiprobable (SEE) money per person i.e., 100 multiplied by the width of each bin. In fact, part III has twice as much SEE money per person compared to part II. Since D c III III is the largest, part III should be treated as the most uniformly distributed, followed by part II followed by part I. In fact, we also note that D III > D and hence part III is actually more uniformly distributed compared to the entire distribution itself. Similarly D I < D and D II < D mean that parts I and II are less uniformly distributed compared to the entire distribution. Finally, the diversity D of the entire distribution is a measure of degree of uniformity of the entire distribution and theorem 3.1 and corollary 3.1 say that the degree of uniformity of the entire distribution is a weighted geometric mean of the degrees of uniformity of the parts of the distribution. In other words, the total diversity D is a weighted geometric mean of D We used the example above to lay down the intuition for why D P /c P is a good measure of the degree of uniformity. Now we consider a general case where a part of a distribution from say index k 1 to k 2 has diversity given by D k k is the amount of SEE types or bins that are assigned per unit cumulative frequency and can be used to measure degree of uniformity in the distribution between parts. The higher the D P /c P ratio for a part, the more diversity per unit cumulative frequency compared to another part that has a lower value. Furthermore, this ratio is actually a true proportion as far as the part {k 1 , k 2 } is concerned, since the redrawn SEE equivalent is actually uniform, and hence any portion P j of this part will contain exactly c number of SEE types. The same intuition that we built using the example above holds true for the general case as well i.e., the ratio will indicate the number of SEE types per unit cumulative frequency and hence is a measure of the degree of uniformity of the part of the distribution from index k 1 to k 2 . We focus on computing the ratio D P /c P for a given part P next. Figure 3 below shows how a part of the distribution of the type k k , 1 2 { } is mapped between the case-based entropy curve and the original distribution.

Computing the ratio D P /c P for a part of a distribution
A Lorentz curve by the name of case-based entropy was introduced to compare distributions in (Rajaram and Castellani 2016). The case-based entropy of a part P = {1, k} is defined as C k ,k} is the diversity of the part P and D K is total diversity of K types. It is clear from the last section, that the ratio D P /c P for a part P is a way to measure the degree of uniformity of the distribution in the part P. In this section, we show how we can use the case-based entropy curve to compute the ratio D P /c P for a given part P of a distribution.
We consider the parts denoted by indices P 1 = {1, k 1 } and P 2 = {1, k 2 }. We know the following: From equation (7) and dividing by the total diversity D K , we have the following: We define the slopes of the secants on the case-based entropy curve joining the points (0, 0) and (c {1,k} , . Using this, the above equation can be rewritten as: Note that D K is the total diversity and it is also known separately. In fact, everything on the right hand side of the above equation is known.
Solving for A k 1, 2 { } in the last equation above, we have: We now take the natural logarithm of both sides of the above equation to obtain a logarithmic interpolation forumla. That is, { } , then the above formula is the slope of the secant line of the curve joining the points A and B in figure 4. We note that this curve starts at (0, 0) and ends at (1, 0). We name this curve as the slope of diversity curve.
In figure 4, points A and B have the following coordinates: Also, taking exponentials, we have: Then we have the following equivalence: This means, as shown in figure 5, that the ordering of the slopes of secants of parts of the slope of diversity curve preserves the same ordering of the ratios of D P /c P for the corresponding parts in the original distribution. This means that the c {1,k} versus c A ln { } (slope of diversity) curve is a way to measure the relative degree of uniformity of parts in the original distribution. We have that, S S k k k k 1, 1, For example, the part {k 1 , k 2 } has more SEE types per unit cumulative frequency than the part {k 3 , k 4 } and hence, is more uniformly distributed compared to the part {k 3 , k 4 }. Alternatively, we could form the ratio , which tells us how much more or less uniformly distributed the part {k 1 , k 2 } is relative to the part {k 3 , k 4 }. Hence, with the slope of diversity curve, we have definitively created a quantitative way to compare the degree of uniformity of parts of a given distribution using the slopes of its secants.
Having established the importance of the ratio D P /c P as a way to measure the degree of uniformity of a part of a distribution, and created a way to compute D P /c P , we now explore ways to compare the ratio D P /c P for different parts of a distribution, and its relationship to the original distribution.

Some results related to comparing D P /c P for parts
In this section, we summarize our findings about comparison of the D P /c P ratio of parts in the form of theorems.
Theorem 5.1. Let P 1 and P 2 be two parts of a probability distribution like in table 1. Then we have the following equivalence:   Theorem 5.1 is illustrated in figure 6. The key point here is to note that for P 1 = {1, k 1 } and P 2 = {1, k 2 }, A P 1 and A P 2 are slopes of secants on the case-based entropy curve between the points (0, 0) and c C , P P 1 1 ( )and the points the points (0, 0) and c C , P P 2 2 ( )respectively. So, the degree of uniformity of parts that look like P 1 = {1, k 1 } and P 2 = {1, k 2 } can be directly compared from slopes of secants from the original case-based entropy curve.
However, if the parts of the type P 1 = {k 1 , k 2 } and P 2 = {k 3 , k 4 } where k 1 ≠ 1 and k 2 ≠ 1, then the case-based entropy curve cannot be directly used to compare their degrees of uniformity. For these types of general partitions that don't start at the index 1, we need to plot slope of diversity curve like in figure 5. We summarize this development in the previous section in the form of the following theorem: = { }be general disjoint parts of a probability distribution like in table 1. Then we have the following equivalence: Proof. Already explained in the previous section. We refer to figure 5 for an illustration as well.
Remark 5.1. We note that theorem 5.1 is subsumed by theorem 5.2. This is because we could choose partitions of the form P k 1, which shows that theorem 5.1 is subsumed by theorem 5.2.
We now state and prove an explicit relationship between the probabilities in the original distribution and the slope of diversity curve. We note that this is the first time that the diversity of a distribution is directly related to the individual probabilities in the distribution, thereby establishing a connection between the diversity and the shape of the original distribution.
Theorem 5.3. Given a probability distribution like in table 1, we have the following: Proof. We already showed the following in equation (9): . This implies that equation (9) becomes and exponentiating both sides gives us the result of theorem 5.3.
This means that we can completely reconstruct the original distribution simply by looking at slopes of the form S {k−1,k} for all k = 1,K,K, and computing This is a key reconstruction result, which is illustrated by the following graph: Figure 7 shows that there is a one-to-one (injective) correspondence between the original distribution and the case-based entropy curve via the slope of diversity curve. This is a new result.
It also means that two different distributions will give two different case based entropy curves that are unique to the shape of each distribution. It also means that two different distributions will give two different slope of diversity curves as well.
Theorem 5.4. Given a probability distribution like in table 1, let 1  be the set of graphs of the original probability distribution, 2  be the set of graphs of the corresponding case-based entropy curves, and 3  be the set of graphs of the corresponding slope of diversity curves, with g 1 , g 2 and g 3 denoting elements (graphs) in 1  , 2  and 3  respectively. In addition, let T j k  be the map from the graph j  to the graph k  where j k , 1, 2, 3 = . Then we have the following: is injective (or one-to-one).
Remark 5.3. We note that since the number of points on the original distribution curve, the case-based entropy curve and the slope of diversity curve are equal, the map T :  is defined to be the natural map between the points in the same order as they appear from left to right.

Proof.
(1) T 1 2  : Let g g , This shows that the map T 1 2  is injective.
(2) Let g g , This shows that the map T 2 3  is injective. ( This shows that the map T 3 1  is injective. Remark 5.4. We note that the inverse of an injective map is also injective. Hence, we could have shown that the inverses of the maps T j k  are injective, and that would have also proved theorem 5.4. The key to any or all of those proof steps is that both coordinates should match for two points to be equal, and that forces the uniqueness of points because the equality of one of the coordinates (typically the x) leads to an equality of indices, probabilities or cumulative probabilities. Remark 5.5. Theorem 5.4 is a significant improvement compared to the original notion of diversity as introduced by (Hill 1973) and (Jost 2006) in its own right. This is because, the Hill numbers D q are insensitive to rearrangements in the original distribution. In other words, any permutation of the probabilities in the original distribution will lead to the same value for the diversity D q . This might be alright for qualitative distributions such as for species in a forest, since in such a context, we are only interested in the diversity of the distribution modulo permutations. However, the shape of the original probability distribution becomes extremely important in the context of a quantitative distribution such as for income, where we are interested in quantifying and comparing the amount of inequality (or degree of uniformity) that exists in different parts of the distribution. Given that the graph of c k 1, { } or slope of diversity g 3 3 Î  is extremely useful to directly read off the D c P P ratios (which measure the degree of uniformity of parts) by looking at slopes of secants, such  comparisons can be easily made from the graph of g 2 2 Î  indirectly by using g 3 3 Î  .In this context, establishing the injectivity of the maps T j k  is an important step that allows us go back and forth between the graphs j  and k  . In essence, we have shown in theorem 5.4 that the shape of the original probability distribution g 1 1 Î  uniquely determines the case-based entropy curve g 2 2 Î  and the slope of diversity curve g 3 3 Î  , and vice-versa.

Results relating the case-based entropy curve with the original probability distribution
In this section, we present some results that relate the case-based entropy curve i.e. c {1,k} versus C {1,k} curve to the original probability distribution. The importance of these results stems from the fact that for the first time, we have a connection between the variation of diversity of a given distribution in table 1 as measured by the casebased entropy curve c {1,k} versus C {1,k} and the slope of diversity curve c {1,k} versus the c A ln { } curve, and the probabilities p k in the original distribution. This will achieve the objective of connecting the Hill numbers from (Jost 2006, Leinster and Cobbold 2012, Chao and Jost 2015, Hsieh et al, 2016, Pavoine et al, 2016, Jost 2019 to the shape of the original distribution.
Theorem 6.1. Given a probability distribution like in table 1, the average case-based entropy per cumulative probability A P i are related as follows: Proof. Divide equation (5) by D K and rewriting using the fact that c 1 Corollary 6.1. Given a probability distribution like in table 1, and a part P with a disjoint partition given by P i i È , the average case-based entropy per cumulative probability A P and A P i are related as follows: Proof. This follows from dividing equation (7) in corollary 3.1 by the total diversity D K .
Hence, we have the following which proves the theorem: Remark 6.1. Rearranging equation (14), we have

This means that
This means that if the average case-based entropy per unit cumulative frequency i.e., the slope of the line joining the points (0, 0) and (c {1,k} , C {1,k} ) on the case-based entropy curve is less than 1, then the portion of the original probability distribution with indices {1, k} is less uniformly distributed compared to the portion of the original probability distribution with indices {k + 1, K}.
Theorem 6.3. Given a probability distribution like in table 1, we have the following for any fixed k: From the last equation above, this means the following: This proves the theorem.
Theorem 6.4. Given a probability distribution like in table 1, and given a fixed index k, we have the following: Proof. We rewrite equation (5) by using remark 3.2 restricting ourselves to a partition k k 1, , 1 , Remark 6.2. We remark that the following variation of equation (17) is true as well and the proof follows the same lines and using the fact that p c 1 1 Theorem 6.5. Given a probability distribution like in table 1, we have the following: We use theorem 6.4 to get the following: In other words, the case-based entropy curve is a straight line joining (0, 0) and (1, 1) if and only if the original distribution is uniform.
Theorem 6.6. Given a probability distribution like in table 1, we have the following: We also note that A 1 . This means that there exists some M L > so that is an increasing sequence that converges to 1, for some M L > . Now, we have the following from equation (17): Going the other direction, we have: is an increasing sequence that converges to 1. This proves the the theorem.
F. irst we note from remark 6.2, theorem 6.1 and theorem 6.3 that We also note that A 1 . This means that there exists some M L > so that ) is an increasing sequence that converges to 1, for some M L > . Now, we have the following from remark 6.2: Going the other direction, we have: is an increasing sequence that converges to 1. This proves the theorem.
Remark 6.4. In particular, if the sequence of average case-based entropies per unit frequency given by ) increases to 1 very fast (or very slow), then the probabilities given by p k k M 1 = { } will form a left tail sequence of increasing probabilities that increases quickly (or slowly).
Remark 6.5. We note that theorems 6.6 and 6.7 while giving us a sense of existence of tails on the right and left ends of the original probability distributions, only give us conservative estimates on what exactly those probabilities could be. We will see below that we can improve this significantly using the c k 1, Theorem 6.8. Given a probability distribution like in The entropy H and the diversity D can be easily calculated.
We truncate up to i = K and re-normalize the probabilities p î (so that they add up to 1) to get So, interestingly enough, for the geometric distribution with infinite support, we have an explicit expression for c A ln { } as a function of c {1,K} , and it is independent of p or q. Let K be the index corresponding to c 0.5.
This means that for every secant line to the left of x = 0.5, we can find another matching secant line to the right of x = 0.5 with the same slope, as the figure illustrates. That is, for any subset of types {k 1 , k 2 } to the left of x = 0.5, there is an equivalent subset of types {k 3 , k 4 } to the right of x = 0.5 (possibly with more types) that have the same number of SEE types as shown in figure 9. Since

So, in fact the
Hence, for the truncated and normalized geometric distribution, the formula for c A ln { } is not independent of p and q. The blue graph in figure 10corresponds to N approaching infinity, for any p and q. The red graph corresponds to N = 8 and q = 0.7.

Conclusion
Given the real-world challenges of measuring diversity we had two objectives for this study. First, to introduce and justify the ratio D P /c P as a measure of degree of uniformity of a part of a given distribution in table 1. Second, to prove results that concretely link the case-based entropy curve and the original probability distribution (via the slope of diversity curve), thereby (for the first time), establishing an explicit and concrete link between the diversity of parts of a distribution and the original probabilities themselves. We have achieved both objectives in this paper, and also demonstrated how to compute some of the quantities such as the c {1,k} versus c A ln k k 1, 1, curve for the geometric distribution, which we call the slope of diversity. These two results are an important step towards concretely comparing and contrasting the degrees of uniformity of parts of a given probability distribution within and across different distributions, given that most real-world systems have unequal distributions, varying frequencies, and comprise multiple diversity types with unknown frequencies that can change. Such systems, as we mentioned in the introduction, include income distributions, economic complexity indices, ecological systems, species diversity, and ranking systems, from genes and exposomic biological assays to measures of economic and health inequality. For example, returning to the Gini coefficient from the introduction, our approach allows for several advances. First, because our approach does not conflate different distributions with the same coefficient, we can provide a unique case-based entropy or slope of diversity curve for each and every income distribution.
Second, we can also provide, for any given country's income distribution, the precise quantification of the relationship between the probability of each income level (p i ) and the total income diversity D for any country, both among parts of their respective income distribution as well as the whole using the case-based entropy and the slope of diversity curves. The Gini index cannot do that, for example.
Third, we have also established a concrete quantitative means of comparing the degree of uniformity of parts of a distribution. Such a comparison is extremely important in studying the prevalence of inequality (as in the case of incomes, for example) in whole distributions and their parts. A quantifiable measure of the degree of uniformity (or inequality) for quantitative variables such as income and resources, will pave the way to formulate policies that will lead to equity in distribution of resources, and also measure such an achievement by using the D P /c P ratio.
Fourth, we have closed the gap that exists in the literature on diversity measures by explicitly relating the diversity of parts of a distribution to the probabilities in the original distribution. We have also shown, to repeat a point, in theorem 5.4 that the shape of the original distribution uniquely determines the diversity of its parts and vice-versa. Furthermore, we have also shown how to explicitly compute the individual probabilities of the original distribution, as in the case of income for example, from the case-based entropy curve. This is a significant step towards linking the concept of diversity to the shape of the original distribution which, as we have commented in remark 5.5, is extremely important in quantifying and locating regions in the original distribution that are more or less unequally distributed. = = å Î . So in general, whenever there is a partition P i as a subscript, it means that we are dividing the probability (or cumulative probability) in the base by c P i .   : This is the set of all probability distributions like in table 1 with elements denoted by g 1 .
(13) 2  : This is the set of all case-based entropy curves with elements denoted by g 2 .
(14) 3  : This is the set of all slope of diversity curves with elemetns denoted by g 3 .