measures of diversity (or uniformity or inequality) such as: biodiversity in ecology, income inequality in economic. Examples are entropy and simpson's indices.
Shannon's entropy is the negative of the sum of the probabilities of each outcome multiplied by the logarithm of probabilities for each outcome. What purpose does the logarithm serve in this equation?
An intuitive or visual answer (as opposed to a…
This is similar to Bootstrap: estimate is outside of confidence interval
I have some data that represents counts of genotypes in a population. I
want to estimate genetic diversity using Shannon's index and also
generate a confidence interval using…
Stack Exchange, as we all know it, is a collection of Q&A sites with diversified topics. Assuming that each site is independent from each other, given the stats a user has, how to compute his "well-roundedness" as compared to the next guy? What is…
Background: My organization currently compares its workforce diversity statistics (ex. % persons with disabilities, % women, % veterans) to the total labor force availability for those groups based on the American Community Survey (a surveying…
Say we have the following 5 cities, each with the same population
CityA with 20% each of 5 ethnicities
CityB with 99% of one ethnicity, but 100 different ethnicities in the remaining 1%
CityC with 40% of one ethnicity and the remaining 60%…
I am having some difficulty understanding the difference between and identifying an observational vs quasi-experimental design. From my understanding, an observational study is one in which the researcher does not influence the system and only…
The Herfindahl–Hirschman Index (HHI) is a concentration measure defined as
$$H = \sum_i p_i^2,$$
where $p_i$ is the market share of firm $i$. It's maximized when one firm has a monopoly and minimized when all firms have equal market…
I've got an educational game in the work that challenges players to redraw U.S. state borders in a way that reduce the inequality between big-population and small-population states. As they play, they need to get real-time feedback on the effect of…
I would like to compare two Simpson Indices from two different populations. I have calculated their variance, as it is done in the original paper by Simpson regarding measures of diversity and I have calculated a confidence interval for each of them…
I'm looking for a measure of dispersion, such as standard deviation, that can be used when distributing to an unordered set.
Specifically: A bucket distribution assigns a non-negative value to each bucket in a finite set. The sum of all assigned…
So I'm trying to run an NMDS (using the vegan package in R) for a species X site matrix. Unfortunately the abundances of species are not inherently equal (If you can use that word) meaning that not all the sites had equal sampling efforts for this…
I am looking to Shannon index formula in diversity. Part of the formula I am having trouble following. For example, 50 foxes at site 1, 60 foxes site 2 and 100 foxes site 3. Across all sites there are 210 foxes.
50 / 210 = 0.23809. Then get that…
We have a random vector $X\sim p(X)$, and a set of realizations of the random vector $S=\{X_i\}_{i=1}^N$. The random vector has $n$ continuous and $m$ categorical features. I want to cluster $S$ so that datapoints with similar values of the…
I would like to compare differences in fish diets between sampling sites using a Bray-Curtis dissimilarity matrix and non-metric multidimensional scaling techniques. My raw data consists of counts of items in each taxonomic category in each stomach,…
Current problem: We have a batch of $n$ items for which we capture their details with $m$ attributes. It could look something like this:
The goal is to compute an "index" that says how "similar" this batch is (or how "diverse" it is…