Questions tagged [diversity]

measures of diversity (or uniformity or inequality) such as: biodiversity in ecology, income inequality in economic. Examples are entropy and simpson's indices.

Some information can be found here: https://en.wikipedia.org/wiki/Biodiversity#Measuring_biodiversity, https://en.wikipedia.org/wiki/Diversity_index, https://en.wikipedia.org/wiki/Income_inequality_metrics

88 questions
106
votes
17 answers

What is the role of the logarithm in Shannon's entropy?

Shannon's entropy is the negative of the sum of the probabilities of each outcome multiplied by the logarithm of probabilities for each outcome. What purpose does the logarithm serve in this equation? An intuitive or visual answer (as opposed to a…
15
votes
2 answers

Biased bootstrap: is it okay to center the CI around the observed statistic?

This is similar to Bootstrap: estimate is outside of confidence interval I have some data that represents counts of genotypes in a population. I want to estimate genetic diversity using Shannon's index and also generate a confidence interval using…
ZNK
  • 201
  • 3
  • 6
13
votes
5 answers

How to measure the "well-roundedness" of SE contributors?

Stack Exchange, as we all know it, is a collection of Q&A sites with diversified topics. Assuming that each site is independent from each other, given the stats a user has, how to compute his "well-roundedness" as compared to the next guy? What is…
Graviton
  • 845
  • 1
  • 15
  • 28
11
votes
3 answers

How would re-weighting American Community Survey diversity data affect its margins of error?

Background: My organization currently compares its workforce diversity statistics (ex. % persons with disabilities, % women, % veterans) to the total labor force availability for those groups based on the American Community Survey (a surveying…
8
votes
5 answers

Is there a way to compute diversity in a population?

Say we have the following 5 cities, each with the same population CityA with 20% each of 5 ethnicities CityB with 99% of one ethnicity, but 100 different ethnicities in the remaining 1% CityC with 40% of one ethnicity and the remaining 60%…
Scott Weinstein
  • 341
  • 1
  • 7
8
votes
5 answers

Observational vs quasi-experimental design?

I am having some difficulty understanding the difference between and identifying an observational vs quasi-experimental design. From my understanding, an observational study is one in which the researcher does not influence the system and only…
elduderino260
  • 163
  • 1
  • 1
  • 6
6
votes
4 answers

How is the Herfindahl-Hirschman index different from entropy?

The Herfindahl–Hirschman Index (HHI) is a concentration measure defined as $$H = \sum_i p_i^2,$$ where $p_i$ is the market share of firm $i$. It's maximized when one firm has a monopoly and minimized when all firms have equal market…
6
votes
2 answers

How to express "inequality" of a distribution in one number?

I've got an educational game in the work that challenges players to redraw U.S. state borders in a way that reduce the inequality between big-population and small-population states. As they play, they need to get real-time feedback on the effect of…
5
votes
2 answers

Comparison of two Simpson indices using t-test

I would like to compare two Simpson Indices from two different populations. I have calculated their variance, as it is done in the original paper by Simpson regarding measures of diversity and I have calculated a confidence interval for each of them…
Barrett
  • 131
  • 1
  • 8
5
votes
1 answer

Measure of dispersion over unordered set

I'm looking for a measure of dispersion, such as standard deviation, that can be used when distributing to an unordered set. Specifically: A bucket distribution assigns a non-negative value to each bucket in a finite set. The sum of all assigned…
5
votes
1 answer

Can you use bray-curtis distances to evaluate standardized abundances in an NMDS?

So I'm trying to run an NMDS (using the vegan package in R) for a species X site matrix. Unfortunately the abundances of species are not inherently equal (If you can use that word) meaning that not all the sites had equal sampling efforts for this…
Leo Ohyama
  • 304
  • 1
  • 18
5
votes
1 answer

Why does the Shannon index take the log of number and then multiply the number by that log?

I am looking to Shannon index formula in diversity. Part of the formula I am having trouble following. For example, 50 foxes at site 1, 60 foxes site 2 and 100 foxes site 3. Across all sites there are 210 foxes. 50 / 210 = 0.23809. Then get that…
cara
  • 51
  • 4
4
votes
1 answer

Clustering on n features while maximizing the heterogeneity on m remaining features

We have a random vector $X\sim p(X)$, and a set of realizations of the random vector $S=\{X_i\}_{i=1}^N$. The random vector has $n$ continuous and $m$ categorical features. I want to cluster $S$ so that datapoints with similar values of the…
DeltaIV
  • 15,894
  • 4
  • 62
  • 104
4
votes
1 answer

What is the most appropriate transformation method for performing analyses on species composition data?

I would like to compare differences in fish diets between sampling sites using a Bray-Curtis dissimilarity matrix and non-metric multidimensional scaling techniques. My raw data consists of counts of items in each taxonomic category in each stomach,…
sjames
  • 41
  • 3
4
votes
0 answers

What can I use to compute a similarity (or diversity) index for a sample with "multidimensional" attributes?

Current problem: We have a batch of $n$ items for which we capture their details with $m$ attributes. It could look something like this: The goal is to compute an "index" that says how "similar" this batch is (or how "diverse" it is…
PhD
  • 13,429
  • 19
  • 45
  • 47
1
2 3 4 5 6