Mutations in known DNA sequence: which test to use?

Question

I am analysing a double knockout (mouse), and trying to see whether restoring one or another of the genes it's missing affects specific positions preferentially in a third gene.

The data* looks like this: /non-DNA simplification below for the non-biologists/

Position    DoubleKO    DoubleKO+1    DoubleKO+2
Base:       A     G     A     G       A     G 

1           1     2     3     4       5     6
2           7     8     9     10      11    12
...
N=~500      1     2     3     4       5     6

* counted on a per-position basis from multiple runs of Sanger sequencing - so not NGS, and the numbers are low

So the task is to find the position, or positions, (which ranges from 1 to 500) in which the ratio of G:A (bases, I'm only considering two, not all four in the DNA) is different in a statistically significant manner for

[(DoubleKO+1) vs (DoubleKO)] and [(DoubleKO+2) vs (DoubleKO)]

Note the following contrast is not of interest:

[(DoubleKO+1) + (DoubleKO+2)] vs (DoubleKO)

Questions:

1. Which statistical test or tests should I use?

2. How should I correct for multiple testing in this experiment?

Please note:

Yes, NGS is more informative. This isn't technically feasible at the moment.
I can have a count for each of the bases (ATGC) + uncalled (N) at each point as well, and this encapsulates the universe of possibilities that "base" can be. So if we need to have this information, we do - but the main interest is these two bases - since they are chemically converted into each other, and can't become C or T in this context.
I have read the following other questions/links, and can't see quite how they should be applied to my problem:
- https://www.broadinstitute.org/cancer/cga/mutect
- https://www.broadinstitute.org/cancer/cga/mutsig
  - optimised for lots of cancer/normal samples
  - I am, essentially, trying to compare mutations in cancer type A vs mutations in cancer type B.
  - I am also in genetically identical mice, not people, so the noise of my mutation rate can be assumed to be 0 (i.e. I do not expect spontaneous changes that are not a factor of my treatments)
- Comparing mutation frequency between a case and a pool of controls
  - here the person had one tumour vs multiple controls, i.e. a different experimental design.

Thanks in advance!

Non-DNA simplification:

Imagine that there are ~500 houses in a neighbourhood, in each of which we can have a family of mum + dad + 2 kids living. We want to identify houses which are more conducive to getting kid A or kid B (but not both) to help out with the housework, relative to just mum and dad:

FamilyStructure   JustMumAndDad           Mum+Dad+KidA         Mum+Dad+KidB
House number      HoursMum  HoursDad     HoursMum  HoursDad   HoursMum  HoursDad              
1                  1         2            3         4          5            6
2                  3         4            9         5          7            6
...
500                1         2            3         4          5            6

Mutations in known DNA sequence: which test to use?

0 Answers0