4

I am doing some data analysis for my masters and I had some data that is normally distributed but does not fit the assumption of homogeneity of variance and has unequal sample sizes. I have been doing some research and have found that the games-howell post hoc test can deal with this type of data but can not find any code/ algorithm for it in R. Does any one know of the code or a different post hoc test that will do the same thing but is supported in R? Thanks

Caroline
  • 41
  • 1
  • 1
  • 2
  • http://aoki2.si.gunma-u.ac.jp/R/src/tukey.R Be sure to remove Japanese characters. Or... http://www.gcf.dkf.unibe.ch/BCB/files/BCB_10Jan12_Alexander.pdf –  Mar 04 '14 at 18:33
  • Definitely check the results of this script. I am getting the same results as I would get with the Tukey test. – Kevin Mar 21 '14 at 12:29
  • I was questioning the above script provided by user41318 and I was having difficulty getting it to work. Is there any breakdown of that script? It currently isn't producing any results... Any help would be great! –  Jun 23 '14 at 19:15
  • But Games-Howell is a non-parametric post hoc test and we have a normal distribution. Wouldn't be better if we had a post hoc test that would also use the 'information' from assuming normality? – labros labrou Nov 30 '19 at 16:16
  • I believe GH does assume normally- distributed residuals, but not necessarily equal variances among groups, like Welch's t test. – Sal Mangiafico Nov 30 '19 at 16:43
  • See https://stats.stackexchange.com/questions/38207/what-are-the-assumptions-of-the-games-howell-multiple-comparisons-procedure – Sal Mangiafico Nov 30 '19 at 16:49

1 Answers1

8

I had the same issue, and I edited the function. It now provides Games-Howell by default, and the output is a bit clearer. I'll put it in the next version of the 'userfriendlyscience' package. Until then, here you go:

posthoc.tgh <- function(y, x, method=c("games-howell", "tukey"), digits=2) {
  ### Based on http://www.psych.yorku.ca/cribbie/6130/games_howell.R
  method <- tolower(method);
  tryCatch(method <- match.arg(method), error=function(err) {
    stop("Argument for 'method' not valid!");
  });

  res <- list(input = list(x=x, y=y, method=method, digits=digits));

  res$intermediate <- list(x = factor(x[complete.cases(x,y)]),
                               y = y[complete.cases(x,y)]);
      res$intermediate$n <- tapply(y, x, length);
      res$intermediate$groups <- length(res$intermediate$n);
      res$intermediate$df <- sum(res$intermediate$n) - res$intermediate$groups;
      res$intermediate$means <- tapply(y, x, mean);
      res$intermediate$variances <- tapply(y, x, var);

  res$intermediate$pairNames <- combn(levels(res$intermediate$x),
                                      2, paste0, collapse=":");

  res$intermediate$descriptives <- cbind(res$intermediate$n,
                                         res$intermediate$means,
                                         res$intermediate$variances);
  rownames(res$intermediate$descriptives) <- levels(res$intermediate$x);
  colnames(res$intermediate$descriptives) <- c('n', 'means', 'variances');

  ### Start on Tukey
  res$intermediate$errorVariance <-
    sum((res$intermediate$n-1) * res$intermediate$variances) /
    res$intermediate$df;
  res$intermediate$t <- combn(res$intermediate$groups, 2, function(ij) {
    abs(diff(res$intermediate$means[ij]))/
      sqrt(res$intermediate$errorVariance*sum(1/res$intermediate$n[ij]));
  } );
  res$intermediate$p.tukey <- ptukey(res$intermediate$t*sqrt(2),
                                     res$intermediate$groups,
                                     res$intermediate$df,
                                     lower.tail=FALSE);
  res$output <- list();
      res$output$tukey <- cbind(res$intermediate$t,
                                res$intermediate$df,
                                res$intermediate$p.tukey)                                     
      rownames(res$output$tukey) <- res$intermediate$pairNames;
      colnames(res$output$tukey) <- c('t', 'df', 'p');

  ### Start on Games-Howell
  res$intermediate$df.corrected <- combn(res$intermediate$groups, 2, function(ij) {               
    sum(res$intermediate$variances[ij] /
          res$intermediate$n[ij])^2 / 
      sum((res$intermediate$variances[ij] /
             res$intermediate$n[ij])^2 / 
            (res$intermediate$n[ij]-1));
  } );
  res$intermediate$t.corrected <- combn(res$intermediate$groups, 2, function(ij) {               
    abs(diff(res$intermediate$means[ij]))/
      sqrt(sum(res$intermediate$variances[ij] /
                 res$intermediate$n[ij]));
  } );    
  res$intermediate$p.gameshowell <- ptukey(res$intermediate$t.corrected*sqrt(2),
                                           res$intermediate$groups,
                                           res$intermediate$df.corrected,
                                           lower.tail=FALSE)  
  res$output$games.howell <- cbind(res$intermediate$t.corrected,
                                   res$intermediate$df.corrected,
                                   res$intermediate$p.gameshowell);
  rownames(res$output$games.howell) <- res$intermediate$pairNames;
  colnames(res$output$games.howell) <- c('t', 'df', 'p');

  ### Set class and return object
  class(res) <- 'posthocTukeyGamesHowell';
  return(res);

}

print.posthocTukeyGamesHowell <- function(x, digits=x$input$digits, ...) {
  print(x$intermediate$descriptives, digits=digits);
  cat('\n');
  if (x$input$method == 'tukey') {
    print(x$output$tukey);
  }
  else if (x$input$method == 'games-howell') {
    print(x$output$games.howell, digits=digits);
  }
}

An example:

> posthoc.tgh(y=diamonds$y, x=diamonds$cut);
              n means variances
Fair       1610   6.2      0.91
Good       4906   5.9      1.11
Very Good 12082   5.8      1.22
Premium   13791   5.9      1.59
Ideal     21551   5.5      1.15

                     t    df       p
Fair:Good         11.8  2985 0.0e+00
Fair:Very Good    16.0  2221 5.9e-11
Fair:Premium       9.1  2316 1.6e-11
Fair:Ideal        26.6  1925 3.1e-11
Good:Very Good     4.5  9497 7.7e-05
Good:Premium       5.1 10243 3.4e-06
Good:Ideal        19.8  7419 1.3e-11
Very Good:Premium 11.9 25871 0.0e+00
Very Good:Ideal   20.1 24473 0.0e+00
Premium:Ideal     32.7 26011 0.0e+00
Matherion
  • 477
  • 4
  • 13
  • 1
    Update: this function is now available in the 'userfriendlyscience' package. – Matherion Mar 27 '15 at 13:09
  • There is a nice description of the Games-Howell test, complete with a working implementation by Aaron Schlegl, at https://rpubs.com/aaronsc32/games-howell-test . – Laryx Decidua Jun 02 '17 at 11:41
  • 1
    Thanks so much for adding this into the package. I'm a bit confused as to what the difference is between p and p adjusted columns, since I was under the impression that the Games Howell test is an adjustment in itself – Adam B Jun 07 '17 at 23:19
  • 1
    I must admit that I don't remember the rationale for having added that. I've changed the default from "holm" to "none" in [commit 51f61fb](https://github.com/Matherion/userfriendlyscience/commit/51f61fb47979768a3853fe45213d3309bca114f8), will be in version 0.6-2. – Matherion Jun 16 '17 at 16:13
  • Update: based on https://stackoverflow.com/questions/48280985/letters-group-games-howell-post-hoc-in-r, optional lettering of the groups/differences will be available in the next version of `userfriendlyscience`. – Matherion Jan 17 '18 at 15:30