I'm currently looking into the gender pay gap using data from glass door (found via kaggle). The dataset has columns for gender, age, performance evaluations of employees, seniority, pay etc.
For context: I have learned a lot of Data Science/Machine learning/programming over the past few years, and am just doing a few of my own basic portfolio projects for practice, before applying for jobs.
I have done a fairly naive t-test, comparing average pay for men vs average pay for women. I am now looking to add in controls, comparing similar age groups, seniority, education level etc. I want to do more t-tests, as well as looking at a chi-square distribution and/or ANOVA.
As I do multiple tests A/B tests, I want to avoid p-hacking. I have a few hypotheses, for example I expect the pay gap to be greater for older age groups. But this is mostly exploring the data, I don't have a single hypothesis I am looking to prove for the entire study, nor do I have a political agenda.
I'm not sure it would really count as p-hacking as long as I choose which comparisons to make, and report everything. I would think it's only p-hacking if I selected which t-test results to report to help prove a hypothesis. Is this fair?
And another question (forget my data for a moment), with ANOVA, as it compares multiple groups at once to look for significance, is this not p-hacking?