I've been caught in these circular loops many times. I think the bottom line of your question is:
"In summary, how can each flight be considered an independent variable, but also have an increasing chance of something occurring?"
As you said, we can use the geometric distribution, interpreted as the number of trials ($k$) (or flights) before the first failure (crash) with a probability of crashing, $p$. The probability mass function is:
$P (X = k) = (1 - p)^k * p$.
So in our case the probability of a plane crash is $2.6/10^6$. Since every flight is independent from the prior, the probability of crashing on the first flight would be
$P(X=0)=(1-2.6/10^6)^0*2.6/10^6= 2.6/10^6$.
But we knew that... So what about in two flights: now it's the probability of two independent outcomes: surviving the first flight, $1-2.6/10^6=0.9999974$, pretty good, times the probability of a crash on the second flight, $2.6/10^6$, exactly the application of the geometric distribution:
$P(X=1)=(1-2.6/10^6)^1*2.6/10^6= 2.6/10^6$. So, you are right, for every additional trip the exponent of $(1-2.6/10^6)$ increases by $1$, decreasing the probability (smaller fraction of $p$). Therefore we have better odds of surviving two flights than ten.
But you knew that. Your question is more philosophical, am I right? Along the lines of why are we not discussing a memoryless process, since the events are independent. And if so I wonder if the answer lies in the difference between the elementary building blocks or atoms of the sample space (so-called outcomes), i.e. $ \Omega =\{ L=\text{land},\,C=\text{crash} \}$, a binary outcome situation in our case, versus different events considered among possible collections or permutations of these elementary outcomes. Of note, when we repeat experiments we expand the sample space accordingly so that each outcome constituting an elementary experiment draws from the initial sample space, and in aggregate, a new sample space is formed where the order matters (experiment 1 is followed by experiment 2) denoted by $\Omega^n$ (each elementary outcome either happens or does not happen, resulting in $2^n$ tuples (in our case, with only two outcomes) with $n$ being the number of elementary outcomes). In this way, whether we are referring to a single or a repeat experiment, the event will be a subset of the sample space. For instance, the event "crashing on the third flight" would be $\text{Third}=\{L, \, L,\, C\}$, and this event (comprised of three flights) will map to a probability of its own, different from the probability of each elementary outcome of the sample space $\Omega=\{LLL,LLC,LCL,LCC,CLL,CLC,CCL,CCC\}$. The reality of our example is such that other events with the outcome $C$ positioned before the last one, such as $\{LCL\}$, would have a probability of zero.
Let me see if we can separate these ideas from the known memoryless property of the geometric distribution.
We have gotten to the point where we agree that it's easier to make it if you have to survive two flights as compared to four. Is this fair based on the prior discussion? OK, so we take two flights... we go on vacation to Amsterdam with direct flights from JFK. The probability of making it alive is $p(X\geq k=2)=(1-2.6/10^6)^2=0.9999948$, which is the probability of events with a $k\geq 2$. Proof:
$\displaystyle p(X\geq k) =\sum_{i=k}^\infty\,(1-p)^i \,p = (1-p)^k\,p\,\sum_{0}^{\infty}\,(1-p)^i$. Since $\displaystyle \sum_{0}^{\infty}(1-p)^i$ is a converging geometric series, it will equal $\frac{1}{1-(1-p)}=\frac{1}{p}$. Hence $p(X\geq k) =(1-p)^k$.
Now, critically, this corresponds to the sum of probabilities for events $\{LLC\}$, $\{LLLC\}$, $\{LLLLC\}$, $\{LLLLLC\}$ ad infinitum. We have seen that each one of these probabilities is slightly lower than the prior, and we add them up, because any of these events is favorable to us - we make it back home.
On the other hand, if we had a layover at Heathrow our chances would be infinitesimally reduced to $p(X\geq k=4)=(1-2.6/10^6)^4=0.9999896$, but still excellent by any account. These two statements should be a point of agreement. This probability corresponds to the addition of the probabilities for events $\{LLLLC\}$, $\{LLLLLC\}$, and onwards to infinity. Evidently then, this probability can't be the same (it is minimally lower) than the probability of making it home without a layover.
Now to the memoryless-ness... The previous discussion is not in contradiction with the idea that our probability of making it to Amsterdam (leaving now aside whether or not we have a layover in Heathrow) is exactly the same independently of whether we simply took a cab from Manhattan, or we live in Boston, and had to take an additional flight to JFK, knowing that we made it to JFK. In other words, just because we survived the flight from Boston, we haven't altered our odds to making it to Amsterdam. This is the memoryless-ness:
$p(X\geq BOStoNYC \,+\, NYCtoAMS\,|\,X\geq BOStoNYC) = p(X\geq NYCtoAMS)$.
Or more formally derived here.
I know anybody still reading has concluded that the odds of survival have plummeted at the thought of a taxi ride, but I hope this is formally correct-ish.