Language theoretic comparison of LL and LR grammars

69

38

People often say that LR(k) parsers are more powerful than LL(k) parsers. These statements are vague most of the time; in particular, should we compare the classes for a fixed $k$ or the union over all $k$? So how is the situation really? In particular, I am interested in how LL(*) fits in.

As far as I know, the respective sets of grammars LL and LR parsers accept are orthogonal, so let us talk about the languages generated by the respective sets of grammars. Let $LR(k)$ denote the class of languages generated by grammars that can be parsed by an $LR(k)$ parser, and similar for other classes.

I am interested in the following relations:

  • $LL(k) \overset{?}{\subseteq} LR(k)$
  • $\bigcup_{i=1}^{\infty} LL(k) \overset{?}{\subseteq} \bigcup_{i=1}^{\infty} LR(k)$
  • $\bigcup_{i=1}^{\infty} LL(k) \overset{?}{=} LL(*)$
  • $LL(*) \overset{?}{\circ} \bigcup_{i=1}^{\infty} LR(k)$

Some of these are probably easy; my goal is to collect a "complete" comparison. References are appreciated.

Raphael

Posted 2012-03-07T00:32:31.947

Reputation: 69 237

2

Maybe this can help you! Grammar Hierarchy image

– Andrea Tucci – 2012-07-11T15:32:29.130

1@AndreaTucci: Yes, but that only covers the grammars, not the generated languages. – Raphael – 2012-07-11T18:06:20.200

Answers

62

There are numerous containments known. Let $\subseteq$ denote containment and $\subset$ proper containment. Let $\times$ denote incomparability.

Let $LL = \bigcup_k LL(k)$, $LR = \bigcup_k LR(k)$.

Grammar level

For LL

  • $LL(0) \subset LL(1) \subset LL(2) \subset LL(2) \subset \cdots \subset LL(k) \subset \cdots \subset LL \subset LL(*)$
  • $SLL(1) = LL(1), SLL(k) \subset LL(k), SLL(k+1) \times LL(k)$

Most of these are proven in Properties of deterministic top down grammars by Rosenkrantz and Stearns. $SLL(k+1) \times LL(k)$ is a rather trivial exercise. This presentation by Terence Parr places $LL(*)$ on slide 13. The paper LL-regular grammars by Jarzabek and Krawczyk show $LL \subset LLR$, and their proof trivially extends to $LL \subset LL(*)$

For LR

  • $LR(0) \subset SLR(1) \subset LALR(1) \subset LR(1)$
  • $SLR(k) \subset LALR(k) \subset LR(k)$
  • $SLR(1) \subset SLR(2) \subset \cdots \subset SLR(k)$
  • $LALR(1) \subset LALR(2) \subset \cdots \subset LALR(k)$
  • $LR(0) \subset LR(1) \subset LR(2) \subset \cdots \subset LR(k) \subset \cdots \subset LR$

These are all simple exercises.

LL versus LR

  • $LL(k) \subset LR(k)$ (Properties of deterministic top down grammars plus any left recursive grammar)
  • $LL(k) \times SLR(k), LALR(k), LR(k-1)$ (simple exercise)
  • $LL \subset LR$ (any left recursive grammar)
  • $LL(*) \times LR$ (left recursion versus arbitrary lookahead)

Language level

For LL

  • $LL(0) \subset LL(1) \subset LL(2) \subset \cdots \subset LL(k) \subset \cdots \subset LL \subset LL(*)$
  • $SLL(k) = LL(k)$

Most of these are proven in Properties of deterministic top down grammars. The equivalence problem for LL- and LR-regular grammars by Nijholt makes references to papers showing $LL(k) \subset LL(*)$. The paper LL-regular grammars by Jarzabek and Krawczyk show $LL \subset LLR$, and their proof trivially extends to $LL \subset LL(*)$

For LR

  • $LR(0) \subset SLR(1) = LALR(1) = LR(1) = SLR(k) = LALR(k) = LR(k) = LR$

Some of these were proven by Knuth in his paper On the Translation of Languages from Left to Right in which he introduced LR(k), the rest are proven in Transforming LR(k) Grammars to LR(1), SLR(1), and (1,1) Bounded Right-Context Grammars by Mickunas et al.

LL versus LR

  • $LL \subset LR(1)$ (containment follows from the above, $\{ a^i b^j | i \geq j \}$ is the canonical example for strict containment)
  • $LL(*) \times LR$ (the language $\{ a^i b^j | i \geq j \}$ shows half the claim, and the introduction of The equivalence problem for LL- and LR-regular grammars by Nijholt makes references to papers showing the other half)
  • $LR(1) = DCFL$ (see e.g. reference here).

Alex ten Brink

Posted 2012-03-07T00:32:31.947

Reputation: 8 716

Surprised at how languages have equality when grammars have containment. For grammars, we have $LR(0)\subset SLR(1)\subset LALR(1)\subset LR(1)$. For languages, we have $LR(0)\subset SLR(1) = LALR(1)=LR(1)$. Why is this so? – anir – 2017-12-16T07:12:09.940

What does it mean "$SLL(k+1) \times LL(k)$ is a rather trivial exercise."? Which statement is trivial? What does "$\times$" mean here? If it is the Cartesian product operation, or any other operation, the sentence does not make sense. – Alexey – 2018-11-03T12:09:58.003

@alexey: $\times$ in this answer means "no containment relationship". In other words, $A\times B$ means that both $A-B$ and $B-A$ are non-empty. – rici – 2019-02-15T15:39:49.077

@rici, where does this use of "times" symbol come from? This is not what it normally means. In fact, i've never heard of using "times" for a relation instead of an operation... – Alexey – 2019-02-15T19:04:02.207

@alexey: it should be a big X. (As in Not, or crossed-out). There is probably something more X-like than the times sign, but I've seen it used that way before. – rici – 2019-02-15T21:29:04.467

@rici, the symbol used here is times, but even for X, where does this usage come from? I doubt this to be coming from mathematics... – Alexey – 2019-02-15T21:35:09.247

@akexey I guess it comes from X mark often used in opposition to check mark. So it has a negative connotation in common usage. There are some very clunky symbols for "incomparable"; according to Wikipedia, it is $\mid\mid$ but I wonder if that would have been obvious to a casual reader. Indeed, writing the two bars crossed feels more incomparable to me :-)

– rici – 2019-02-15T22:48:53.257

Excellent answer though, I had already upvoted. I would have thought Frank deRemer proved LALR <= LR in his original LALR paper? (1969?) – user207421 – 2012-04-11T18:43:25.120

@EJP: $LALR(k) \subseteq LR(k)$ is immediate if you define $LALR$ by collapsing $LR$ states. deRemer only proved that his construction created the same parser. The grammar classes are not equal, which is a nice exercise (or even a nice question for this site). The language class equality is a lot more tricky: read the paper if you really want the details ;) – Alex ten Brink – 2012-04-11T22:58:34.867

1@AlextenBrink I did read the paper, and was taught by Frank de Remer, but it's 30+ years ago ;-) Thanks for all the detail. – user207421 – 2012-04-12T00:13:57.423

It might be nice to collect example grammars for each inequality. – o11c – 2015-05-31T05:51:55.543

1@o11c I think that would overburden a single answer. My impression is that Alex gave good references where necessary; he states "easy exercise" for some. I guess if a reader can not come up with a grammar, they can post a new question asking for that specific case. – Raphael – 2015-06-18T10:33:15.510