The derivation of a string from a Context-Free Grammar (CFG) can be visualized as a parse tree. The root of the parse tree is always the start variable. Each branch corresponds to a rule in the CFG.
Derivations and Parse Trees
Consider the following grammar
This grammar can produce the string using the following derivation
An equivalent parse-tree for the above derivation is
The same string can have another derivation
However, this derivation generates the same parse-tree as above.
Thus, two different derivations may point to the same parse-tree.
Ambiguity
The above grammar can have two different parse trees below to generate .
A left-most derivation is a derivation in which the left-most variable is replaced in each step.
Every left-most derivation yields a parse-tree. Every parse-tree yields a left-most derivation. These are in one-to-one correspondence.
A CFG is ambiguous if some string in its language has two or more parse-trees (left-most derivations).
Ambiguity is a property of the grammar, not of the language. Some languages are inherently ambiguous. It is impossible to define an unambiguous grammar for these languages.
Chomsky Normal Form
A CFG in Chomsky Normal Form (CNF) will generate a parse-tree which is a binary tree. This is because each rule is of the form or , where and are variables and is a terminal.
Each internal node of the binary tree will be a variable. The leaves of the binary tree will be a terminal.
Consider a parse-tree to generate a string of length
This parse-tree will have leaves – one for each symbol in the string. These correspond to type rules. A binary tree with leaves will have internal nodes. These correspond to type rules.
Thus, in total, there will be application of rules to generate a string of length .
If however, , the string generated is . In this case, there will be only one application of the rule