Definition
A Context-Free Grammar (CFG) is in Chomsky Normal Form (CNF) if every rule is of the form
or
or
where , , and is the start variable. Note that the start variable cannot be present in the RHS of any rule.
CFG to a CFG in CNF
Consider the CFG to generate strings with balanced paranthesis
Every CFG can be converted to a CFG in CNF by following the steps below.
1. No start variable on RHS of a rule
This can be done by adding a new start variable .
2. Null Variables
a. Determine which variables are nullable
A variable is nullable if it can generate . The nullable variables for the above grammar are
b. Whenever and is nullable, add rule
In essence, replace every instance of a nullable variable with and consider all possiblities.
c. Remove all rules except for (if present)
Now that all nullable variables generating is accounted for, these rules can be removed.
3. Eliminate Unit Rules
Unit rules are of the type where both and are variables. Replacing with ‘s RHS expansion will eliminate this unit rule.
4. Ensure that every RHS of length at least contains only variables
This step involves creating new rules of the type where is a variable and is a terminal. Define the following new rules
The grammar is then modified to
5. Breakup RHS of length greater than
RHS of length greater than can broken down into sequential chunks of . For example can be written as
The final grammar in CNF is as follows