原文連接:https://www.cs.uic.edu/~spopuri/cparser.htmlhtml
Satya Kiran Popuri
Graduate Student
University of Illinois at Chicago
Chicago, IL 60607
spopur2 [at] uic [dot] edu
Wed Sep 13 12:24:25 CDT 2006redis
Bison is a general-purpose parser generator that converts an annotated context-free grammar into an LALR(1) or GLR parser for that grammar. The generated parser is implemented as a C or C++ program with a parsing function that can be called from application programs.express
It can be useful to understand the generated parser code in some situations. You may want to hand optimize the parser, or split the big yyparse() function into several smaller functions, or you want to understand how a real world parser is implemented as compared to the `theoretical' ways of compiler design books.app
This document is an attempt to describe the implementation of an LALR(1) parser in C as generated by Bison 2.3. I used a simple grammar to demonstrate the working of the parser and the nature of parsing tables. You will also find a comparison of these tables to the uncompressed tabular scheme given in the popular book "Compilers - Principles, Techniques and Tools" by Aho, Sethi and Ullman, also called as the "Dragon Book" and many other books on compiler design.less
I am assuming that you are familiar with context free grammars and LR parsing techniques; a little hands on experience with Bison will also help. You must be familiar with the theory and the usual terminology like shifts, reductions, parse stacks and error handling. If you are not familiar with Bison at all, I strongly recommend that you read the Bison user manual before reading this document.dom
An LR parser is essentially a shift-reduce (bottom up) parsing algorithm driven by parsing tables and a stack. For an overview of the LR parsing algorithm, you can refer to the Dragon book or this excellent wikipedia entry.electron
The parsing algorithm is same (at least in theory) for all LR parsers. They differ only in the tables generated by the parser generator. In our case, Bison generates LALR(1) parsing tables in the form of C/C++ arrays. The generated yyparse() function implements a parsing routine that uses these tables and a stack to guide the parse.ide
I will use the following little grammar as an example to demonstrate the internals of a C parser generated by Bison. It will also be used to show the traditional "theoretical" tables and the real world parsing tables and the relationship between them.oop
A Bison grammar specification for this grammar is like this:ui
%% L : L ';' E | E ; E : E ',' P | P ; P : 'a' | (M) ; M : /* nothing */ | L ; %%
The (uncompressed) tabular scheme described in the popular Dragon book and many other books on compiler theory serves well to gain a theoretical understanding of the working of an LR parser. Let us examine this scheme for the grammar given in the previous section.
The action and GOTO parts of the table are shown below. This LALR(1) table is calculated by hand.
action | GOTO | ||||||||||
state | ; | , | a | ( | ) | $ | L | E | P | M | |
0 | s12 | s6 | 1 | 8 | 9 | ||||||
1 | s2 | acc | |||||||||
2 | s12 | s6 | 3 | 9 | |||||||
3 | r1 | s4 | r1 | r1 | |||||||
4 | s12 | s6 | 5 | ||||||||
5 | r3 | r3 | r3 | r3 | |||||||
6 | s12 | s6 | r7 | 7 | 8 | 9 | 10 | ||||
7 | s2 | r8 | |||||||||
8 | r2 | s4 | r2 | r2 | |||||||
9 | r4 | r4 | r4 | r4 | |||||||
10 | s11 | ||||||||||
11 | r6 | r6 | r6 | ||||||||
12 | r5 | r5 | r5 | r5 |
The resulting automaton has 13 states. A string belongs to the language generated by this grammar if we encounter the end of string while in state 1. The working of an LR parsing algorithm based on this type of table is fairly simple and is clearly explained in the wikipedia entry and in standard compiler theory books;
We can see that most of the table is either empty or filled with repeated entries. The empty slots in the action part of the table are errors; The GOTO part of the table has only 4 valid entries with the rest of the cells empty. These empty slots are not errors.. they are simply unreachable; It is impossible to detect an error on a non-terminal symbol because of the correctness of the LR algorithm.
A real world useful grammar would end up with hundreds of states and thousands of empty slots in such a table. There is a lot of scope for optimization in a practical parser generator. In fact many books including the Dragon book suggest some scheme for compression of parsing tables. Bison takes a different approach to table compression and so far I haven't come across this method in any of the standard text books.
Instead of generating a big sparse matrix like the one shown above, Bison generates several smaller one dimensional tables as C arrays. Not all tables are used directly by the parsing routine - some are used to print debugging information and error recovery. Before diving into the essential tables, I must describe some unique concepts utilized by Bison that are not used in theoretical table computing methods.
Bison augments your input grammar with the following rule:
$accept : <start-symbol> $end
$accept and $end are fictitious symbols. This is somewhat similar to augmenting the grammar with a new start symbol as described by most books. In addition to the new start symbol $accept, Bison also has a terminal symbol $end which will help generate a "final" state of the automaton. This final state always has just one production:
$accept : <start-symbol> $end .
with the dot at the end of the augmented rule. This helps the parsing routine recognize when to accept the input string just by looking at the current state. Also it obviates the necessity to represent "accept" as an entry in the LALR tables. There is no notion of "final state" in theoretical LALR parsing (although it exists in an LR(0) characteristic FSM), but Bison uses one state as the final state by introducing the $end symbol.
One important implication of this augmentation is that the "Rule numbering" will now change for Bison generated tables. The $accept rule is considered rule #1, so in the tables produced by Bison, an r6 will be denoted by r7 and so on.
In states where there is a reduction, the action for many symbols will be the same. For example in the table above, states 5, 8, 9, and 11 have a common reduce action for different terminal symbols. Some substantial space saving is made by making the most common reduction in that state a "default" action. For example state 5 can have the default action as r3. The parser generated by Bison will do a reduction using rule #3 if it finds itself in state 5 regardless of the look-ahead symbol! This might lead to some illegal reductions. But it will not matter because the error will be caught ultimately before another shift can take place.
Taking the above two aspects into consideration, our dragon book style table will now change considerably. There are now 14 states due to the additional $end terminal symbol. And there is no "accept" entry in the table any more. State 8 simply is an accepting state.
The table given below is constructed directly from the .output report file generated by Bison 2.3. You can generate this report file by using the command line option --report=all. This is not the table that will be used for parsing in a Bison generated parser. This table here is only to gain a perspective on how the final tables will look like.
Table constructed from Bison report | Hand calculated table - rearranged | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|
The table constructed from the Bison report file is matched with our hand computed table. State 1 generated by Bison corresponds to state 12 in our table and so on. Of course there is no state equivalent to state 8 in the hand computed table because we didn't have the $end symbol and the notion of a final state.
Bison assigns a symbol number to each grammar symbol (terminal or non-terminal). Of course Bison has its own symbol table like any other compiler (Bison is a compiler compiler remember). As a rule, Bison always assigns numbers to terminal symbols first and then proceeds to non-terminals. i.e Non-terminals symbols always have higher numbers than terminals. Also, Bison introduces 4 new symbols in the symbol table:
$accept : The new start (non-terminal) symbol of the grammar.
$end : The fictitious end symbol. Symbol number 0 is reserved for this symbol.
error : The 'built-in' error symbol used for error productions. Symbol number 1 is reserved for it.
$undefined : A symbol that represents a token that Bison cannot recognize. (It is not a part of the grammar or not defined using %token). Symbol number 2 is reserved for this symbol.
Numbers for terminals begin from 3 onwards. You can check the symbol numbers assigned to various symbols by looking at the yytname array generated in the output parser. For our example grammar, the symbol numbers look like this:
Symbol | Number |
$end | 0 |
error | 1 |
$undefined | 2 |
';' | 3 |
',' | 4 |
'a' | 5 |
'(' | 6 |
')' | 7 |
$accept | 8 |
L | 9 |
E | 10 |
P | 11 |
M | 12 |
Symbols 0 through 7 are terminals and the rest are non-terminals. These symbol numbers are used to index various tables as we will see in a moment.
Let us look at how we can optimally represent our 'modified traditional table'. The table is still a sparse matrix with a lot of white space and repeated information. We can start by collecting all repeated data at one place. The default reductions are our prime targets. They can all be listed in an array indexed by state number:
default_reductions[] = {none,r5,r7,none,r2,r4,r8,none,none,none,none,r6,r1,r3}
We did waste some space for states where there are no default reductions, but that is far less than all the locations used up for repeating reduce actions. Some of the states have only reduce actions and nothing else. So, these rows are pretty much "taken care of" by the above array. Observe that we have listed a default reduction for state 2 (r7) even though it had shift actions on a couple of symbols. Hence, this array may be used only after ensuring that there are no shift actions for the current state on the current look-ahead symbol.
The next obvious target would be the GOTO part of the table which is mostly empty. Since most states do not have a GOTO part, the best space saving scheme here would be to have an array indexed by non-terminal (column) instead of state (row). But each non-terminal can have a different GOTO state depending on the current state of the automaton. So, we choose to list out the most popular GOTO state of each non-terminal in an array like this:
default_gotos[] = { 3, 4, 5, 7 }
That's one entry each for non-terminals L, E, P, M. We still have to take care of the other GOTO states for L, E and P. For now this table has saved us a lot of wasted GOTO white space.
The task now has been reduced to representing the shift actions properly for each state and some GOTOs. This is what is left of the action and GOTO tables. I have zapped default reductions and also states that have only default reductions. Also the terminals and non-terminals are now replaced by their symbol numbers and instead of representing shifts as s1, s2 and so on, I have used only the state numbers as there are no reduction entries now:
|
|
To compress this part of the table, Bison follows a method described by Tarjan and Yao [3]. It is a fairly complex method combining the idea of Trie data structures and "double displacement" with some of the authors' own ideas thrown in to improve time and space complexity results.
Double displacement is very straight forward. We flatten out the above 2-D table into a simple one dimensional array by mapping all non-white-space entries into some location of the array. We keep the order of elements in each row intact and displace each row by some amount such that no two non-white-space entries in each row occupy the same position in the one-dimensional array. Our mapping will be defined by a set of displacements D(i) for each row i; position (i,j) in the above table is mapped to position D(i) + j in our one-dimensional array. Here is a sample displacement table and the corresponding one dimensional table:
D = {4,6,0,0,12,9,8,0,13} and T = {9,10,1,2,11,8,1,2,1,2,1,2,9,11,10}
The displacement table D will act as a directory to one dimensional table T that contains the actual entries from the 2-D table; the offset into T for element (i,j) is calculated by D[i] + j. e.g., to find the entry at (3,1)(action for state 4 on symbol 4) in the 2-D table above, we compute D[3] + 1 = 0+1 = 1 and hence T[1] = 10 is the entry we really want.
Even with the above scheme we have a lot of repeated entries in the table T which are really the same states (e.g, states 1 and 2). So this method is combined with column displacements and "Tries" to obtain a more compressed table organization. If this discussion has whet your appetite for more, you can refer to [3] for more details.
The final objective of Bison is to build a one-dimensional table T that specifies what to do in each state. This table will be indexed by a (one dimensional) directory table D. In the above discussion, displacements into T are indexed by the 2-D array index (i,j). But we would rather want to index into T by state number and symbol number instead (since the rows and columns of the 2D table are headed by states and symbols). Bison indexes D by state number and displacement into T by symbol number. If the next look-ahead symbol has number k, table entry for current state can be retrieved as follows:
Action for state n on symbol y = T[ D[n] + k ]
As an example, the action for state 0 is s1 on symbol number 5 ('a'). So if T[0] = 1 then D[0] = -5; to account for action s2 on symbol number 6, we would have T[ D[0] + 6] = T[1] = 2 and so on;
But there is a special case that we need to take care of. There can be some 'explicit' errors in the action table that cannot be over written by default reductions. These errors are due to any %nonassoc declarations in the grammar. In that case, we would have some reductions in the action part "not taken care of" by the default_reductions[] array.
To represent these reductions in the same table T, Bison generates negated rule numbers in T. The negative sign is just to differentiate the shifts (which are positive and represent state numbers) from the reductions. The explicit errors will be represented by another negative number which is out of range of rule numbers and defined explicitly by a macro (see YYTABLE_NINF below). We do not have this situation in our grammar.
Also D will have a specially defined negative value that will indicate that the current state has only a default reduction (like state 1 for example). The parser would always go for a default reduction if this value is encountered in D.
A similar directory table ND (say) can be built for the non-terminals symbols that have non-default GOTO states. This table will contain one reference entry for each non-terminal symbol. These entries are displacements in T, but indexed by state number.
If ND[P] = 2, and the current state is 10, we refer to T[12] to get the GOTO of state 10 on P. But given that current state S, the parser needs to choose between ND and default_gotos to pick the GOTO state. Here we can make ND[S] to be a value that is out of bounds of table T and hence force the parser to pick the GOTO from default_gotos.
Bison also has a guard table that is checked to see if we are within legal bounds of table T. Read on for the actual descriptions of these tables to connect all the dots.
Bison generates parsing tables as linear arrays as seen in previous sections. All variable names in the output parser begin with 'yy' to avoid naming conflicts with user programs. What follows is a brief description of each table with some examples referring to the sample grammar.
This table maps lexical token numbers to their symbol numbers. If you have %token declarations in your grammar, Bison assigns token numbers to the different tokens; If you just use character representations like we have done in our grammar above, Bison just maps their ASCII values to the symbol numbers. So, in our case, yytranslate[97] = 5 which is 'a' (see symbol table above). The full listing of yytranslate is given below.
/* YYTRANSLATE[YYLEX] -- Bison symbol number corresponding to YYLEX. */ static const yytype_uint8 yytranslate[] = { 0, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 6, 7, 2, 2, 4, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 5, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 1, 2 };
Most entries point to symbol #2 which is $undefined. Only the symbols defined in our grammar have been given valid symbol numbers. Whenever yyparse() requires a token, it calls yylex and then translates the returned token number into its internal symbol number using this table.
This table lists default reductions for each state. yydefact[state] = rule number to use for a default reduction in that state. Here is the yydefact table produced for our grammar:
/* YYDEFACT[STATE-NAME] -- Default rule to reduce with in state STATE-NUM when YYTABLE doesn't specify something else to do. Zero means the default is an error. */ static const yytype_uint8 yydefact[] = { 0, 6, 8, 0, 3, 5, 9, 0, 1, 0, 0, 7, 2, 4 };
I will get to YYTABLE (mentioned in the comment) in a moment. Note that a rule number 0 in this table means error. Rule number 0 is not used internally by Bison because of some limitations of the internal representation of the input grammar.
If you take a look back at our modified dragon book style table, it is easy see what yydefact is trying to say. One important observation is in order here - rule numbers are incremented by 1 in this table because of the additional rule ($accept → L $end). So what was really r5 for state 1, has become r6 in yydefact.
This is a compressed form of the GOTO part of our traditional table. It has as many entries as there are non-terminals in the grammar. Each entry specifies the stateto transition to on each non-terminal. So here goes:
static const yytype_int8 yydefgoto[] = { -1, 3, 4, 5, 7 };
yydefgoto[nth non-terminal] = most common GOTO state for the nth non-terminal.
n starts from zero. An index into this array is obtained by subtracting the number of terminal symbols from the symbol number of the non-terminal. For example, the symbol number for E is 10 and number of tokens in the grammar is 8, so:
Thus yydefgoto[E] = yydefgoto[10-8] = state 4.
yydefgoto is consulted whenever the parser reduces stack contents using a rule. Later we will see that yydefgoto is consulted only after checking with another goto table, and that will explain how we manage to go to state 6 from state 2 on L instead of going to state 3 as this table specifies for L.
As a final note observe that the entry for the zeroth non-terminal ($accept) is -1. The stack will never be reduced with the $accept rule.
yyr1 specifies the symbol number of the LHS of each rule. Remember that 0 is never used as a rule number, so this table has NRULES + 1 entries, where NRULES is the number of rules in the grammar (Which is 9 in our case). Here is the listing:
/* YYR1[YYN] -- Symbol number of symbol that rule YYN derives. */ static const yytype_uint8 yyr1[] = { 0, 8, 9, 9, 10, 10, 11, 11, 12, 12 };
So, rule #1 has $accept as LHS, and hence yyr1[1] = 8 (see symbol table given previously) and so on. When a reduction takes place, We need to know the LHS symbol of the rule used for reduction to transition to an appropriate state. That is where this table comes into use.
yyr2 specifies the length (number of symbols) of the right hand side of each rule. Here is a listing produced by Bison:
/* YYR2[YYN] -- Number of symbols composing right hand side of rule YYN. */ static const yytype_uint8 yyr2[] = { 0, 2, 3, 1, 3, 1, 1, 3, 0, 1 };
Rule #2 (L → L;E) has 3 symbols on the RHS, and hence yyr2[2] = 3. This table is also used at the time of a reduction. The number of states to be popped off the stack is same as the number of symbols on the right hand side of the reducing rule.
This table is a mixed bag of state numbers and rule numbers in some pre-calculated order. This is the table T we discussed in the previous section. yytable works closely with yycheck, yypact and yypgoto tables (described below) to indicate to the parser either the state to pushed next on to the parse stack or a rule to use for reduction. I will just dump the table listing here. The entries won't make sense until we look at yypact and yypgoto.
/* YYTABLE[YYPACT[STATE-NUM]]. What to do in state STATE-NUM. If positive, shift that token. If negative, reduce the rule which number is the opposite. If zero, do what YYDEFACT says. If YYTABLE_NINF, syntax error. */ #define YYTABLE_NINF -1 static const yytype_uint8 yytable[] = { 8, 1, 2, 9, 11, 10, 9, 6, 12, 0, 0, 0, 13 };
One thing worth noting here is the definition of YYTABLE_NINF - the "negative infinity" value for yytable is the highest negative entry which in our case is -1 (since there are no negative values in yytable). This value is used to determine explicit error situations.
This table gives a reference to GOTO entries for non-terminals that can transition the automaton to different states based on previous state. For example, the symbol L can take the automaton to state 3 if the present state is 0, but the GOTO on L for state 2 is state 6. Similarly for symbols E and P there are different GOTO states based on the current state. The most common GOTO is already defined in YYDEFGOTO. The job of YYPGOTO is to indicate the anomalies. Lets take a look:
/* YYPGOTO[NTERM-NUM]. */ static const yytype_int8 yypgoto[] = { -5, 5, -1, 2, -5 };
That's our ND table that we discussed in the compressing parsing tables section. It is indexed by non-terminal symbol number; One entry each for $accept, L, E, P, M.
Let us say the current state on top of the stack after reduction by rule #5 (P → a) is state 10 (see example parse in the appendix). Now the GOTO transition for state 10 on P is really state 13 (according to our traditional tables), which is not the yydefgoto entry for P.
The parser adds yypgoto[P] i.e yypgoto[3] to the current exposed state number (state 10). The result is 12 (since yypgoto[3]=2). Now yytable[12] happens to be 13, so this is the new state to be pushed on to the stack.
How does the parser know that it has to pick the state value from yytable (via yypgoto) and not from yydefgoto? For this, there is a guard table called yycheck that will indicate this fact. It is only after checking with this table that a proper pick is made. We will see this operation in the section describing the parsing routine yyparse().
This table specifies the portion of yytable that describes what to do in state S. It is indexed by the symbol number of the token symbols. This is like the directory table D that was described in the previous section on compressing parsing tables.
#define YYPACT_NINF -5 static const yytype_int8 yypact[] = { -4, -5, -4, 0, 1, -5, 3, -3, -5, -4, -4, -5, 1, -5 };
This is the first table consulted by the parsing loop. If yypact[cur-state] = YYPACT_NINF, its time for a default reduction using yydefact. This means that the state has only reductions as in state 1 (r5); otherwise the entry in yypact[cur-state] is to be added to the symbol number of the current look-ahead token and the resulting number is used as an index into yytable to get the next action (see comments in yytable code).An example:
Suppose we are in state zero and the look-ahead token is 'a'(symbol number 5). Now yypact[0] is -4, so the index into yytable is 5-4 = 1. yytable[1] is 1 which means "shift this symbol and go to state 1". See yycheck below for some more checking information. If the yytable entry specifies a negative value say -3, it means that we should be reducing the stack with rule #3.
This is like a guard table. This table is used for various checks. Again this table is another mixed bag - of symbol numbers and state numbers. There is a very good explanation for this table inside Bison source code (src/tables.h). So I will just quote that here:
YYCHECK = a vector indexed in parallel with YYTABLE. It indicates, in a roundabout way, the bounds of the portion you are trying to examine. Suppose that the portion of YYTABLE starts at index P and the index to be examined within the portion is I. Then if YYCHECK[P+I] != I, I is outside the bounds of what is actually allocated, and the default (from YYDEFACT or YYDEFGOTO) should be used. Otherwise, YYTABLE[P+I] should be used.
Here is the table listing:
static const yytype_int8 yycheck[] = { 0, 5, 6, 3, 7, 4, 3, 2, 9, -1, -1, -1, 10 };
Examples of usage:
Suppose we are in state 0 and the look-ahead token in 'a'. yypact[0] = -4 and this, added to symbol number of 'a' (5) gives the index in yytable as we have seen before. Before accessing yytable[1] and proceeding to shift the token, the parser must check that yycheck[1] has the symbol number for the current token. If the test fails, it is time for a default reduction! But in our case, yycheck[1] contains 5 which really is the symbol number 'a', so we can safely consult yytable for what to do next.
yycheck also helps with special case reductions where the GOTO on a non-terminal for the current state is not the most common state (specified by yydefgoto); Consider the same example given in the yypgoto section: we were in state #10 and the reduction rule was rule #5 (P → a). We added yypgoto[P]=2 to current state number (10) to get 12. Before consulting yytable the parser checks with yycheck[12]. If this value is 10, then we know that state# 10 is a special case, otherwise, we use yydefgoto to decide the transition.
There are other tables that Bison will output to help with printing debug information and parsing error recovery and verbose output. Here is a list with brief descriptions.
yyrhs - A -1 separated list of RHS symbol numbers of all rules. yyrhs[n] = first symbol on the RHS of rule #n.
yyprhs[n] - Index in yyrhs of the first RHS symbol of rule n.
yyrline[n] - Line # in the .y grammar source file where rule n is defined.
yytname[n] - A string specifying the symbol for symbol number n. yytoknum[n] - Token number of token n.
Here is a one line summary of the main tables:
yytranslate: maps token numbers returned by yylex() to Bison's internal symbol numbers.
yydefact: lists default reduction rules for each state. 0 represents an error.
yydefgoto: lists default GOTOs for each non-terminal symbol. It is only used after checking with yypgoto.
yyr1: symbol number of lhs of each rule. Used at the time of a reduction to find the next state.
yyr2: length of rhs of each rule. Used at the time of reduction to pop the stack.
yytable: a highly compressed representation of the actions in each state. Negative entries represent reductions. There is a negative infinity to detect errors.
yypgoto: accounts for non-default GOTOs for all non-terminal symbols.
yypact: directory into yytable indexed by state number. The displacements in yytable are indexed by symbol number.
yycheck: guard table used to check legal bounds within portions of yytable.
Lets get down to the operation of yyparse - the LR parsing routine based on the tables just described. To demystify the routine, I have copied here snippets of code from the generated C parser file and altered them slightly. The most important change here is that I have cut out error handling to make the basic operation very clear. Many checks and macro calls have been removed to expose the bare parsing algorithm. All explanation of the code is found in comments.
/*These are global variables*/ /* The look-ahead symbol. */ int yychar; /* The semantic value of the look-ahead symbol. */ YYSTYPE yylval; int yyparse() { int yystate; /* current state */ int yyn; /* This is an all purpose variable! One time it may represent a state, the next time, it might be a rule! */ int yyresult; /* Result of parse to be returned to the called */ int yytoken=0; /* current token */ /* The state stack: This parser does not shift symbols on to the stack. Only a stack of states is maintained. */ int yyssa[YYINITDEPTH]; /*YYINITDEPTH is 200 */ int *yyss = yyssa /* Bottom of state stack */ int *yyssp; /* Top of state stack */ /* The semantic value stack: This stack grows parallel to the state stack. At each reduction, semantic values are popped * off this stack and the semantic action is executed */ YYSTYPE yyvsa[YYINITDEPTH]; YYSTYPE *yyvs = yyvsa; /* Bottom of semantic stack */ YYSTYPE *yyvsp; /* Top of semantic stack */ /* POP the state and semantic stacks by N symbols - useful for reduce actions */ #define YYPOPSTACK(N) (yyvsp -= (N), yyssp -= (N)) int yylen = 0; /* This variable is used in reduce actions to store the length of RHS of a rule; */ /* Ok done declaring variables. Set the ball rolling */ yystate = 0; /* Initial state */ yychar = YYEMPTY /* YYEMPTY is -2 */ yyssp = yyss; /* Top = bottom for state stack */ yyvsp = yyvs; /* Same for semantic stack */ goto yysetstate; /* Well, gotos are used for extracting maximum performance. */ /* Each label can be thought of as a function */ yynewstate: /* Push a new state on the stack */ yyssp ++; /*Just increment the stack top; actual 'pushing' will happen in yysetstate */ yysetstate: *yyssp = yystate; /* Ok pushed state on state stack top */ goto yybackup; /* This is where you will find some action */ yybackup: /* The main parsing code starts here */ yyn = yypact[yystate]; /* Refer to what yypact is saying about the current state */ if ( yyn == YYPACT_NINF) /* If negative infinity its time for a default reduction */ goto yydefault; /* This label implements default reductions; see below */ /* Check if we have a look-ahead token ready. This is LALR(1) parsing */ if (yychar == YYEMPTY) yychar = YYLEX; /* Macro YYLEX is defined as yylex() */ if (yychar <= YYEOF) /* YYEOF is 0 - the token returned by lexer at end of input */ yychar = yytoken = YYEOF; /* set all to EOF */ else yytoken = yytranslate[yychar]; /* Translate the lexer token into internal symbol number */ /* Now we have a look-ahead token. Let the party begin ! */ yyn = yyn + yytoken; /* This is yypact[yystate] + yytoken */ /* Observe this check carefully. We are checking that yyn is within the bounds of yytable * and also if yycheck contains the current token number. */ if ( yyn < 0 || YYLAST < yyn || yycheck[yyn] != yytoken ) /* YYLAST is the highest index in yytable */ goto yydefault; /* Its time for a default reduction */ /* Ok, yyn is within bounds of yytable */ yyn = yytable[yyn]; /* This is yytable[ yypact[yystate] + yytoken ] */ if (yyn <= 0) /* If yytable happens to contain a -ve value, its not a shift - its a reduce */ { if (yyn == 0 || yyn == YYTABLE_NINF) /* But check for out of bounds condition*/ goto yyerrlab; /* Label to handle errors */ /* Other wise reduce with rule # -yyn */ yyn = -yyn; goto yyreduce; /* Label to implement reductions */ } /* Last check: See if we reached final state! */ if (yyn == YYFINAL) /* YYFINAL is 8 in our case */ YYACCEPT; /* macro deined as 'goto acceptlab - a label to finish up */ /* That completes all checks; If we reached here, there is no other option but to shift */ yystate = yyn; /* Now, yyn (= yytable[ yypact[yystate] + yytoken ]) is a state that has to be pushed */ *++yyvsp = yylval; /* Push the semantic value of the symbol on the semantic stack */ goto yynewstate; /* This will increment state stack top and the following yysetstate that will do the pushing */ yydefault: /* A label to implement default reductions */ yyn = yydefact[yystate]; /* Get the default reduction rule for this state */ if ( yyn == 0 ) /* This state has no default reduction. Something is wrong */ goto yyerrlab; goto yyreduce; /* Ok, got the default reduction rule # in yyn; go ahead and reduce the stack */ yyreduce: /* A lablel that implements reductions on stack. */ /* By the time we are here, yyn contains the rule# to use for reducing the stack. */ /* Steps for reduction: * 1. Find the length of RHS of rule #yyn * 2. Execute any semantic actions by taking the values from the semantic stack * 3. POP 'length' symbols from the state stack and 'length' values from semantic stack * 4. Find the LHS of rule #yyn * 5. Find the GOTO of state currently on top of stack on LHS symbol * 6. Push that state on top of stack * */ yylen = yyr2[yyn]; /* Get length of RHS */ /* Default semantic action - $$=$1 */ yyval = yyvsp[1-yylen]; /* Execute semantic actions */ switch ( yyn ) /* Each rule has its own semantic action */ { default: break; /* We didn't have any semantic actions in the grammar.*/ } YYPOPSTACK (yylen); /* This will pop both state and semantic stacks. See definition of this macro above */ yylen = 0; /* re-initialize yylen */ *++yyvsp = yyval; /* Push the result of semantic evaluation on top of semantic stack */ /* Now shift the result of reduction (steps 4 - 6) */ yyn = yyr1[yyn]; /* Reuse yyn at every opportunity. For now, yyn is the LHS symbol (number) of the rule */ /* First check for anomalous GOTOs, otherwise use Default GOTO (YYDEFGOTO) * * Observe that if we subtract no. of terminals (YYNTOKENS) from symbol number of a nonterminal, we get * an index into yypgoto or yydefgoto for that non-terminal. */ yystate = yypgoto[yyn - YYNTOKENS] + *yyssp; /* A couple of checks are needed before we know this is not a default GOTO * 1. yystate must be within bounds of yytable. ( 0 to YYLAST ) * 2. yycheck must contain the state currently on top of the stack */ if ( 0 <= yystate && yystate <= YYLAST && yycheck[yystate] = *yyssp) yystate = yytable[yystate]; /* Take the GOTO from yytable */ else yystate = yydefgoto[yyn - YYNTOKENS]; /* Otherwise use the default GOTO */ goto yynewstate; /* Simply push the newly found state on top of stack and continue */ } /* End of yyparse() */
That completes a very basic and simplified parsing routine based on the tables produced by Bison. The actual generated parser does other things like checking the stack for overflows, relocating the stacks when necessary and a lot of error checking. If you use error productions, the parser has to get back to a sane state where the error token can be shifted next. Also, it has to print appropriate verbose output and be able to call yyerror(); There is also cleanup code for symbols that are popped off the semantic stack to avoid memory leaks. I have ignored all this code here to focus on the outline of the parsing routine yyparse().
Bison has a facility to enable parse tracing. If you set yydebug=1 before calling yyparse(), the parser will produce a trace of shifts and reductions as the input string is being processed. I produced such a trace for a sample input string in the language generated by our example grammar. Following this trace with the parsing routine and the tables can help in appreciating the table scheme and the operation of yyparse() function.
Sample string: a,a;a,a$
$ is my EOF character. Here is the trace listing.
Starting parse Entering state 0 Reading a token: Next token is token 'a' () Shifting token 'a' () Entering state 1 Reducing stack by rule 5 (line 26): $1 = token 'a' () -> $$ = nterm P () Stack now 0 Entering state 5 Reducing stack by rule 4 (line 24): $1 = nterm P () -> $$ = nterm E () Stack now 0 Entering state 4 Reading a token: Next token is token ',' () Shifting token ',' () Entering state 10 Reading a token: Next token is token 'a' () Shifting token 'a' () Entering state 1 Reducing stack by rule 5 (line 26): $1 = token 'a' () -> $$ = nterm P () Stack now 0 4 10 Entering state 13 Reducing stack by rule 3 (line 23): $1 = nterm E () $2 = token ',' () $3 = nterm P () -> $$ = nterm E () Stack now 0 Entering state 4 Reading a token: Next token is token ';' () Reducing stack by rule 2 (line 21): $1 = nterm E () -> $$ = nterm L () Stack now 0 Entering state 3 Next token is token ';' () Shifting token ';' () Entering state 9 Reading a token: Next token is token 'a' () Shifting token 'a' () Entering state 1 Reducing stack by rule 5 (line 26): $1 = token 'a' () -> $$ = nterm P () Stack now 0 3 9 Entering state 5 Reducing stack by rule 4 (line 24): $1 = nterm P () -> $$ = nterm E () Stack now 0 3 9 Entering state 12 Reading a token: Next token is token ',' () Shifting token ',' () Entering state 10 Reading a token: Next token is token 'a' () Shifting token 'a' () Entering state 1 Reducing stack by rule 5 (line 26): $1 = token 'a' () -> $$ = nterm P () Stack now 0 3 9 12 10 Entering state 13 Reducing stack by rule 3 (line 23): $1 = nterm E () $2 = token ',' (). $3 = nterm P () -> $$ = nterm E () Stack now 0 3 9 Entering state 12 Reading a token: Now at end of input. Reducing stack by rule 1 (line 20): $1 = nterm L () $2 = token ';' () $3 = nterm E () -> $$ = nterm L () Stack now 0 Entering state 3 Now at end of input. Stack now 0 3 Cleanup: popping nterm L ()
[1] Nigel P. Chapman "LR Parsing Theory and practice". Cambridge University Press 1987. This is an excellent book on the theory and implementation of LR parsers. Unfortunately this book is out of print now. The example grammar used through out this article is taken from this book.
[2] Aho, Sethi, Ullman "Compilers - Principles Techniques and Tools". Pearson Education. The "red Dragon book" is a classic. It is the best book I have read on compiler design.
[3] Robert Endre Tarjan and Andrew Chi-Chih Yao in their paper "Storing a Sparse Table", Communications of the ACM, Volume 22, issue 11, pages 606-611. Availablehere.
[4] GNU Bison source code. http://ftp.gnu.org/pub/gnu/bison/
[5] GNU Bison user manual. http://www.gnu.org/software/bison/manual
Copyright ©2006 Satya Kiran Popuri.
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation; with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license is included in the section entitled ``GNU Free Documentation License''.
GNU Free Documentation License Version 1.2, November 2002 Copyright (C) 2000,2001,2002 Free Software Foundation, Inc. 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA Everyone is permitted to copy and distribute verbatim copies of this license document, but changing it is not allowed. 0. PREAMBLE The purpose of this License is to make a manual, textbook, or other functional and useful document "free" in the sense of freedom: to assure everyone the effective freedom to copy and redistribute it, with or without modifying it, either commercially or noncommercially. Secondarily, this License preserves for the author and publisher a way to get credit for their work, while not being considered responsible for modifications made by others. This License is a kind of "copyleft", which means that derivative works of the document must themselves be free in the same sense. It complements the GNU General Public License, which is a copyleft license designed for free software. We have designed this License in order to use it for manuals for free software, because free software needs free documentation: a free program should come with manuals providing the same freedoms that the software does. But this License is not limited to software manuals; it can be used for any textual work, regardless of subject matter or whether it is published as a printed book. We recommend this License principally for works whose purpose is instruction or reference. 1. APPLICABILITY AND DEFINITIONS This License applies to any manual or other work, in any medium, that contains a notice placed by the copyright holder saying it can be distributed under the terms of this License. Such a notice grants a world-wide, royalty-free license, unlimited in duration, to use that work under the conditions stated herein. The "Document", below, refers to any such manual or work. Any member of the public is a licensee, and is addressed as "you". You accept the license if you copy, modify or distribute the work in a way requiring permission under copyright law. A "Modified Version" of the Document means any work containing the Document or a portion of it, either copied verbatim, or with modifications and/or translated into another language. A "Secondary Section" is a named appendix or a front-matter section of the Document that deals exclusively with the relationship of the publishers or authors of the Document to the Document's overall subject (or to related matters) and contains nothing that could fall directly within that overall subject. (Thus, if the Document is in part a textbook of mathematics, a Secondary Section may not explain any mathematics.) The relationship could be a matter of historical connection with the subject or with related matters, or of legal, commercial, philosophical, ethical or political position regarding them. The "Invariant Sections" are certain Secondary Sections whose titles are designated, as being those of Invariant Sections, in the notice that says that the Document is released under this License. If a section does not fit the above definition of Secondary then it is not allowed to be designated as Invariant. The Document may contain zero Invariant Sections. If the Document does not identify any Invariant Sections then there are none. The "Cover Texts" are certain short passages of text that are listed, as Front-Cover Texts or Back-Cover Texts, in the notice that says that the Document is released under this License. A Front-Cover Text may be at most 5 words, and a Back-Cover Text may be at most 25 words. A "Transparent" copy of the Document means a machine-readable copy, represented in a format whose specification is available to the general public, that is suitable for revising the document straightforwardly with generic text editors or (for images composed of pixels) generic paint programs or (for drawings) some widely available drawing editor, and that is suitable for input to text formatters or for automatic translation to a variety of formats suitable for input to text formatters. A copy made in an otherwise Transparent file format whose markup, or absence of markup, has been arranged to thwart or discourage subsequent modification by readers is not Transparent. An image format is not Transparent if used for any substantial amount of text. A copy that is not "Transparent" is called "Opaque". Examples of suitable formats for Transparent copies include plain ASCII without markup, Texinfo input format, LaTeX input format, SGML or XML using a publicly available DTD, and standard-conforming simple HTML, PostScript or PDF designed for human modification. Examples of transparent image formats include PNG, XCF and JPG. Opaque formats include proprietary formats that can be read and edited only by proprietary word processors, SGML or XML for which the DTD and/or processing tools are not generally available, and the machine-generated HTML, PostScript or PDF produced by some word processors for output purposes only. The "Title Page" means, for a printed book, the title page itself, plus such following pages as are needed to hold, legibly, the material this License requires to appear in the title page. For works in formats which do not have any title page as such, "Title Page" means the text near the most prominent appearance of the work's title, preceding the beginning of the body of the text. A section "Entitled XYZ" means a named subunit of the Document whose title either is precisely XYZ or contains XYZ in parentheses following text that translates XYZ in another language. (Here XYZ stands for a specific section name mentioned below, such as "Acknowledgements", "Dedications", "Endorsements", or "History".) To "Preserve the Title" of such a section when you modify the Document means that it remains a section "Entitled XYZ" according to this definition. The Document may include Warranty Disclaimers next to the notice which states that this License applies to the Document. These Warranty Disclaimers are considered to be included by reference in this License, but only as regards disclaiming warranties: any other implication that these Warranty Disclaimers may have is void and has no effect on the meaning of this License. 2. VERBATIM COPYING You may copy and distribute the Document in any medium, either commercially or noncommercially, provided that this License, the copyright notices, and the license notice saying this License applies to the Document are reproduced in all copies, and that you add no other conditions whatsoever to those of this License. You may not use technical measures to obstruct or control the reading or further copying of the copies you make or distribute. However, you may accept compensation in exchange for copies. If you distribute a large enough number of copies you must also follow the conditions in section 3. You may also lend copies, under the same conditions stated above, and you may publicly display copies. 3. COPYING IN QUANTITY If you publish printed copies (or copies in media that commonly have printed covers) of the Document, numbering more than 100, and the Document's license notice requires Cover Texts, you must enclose the copies in covers that carry, clearly and legibly, all these Cover Texts: Front-Cover Texts on the front cover, and Back-Cover Texts on the back cover. Both covers must also clearly and legibly identify you as the publisher of these copies. The front cover must present the full title with all words of the title equally prominent and visible. You may add other material on the covers in addition. Copying with changes limited to the covers, as long as they preserve the title of the Document and satisfy these conditions, can be treated as verbatim copying in other respects. If the required texts for either cover are too voluminous to fit legibly, you should put the first ones listed (as many as fit reasonably) on the actual cover, and continue the rest onto adjacent pages. If you publish or distribute Opaque copies of the Document numbering more than 100, you must either include a machine-readable Transparent copy along with each Opaque copy, or state in or with each Opaque copy a computer-network location from which the general network-using public has access to download using public-standard network protocols a complete Transparent copy of the Document, free of added material. If you use the latter option, you must take reasonably prudent steps, when you begin distribution of Opaque copies in quantity, to ensure that this Transparent copy will remain thus accessible at the stated location until at least one year after the last time you distribute an Opaque copy (directly or through your agents or retailers) of that edition to the public. It is requested, but not required, that you contact the authors of the Document well before redistributing any large number of copies, to give them a chance to provide you with an updated version of the Document. 4. MODIFICATIONS You may copy and distribute a Modified Version of the Document under the conditions of sections 2 and 3 above, provided that you release the Modified Version under precisely this License, with the Modified Version filling the role of the Document, thus licensing distribution and modification of the Modified Version to whoever possesses a copy of it. In addition, you must do these things in the Modified Version: A. Use in the Title Page (and on the covers, if any) a title distinct from that of the Document, and from those of previous versions (which should, if there were any, be listed in the History section of the Document). You may use the same title as a previous version if the original publisher of that version gives permission. B. List on the Title Page, as authors, one or more persons or entities responsible for authorship of the modifications in the Modified Version, together with at least five of the principal authors of the Document (all of its principal authors, if it has fewer than five), unless they release you from this requirement. C. State on the Title page the name of the publisher of the Modified Version, as the publisher. D. Preserve all the copyright notices of the Document. E. Add an appropriate copyright notice for your modifications adjacent to the other copyright notices. F. Include, immediately after the copyright notices, a license notice giving the public permission to use the Modified Version under the terms of this License, in the form shown in the Addendum below. G. Preserve in that license notice the full lists of Invariant Sections and required Cover Texts given in the Document's license notice. H. Include an unaltered copy of this License. I. Preserve the section Entitled "History", Preserve its Title, and add to it an item stating at least the title, year, new authors, and publisher of the Modified Version as given on the Title Page. If there is no section Entitled "History" in the Document, create one stating the title, year, authors, and publisher of the Document as given on its Title Page, then add an item describing the Modified Version as stated in the previous sentence. J. Preserve the network location, if any, given in the Document for public access to a Transparent copy of the Document, and likewise the network locations given in the Document for previous versions it was based on. These may be placed in the "History" section. You may omit a network location for a work that was published at least four years before the Document itself, or if the original publisher of the version it refers to gives permission. K. For any section Entitled "Acknowledgements" or "Dedications", Preserve the Title of the section, and preserve in the section all the substance and tone of each of the contributor acknowledgements and/or dedications given therein. L. Preserve all the Invariant Sections of the Document, unaltered in their text and in their titles. Section numbers or the equivalent are not considered part of the section titles. M. Delete any section Entitled "Endorsements". Such a section may not be included in the Modified Version. N. Do not retitle any existing section to be Entitled "Endorsements" or to conflict in title with any Invariant Section. O. Preserve any Warranty Disclaimers. If the Modified Version includes new front-matter sections or appendices that qualify as Secondary Sections and contain no material copied from the Document, you may at your option designate some or all of these sections as invariant. To do this, add their titles to the list of Invariant Sections in the Modified Version's license notice. These titles must be distinct from any other section titles. You may add a section Entitled "Endorsements", provided it contains nothing but endorsements of your Modified Version by various parties--for example, statements of peer review or that the text has been approved by an organization as the authoritative definition of a standard. You may add a passage of up to five words as a Front-Cover Text, and a passage of up to 25 words as a Back-Cover Text, to the end of the list of Cover Texts in the Modified Version. Only one passage of Front-Cover Text and one of Back-Cover Text may be added by (or through arrangements made by) any one entity. If the Document already includes a cover text for the same cover, previously added by you or by arrangement made by the same entity you are acting on behalf of, you may not add another; but you may replace the old one, on explicit permission from the previous publisher that added the old one. The author(s) and publisher(s) of the Document do not by this License give permission to use their names for publicity for or to assert or imply endorsement of any Modified Version. 5. COMBINING DOCUMENTS You may combine the Document with other documents released under this License, under the terms defined in section 4 above for modified versions, provided that you include in the combination all of the Invariant Sections of all of the original documents, unmodified, and list them all as Invariant Sections of your combined work in its license notice, and that you preserve all their Warranty Disclaimers. The combined work need only contain one copy of this License, and multiple identical Invariant Sections may be replaced with a single copy. If there are multiple Invariant Sections with the same name but different contents, make the title of each such section unique by adding at the end of it, in parentheses, the name of the original author or publisher of that section if known, or else a unique number. Make the same adjustment to the section titles in the list of Invariant Sections in the license notice of the combined work. In the combination, you must combine any sections Entitled "History" in the various original documents, forming one section Entitled "History"; likewise combine any sections Entitled "Acknowledgements", and any sections Entitled "Dedications". You must delete all sections Entitled "Endorsements". 6. COLLECTIONS OF DOCUMENTS You may make a collection consisting of the Document and other documents released under this License, and replace the individual copies of this License in the various documents with a single copy that is included in the collection, provided that you follow the rules of this License for verbatim copying of each of the documents in all other respects. You may extract a single document from such a collection, and distribute it individually under this License, provided you insert a copy of this License into the extracted document, and follow this License in all other respects regarding verbatim copying of that document. 7. AGGREGATION WITH INDEPENDENT WORKS A compilation of the Document or its derivatives with other separate and independent documents or works, in or on a volume of a storage or distribution medium, is called an "aggregate" if the copyright resulting from the compilation is not used to limit the legal rights of the compilation's users beyond what the individual works permit. When the Document is included in an aggregate, this License does not apply to the other works in the aggregate which are not themselves derivative works of the Document. If the Cover Text requirement of section 3 is applicable to these copies of the Document, then if the Document is less than one half of the entire aggregate, the Document's Cover Texts may be placed on covers that bracket the Document within the aggregate, or the electronic equivalent of covers if the Document is in electronic form. Otherwise they must appear on printed covers that bracket the whole aggregate. 8. TRANSLATION Translation is considered a kind of modification, so you may distribute translations of the Document under the terms of section 4. Replacing Invariant Sections with translations requires special permission from their copyright holders, but you may include translations of some or all Invariant Sections in addition to the original versions of these Invariant Sections. You may include a translation of this License, and all the license notices in the Document, and any Warranty Disclaimers, provided that you also include the original English version of this License and the original versions of those notices and disclaimers. In case of a disagreement between the translation and the original version of this License or a notice or disclaimer, the original version will prevail. If a section in the Document is Entitled "Acknowledgements", "Dedications", or "History", the requirement (section 4) to Preserve its Title (section 1) will typically require changing the actual title. 9. TERMINATION You may not copy, modify, sublicense, or distribute the Document except as expressly provided for under this License. Any other attempt to copy, modify, sublicense or distribute the Document is void, and will automatically terminate your rights under this License. However, parties who have received copies, or rights, from you under this License will not have their licenses terminated so long as such parties remain in full compliance. 10. FUTURE REVISIONS OF THIS LICENSE The Free Software Foundation may publish new, revised versions of the GNU Free Documentation License from time to time. Such new versions will be similar in spirit to the present version, but may differ in detail to address new problems or concerns. See http://www.gnu.org/copyleft/. Each version of the License is given a distinguishing version number. If the Document specifies that a particular numbered version of this License "or any later version" applies to it, you have the option of following the terms and conditions either of that specified version or of any later version that has been published (not as a draft) by the Free Software Foundation. If the Document does not specify a version number of this License, you may choose any version ever published (not as a draft) by the Free Software Foundation. ADDENDUM: How to use this License for your documents To use this License in a document you have written, include a copy of the License in the document and put the following copyright and license notices just after the title page: Copyright (c) YEAR YOUR NAME. Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation; with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license is included in the section entitled "GNU Free Documentation License". If you have Invariant Sections, Front-Cover Texts and Back-Cover Texts, replace the "with...Texts." line with this: with the Invariant Sections being LIST THEIR TITLES, with the Front-Cover Texts being LIST, and with the Back-Cover Texts being LIST. If you have Invariant Sections without Cover Texts, or some other combination of the three, merge those two alternatives to suit the situation. If your document contains nontrivial examples of program code, we recommend releasing these examples in parallel under your choice of free software license, such as the GNU General Public License, to permit their use in free software.