I. Context-Free Grammars
# Formal Definition of Context-Free Grammars
The language defined above is not a context-free language, and this can be strictly proven using the pumping lemma for context-free languages, but for example the language (at least 1 a followed by the same number of b's) is context-free, as it can be defined by the grammar G2 with
,
, S the start symbol, and the following production rules:
- 1.
- 2.
A context-free language can be recognized in O(n3) time (see Big O notation) by an algorithm such as Earley's algorithm. That is, for every context-free language, a machine can be built that takes a string as input and determines in O(n3) time whether the string is a member of the language, where n is the length of the string.[7] Further, some important subsets of the context-free languages can be recognized in linear time using other algorithms.
# Examples of Context-Free Grammars
For these examples, formal languages are specified using set-builder notation.
Consider the grammar G where ,
, S is the start symbol, and P consists of the following production rules:
- 1.
- 2.
- 3.
- 4.
Some examples of the derivation of strings in are:
- (Note on notation:
reads "String P generates string Q by means of production i", and the generated part is each time indicated in bold type.)
This grammar defines the language where an denotes a string of n consecutive a's. Thus, the language is the set of strings that consist of 1 or more a's, followed by the same number of b's, followed by the same number of c's.
# Designing a Context-Free Grammars
Designing Context-Free Grammars.
• Some basic techniques:
– Matching – enforcing the fact that your grammar generates matching pairs of
symbols.
Example: L = {0n12n n ≥ 0}. The context-free grammar for L is
S → 0S11 "
The grammar forces every 0 to match to 11.
What about L = {0n1m 2n ≤ m ≤ 3n} ?
Another example: L = {w w ∈ {0, 1} and of even length }.
S → 0S0 0S1 1S0 1S1 "
The grammar forces every symbol to match to another symbol.
– Using recursive relationships – when you can represent strings in the language in
terms of shorter strings in the language.
Example: L – the language of all strings of balanced brackets. For example,
(()())() ∈ L, but (() /∈ L.
Every string w in L can be viewed as either
- uv, for u ∈ L, v ∈ L, or
- (u), for u ∈ L.
This gives the following grammar:
S → SS (S) "
– Thinking of each nonterminal A as representing a language of all strings w deriv-
able from A. Then, combining these languages into one, using the rules.
Example: L = {0n1m n 6= m}.
Let L1 = {0n1m n > m}, and L2 = {0n1m n <>
# Ambiguity
Example: the grammar G5, whose rules are:
E −→ E + E E ∗ E (E) a
generate ambiguously some arithmetic expressions.
# Chomsky Normal Form
In computer science, a context-free grammar is said to be in Chomsky normal form if all of its production rules are of the form:
- A
BC or
- A
α or
- S
ε
where A, B and C are nonterminal symbols, α is a terminal symbol (a symbol that represents a constant value), S is the start symbol, and ε is the empty string. Also, neither B nor C may be the start symbol.
Every grammar in Chomsky normal form is context-free, and conversely, every context-free grammar can be transformed into an equivalent one which is in Chomsky normal form. Several algorithms for performing such a transformation are known. Transformations are described in most textbooks on automata theory, such as (Hopcroft and Ullman, 1979). As pointed out by Lange and Leiß, the drawback of these transformations is that they can lead to an undesirable bloat in grammar size. Using G to denote the size of the original grammar G, the size blow-up in the worst case may range from G 2 to 22 G , depending on the used transformation algorithm (Lange and Leiß, 2009).
Another way to define Chomsky normal form (e.g., Hopcroft and Ullman 1979, and Hopcroft et al. 2006) is:
A formal grammar is in Chomsky normal form if all of its production rules are of the form:
- A
BC or
- A
α
where A, B and C are nonterminal symbols, and α is a terminal symbol. When using this definition, B or C may be the start symbol.
# Formal Definition of a Pushdown a Automation
II. Pushdown Automata
A PDA is formally defined as a 7-tuple:
where
is a finite set of states
is a finite set which is called the input alphabet
is a finite set which is called the stack alphabet
is a finite subset of
, the transition relation.
is the start state
is the initial stack symbol
is the set of accepting states
An element is a transition of
. It has the intended meaning that
, in state
, with
on the input and with
as topmost stack symbol, may read
, change the state to
, pop
, replacing it by pushing
. The letter
(epsilon) denotes the empty string and the
component of the transition relation is used to formalize that the PDA can either read a letter from the input, or proceed leaving the input untouched.
In many texts the transition relation is replaced by an (equivalent) formalization, where
is the transition function, mapping
into finite subsets of
.
Here contains all possible actions in state
with
on the stack, while reading
on the input. One writes
for the function precisely when
for the relation. Note that finite in this definition is essential.
# Examples of Pushdown Automation
The following is the formal description of the PDA which recognizes the language by final state:
, where
consists of the following six instructions:
,
,
,
,
, and
.
In words, in state for each symbol
read, one
is pushed onto the stack. Pushing symbol
on top of another
is formalized as replacing top
by
. In state
for each symbol
read one
is popped. At any moment the automaton may move from state
to state
, while it may move from state
to accepting state
only when the stack consists of a single
.
There seems to be no generally used representation for PDA. Here we have depicted the instruction by an edge from state
to state
labelled by
(read
; replace
by
).
# Equivalence with Context-Free Grammars
0 comments:
Post a Comment