Let Σ be an alphabet.
We define Σk to be
the set of all strings of length k, each of whose symbols is in Σ.
Powers of an alphabet:
Σ = {a, e, t}
Σ0 = {ε}
Σ1 = {a, e, t}
Σ2 = {aa, ae, at,
ea, ee, et, ta, te, tt}
Σ* = Σ0
U Σ1 U Σ2 U Σ3 U Σ4 U . . .
Σ+ = Σ1
U Σ2 U Σ3 U Σ4 U . . .
Σ* = Σ0 U Σ+ = {ε}
U Σ+
More precisely if string x and y are as:
x = a1 a2 a3 . . . ai
and
y = b1 b2 b3 . . . bj
then
x y = a1 a2 a3 . . . ai b1 b2 b3 . . . bj
Language:
If Σ is an alphabet and L is
a subset of Σ*,
then L is a language over
alphabet Σ.
Example of Language:
ex. a language over {a, e, t}
{ a, at, ate, eat, tat, tea,
tee }
Sometimes it is difficult to
specify the strings of a language.
The language of all strings
consisting of n 0's followed by n 1's for some n ≥ 0. { empty, 0 1, 0 0 1 1, 0 0 0 1 1 1, . . . }
2. The set of strings of 0's
and 1's with an equal number of each.
3. The set of binary numbers
whose value is a prime.
4. Σ* for any alphabet Σ.
5. Φ, the empty language, is
a language over any alphabet.
6. {ε}, the language
consisting of only the empty string is a language over any alphabet. Notice
that Φ ≠ {ε}.
In automata theory, a problem
is the question of deciding whether a given string is a member of some
particular language.
If Σ is an alphabet and L is
a language over Σ , then the problem L is:
Given a string w in Σ*,
decide whether or not w is in L.
Notes from: Book Chapter 1
Sections 1.2 to 1.4, 1.5 and Chapter 2
To be added later
--------###########-----------
Components of a finite
automaton
automaton has a finite set of
states
control moves from state to
state in response to external inputs
Kinds of finite automata:
Deterministic (it is also a
finite automata)
automaton cannot not be in
more than one state at one time
nondeterministic (it is
also a finite automata)
automaton may be in several
states at once
right linear grammars:
HMU, p. 182
Right-linear: if each
production body has at most one variable, and that variable is at the right
end. That is, all productions of a right-linear grammar are of the form A ->
wB or A -> w, where A and B are variables and w some string of zero or more
terminals.
Exactly the same languages can be:
• recognized by a finite
state automaton (deterministic or nondeterministic)
• defined by a regular
expression
• defined by a right linear
grammar (or a left linear grammar)
A deterministic finite
automaton is a 5-tuple
A = (Q, Σ, δ, q0, F)
where
Q is a
finite set of states,
Σ is a
finite set of input symbols,
δ is a transition
function
δ: Q x Σ → Q
q0 ε Q is
the start state (or initial state), and
F is a
subset of Q called the set of final states or accepting states.
The language of a
deterministic finite automaton
A = (Q, Σ, δ, q0, F)
is
L(A) = { w | δ'(q0,w) is in
F, w is in Σ* }
If L = L(A) for some dfa A,
we say that L is a regular language.
i.e., the language of A is the set of strings w that
take the start state q0 to one of the accepting states. If L is L(A) for some
DFA A, then we say L is a regular language.
transition table (or state
table)
A transition table is
a tabular representation of a transition function.
Each row corresponds to a
state.
Each column corresponds to an
input symbol.
The entry in the row
corresponding to state q and the column corresponding to input symbol a is the
state δ(q,a).
An IDENTIFIER is a string of
characters (typically letters or digits) that refers to (identifies) an entity,
such as a data object, a procedure, a class, or a type.
All identifiers are names, but
not all names are identifiers.
NAMES can be expressions.
x.y might denote the field y
of a structure denoted by x.
x and y are identifiers; x.y
is a name but not an identifier
Composite names like x.y are
called QUALIFIED NAMES.
A VARIABLE refers to a particular
location of the store.
It is common for the same
identifier to be declared more than once; each such declaration introduces a
new variable.
Even if each identifier is
declared just once, an identifier local to a recursive procedure will refer to
different locations of the store at different times.
IDENTIFIER
BASIC (early)
letter or letter followed by
digit
FORTRAN IV
1-6 letters or digits, 1st
character is letter
ALGOL 60
1 or more letters or digits,
1st character is letter
Java - Identifier
"an identifier is one or
more characters selected from alpha, digit, underscore, or dollar
sign. The only restriction is the first character can't be a digit."
http://www.cs.umd.edu/~clin/MoreJava/Intro/var-ident.html
Any missing transitions are
assumed to lead to an error state.
Once the DFA is in the error
state, it remains in that error state.
Example DFAs and NDFAs on
Final slides in L02CS422-2014D.pptx
Some problems to solve (basedon
HMU, pp. 52-55)
State: the
purpose of a state is to remember the relevant portion of the system’s history.
In finite automata, since
there are finite number of states and entire history of the system can not be
remembered, so the system must be designed carefully to remember what is
important and forget what is not.
The advantage of finite automaton is that we can implement the system
with a fixed set of resources because the number of states is finite.
A language L, which is cannot be defined by a regular
expression or finite automaton is NOT a Regular Language.
IN designing system: what
should be important about the problem is what sequences of events can happen,
not who is allowed to initiate them.
Deterministic: On
each input there is one and only one state to which the automaton can
transition from its current state. A nondeterministic
automaton can be in several states at once.
---------##########--------------
Nondeterministic finite
automaton:
Difference from Deterministic
Finite Automaton is:
δ: Q x Σ → 2Q
2Q = { A | A is a
subset of Q }
ex. Q = { a, b, c }
2Q = { Φ, {a},
{b}, {c}, {a,b}, {a,c}, {b,c}, {a,b,c} }
Why this notation?
|2Q| = 2|Q|
ex. |2{a,b,c} | =
2|{a,b,c}| = 23 =
8
Dead States:
A dead state is a
nonaccepting state that goes to itself on every possible input symbol. It corresponds to Φ, the empty set of states.
We can add a dead state to
any automaton that has no more than one transition for any state and input
symbol. Then add a transition to the
dead state from each other state q, on all input symbols for which q has no other
transition.
HMU, p. 67
(Some we call this an
"error" state or a "trap" state.)
Ways of thinking about
execution of a NFA (nondeterministic finite automaton)
At each time step, the
automaton can be in multiple states.
2. When a transition would
send the automaton to multiple next states, make a separate copy for each legal
state and continue (with each copy only in a single state).
3. Perform a depth-first
search, quitting with success when one path is in a final state when all input
has been read or quitting with failure when all paths have been followed with
none ending in final state.
The deterministic finite
automata and nondeterministic finite automata are equivalent in power.
By definition every
deterministic finite automaton is a nondeterministic finite automaton.
The "subset
construction" on the previous slide shows that for every nondeterministic
finite automaton we can construct a deterministic finite automaton that
recognizes exactly the same language.
To Convert NDFSA to DFSA
--replace all transitions from a state that puts
the FSA into multiple other states with one transition that puts it into a
*new* single state (that will represent the multiple states) and will take all
transitions off those multiple states.
It is an open research question whether
deterministic and nondeterministic linear bounded automata are equivalent
in power. (type 1)