Tuesday, February 24, 2015

Automata

Let Σ be an alphabet. 
We define Σk to be the set of all strings of length k, each of whose symbols is in Σ.

Powers of an alphabet:
Σ = {a, e, t}
Σ0 = {ε}
Σ1 = {a, e, t}
Σ2 = {aa, ae, at, ea, ee, et, ta, te, tt}

Σ*  =  Σ0 U Σ1 U Σ2 U Σ3 U Σ4 U . . .
Σ+  =  Σ1 U Σ2 U Σ3 U Σ4 U . . .
Σ*  =  Σ0  U  Σ+  =  {ε} U  Σ+

More precisely if  string x and y are as:
x = a1 a2 a3 . . . ai
and
y = b1 b2 b3 . . . bj
then
x y = a1 a2 a3 . . . ai  b1 b2 b3 . . . bj

Language:
If Σ is an alphabet and L is a subset of Σ*,
then L is a language over alphabet Σ.

Example of Language:
ex. a language over {a, e, t}
{ a, at, ate, eat, tat, tea, tee }

Sometimes it is difficult to specify the strings of a language.

The language of all strings consisting of n 0's followed by n 1's for some n ≥ 0. { empty, 0 1,  0 0 1 1, 0 0 0 1 1 1, . . . }
2. The set of strings of 0's and 1's with an equal number of each.
3. The set of binary numbers whose value is a prime.
4. Σ* for any alphabet Σ.
5. Φ, the empty language, is a language over any alphabet.
6. {ε}, the language consisting of only the empty string is a language over any alphabet. Notice that Φ ≠ {ε}.

In automata theory, a problem is the question of deciding whether a given string is a member of some particular language.
If Σ is an alphabet and L is a language over Σ , then the problem L is:
Given a string w in Σ*, decide whether or not w is in L.

Notes from: Book Chapter 1 Sections 1.2 to 1.4, 1.5 and Chapter 2

To be added later

--------###########-----------
Components of a finite automaton
automaton has a finite set of states
control moves from state to state in response to external inputs

Kinds of finite automata:
Deterministic (it is also a finite automata)
automaton cannot not be in more than one state at one time

nondeterministic (it is also a finite automata)
automaton may be in several states at once

right linear grammars:
HMU, p. 182
Right-linear: if each production body has at most one variable, and that variable is at the right end. That is, all productions of a right-linear grammar are of the form A -> wB or A -> w, where A and B are variables and w some string of zero or more terminals.

Exactly the same languages can be:
• recognized by a finite state automaton (deterministic or nondeterministic)
• defined by a regular expression
• defined by a right linear grammar (or a left linear grammar)

A deterministic finite automaton is a 5-tuple
A = (Q, Σ, δ, q0, F)
where
Q is a finite set of states,
Σ is a finite set of input symbols,
δ is a transition function
δ: Q x  Σ →  Q
q0  ε Q  is the start state (or initial state), and
F is a subset of Q called the set of final states or accepting states.





The language of a deterministic finite automaton
A = (Q, Σ, δ, q0, F)
is
L(A) = { w | δ'(q0,w) is in F, w is in Σ* }
If L = L(A) for some dfa A, we say that L is a regular language.
i.e., the language of A is the set of strings w that take the start state q0 to one of the accepting states. If L is L(A) for some DFA A, then we say L is a regular language.


transition table (or state table)
A transition table is a tabular representation of a transition function.
Each row corresponds to a state.
Each column corresponds to an input symbol.
The entry in the row corresponding to state q and the column corresponding to input symbol a is the state δ(q,a).

An IDENTIFIER is a string of characters (typically letters or digits) that refers to (identifies) an entity, such as a data object, a procedure, a class, or a type.
All identifiers are names, but not all names are identifiers.
NAMES can be expressions.
x.y might denote the field y of a structure denoted by x.
x and y are identifiers; x.y is a name but not an identifier
Composite names like x.y are called QUALIFIED NAMES.
A VARIABLE refers to a particular location of the store. 
It is common for the same identifier to be declared more than once; each such declaration introduces a new variable. 
Even if each identifier is declared just once, an identifier local to a recursive procedure will refer to different locations of the store at different times.

IDENTIFIER
BASIC (early)
letter or letter followed by digit
FORTRAN IV
1-6 letters or digits, 1st character is letter
ALGOL 60
1 or more letters or digits, 1st character is letter

Java - Identifier
"an identifier is one or more characters selected from alpha, digit, underscore, or dollar sign. The only restriction is the first character can't be a digit."
http://www.cs.umd.edu/~clin/MoreJava/Intro/var-ident.html


Any missing transitions are assumed to lead to an error state.
Once the DFA is in the error state, it remains in that error state.



Example DFAs and NDFAs on Final slides in L02CS422-2014D.pptx

Some problems to solve (basedon HMU, pp. 52-55)

State: the purpose of a state is to remember the relevant portion of the system’s history.

In finite automata, since there are finite number of states and entire history of the system can not be remembered, so the system must be designed carefully to remember what is important and forget what is not.

The advantage of finite automaton is that we can implement the system with a fixed set of resources because the number of states is finite.

A language L, which is cannot be defined by a regular expression or finite automaton is NOT a Regular Language.

IN designing system: what should be important about the problem is what sequences of events can happen, not who is allowed to initiate them.

Deterministic: On each input there is one and only one state to which the automaton can transition from its current state. A nondeterministic automaton can be in several states at once.

---------##########--------------

Nondeterministic finite automaton:
Difference from Deterministic Finite Automaton is:
δ: Q x  Σ →  2Q

2Q = { A | A is a subset of Q }
ex. Q = { a, b, c }
2Q = { Φ, {a}, {b}, {c}, {a,b}, {a,c}, {b,c}, {a,b,c} }
Why this notation?
|2Q| = 2|Q| 
ex. |2{a,b,c} | = 2|{a,b,c}|  = 23 = 8

Dead States:
A dead state is a nonaccepting state that goes to itself on every possible input symbol.  It corresponds to Φ, the empty set of states.
We can add a dead state to any automaton that has no more than one transition for any state and input symbol.  Then add a transition to the dead state from each other state q, on all input symbols for which q has no other transition.
HMU, p. 67
(Some we call this an "error" state or a "trap" state.)

Ways of thinking about execution of a NFA (nondeterministic finite automaton)

At each time step, the automaton can be in multiple states.
2. When a transition would send the automaton to multiple next states, make a separate copy for each legal state and continue (with each copy only in a single state).
3. Perform a depth-first search, quitting with success when one path is in a final state when all input has been read or quitting with failure when all paths have been followed with none ending in final state.


The deterministic finite automata and nondeterministic finite automata are equivalent in power.
By definition every deterministic finite automaton is a nondeterministic finite automaton.
The "subset construction" on the previous slide shows that for every nondeterministic finite automaton we can construct a deterministic finite automaton that recognizes exactly the same language.



To Convert NDFSA to DFSA
--replace all transitions from a state that puts the FSA into multiple other states with one transition that puts it into a *new* single state (that will represent the multiple states) and will take all transitions off those multiple states.


It is an open research question whether deterministic and nondeterministic linear bounded automata are  equivalent in power.  (type 1)



0 comments: