Powerset construction algorithm
INPUT: A nondeterministic finite automaton N.
OUTPUT: A DFA D that recognizes the same language that N. Each state of D registers all the states that N may have reached at each step in the process of recognizing the input string. So each state of D is associated with a set of states of N.
METHOD:
- Compute the ε-closure of the initial state of N, this will give the set of states in which N may initially be and constitutes the initial state of D.
- For this state and for the next ones; for each state S and alphabet's characters a:
- Compute T = m(S, a) = { t | t : states of N that can be reached from a state s in S following a transition labeled with the symbol a }
- Compute εT = ε-closure(T)
- This defines a state (new or already generated in a previous transition) and also a new transition:
- Loop until no new states are created.
- Mark as final states all states of D whose associated set contains any final state of N.
The ε closure of a single state s is the set of all states reachable from s following only ε-transitions (transitions that are not labeled with symbols from the language).
The ε closure of a set of states, in the union of the ε closures of all states within the set.
First of all, it is necessary to know the concept epsilon closure of the automaton. The epsilon closure of an automaton is the set of states that can be reached without consuming any input symbol, i.e. the set of nodes that can be reached by following epsilon transitions.
The first step of the algorithm is to compute the epsilon closure of the initial state. The following figure shows the epsilon closure of the initial state, in the epsilon closure are all the colored states (in orange the initial state, that is also in the epsilon closure, the rest in yellow).
The set of marked nodes corresponds with a new state in the new finite automaton that is being built. After this, it is computed the states that can be reached from the new state by following the first symbol of the input. These states, together with those that can be reached by following epsilon transitions, are combined to create a new state in the finite automaton that is being built.
The picture shows the states that can be reached after the consumption of the symbol c in the states of the previous picture. The states with a red border are the NDFA states within the departure DFA state. The state with a thick red border is the only state from which departs a transition labeled with the symbol c. All the states in the epsilon closure of the state that we can reach following that transition are colored: in orange the state to which we arrived directly (also in the epsilon closure), in yellow the rest of the states that we can reach following epsilon transitions.
The process is repeated until no new states are created. Every single set corresponds with one state of the new deterministic finite automaton that is being built.
The new automaton is built state by state in each step. The transition table of the new automaton is built at the same time.
Instructions
- Parenthesis: ( )
- Closure: *
- Concatenation .
- Union: |
- Symbols: lowercase letters [a-z]
- Epsilon: ε, E, €