Powerset construction algorithm

The Powerset Construction algorithm, defines a procedure for transforming an NDFA into a deterministic automaton (DFA). Both automata are equivalent in that they both recognize the same language. The idea behind this algorithm is simply to have a state in the DFA for each combination of states in the NFA. All combinations will never be possible. The algorithm determines the transitions of the states in the DFA according to the transition patterns in the NFA.

INPUT: A nondeterministic finite automaton N.

OUTPUT: A DFA D that recognizes the same language that N. Each state of D registers all the states that N may have reached at each step in the process of recognizing the input string. So each state of D is associated with a set of states of N.

METHOD:

  1. Compute the ε-closure of the initial state of N, this will give the set of states in which N may initially be and constitutes the initial state of D.
  2. For this state and for the next ones; for each state S and alphabet's characters a:
  3.    Compute T = m(S, a) = { t | t : states of N that can be reached from a state s in S following a transition labeled with the symbol a }
  4.    Compute εT = ε-closure(T)
  5.    This defines a state (new or already generated in a previous transition) and also a new transition: S -a-> eps T
  6. Loop until no new states are created.
  7. Mark as final states all states of D whose associated set contains any final state of N.

The ε closure of a single state s is the set of all states reachable from s following only ε-transitions (transitions that are not labeled with symbols from the language).

The ε closure of a set of states, in the union of the ε closures of all states within the set.

First of all, it is necessary to know the concept epsilon closure of the automaton. The epsilon closure of an automaton is the set of states that can be reached without consuming any input symbol, i.e. the set of nodes that can be reached by following epsilon transitions.

The first step of the algorithm is to compute the epsilon closure of the initial state. The following figure shows the epsilon closure of the initial state, in the epsilon closure are all the colored states (in orange the initial state, that is also in the epsilon closure, the rest in yellow).

Ejemplo de la cerradura épsilon del estado inicial de un autómata

The set of marked nodes corresponds with a new state in the new finite automaton that is being built. After this, it is computed the states that can be reached from the new state by following the first symbol of the input. These states, together with those that can be reached by following epsilon transitions, are combined to create a new state in the finite automaton that is being built.

The picture shows the states that can be reached after the consumption of the symbol c in the states of the previous picture. The states with a red border are the NDFA states within the departure DFA state. The state with a thick red border is the only state from which departs a transition labeled with the symbol c. All the states in the epsilon closure of the state that we can reach following that transition are colored: in orange the state to which we arrived directly (also in the epsilon closure), in yellow the rest of the states that we can reach following epsilon transitions.

Estados tras consumir el símbolo c

The process is repeated until no new states are created. Every single set corresponds with one state of the new deterministic finite automaton that is being built.

The new automaton is built state by state in each step. The transition table of the new automaton is built at the same time.

To obtain the automaton, in JSON format, just use the address/subsetConstructionJson with get expr parameter, where expr is the regular expression that you want.

For example: a|b.c*

Instructions

Characters to use in precedence order:
  • Parenthesis: ( )
  • Closure: *
  • Concatenation .
  • Union: |
  • Symbols: lowercase letters [a-z]
  • Epsilon: ε, E, €

Try a regular expression