Room Escapee

In this example, we will introduces another simple but comprehensive problem.

Problem Description
Suppose we have 5 rooms in a building connected by doors as shown in the figure below. Each room is numbered from 0 to 4 The outside of the building can be though of as a big room numbered 5. One can go to room 5 (outside) through room 1 or room 4 as indicated in the figure.

For this example, an agent was put in any room, and from that room, it needs to escape from the building and go outside (in other words, room 5 is the target).

The rooms and doors can be represented on a graph, each room as a node, and each door as a link.

Note that the doors are two-way, for example, 2 leads to 3 and 3 leads back to 2. Room 5 is a bit more special, the agent can stay in room 5, which explains the loop from room 5 to itself.

Another thing to note is that, the agent we put in a room is totally dumb. It has no knowledge of the room layout we've talk about and doesn't know which sequence of doors lead to the outside.

The problem is how to model an agent in any room, so that it can learn through experience, and find the way to reach and stay outside the building by itself no matter what the layout of rooms and doors are.

Modelling and Coding
Modelling the agent in GACMS is straightforward. Each room is corresponding to a "state", and the agent's movement from one room to another is an "action".

For example, suppose the agent is in state 2. From state 2, it can go to state 3 because state 2 is connected to 3. However, from state 2, it can not directly got to state 1 because there is no direct door connecting room 1 and room 2 (thus no such action). From state 3, it can go either to state 1 or 4 or back to 2. If the agent is in state 4, then the possible actions are to go to state 0, 5 or 3. In state 5 (outside), it can go either to state 1 or 4, or choose to stay in state 5.

In GAMCS, each state is assigned with a payoff, to drive out the agent from the building, we assign a minus payoff for the rooms inside, and 0 for room 5 (outside).

Here is the "Escapee.h":

Nothing new, it's still the four virtual functions that need to be implemented as discussed in other examples. The current_room variable is used to indicated which room the agent is currently in (it's initialized as 2 which means the agent was starting from room 2).

The perception function is quite simple, it just needs to return the current room.

As talked above, the action stuffs are also straightforward and simple, the code explains all.

For the payoff, we set state 5 to 0, and other rooms as -10 (any minus value is ok).

Finally, here is the main.cpp:

Everything is done now, compile the code with GAMCS library. Let's wake up the Escapee!

Observing and Analysis
Run the code, and plot out the rooms the escapee entered varying with time. Below is a typical result.

We can see from the figure that, at first the escapee started from room 2, and then it entered randomly to the connected rooms. But at last, it will found room 5 and stayed there forever (successfully escaped from the building). That's exactly what we want to model.

- Now let's put the "experienced" agent back to each room, and see what it will behave respectively.

From the figures shown above, we can see apparently that after exploring the rooms for the first time, the agent has built a map of the room layout in its mind. Wherever it was put in the building, it can find the shortest path to go outside.