A mouse in tunnel

The first example we'll talk about is quite simple, but it shows us the general procedure of using GAMCS and its great features.

Problem Description
Suppose there is a mouse in a narrow tunnel with both ends blocked. The mouse in this tunnel can only do a side-to-side movement as shown in the following image.

In the meantime, we have both cheese and nails in our hand, and can put them at any position of the tunnel as we like. The mouse likes cheese and fears nails as it normally is.

Ok, that's all the props we will use in this example. But before we start to code using GAMCS, we need to build the model quantificationally. Let's assume that the tunnel is 20 meters long ranging from -10 to 10, and the mouse can do one of three actions each time, they are -1 for moving 1 meter left, 1 for moving 1 meter right, and 0 for staying still. For the cheese and nails, let's say that a piece of cheese will give 1 unit joy to a mouse, and a stab of nail will give it 1 unit pain.

Just imagine for a while, what would a true mouse behave in a similar situation. What will it do if we put a cheese somewhere or what will it do if we put a nail?

Let GAMCS come on the stage now.

Modelling and Coding
This is the Mouse.h

Firstly the necessery header files are included and the gamcs namespace are imported. Then we create a new class called Mouse that derived from Avatar. It has the four familiar virtual functions to be implemented in the private region. Besides, it has a member variable called positioin, which is used to record the current position of the mouse in tunnel. The position was initialized to 0, which means the mouse was staying at position 0 at start.

Now we have to implement the 4 virtual functions.

In each position of the tunnel as we have talked about, the mouse has one of three actions to perform. So in the availableActions function, we create a empty action space, and add the three actions to it for all incoming states. In the performAction function, as we appointed above, action -1 means moving 1 meter left, 0 means staying, and 1 means moving 1 meter right, that we can simply use position += act to perform actions. Moreover the tunnel are block on both ends, so the mouse can't move exceeds the ends, that's the two checks do in if clauses.

The percieveState function is even simpler, it just needs to return the current position.

The originalPayoff function assigns a payoff value to each state, which represents how much the mouse likes or dislikes the state originally. In the above code, for each state the payoff is 0, which means the mouse takes all the states equal, none of them is better or worse.

The mouse model is built up now. To run the model, we have to create a main function. Here is the main.cpp.

As usual, necessary header files were included at the top. Here we encounter a new class CSOSAgent. It's short for Computer Simulation of Open Storage Agent, quite a long name. It's the default built-in implementation of Agent that was provided by GAMCS, you can use it directly. At line 6, we make an instance of CSOSAgent, set id to 1, discount rate to 0.9, and threshold to 0.01 (no need to care about these arguments now). Next, we create an instace of Mouse which we just built above, and connect to the agent. Now the mouse is ready to run, we build a while loop and run the mouse 500 times in it.

Everything is done now, compile your code with GAMCS library. Let's wake up the mouse!

Observing and Analysis
Let's run the mouse, and see what it will behave. Here is a typical result which plots the position of mouse over time.

We can see from the graph that, the mouse was walking back and forth in the tunnel randomly. Since no state is better or worse, the mouse is totally aimless.

The allure of cheese
Now let's change the originalPayoff function a bit, as highlighted below.

We set payoff of position 5 to 1 which as we talked about in the first section, visually means a piece of cheese was put here. Recompile it and run again, after this modification, we found that the mouse's behavior becomes regular now, as shown below.

In both plottings, at first the mouse wandered around randomly, but after reaching position 5 where the cheese is put, the mouse was trying to stay there or walk around it. In the second plotting, the reason why the mouse would walk around instead of staying at position 5 (which is more joyful) is a phenomenon we called local optimum. Once the mouse found a local optimum, it will be stuck there and missing the global optimum (which is to stay at position 5 in this example). We'll talk about this problem and its solution in the next example.

Anyway, we see that the mouse is attracted by the cheese, once touching the cheese it will never be willing to leave. Increase the number of loops if you have any doubt about this.



---

It's interesting to know what will the mouse behave if we put it back at position 0 while with its memory reserved. Let's find it out. Instead of putting the same mouse to position 0, we would just create a new mouse and connect it to the formed memory, which has the same effect as put the mouse back at position 0. The modified code is shown below.

Recompile it and run, here is the position of new mouse varying with time.

We could see that, the new mouse runs directly to position 5 directly without any hesitation. With the inherited memory/experience from last mouse, the new mouse is born to know that a piece of cheese is put at position 5 and also knows the path to go to there. That's the great power of memory!

--

So far, all we saw are the external behaviors of a mouse, what would the inner memory of a mouse look like? GAMCS provides the functionality to modify/view an Avatar's memory. Let's take a look.

To do this, some additional codes need to be added in main.cpp.

First, DotViewer.h is included, it's a class that can print the memory of an agent in graphviz dot format. Then before quiting the main function, we create a instance of DotViewer, attach the viewer to agent, and print out the memory. Recompile it and run again, after getting the output of memory, plot it with dot command. Here is what the memory looks like:

In the above plotting, a hollow node corresponds to a state (position in this example) with its payoff in brackets below, and a black filled node is a hidden/intermediate state with ones in the same information set connected by a dash line. An arrow that links two nodes corresponds to a pair of agent action(blue) and environment action(red). And for an environment action, a count number is shown in brackets aside.

From this memory, we can see clearly that position 5 is the favorite state of this mouse, with payoff equals 9.90 (Note: it's an accumulated result not the original one which is 1), and position 4 is the second favorite one with payoff 8.91, and so on. The closer a position is to position 5 the bigger its payoff is. That's the reason why the mouse would like to stay at position 5, and also the reason why the new mouse would go directly to position 5. Moreover, the payoff values are converged, increasing the number of loops will no longer change the payoffs, try it for yourself (this needs a strict mathematic proof!).

The fear of nail
After seeing the great allure of "cheese", let's find out the effect of "nails".

We withdraw the cheese at position 5 and instead put a nail there, what would happen? Let's do it, the corresponding modification to code is highlighted below.

Quite easy, we just change the payoff of position 5 from 1 to -1. Recompile it and run. Here is two typical results.



In either case, we saw clearly as if there is a visual "wall" at position 5. The mouse was afraid of position 5 once it's been punched by the nail there. Here is the memory of this poor mouse. All its behaviors can be explained by the memory.

Conclusion
In this example, we demonstrated how to use GAMCS to model a simple but vivid mouse. With less than 100 lines of code, we can create a mouse that shows complex behaviour who can be attracted by food and tries to escape from danger just like the true mouse does in reality. You can modify the code and do more experiments by yourself. It's quite fun to have a "mouse" raised in your computer, and shows lots of amazing behaviours.