Demonstration page for the BIMM (Brain-Inspired Memory Model) neural net
of Kary FRÄMLING
Brain-Inspired Memory Model is the current working name for a neural
reinforcement learning model that uses notions of short- and long-term
working memory for performing animal- and human-like trial-and-error learning.
This page gives access to demonstration programs using this technique, even
though there is for the moment only one of them - a maze rout finding program.
Background
This artificial neural net (ANN) was born from the
idea to develop a model that would do problem solving and learning in
similar ways as humans and animals do. The model would also correspond to
some very rough-level ideas and knowledge about how the brain operates,
i.e. activations and connections between different areas of the brain and
notions of short- and long term working memory.
Animal problem solving mainly seems to be based on
trial and learning. The success or failure of a trial modifies behavior
in the "right" direction after some number of trials, where "some number"
is in the range one (e.g. learning how to turn on the radio from "power"
button) to infinity (e.g. learning how to grab things, which is a life-long
adaptation procedure).
Such behavior is currently studied mainly in the scientific research
area called reinforcement learning (RL). RL methods have been successfully
applied to many problems where more "conventional" methods are difficult
to use due to factors like lacking data about the environment, which forces
the neural net to explore its environment and learn interactively. Exploring
is a procedure where the agent has to take actions without a priori knowledge
about how good or bad the action is, which may be known only much later
when the goal is reached or when the task failed.
Maze solver
This application has been/will be used in the following publications:
[1] FRÄMLING, Kary. Reducing state space exploration
in reinforcement learning problems by rapid identification of initial solutions
and progressive improvement of them. To appear in Proceedings of 3rd
WSES International Conference on Neural Networks and Applications (NNA'02)
. Interlaken, Switzerland, 11-15 February 2002.
Maze route finding is commonly used in studies of animal
learning and behavior. Animals have to explore the maze and construct an
internal model of the maze in order to reach the goal. The more maze runs
the animal performs, the quicker it goes to the goal since solutions get
better memorized.
Maze route finding is not a very complicated problem
to solve with many existing methods like depth-first and breadth-first search
or reinforcement learning methods like temporal difference (TD) or Q-learning.
Ordinary search methods usually require a model of the problem being solved
and they are not good at handling unstatic problems. Many reinforcement
learning methods do not require a model of the problem space and they are
also usually capable of handling changes in the problem space. The main
problem of currently existing reinforcement learning methods is their need
for exhaustive exploration of the entire problem space, which often means
very long learning times. As the number of possible states grows, current
methods become unusable. Big mazes are one example of such problems, where
state space reduction by generalisation and function approximation is not
possible.
BIMM's goal is to reduce the state space exploration to a minimum so
that a "usable" solution is identified very rapidly (often found with only
one episode). Then better solutions can be searched for when there "is time
for it" by balancing between old knowledge stored in long-term working memory
and random exploration favored by short-term working memory.
Use of maze solver
The maze solver demonstration applet is started by the button below
(if you have a browser that supports JDK 1.3 compiled applets):
It is actually a Java application which just modifies its behaviour when
run as an Applet. It is also a program which is purely developed for research
purposes and is therefore far from being bug-free. It should actually be
completely rewritten as most computer programs written for testing different
research ideas, but I'll do that when I get retired :-).
The following steps are suggested to get started:
- Click on button above
- Either 1) select "New Maze" from Maze menu to automatically create
an initial maze of the size you like or 2) select "Sutton&Barto Maze"
from "Applet Demonstartion" menu.
- Select "Go to goal" from "Neural Net" menu, which sends that ANN
"agent" on its first episode. Once finished, you get a dialog where you
should choose "Store in long-term memory" and then "Close".
- Select "Reset" from "Maze" menu to put the agent back to the start
position.
- Select "Go to goal" again, which sends that ANN "agent" on its second
episode. Once finished, you get the same dialog box as the previous time,
but this time the number of steps is typically much smaller and near "usable".
- Searching for optimal solutions can be done in two ways, either
by the command "Find optimal path by multiple agents..." or by the command
"Find optimal path by exploring agent..." in the "Neural Net" menu. More
information about the use of these commands can be found in [1].
You can also modify the maze as you like and do some other interesting
things, but more about that later...
Important! The current application is compiled with JDK1.3.1, so
it might not work with older browsers like Netscape 4.x, for instance. It
should, however, work fine with Netscape 6.2 and probably also with the
newest versions of Internet Explorer. An alternative solution is to
download the JAR file
, which contains the whole application and then start it either with the
command "java fi.kf.maze.MazeSolverFrame", which runs it as an application,
or with the command "appletviewer BIMMdemo.html", which runs it exactly as
it would in a browser. However, this requires that you have a Java Development
Kit 1.3 (might also work with 1.2) installed or at least the necessary parts
of it.
Bugs
- When finding the optimal path with multiple agents, the best one
found is not always used in the first run made with it after closing the
dialog window. However, it apparently is in the following runs, but this
remains to be studied.
Future work
To improve this page, among others...
The main next challenge for the moment is to apply BIMM to problems with
continuous-valued inputs instead of lookup-table state spaces.
Last updated 9 January 2002.