Week 13: Final Project


This project experiments with various uses of a reinforcement machine learning algorithm called Q-learning.  A common (and simple) Q-learning example is called the Frozen Lake example.  This example consists of a grid of tiles.  The first tile is the start tile and the last tile is the goal tile.  Each of the remaining tiles are either a frozen (safe) tiles or a hole (unsafe) tiles.  An agent is placed on the start tile and has the goal of reaching the goal tile without stepping on any holes.  This is a simple problem where the Q-learning algorithm can be implemented.  For a given tile the agent can go in any of four directions; left, up, right or down.  A Q-table is a matrix of values where each row represents a tile (state) and the columns represent the possible direction (actions).  An example of a two by two frozen lake and its Q-table are demonstrated in Figure 1.

Figure 1

Figure 1

Each element in the Q-table is initiated to zero.  Thus for any given state the agent has an equal probability of taking any of the four actions.  The agent uses the Q-table to guide it through the frozen lake map.  If the agent falls in a hole it goes back to the start tile and try again.  If the agent finds it’s way to the goal, it is given one point for that iteration of the Q-table.  Thus the probabilities of the Q-table start to changed depending on the state and action.  Once the agent had attempted to complete the map a certain number of times (episodes) it creates a Q-table which gives higher probabilities to safe action and lower probabilities to unsafe actions and thus the agent solves the map.  

Arthur Juliani has created a wonderful tutorial that uses tensor flow to implement Q-learning examples.  Aidan Nelson, a fellow ITP student, has been working to translate these tutorial examples into Javascript.  I have used these examples to begin experimenting further with Q-learning tables.  A big thank you to Arthur and Aidan for providing the baseline code for this project. 

For this project I built an interactive website where users can build a frozen lake map and train the agent on the map.  I also began to play around with the idea of using this frozen lake example as a way to visualize data. 


For much of this project I used the resources I had at ITP to build an understanding of tf.js and reinforcement Q-learning.  I began by following Aiden’s example and creating a simple frozen lake program where the agent can learn the map (Figure 2).

I then added a user interaction to this example.  Figure 3 shows an example of this interaction.  The user can make a custom map and submit it to be trained by the agent.  This example can also be played with at this link.

I then created a webpage where you could choose to make a custom map or to use a data map.  For the data map I uploaded my own data about my gym activities.  Safe spots on the map were days I went to the gym while unsafe spots were days that I did not go to the gym.  The idea was that the agent would have to navigate through my data to get to the goal.  The more I went to the gym, the easier it would be for the agent to find the goal.  This was a simple experiment to explore how data could be implemented into this example.  I would like to explore this concept further in future iteration of the application.  

Embedded below is the final webpage. This example can also be viewed here

Conclusion/Future Steps:

While this project acted as an exploration into q-learning and its potential implementations I would like to go much further.  One feature that I would like to make more clear in a future iteration is the training process of Q-learning.  While I attempted to do this and was uncessful, I would like to try to visualize the Q-table as the agent is training.  This would more clearly show the user how reinforcement learning works and example this algorithm to users who have never heard of it before.  Mostly, I would like to further explore this example and a way to visualize data.  The gym data set was a good start, but I would like to see how I could improve on this concept.  First and foremost, I would need to implement a neural network in order to allow the frozen lake table to improve its resolution.  Then I could perhaps visualize larger amounts of data and introduce multiple agents.  Perhaps this visualization could turn into a habit motivation game where you keep track of a habit and your agent “friend”  gains lives if it is able to reach the goal.  Overall, this project was a great first step into exploring these q-learning implementations.  I would definitely like to continue working on the data side of this project in order to build an interesting visualization. 

Links to project:

Final Webpage

Final Code 


  1. https://github.com/AidanNelson/reinforcement-learning-experiments/tree/master/simple-rl-tutorials/0-q-learning-agents

  2. https://medium.com/emergent-future/simple-reinforcement-learning-with-tensorflow-part-0-q-learning-with-tables-and-neural-networks-d195264329d0

  3. https://www.youtube.com/playlist?list=PLRqwX-V7Uu6YIeVA3dNxbR9PYj4wV31oQ

Week 12: Final Project Progress

This week I began by adding a user interaction to the example I made last week (Figure 1).  The user can make their own custom map by selecting a tile.  When they are ready to submit their map they can train and run the agent through this map space. 

After this test I began to move the code to a more user friendly webpage.  The webpage was build to prepare for user testing.  Figure 2 shows the user interaction.  The user can choose from two menu options.  If they choose to make their own custom map they will build guided to build and train a custom map.  If they choose to use a data map they can select a data set about my gym habits.  This data will translate into a map that the agent can be trained on.

In the future I plan to add more data sets.  After user testing I plane to get feedback on the different approaches I am taking and perhaps see how the design can be improved.   


Example Code

Play Example

Week 11: Final Project Progress

This week I spent my time learning about the q-learning algorithm and implementing it using javascript.  I found Arthur Juliani's article series to be the best explanation of simple q learning.  Using his python script and Aidan Nelson’s implementation of this code in javascript I implemented a simple q-learning table for my own purposes (Figure 1).

A big thank you to Aidan Nelson for his example as it provided a baseline for my implementation.  To view my code visit this link and to view my example visit this link (press t on the keyboard to train and then r on the keyboard to run).

Now that I better understand the q-table algorithm I have a few steps that I would like to make in the coming weeks.  First I would like to implement an interactive component to the sketch so that before training the agent, the user can change the maze environment.  Then I would like to use this method to visualize data.  I would like to develop a series of environments based on real data and run the agent through that environment.  For example the data could be information about my gym habits.  If I go to the gym on one day, then the maze would show a safe tile at the corresponding spot in the maze.  If I do not go to the gym on one day, the maze would show an unsafe tile at the corresponding spot in the maze.  If I went to the gym a lot, there would be many safe spots and the agent would have an easy time completing the maze (less training requires).  If I skipped the gym a lot, there would be many unsafe spots and the agent would have a very difficult time completing the maze (more training required).  I would like to play around with this idea of how the maze can visualize your data in an interesting way and motivate your actions.  Ultimately, I would like to scale this up so that the maze could be much larger than 5X5 or 4X4.  In this case I may need to implement a neural network to more efficiently work through the q table. 


  1. https://github.com/AidanNelson/reinforcement-learning-experiments/tree/master/simple-rl-tutorials/0-q-learning-agents

  2. https://gist.github.com/awjuliani/9024166ca08c489a60994e529484f7fe#file-q-table-learning-clean-ipynb

  3. https://medium.com/emergent-future/simple-reinforcement-learning-with-tensorflow-part-0-q-learning-with-tables-and-neural-networks-d195264329d0

Week 10: Final Project Proposal


I was inspired by last weeks discussion on reinforcement q-learning and wanted to look more into this concept.  I see a potential for many applications of q-learning and want to take time applying it to my own scenario to understand the process.  My plan is to use the example we saw last week in class by Aidan Nelson as a baseline, and further his example.  Hopefully, I will be able to make the project interactive, so that users can manipulate the q-table.  Ultimately, this will be an exploration for myself to learn about this type of machine learning while also providing an opportunity for others to see an agent learn using reinforcement q learning in various scenarios. 

Source Material/Code:

  1. Aidan Nelson’s code

  2. Reinforcement Learning Tutorial

  3. My starter code

Implementation options:

  • grid/ automated maze

  • automated frogger

  • other ideas?

Current one sentence description:

This project will create a maze field that the user can manipulate and an automated agent will learn using reinforcement q-learning.

Week 9: Neural Networks

In addition to Nature of Code I am also taking the Machine Learning for the Web class at ITP.  Conveniently, this week in both classes, we went over the teachable machine and different methods like regression and KNN classification. Thus, this week I took some time going through the teachable machine tutorials and examples and built my own teachable machine image classifiers. Ultimately I built a program that allows you to type a word on a screen using sign language.  Figure 1 shows the process of training this program.

Figure 2 shows the program after it has been trained.  The user signs the letters and writes the word “hello.” 

Overall, this program works well, however I wonder how well it would work if I saved a model using my hand and then someone else tried to use the program without adding any new images into the model.  Ideally, after this program is trained, a user could sign without ever having to touch the keyboard.  The way the program is build now, when the machine detects a letter it is printed below the canvas.  A letter is only added to the text on the canvas when the user types the enter key.  Perhaps in future iterations, an interaction can be made to tell the program when to add a letter without the use of the keyboard. With more experimentation I see how this project could have a potential to more intelligently type using sign language. Perhaps the machine could also dictate what the user is signing.


The Example

The Code

Week 8: Genetic Algorithm

This week I reviewed the genetic algorithms materials and followed this video to create my own target seeking genetic algorithm.  In order to better see how the new population relates to the previous population I tracked the particles and used some alpha (Figure 1). 

I think this example does a good job of seeing how a particle with a good fitness closely follows the genes of a previous particle but not exactly.  This way of looking at the genetic algorithm makes it clearer to me that everything is working properly and how exactly the populations are changing.


Click here to view this example.

Click here to view the code. 

Week 5&6&7: Simulation Experiments

For this project I played with a number of different experiments.  First I followed this sample code to practice building a symbol flocking system.  Figure 1 shows the resulting flocking system.  The code for this example can be found here.

I decided to build on this flocking system by adding an interactive element to the screen.  Using ml5.js poseNet I added a flee behavior from the users nose.  As the user moved the camera would detect where their nose was and the particles would have a strong force repelling them from their nose (Figure 2).  This example can be viewed here and the code can be viewed here

At this point, I decided to shift gears a bit.  I was noticing various posts on the r/creativecoding subReddit of examples using complex attractors.  I was particularly drawn to this example where the particles almost look like they are moving through a complex attractor flow field.  I wanted to set out to make this example in hopes to build on it for a future project.  I recognized that this may be a longer term project so I simply began by building a simple flow field example following this sample code. I began to play around with the equations that made the vectors behind the flow field to get different effects.  Figure 3 shows a test where I manipulated the flow field randomly but started the particles at the top and bottom of the screen to get a nice growing effect.

Figure 4 shows the same example but I adjusted the flow field equation to make more predictable visual.  The code for this example can be found here and the example can be viewed here.

From here it was clear that I needed to understand the behaviors of the complex attractor that I wanted to use in order to understand how to implement that equation into a flow field.  Thus, I took this example from the creative coding subReddit and added sliders to manipulate the constants.  Figure 5 shows the example of how you can interact with this example.  The code can be found here and the example can be found here. All of these steps were certainly beginning steps for a larger project where I build with these complex attractors and create an interactive simulation. 

Week 4: Systems and Inheritance

This week I experimented with particle systems and inheritance to develop a rain cloud system.  First I practiced making one particle system with particle lifespans (Figure 1).

I then added multiple particle system instances inside a particle system class (Figure 2).

Figure 2

This example can be found here and the code can be found here. I then began to make the rain/cloud sketch.  I started by making a generic particle class.  I then made a raindrop class that extended the particle and changed its appearance and motion to look more like rain (Figure 3).

Figure 3

Again, I created a cloud particle class that extended the generic particle class.  I then made two systems, one for rain and one for the cloud.  In the main draw loop, 10 systems of each type are made and display on the screen to make a rainy weather landscape. This example can be found here and the code can be found hereFigure 4 shows the final result. 

Week 3: Oscillations & Rotation

This week I experimented with making spirals, manipulating them with concepts from Fourier transforms, and using oscillations.  I began with a simple sketch where I used polar coordinates to build a spiral (Figure 1). 

Figure 1

I then manipulated the radius of the spiral using concepts from this 3blue1brown video.  Figures 2&3 show patterns that I developed using these concepts.  You can view these examples here and find the code here.

Figure 2

Figure 3

At this point I changed gears and began to focus on oscillations.  I saw this post on the creative coding subReddit and was inspired to combine this idea with oscillating symbol that appears on the apple watch when you use the breathe application.  I began by creating a simple oscillating circle that grows and shrinks (Figure 4).

Figure 4

I then added an angular acceleration to this object (Figure 5).

Figure 5

From here I created a class for each square and added many squares to the canvas.  Each square had a unique acceleration and amplitude (Figure 6).

Finally I added a gradient to the canvas using this code as a reference. The final result reminds me of a star in the night sky (Figure 7).  You can view this example here and find the code here

Week 2: Vectors & Forces

This week I experimented with createVector() and applying forces to objects in p5.js.  I began by making a simple bouncing ball with an acceleration down to simulate gravity (Figure 1). 

Figure 1

Then I converted the acceleration to a gravity force and used an applyForce() function to apply gravity.  With this setup, I could add additional forces to the ball like a wind force with the click of my mouse (Figure 2).  The code for this example can be found here.

Figure 2

From here I created many instances of balls falling and incorporated mass into the acceleration equation.  This created a nice rain effect (Figure 3).

Figure 3

For my final sketch I used these concepts to create a repelling field.  As the mouse moves through a field of particles, they experience a repelling force away from the mouse.  The code for this example can be found here.  Some examples of this sketch are shown in Figure4&5

Figure 4

Figure 5

Week 1: Random Walk & Perlin Noise

A note on my workflow:

Since I was in ICM last semester I decided to challenge myself with creating a new workflow and try to step away from the p5.js web editor.  While I was not able to get node.js live-server working (I plan to in the coming weeks) I was able to start a local server and use my text editor to develop my code.  Through this process I made a folder system that worked well with this workflow.  I have a template p5.js folder with all the necessary files (index.html, sketch.js, etc) that I will keep as a template to copy throughout the semester.  Then I created another folder with a copy of all of those files and named it runningCode.  This is the directory that I used to start my local server.  Since I wanted to keep different parts of my homework code as reference, I switched out the sketch.js file in the runningCode folder.  This way I could work on a simple random walk example and then experiment with Perlin noise and could have these examples in two separate sketch.js files.  I enjoyed developing this process and I anticipate modifying it more throughout the semester. 

My process:

This week I developed my code in stages. I began by building a generic random walk example as shown in Figure 1.

Figure 1: Random Walker

In this example, every time I load the page it begins the random walk in a random spot.  The code constrains the random walk from beginning too close to the edges of the sketch.  Each time the page is loaded the step amount and dot size is randomly generated as well.  The code for this example can be found here

Then I recreated a random walk in another sketch but limited the movement to the left and right (Figure 2).

Figure 2: One Dot Random Walk

Using the Perlin noise method I smoothed out this movement (Figure 3).

Figure 3: One Dot Perlin Noise

I enjoyed this motion and thought that I could get an interesting effect with many of these slider type circles on the screen.  Thus, I added an array of walkers that appeared on the screen as I clicked on the canvas (Figure 4).

Figure 4: Array of Perlin Noise Dots

As you can see, the alpha of each circle is also changing using the random walk method.  From here I began to experiment with this concept but added color and changed the circles to long rectangles (Figure 5).

 As the canvas became more populated with walkers, I enjoyed the generative pattern it was creating.  Thus I continued to experiment with this sketch to see what other patterns I could make.  Eventually I adjusted the x and y Perlin Noise movement of the walkers and came up with the pattern shown in Figure 6.

Figure 6a: Horizontal Color Modification

Figure 6b: Vertical Color Modification

Figure 7 is a black and white version of the previous images which I tend to like more.

Figure 7a: Vertical Grayscale Modification

Figure 7b: Horizontal Grayscale Modification

Figure 8 shows the final result.

Figure 8a: Final Result (faster alpha random walk modification)

Figure 8b: Final Result (slower alpha random walk modification)


Overall, I am happy with the pattern that results from this iterative process.  Since I spent some time setting up my work flow, I did not get into too much detail with this project however I see potential for more iterations of this idea.  In the future I may want to add music or notes to amplify this pattern.  It also may be interesting to create a series of these moving x patterns for a larger display.