Assignment 1 (Autumn Term 2024)
Questions
1. Vacuum cleaner
Consider the Simple Reflex Agent in the Two-square Vacuum-cleaner World studied in w01_vacuum_world_revised.ipynb in your seminar
(linked here, see code blocks 30 and 31 for the environment and agent definitions). We modify the environment so that after dirt has been cleaned in a square, new dirt will be generated in the same square after Y time steps, where Y ∼ Geom(p) for p = 1/15 (i.e. if the square is clean at any time point, the probability of having dirt in the square at the next time point isp). Let RT denote the average reward in T time steps for this Simple Reflex Agent.
i. Write down the PEAS description of the task environment (using the reward function as in the IPython notebook). Is the environment fully observable? Is the environment deterministic?
ii. Compute limT→ ∞ T −1ERT . You may derive it mathematically or write a piece of code to provide an approximate solution (to two decimal places) for this part. You do not need to submit any code used.
iii. Is the simple reflex agent described in (ii) rational? Why or why not?
iv. Explain why model-based agents can be more suitable for this particular environment.
2. Maze exploration
Consider the following maze in a 7-by-7 grid world. An agent is exploring the maze from
At each square, the agent will always try forward , turn left and turn right in this order. For example, after reaching square X , the agent will try first to move East, and then try to move North (which it cannot move due to a wall), before trying to move South.
i. If the agent explores using a depth-first graph search algorithm, write down the order in which it visits squares A , B , C , D , E and F before reaching the Finish (not all squares may be visited).
ii. If the agent explores using a breadth-first graph search algorithm, write down the order in which it visits squares A , B , C , D , E and F before reaching the Finish (not all squares may be visited).
iii. In the previous two parts, will your answers change if the agent used depth-first tree search and breadth-first tree search respectively? Why?
iv. The agent now uses an A* search algorithm with the heuristic function
h(x) = √−(x1(−)−−f1(−−)−)2(−)−(x2(−−)f2(−−),
where x = (x1, x2) is the coordinate value of the current square (with (0,0) for the top-left square) and f = (f1, f2) = (6, 3) is the coordinate value of the Finish square. Is this heuristic function admissible? Why?
3. Sokoban
In this task, you are required to submit Python code that solves a variant of the Sokoban (Storekeeper) game using the A* search algorithm. You will modify the provided sokoban.py file to solve the puzzle based on the game rules and provided input.
Game Rules
Game setup: The game is initialised with a 7x7 grid world where each square can either be an obstacle * , a free square (blank space), a storekeeper S , a box B or a target location # . You may assume that there is only one S , one B and one # , and that the entire
perimeter of the grid world are lined with * .
Movement: You control the storekeeper. In each step, your action is one of ‘N’ (move north), ‘S’ (move south), ‘E’ (move east), ‘W’ (move west). The storekeeper cannot walk into an obstacle ( * ). The storekeeper can push the box B (by walking towards it) only if the square behind the box (in the direction of the push) is free (not an obstacle).
Cost: The floor of the store is sloped so it is difficult to push the box North. Each push in the North has a cost of 0.2, and a push in South, East or West has a cost of 0.1. In addition, each non-pushing movement of the storekeeper incurs a cost of 0.01.
Example
The 7x7 character array in input0.txt encodes the initial game configuration as shown in the leftmost panel below:
A sequence of moves NNWNNESSSNNNEESSSSWW would have solved the puzzle (though not necessarily optimally) with a total cost of 0.2 × 1 + 0.1 × 4 + 0.01 × 15 = 0.75. Some intermediate game states are shown above.
Your Task
You are provided with three files on Moodle:
. sokoban.py : main code file for solving the Sokoban puzzle using A* search.
search4e.py : a slight modification of code in search4e.ipynb in your AIMA library that implements various search algorithms. input0.txt : an example input instance.
Please place all three files in the same working directory. Your task is to fill in the blanks in sokoban.py so that the code reads in an input in the same format as input0.txt and outputs a character string encoding the sequence of actions for solving the Sokoban puzzle. You only need to submit sokoban.py , renamed using your candidate number according to the submission instruction below. Please do not modify code outside designated blanks in sokoban.py and do not change search4e.py , otherwise your submitted code may not run properly.