Skip to content
Snippets Groups Projects
Commit 4c72f4ed authored by MIRANDA GONZALES Marcelo's avatar MIRANDA GONZALES Marcelo
Browse files

Delete compare_all_randoms.ipynb

parent e84e046e
No related branches found
No related tags found
No related merge requests found
%% Cell type:markdown id: tags:
<h1 style="background-color: gray;
color: black;
padding: 20px;
text-align: center;">INFO</h1>
In this script, we compare players `Random1`, `Random2` and `Random3` in a game where there is only one cheese to catch in a maze without mud. \
All programs are evaluated on the same game configurations. \
We do not show the game interface here, to make the script faster. \
The goal is to compare the performances of the different random players in the same conditions.
%% Cell type:markdown id: tags:
<h1 style="background-color: gray;
color: black;
padding: 20px;
text-align: center;">IMPORTS</h1>
%% Cell type:code id: tags:
``` python
# External imports
import sys
import os
import tqdm.auto as tqdm
import matplotlib.pyplot as pyplot
import scipy.stats as scstats
# Add needed directories to the path
sys.path.append(os.path.join("..", "players"))
# PyRat imports
from pyrat import Game, GameMode
from Random1 import Random1
from Random2 import Random2
from Random3 import Random3
```
%% Cell type:markdown id: tags:
<h1 style="background-color: gray;
color: black;
padding: 20px;
text-align: center;">CONSTANTS</h1>
In this script, we are going to make multiple independent games. \
The goal is to collect enough statistics to draw conclusions on which algorithm is better than the other. \
This constant defines how many games are made.
%% Cell type:code id: tags:
``` python
# Determines how many games will be played for each player
NB_GAMES = 1000
```
%% Cell type:markdown id: tags:
Let's configure the game with a dictionary. \
Note that we put the game mode as `SIMULATION` to perform all games as fast as possible.
%% Cell type:code id: tags:
``` python
# Customize the game elements
CONFIG = {"mud_percentage": 0.0,
"nb_cheese": 1,
"game_mode": GameMode.SIMULATION}
```
%% Cell type:markdown id: tags:
<h1 style="background-color: gray;
color: black;
padding: 20px;
text-align: center;">RUN THE GAMES</h1>
Let us now perform all games. \
For each game, we remember the number of turns needed to complete it.
%% Cell type:code id: tags:
``` python
# Players to test (keys are legends to appear in the plot)
players = {"Random 1": {"class": Random1, "args": {}},
"Random 2": {"class": Random2, "args": {}},
"Random 3": {"class": Random3, "args": {}}}
# Run the games for each player
results = {player: [] for player in players}
for key in players:
for seed in tqdm.tqdm(range(NB_GAMES), desc=key):
# Make the game with given seed
game = Game(random_seed=seed, **CONFIG)
player = players[key]["class"](**players[key]["args"])
game.add_player(player)
stats = game.start()
# Store the number of turns needed
results[key].append(stats["turns"])
```
%% Cell type:markdown id: tags:
<h1 style="background-color: gray;
color: black;
padding: 20px;
text-align: center;">ANALYZE THE RESULTS</h1>
Now that all games are performed, we plot the percentage of games completed as a function of the number of turns elapsed.
%% Cell type:code id: tags:
``` python
# Visualization of cumulative curves of numbers of turns taken per program
max_turn = max([max(results[player]) for player in results])
pyplot.figure(figsize=(10, 5))
for player in results:
turns = [0] + sorted(results[player]) + [max_turn]
games_completed_per_turn = [len([turn for turn in results[player] if turn <= t]) * 100.0 / NB_GAMES for t in turns]
pyplot.plot(turns, games_completed_per_turn, label=player)
pyplot.title("Comparison of turns needed to complete all %d games" % (NB_GAMES))
pyplot.xlabel("Turns per game")
pyplot.ylabel("% of games completed")
pyplot.xscale("log")
pyplot.legend()
pyplot.show()
```
%% Cell type:markdown id: tags:
Visualizing is great, but it may be hard to conclude with just a plot. \
Here, we perform a statistical test that will give more insight on whether an algorithm is better than the other.
%% Cell type:code id: tags:
``` python
# Formal statistics to check if these curves are statistically significant
for i, player_1 in enumerate(results):
for j, player_2 in enumerate(results):
if j > i:
test_result = scstats.mannwhitneyu(results[player_1], results[player_2], alternative="two-sided")
print("Mann-Whitney U test between turns of program '%s' and of program '%s':" % (player_1, player_2), test_result)
```
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment