Parameter Search

The ParameterSearch executable class is one of the main entry points to using TAG. It can be used to tune either AI agent parameters (see here), or Game parameters (see here). This page attempts to document all the run-time options available. For the most up-to-date summary, you can also look at the code, and/or by executing:

java -jar ParameterSearch.jar --help

The optimisation algorithm used is NTBEA (N-Tuple Bandit Evolutionary Optimisation), and full details of this can be found in the following paper:

Lucas, Simon M., Jialin Liu, and Diego Perez-Liebana. 2018. ‘The N-Tuple Bandit Evolutionary Algorithm for Game Agent Optimisation’. In IEEE Congress on Evolutionary Computation (CEC). Rio de Janeiro. https://doi.org/10.1109/CEC.2018.8477869.

The default model is set to use the same settings as this paper, i.e. 1-, 2- and N-Tuples, kExplore of 1.0 and a neighbourhood of 50. These are not necessarily the best settings for all environments and in particular kExplore can often fruitfully be set rather lower at 0.1 or 0.3, especially if the number of iterations is small relative to the size of the parameter space.

All arguments are specified by command line tags in the table below.

The steps below provide an overview of the full process within which NTBEA is run. Only the first two are mandatory:

For each repeat a total of iterations evaluations are run across the specified search space.
This gives repeat candidates for the final recommendation.
[optional] Each of these is further evaluated with evalGames additional trials to get a better estimate of its true performance.
[optional] A round robin tournament can be run between the repeat candidates in order to recommend the best one (using a budget of tournamentGames games). This recommendation will take preference over the previous step is used.
[optional] From this final recommendation a final possible step is to explore the immediate surroundings in the search space for better settings. This uses OSDBudget games, and will look for settings that are better with at least a confidence level of OSDConfidence. This approach makes most sense when the search space is very big compared to the total budget. In this case the initial NTBEA runs find a good candidate in the space, and the OSD process searches this space more exhaustively.

Tag	Description	Default
`searchSpace=`	A filename for a Search Space definition in json format. See details here or here	None
`iterations=`	The number of NTBEA iterations to run in each trial.	None
`game=`	The game for which we are optimising an AI agent, or which we are tuning directly, e.g. TicTacToe, Uno, or LoveLetter.	None
`nPlayers=`	The total number of players in each game	The minimum permitted
`evalGames=`	The number of games to run after each NTBEA run with the best predicted setting to estimate its true value.	20% of NTBEA iterations
`repeats=`	The number of times NTBEA should be run from scratch, using the full set of iterations in each, to find a single best recommendation. Any one NTBEA run can give a poor recommendation, and given a budget of X total iterations is is empirically better to run 10 runs, each using X/10 iterations. (See this paper for detailed experiments on test domains that show this.)	1
`evalsPerTrial=`	This detemines how many games are run on each individual NTBEA iteration. These are then averaged to obtain a score for that setting. The idea is that larger values will reduce the variance of the signal, but in practice the default setting of 1 is almost always best.	1
`matchups=`	If this is non-zero, then after all `repeats`, a final Tournament will be run with all the recommended agents from each run. This can be used to provide a final recommendation.	0
`tuneGame=`	If `true` then we are tuning game parameters. If `false`, then we are optimising an AI agent.	`false`
`useTwoTuples=`	If `true` then NTBEA will use 2-Tuples in its model.	`true`
`useThreeTuples=`	If `true` then NTBEA will use 3- and N-Tuples in its model.	`false`
`useNTuples=`	If `true` then NTBEA will use N-Tuples in its model.	`false`
`verbose=`	If `true` then the results marginalised to each dimension will be logged, plus the Top 10 best tuples for each run. This can be useful to get a feel for the important dimensions in the search space.	false
`opponent=`	The agent(s) used as opponent. Detailed options below.	random
`budget=`	If this is set to a positive number then this will be used to set the time (or other) budget of all agents (where they are AnyTime agents, such as MCTS or RHEA). The default will use the budget specified in the Search space JSON description.	0
`evalMethod=`	Defines the objective to optimise. When tuning an AI agent this is one of `Score`, `Ordinal`, `Heuristic`, `Win`. For tuning a game this is a json file detailing an `IGameHeuristic` (see Game Tuning for details)	`Win`
`kExplore=`	The k value used by NTBEA. This should be scaled to be appropriate to the range of the objective function. The default works well for values in {-1, 0, 1}	1.0
`neighbourhood=`	The size of neighbourhood to look at in NTBEA during each iteration.	`min(50, \\|searchSpace\\| * 0.01)`
`simpleRegret=`	If `true` then instead of using the UCB equation to determine the exploration level for each tuple, \(\frac{\sqrt{N}}{1+n}\) is used instead. This is intended to optimise for simple regret instead of the cumulative regret of UCB. In practice this seems to work poorly.	`false`
`noiseCombination=`	This defines how the exploration levels from each tuple. The default is to take the simple mean of these across all tuples. This parameter is the exponent to use in a generalised mean for this process (1.0 is the simple mean). Values larger than 1.0 will give additional weight to large values (increasing exploration) and a value larger than 1.0 will do the obverse. Specifically, if noiseCombination is set to p, then: <p>\(\mu_p = \left(\frac{1}{N}\sum\limits_i (x_i)^p \right) ^{\frac{1}{p}}\)	1.0
`OSDBudget=`	If set then this will use this number of games to run a number of trials on all possible one-step deviations from the final NTBEA recommendation. A one-step deviation is a set of parameters that has exactly one of them set to a different value. This is an iterative process that searches the immediate environs of the point in parameter space for a result that is significantly better.	0
`OSDConfidence=`	This defines the statistical confidence required for a result to be significantly better than the NTBEA recommendation. All statistical analysis will correct this for the number of tests being run (one per one-step deviation) to avoid accidental p-hacking, so this refers to the overall confidence of the process, and not each individual test.	0.90
`OSDTournament=`	Instead of using `OSDBudget` for trials on all one-step deviations, this will instead run a round robin tournament between them. This is in much the same way that `tournamentGames` can be run between the winners of each `repeat` to pick the best final recommendation.	`false`
`seed=`	Random seed for Game use (not used by NTBEA itself).	`System.currentTimeMillis()`
`gameParams=`	A json file detailing the game parameters to use (only if optimising an AI agent; this is ignored if tuning a game). See here for details.	Game defaults
`listener=`	The location of a JSON file from which a listener can be instantiated. A pipe-delimited list can be provided. See section below on Listeners	`metrics/MetricsGameListener.json`
`metrics=`	(Optional) The full class name of an `IMetricsCollection` implementation. The recommended usage is to include these in the JSON file that defines the listener.	`evaluation.metrics.GameMetrics`
`destDir=`	The directory to which any reported metrics will be written. If this is run for multiple games and/or player counts, then a sub-directory will be created for each. The standard output will include a JSON file for each recommended agent for each repeat of NTBEA.	`metrics\out`
`NTBEAMode=`	NTBEA is the default. StableNTBEA runs P (number of players) games with a fixed random seed for each trial; with the tuned agent in each position. This is useful for games with strong positional or random seed effects to reduce variance. CoopNTBEA will tune one agent for all players (for coop games).	NTBEA
`byTeam=`	Determines if the same Agent is applied to all players on the same team (the default). This is only relevant for team-based games such as Resistance (2 teams)	true
`gameParams=`	A JSON file detailing the game parameters to use. See here for details.	Game defaults
`config=`	If this is provided it overrides all other parameters, and should be the only one provided. This is then the name of a JSON file which details the parameters to use, with each parameter as a name-value JSON pair.	None

Opponent

The value provided for opponent= can be any of:

one of mcts, rmhc, random, osla to use the default implementations, with default parameter. (Useful for random or osla; not recommended for mcts or rmhc, in which case you should define a json-file instead.
the full classname of a class that extends AbstractPlayer and has a no-argument constructor
the name of a json-format file that details the parameters of an agent to use (see details).
the name of a directory that contains one or more json-format files, each detailing an agent. The players/opponents in a game will then be sampled from this set of agents.

If tuneGame is set, then the opponent argument must be provided, and will be used for all players.

Tuning an Agent

In this mode, each NTBEA iteration will sample a different set of agent parameters (as defined in the Search Space). This will be used for one player (at a random position), with all the other players using agents defined by the opponent= setting. The exception to this is NTBEAMode=coop, in which case the single NTBEA sample is used for all players. (If a game has teams, such as in Resistance, then the same agent will be used for all the members of one team…unless byTeam is false.

Tuning a Game

In this mode opponent= is a mandatory setting. On each NTBEA iteration a new set of game parameters are used, and all of the players are sampled from the opponent setting.

last updated: Feb 2025