Skip to main content

Tournament System

This document describes the tournament system for Reinforce Tactics, which allows running round-robin tournaments between different bot types.

tip

Looking for tournament results? Check out the Bot Tournaments page!

Overview

The tournament system automatically discovers and runs competitions between:

  • SimpleBot: Built-in basic rule-based bot (always included)
  • MediumBot: Built-in improved rule-based bot with advanced strategies (always included)
  • LLM Bots: OpenAI, Claude, and Gemini bots (if API keys configured)
  • Model Bots: Trained Stable-Baselines3 models (from models/ directory)

Quick Start

Run a tournament with default settings:

python3 scripts/tournament.py

This will:

  • Use the maps/1v1/6x6_beginner.csv map
  • Discover all available bots
  • Run 4 games per matchup (2 per side)
  • Save results to tournament_results/

Command-Line Options

python3 scripts/tournament.py [OPTIONS]

Options

  • --map PATH: Path to map file (default: maps/1v1/6x6_beginner.csv)
  • --models-dir PATH: Directory containing trained models (default: models/)
  • --output-dir PATH: Directory for results and replays (default: tournament_results/)
  • --games-per-side INT: Number of games per side in each matchup (default: 2)
  • --test: Test mode - adds duplicate SimpleBots for testing

Examples

Run a tournament on a different map:

python3 scripts/tournament.py --map maps/1v1/10x10_easy.csv

Run more games per matchup:

python3 scripts/tournament.py --games-per-side 5

Save results to a custom directory:

python3 scripts/tournament.py --output-dir my_tournament

Test the tournament system:

python3 scripts/tournament.py --test --games-per-side 1

Bot Discovery

SimpleBot & MediumBot

Both built-in bots are always included. No configuration needed.

  • SimpleBot: Basic strategy with single-unit purchases and simple targeting
  • MediumBot: Advanced strategy with coordinated attacks and maximized unit production

LLM Bots

Automatically included if:

  1. API key is configured in settings.json
  2. Required package is installed (openai, anthropic, or google-generativeai)
  3. API connection test passes

Supported Models

OpenAI (Default: gpt-4o-mini)

  • GPT-4o family: gpt-4o, gpt-4o-mini (recommended for cost-effectiveness)
  • GPT-4 Turbo: gpt-4-turbo, gpt-4-turbo-2024-04-09
  • GPT-4: gpt-4, gpt-4-0613
  • GPT-3.5 Turbo: gpt-3.5-turbo, gpt-3.5-turbo-0125
  • O1 Reasoning: o1, o1-mini, o1-preview
  • O3: o3-mini (if available)

Anthropic Claude (Default: claude-3-5-haiku-20241022)

  • Claude 4: claude-sonnet-4-20250514
  • Claude 3.5: claude-3-5-sonnet-20241022, claude-3-5-haiku-20241022 (recommended)
  • Claude 3: claude-3-opus-20240229, claude-3-sonnet-20240229, claude-3-haiku-20240307

Google Gemini (Default: gemini-2.0-flash)

  • Gemini 2.0: gemini-2.0-flash (recommended), gemini-2.0-flash-thinking-exp
  • Gemini 1.5: gemini-1.5-pro, gemini-1.5-flash, gemini-1.5-flash-8b
  • Gemini 1.0: gemini-1.0-pro, gemini-pro

Configure API keys in settings.json:

{
"llm_api_keys": {
"openai": "sk-...",
"anthropic": "sk-ant-...",
"google": "AIza..."
}
}

You can also specify custom models by setting environment variables or modifying bot initialization code.

Model Bots

Automatically discovered from the models/ directory:

  1. Place trained .zip model files in models/
  2. Models must be Stable-Baselines3 compatible (PPO, A2C, or DQN)
  3. Models must be trained on the Reinforce Tactics environment

Example model file: models/ppo_best_model.zip

Tournament Format

Round-Robin Structure

Every bot plays against every other bot exactly once.

Matchup Structure

Each matchup consists of 2 × games-per-side games:

  • games-per-side games with Bot A as Player 1
  • games-per-side games with Bot B as Player 1

This accounts for first-move advantage.

Example with --games-per-side 2:

  • Game 1: Bot A (P1) vs Bot B (P2)
  • Game 2: Bot A (P1) vs Bot B (P2)
  • Game 3: Bot B (P1) vs Bot A (P2)
  • Game 4: Bot B (P1) vs Bot A (P2)

Game Execution

  • All games run in headless mode (no rendering) for speed
  • Maximum 500 turns per game (prevents infinite games)
  • Games end when:
    • One player wins (captures enemy HQ or eliminates all enemy units)
    • Turn limit reached (counts as draw)

ELO Rating System

The tournament tracks ELO ratings for all bots:

  • Starting rating: 1500 for all bots
  • K-factor: 32 (standard chess rating adjustment)
  • Ratings are updated after each game based on expected vs actual outcomes
  • Final rankings include ELO rating and change from initial rating

Output Files

The tournament generates the following outputs:

tournament_results/tournament_results.json

Complete tournament data in JSON format:

{
"timestamp": "2025-12-10T22:09:52.145889",
"map": "maps/1v1/6x6_beginner.csv",
"games_per_side": 2,
"rankings": [
{
"bot": "SimpleBot",
"wins": 5,
"losses": 1,
"draws": 2,
"total_games": 8,
"win_rate": 0.625,
"elo": 1564,
"elo_change": 64
}
],
"matchups": [...],
"elo_history": {...}
}

tournament_results/tournament_results.csv

Simple CSV format for spreadsheet import:

Bot,Wins,Losses,Draws,Total Games,Win Rate,Elo,Elo Change
SimpleBot,5,1,2,8,0.625,1564,+64
OpenAIBot,3,3,2,8,0.375,1436,-64

tournament_results/replays/

Replay files for every game:

  • Format: matchup{N}_game{M}_{BotA}_vs_{BotB}.json
  • Example: matchup001_game01_SimpleBot_vs_OpenAIBot.json
  • Can be played back using the game's replay system

ModelBot Integration

The ModelBot class allows trained Stable-Baselines3 models to participate in tournaments.

Creating Compatible Models

Train a model using the Reinforcement Learning environment:

from stable_baselines3 import PPO
from reinforcetactics.rl.gym_env import StrategyGameEnv

# Create environment
env = StrategyGameEnv(
map_file='maps/1v1/6x6_beginner.csv',
opponent='bot',
render_mode=None
)

# Train model
model = PPO('MultiInputPolicy', env, verbose=1)
model.learn(total_timesteps=100000)

# Save model
model.save('models/my_trained_bot')

The saved model will be automatically discovered and used in tournaments.

Action Translation

ModelBot automatically translates between:

  • Model actions (MultiDiscrete format)
  • Game actions (create_unit, move, attack, seize, heal)

Action format: [action_type, unit_type, from_x, from_y, to_x, to_y]

Troubleshooting

"Need at least 2 bots for a tournament"

  • Only SimpleBot was found
  • Add LLM API keys or train some models
  • Or use --test flag to add a duplicate SimpleBot

LLM bot not discovered

  • Check API key in settings.json
  • Install required package: pip install openai (or anthropic, google-generativeai)
  • Verify API key is valid and has credits

Model bot not discovered

  • Ensure .zip file is in models/ directory
  • Verify model is Stable-Baselines3 compatible
  • Check that stable-baselines3 is installed: pip install stable-baselines3

Games ending in draws

  • Map may be too large or defensive positions too strong
  • Try a smaller map or increase turn limit in code
  • Check bot logic is aggressive enough

Testing

Run the test suite:

python3 -m pytest tests/test_tournament.py -v

Quick tournament test:

python3 scripts/tournament.py --test --games-per-side 1 --output-dir /tmp/test

Architecture

Key Components

  1. BotDescriptor: Describes a bot and knows how to instantiate it
  2. TournamentRunner: Manages tournament execution
  3. ModelBot: Wrapper for Stable-Baselines3 models
  4. Bot discovery: Automatic detection of available bots
  5. Results tracking: Win/loss/draw statistics

Code Structure

scripts/
tournament.py # Main tournament script
reinforcetactics/
game/
bot.py # SimpleBot implementation
llm_bot.py # LLM bot implementations
model_bot.py # ModelBot for trained models
__init__.py # Exports all bot types
tests/
test_tournament.py # Tournament system tests

Docker Tournament Runner

For more advanced tournament features, see the Docker-based tournament runner in docker/tournament/:

cd docker/tournament
docker-compose up --build

The Docker tournament runner includes:

  • ELO rating system: Tracks bot skill ratings throughout the tournament
  • Concurrent game execution: Run multiple games in parallel (configurable 1-32)
  • Resume capability: Continue interrupted tournaments from where they left off
  • Google Cloud Storage: Upload results to GCS for cloud deployments
  • Multi-map tournaments: Play across multiple maps with per-map configuration
  • LLM API rate limiting: Configurable delay between API calls

See docker/tournament/README.md for detailed configuration options.

Future Enhancements

Possible improvements:

  • Swiss-system tournament format
  • Real-time progress visualization
  • Tournament brackets for elimination format
  • Head-to-head statistics
  • Performance profiling per bot