OpenEnv documentation
Connect4 Environment
Get Started
Guides
Tutorials
OverviewHello WorldTrain a reasoning model with TRLMCP EnvironmentsRubricsRL Training with TRLRL Training with UnslothEvaluating with Inspect AIRL Training with an Agentic HarnessSFT Training with Environments
Environments
EnvironmentsEchoCodingJupyterTerminusCoding ToolsChatAtariOpenSpielSUMO-RLFinRLTextArenaGitDIPG SafetySnakeWeb SearchBrowserGymREPLCalendarCARLAChessConnect4DM ControlFinQAGrid WorldJuliaKernRLMazeOpenAppQED MathReasoning GymTBench2UnityWildfireAgent World ModelOpenCodeSophistry Bench Sprint
API Reference
Project
You are viewing main version, which requires installation from source. If you'd like
regular pip install, checkout the latest stable version (v0.4.1).
Connect4 Environment
A classic Connect Four board game environment for training agents on turn-based strategy with a 6×7 grid. Players alternate dropping pieces into columns, aiming to connect four in a row horizontally, vertically, or diagonally.
Quick Start
import asyncio
from connect4_env import Connect4Action, Connect4Env
async def main():
async with Connect4Env(base_url="http://localhost:8000") as client:
obs = await client.reset()
print(f"Board: {obs.board}")
print(f"Legal moves: {obs.legal_actions}")
# Drop a piece in column 3
result = await client.step(Connect4Action(column=3))
print(f"Reward: {result.reward}, Done: {result.done}")
asyncio.run(main())Action Space
| Field | Type | Description |
|---|---|---|
column | int | Column index (0–6) where the piece will be dropped |
Invalid moves (out-of-range or full column) result in a reward of -1 and end the episode.
Observation Space
| Field | Type | Description |
|---|---|---|
board | list[list[int]] | 6×7 grid — 1 = current player, -1 = opponent, 0 = empty |
legal_actions | list[int] | Column indices that are valid moves |
Rewards
| Outcome | Reward |
|---|---|
| Win (4 in a row) | +1.0 |
| Draw (board full) | 0.0 |
| Invalid move | -1.0 |
| Otherwise | 0.0 |
State
| Field | Type | Description |
|---|---|---|
episode_id | str | Unique ID for the current game |
board | list[list[int]] | Current board state |
next_player | int | Whose turn it is (1 or -1) |
step_count | int | Number of steps taken |