This project was started with very common motive. I was scrolling through my phone when I stumbled upon the Water Sort Puzzle game. It’s simple rule caught me and sparkled my curiosity.

•

Is there a systematic method to solve any given level of this game?

•

What algorithms could guarantee the most efficient solution?

•

With the advancements in artificial intelligence, could machine learning or reinforcement learning offer a better approach?

The idea of combining my interest with cutting-edge trendsi nAI was exhilarating. I realized that this simple game could serve as a perfect playground to explore and compare different problem-solving methodologies.

Why this matters

In an era where AI is transforming industries and solving complex problems, applying these technologies to a puzzle game might seem trivial. However, puzzles like the Water Sort Puzzle encapsulate fundamental challenges in computer science and AI:

•

State Space Exploration
The game has a vast number of possible states, especially at higher levels, making it a non-trivial task to find the optimal solution.

•

Combinatorial Optimization
Finding the sequence of moves that leads to the solution with the least number of steps is a classic optimization problem.

•

Decision-Making Under Constraints
Each move must adhere to the game's rules, similar to constraints in real-world applications.

By tackling these challenges, we can gain insights into how different approaches perform, not just in games but in broader applications requiring efficient problem-solving strategies.

Journey ahead

Motivated by this realization, I decided to embark on a project with the following objectives:

Develop and Implement Various Solution Methods:

•

Algorithmic Approach: Utilize traditional search algorithms like BFS, DFS, and A* to solve the puzzles.

•

Machine Learning Approach: Train models to predict the next best move based on patterns learned from data.

•

Reinforcement Learning Approach: Develop agents that learn optimal strategies through trial and error interactions with the game environment.

Compare Efficiency and Effectiveness:

•

Evaluate each method based on solution optimality, computational resources, development effort, and scalability.

Share Findings with the Community:

•

Document the entire process through this blog, providing insights, code snippets, and lessons learned.

•

Contribute to the understanding of how AI techniques can be applied to problem-solving in engaging and accessible ways.

Progresses ahead

Literature review and research direction concretization

•

Case study of algorithm/ml/dl/rl application cases for the water sort puzzle or similar games

•

Clearly establish points of differentiation from existing research to ensure originality

Problem definition and experiment design

•

Creating a program that automatically generates games using difficulty level system

•

Establish clear evaluation criteria and indicators for comparison

Implementation and evaluation of algorithms

•

Implement and apply algorithms such as BFS, DFS, A*, etc

•

Measure resource usage and performance of each algorithms

Develop and evaluate machine learning model

•

Dataset creation

•

Model training and performance evaluation (e.g. accuracy, prediction time)

•

Record the effort and cost during the training process (data preparation time, training time, etc.)

Development and evaluation of reinforcement learning agents

•

Environment and agent design

•

Learning algorithm implementation

•

Performance evaluation and measurement of effort and cost in the learning process

Compare and analyze results

•

Compare performance and cost for all approaches using the same criteria

•

Present results visually using graphs and tables

•

Verify significance of results using statistical methods

Case study and direction concretization

Similar puzzles that are mainly studied

•

Sliding tile puzzles (8-puzzle, 15-puzzle, …)

•

Rubik’s cube

•

Tower of hanoi

•

Sokoban

•

Sudoku

•

Peg solitaire

Papers about similar topics

Algorithm approaches - Classic / Easy Algorithms

Breadth-First Search (BFS)

•

Explores all possible moves level by level, ensuring that the shortest path to the solution is found if one exists

•

BFS is straightforward to implement and provides a baseline for comparing other algorithms. It's particularly effective for puzzles with shallow solution depths.

Depth-First Search (DFS)

•

Explores as deep as possible along each branch before backtracking. It's useful for puzzles where solutions may be found far from the root node.

•

DFS is simple and memory-efficient. It can serve as a contrast to BFS in terms of performance and resource usage.

A* Search Algorithm

•

An informed search algorithm that uses heuristics to estimate the cost from the current state to the goal, optimizing both time and space.

•

A* is powerful for finding the least-cost path efficiently. Implementing it will introduce me to heuristic design, which is crucial in AI.

Greedy Best-First Search (Greedy BFS)

•

Selects the next move based solely on which appears to be closest to the goal, according to a heuristic.

•

It's faster than A* in some cases and easier to implement, though it doesn't guarantee the optimal solution. It provides a good comparison point for heuristic effectiveness.

Simulated Annealing

•

A probabilistic technique that explores the search space by occasionally accepting worse solutions to escape local minima.

•

Simulated Annealing introduces you to metaheuristic optimization without the complexity of genetic algorithms. It's suitable for large search spaces.

Algorithm approaches - Recent / Complex Algorithms

Genetic Algorithms

•

Mimics the process of natural selection by evolving a population of solutions over generations using selection, crossover, and mutation.

•

Genetic Algorithms are powerful for global optimization problems. They can handle large and complex search spaces, providing a different perspective from deterministic algorithms.

Reinforcement Learning (Q-Learning)

•

An agent learns optimal actions through trial and error by maximizing cumulative rewards. Q-Learning is a value-based method that doesn't require a model of the environment.

•

Reinforcement Learning introduces you to AI methods that learn from interaction with the environment. It's applicable to sequential decision-making tasks like puzzles.

Monte Carlo Tree Search (MCTS)

•

Uses random sampling of the search space to build a search tree, balancing exploration and exploitation to find optimal moves.

•

MCTS is effective in large, complex state spaces where traditional search methods are impractical. It's widely used in game AI, such as in Go and chess engines.

Deep Reinforcement Learning (DQN)

•

Combines neural networks with reinforcement learning, allowing the agent to learn value functions directly from high-dimensional inputs.

•

Deep RL can handle complex patterns and generalize well to unseen states. Implementing DQN will expose you to cutting-edge AI techniques.

Constraint Satisfction Problems (CSP) Approach

•

Models the puzzle as a CSP where variables must satisfy specific constraints. Techniques like backtracking and constraint propagation are used.

•

CSPs are powerful for problems with clear constraints. This approach can efficiently prune the search space and is a fundamental concept in AI.

Problem Definition

Game Components

There are 3 main variables to create a game.

•

Bottles

B = \{b_1, b_2, ..., b_n\}

•

Colors

C = \{1,2,...,m\}

represent the set of colors. Each color is represented by an integer for simplicity, and actual colors can be assigned randomly in code.

•

Capacity of a bottle

k

is a fixed capacity of each bottle.

Representing maximum number of color units it can hold.

State Representation

A state in the game can be represented by the configuration of colors in each bottle. We can model the state as a tuple or an array where each element represents a bottle’s content from bottom to top.

•

State S=[s1,s2,...,sn]S = [s_1, s_2, ..., s_n]S=[s1​,s2​,...,sn​]

◦

sis_isi​ represents the content of bottle bib_ibi​, which is an ordered list of colors from bottom to top

◦

If s1=[2,2,1]s_1=[2,2,1]s1​=[2,2,1], bottle b1b_1b1​ has color 2 at the bottom, color 2 in the middle, and color 1 at the top.

Actions (Moves)

An action involves pouring the topmost color from one to another, following the game’s rules.

Definition of an Action

•

Action aaa: Pour from bottle bib_ibi​ to bottle bjb_jbj​

•

Pouring Amount ppp: The number of color units to pour, determined by the number of consecutive same-colored units at the top of the source bottle.

Determining the Pouring Amount

•

Source Bottle sis_isi​

◦

Let top_color=si[−1]top\_color = s_i[-1]top_color=si​[−1]

The last element in

s_i

represents the top color unit

◦

Consecutive Same-Color Units at Top:

▪

Starting from the top of sis_isi​, count the number of consecutive units equal to top_colortop\_colortop_color.

▪

This gives the maximum number of units ppp that can be poured.

Valid Move Conditions

None-Empty Source Bottle

s_i

is not empty.

None-Full Destination Bottle

len(s_j) < k

Color Matching or Empty Destination Bottle

s_j

is empty, any color can be poured.

s_j

is not empty, tope top color of

s_j

must equal to

top\_color

Capacity Constraint

len(s_j)+p≤k

The destination bottle must have enough space to accommodate all

p

units being poured.

Action Execution

•

Pouring Process Steps:

Determine ppp as the number of consecutive top units in sis_isi​ that are equal to top_colortop\_colortop_color.

Ensure len(sj)+p≤klen(s_j)+p≤klen(sj​)+p≤k.

Move the top ppp units from sis_isi​ to sjs_jsj​.

State Transition Function

•

T(S,a)=S′T(S,a)=S'T(S,a)=S′

◦

Applies action aaa to state SSS, resulting in new state S′S'S′.

◦

State Update:

▪

si′=sis'_i=s_isi′​=si​ with the top ppp units removed.

▪

sj′=sjs'_j=s_jsj′​=sj​ with the ppp units added to the top.

Goal State

The goal state is achieved when all bottles are either empty or contain units of only one color.

•

Goal Condition:

◦

For every bottle bib_ibi​ in B:

▪

sis_isi​ is empty, or 

▪

All units in sis_isi​ are the same color.  ex: len(set(si))=1)len(set(s_i))=1)len(set(si​))=1)

Constraints and Rules Formalization

Capacity Constraint

•

For All Bottles:

len(s_i)≤k

Pouring Constraint

•

Valid Move Conditions Recap:

◦

None-Empty Source: si≠[ ]s_i \neq[\ ]si​=[ ]

◦

Not Full Destination: len(sj)<klen(s_j)<klen(sj​)<k

◦

Color Matching or Empty Destination:

▪

If si=[ ]s_i =[\ ]si​=[ ], any color can be poured.

▪

Else, sj[−1]=top_colors_j[-1]=top\_colorsj​[−1]=top_color

◦

Capacity Check: len(sj)+p≤klen(s_j)+p\leq klen(sj​)+p≤k

Pouring Amount p Calculation

•

Algorithm to Compute ppp:

def calculate_pouring_amount(s_i):
    if not s_i:
        return 0
    top_color = s_i[-1]
    p = 1
    for unit in reversed(s_i[:-1]):
        if unit == top_color:
            p += 1
        else:
            break
    return p
Python
복사

Starts from the top of

s_i

, count how many consecutive units are of the same color.

Same Color Constraint in Goal State

•

For Each Bottle bib_ibi​:

◦

si=[ ]s_i = [\ ]si​=[ ], or

◦

∀u∈si,u=si[0]∀u∈s_i, u=s_i[0]∀u∈si​,u=si​[0]

Algorithmic Strategies

State Representation

•

Efficient Data Structures:

◦

Use tupes of tupes to represent states for immutability and hashability.

◦

Example: S=((2,2,1),(3,),(),...)S = ((2,2,1),(3,),(),...)S=((2,2,1),(3,),(),...)

Successor Function

•

Generating Valid Moves:

◦

For each pair of bottles (bi, bj)(b_i,\ b_j)(bi​, bj​):

▪

Check if moving from bib_ibi​ to bjb_jbj​ is valid according to the rules.

▪

Generate the new state S′S'S′ by applying this action.

Heuristic Design

•

Considering Multiple Units Poured:

◦

When estimating the cost to the goal, account for the possibility of moving multiple units in one action.

•

Possible Heuristic:

◦

Total Misplaced Units:

▪

Sum the number of units not in their target bottles.

▪

This accounts for the efficiency of moving multiple units at once.

Algorithm Adjustments

•

For Search Algorithms (BFS, DFS, A*):

◦

Update the successor function to generate actions based on pouring multiple units.

◦

Adjust heuristics to reflect the possibility of moving multiple units.

•

For Metaheuristic Algorithms (Simulated Annealing, Genetic Algorithms):

◦

Ensure that state mutations or neighbor generation respect the corrected pouring rules.

•

For Reinforcement Learning:

◦

Define actions that involve pouring multiple units.

◦

Update the environment dynamics to reflect the accurate state transitions.

Game Experiment Design

High-Level Design Overview

Core Components

•

Game Engine

Manages the overall game flow

•

State Representation

Encapsulates the state of the puzzle at any given time

•

Bottles and Colors

Classes represnting the bottles and their contents

•

Move Generation

Logic to generate valid moves

•

Level Generator

Creates puzzles with specified difficulty levels

•

Solvability Checker

Ensures generated puzzles are solvable using mathematical logic

Not using any of algorithmic way (ex: BFS/DFS) on this part due to speed issue

•

Export/Import System

Allows saving and loading puzzles in a format (ex: csv) suitable for algorithms

Algorithm-Specific Modules

Seperate modules or classes for each algorithm that may require unique data structures or representations.

Performance Considerations

•

Efficient data structure (ex: tupes, lists)

•

Minimizing redundant calculations

•

Lazy evaluations where appropriate

Detailed Component Structure

Classes and Data Structures

Color

Colors are represented by unique integer identifiers, where

m

is total number of colors in the game.

There will be extra Color Generator for game play in pygame mode.

Bottle

•

Represents a bottle comtaining color units.

•

Attributes:

◦

capacity int Maximum number of color units

◦

contents list<int> List representing color IDs from bottom to top.

•

Methods:

◦

is_full() Checks if the bottle is full.

◦

is_empty() Checks if the bottle is empty.

◦

top_color() Returns the color ID on top.

◦

can_receive(color_id, amount) Checks if the bottle can receive the given amount of the specified color.

◦

pour_into(target_bottle) Executes the pouring action.

•

Implement

class Bottle:
    def __init__(self, capacity):
        self.capacity = capacity
        self.contents = []  # List of color IDs (integers)

    def is_full(self):
        return len(self.contents) >= self.capacity

    def is_empty(self):
        return len(self.contents) == 0

    def top_color(self):
        return self.contents[-1] if not self.is_empty() else None

    def can_receive(self, color_id, amount):
        if self.is_full():
            return False
        if self.is_empty():
            return (len(self.contents) + amount) <= self.capacity
        return (self.top_color() == color_id) and ((len(self.contents) + amount) <= self.capacity)

    def pour_into(self, target_bottle):
        # TODO: Logic to pour contents into target bottle
        pass
Python
복사

GameState

•

Represents the states of the game at any point.

•

Attributes:

◦

bottles list<Bottle>

◦

history list<Move>

•

Methods:

◦

generate_valid_moves() Returns a list of valid moves from the current state with selected algorithm. check needed

◦

is_goal_state() Checks if the current state is the goal state.

◦

apply_move(move) Returns a new GameState after apply a move

◦

encode_state() Encodes the state into a hashable format for storage.

•

Implement:

class GameState:
    def __init__(self, _bottles, _history=None):
        self.bottles = _bottles
        self.history = _history or []

    def generate_valid_moves(self):
        pass

    def is_goal_state(self):
        pass

    def apply_move(self, _move):
        new_bottles = copy.deepcopy(self.bottles)
        source_bottle = new_bottles[_move.source]
        destination_bottle = new_bottles[_move.destination]

        color_to_move = source_bottle.top_color()
        amount_to_move = _move.amount

        source_bottle.contents = source_bottle.contents[:-amount_to_move]
        destination_bottle.contents.extend([color_to_move] * amount_to_move)

        new_history = self.history + [_move]
        return GameState(new_bottles, new_history)

    def encode_state(self):
        return tuple(tuple(bottle.contents) for bottle in self.bottles)
Python
복사

Move

•

Represents an action of pouring from one bottle to another.

•

Attributes:

◦

source int Index of the source bottle

◦

destination int Index of the destination bottle

◦

amount int number of color units poured.

•

Methods:

◦

None (acts as a data container or structure)

•

Implement:

class Move:
    def __init__(self, _source, _destination, _amount):
        self.source = _source
        self.destination = _destination
        self.amount = _amount
Python
복사

LevelGenerator and SolvabilityChecker

LevelGenerator

Generate capacity, num_colors, num_bottles with given level.

•

Attributes:

◦

level int

•

Implements:

from base.bottle import Bottle
from base.gameState import GameState
from base.solvabilityChecker import SolvabilityChecker

class LevelGenerator:
    def __init__(self, level):
        if not level.isnumeric():
            assert TypeError("level must be number")
        self.level = level
        self.capacity = 4  # Fixed
        self.num_colors = None
        self.num_bottles = None
        self.generate_parameters()

    def generate_parameters(self):
        # 레벨에 따른 num_colors와 num_bottles 결정
        self.num_colors = min(2 + self.level // 5, 12)  # 최대 12색
        extra_bottles = max(2 - self.level // 10, 1)    # 빈 병의 수를 레벨에 따라 조절
        self.num_bottles = self.num_colors + extra_bottles

    def generate_level(self):
        # 해결 가능한 GameState 생성
        initial_state = self.create_initial_state()
        if SolvabilityChecker.is_solvable(initial_state):
            return initial_state
        else:
            self.generate_parameters()
            return self.generate_level()

    def create_initial_state(self):
        import random
        total_units_per_color = self.capacity  # 각 색상은 병 하나를 채움
        colors = []
        for color_id in range(1, self.num_colors + 1):
            colors.extend([color_id] * total_units_per_color)

        random.shuffle(colors)

        bottles = []
        color_index = 0
        num_filled_bottles = self.num_bottles - (self.num_bottles - self.num_colors)
        for _ in range(num_filled_bottles):
            bottle = Bottle(self.capacity)
            bottle.contents = colors[color_index:color_index + self.capacity]
            color_index += self.capacity
            bottles.append(bottle)

        # 빈 병 추가
        for _ in range(self.num_bottles - num_filled_bottles):
            bottles.append(Bottle(self.capacity))

        # 병 순서 섞기
        random.shuffle(bottles)

        return GameState(bottles)
Python
복사

SolvabilityChecker

class SolvabilityChecker:
    @staticmethod
    def is_solvable(game_state):
        # Mathematical logic to check solvability
        color_counts = {}
        for bottle in game_state.bottles:
            for color_id in bottle.contents:
                color_counts[color_id] = color_counts.get(color_id, 0) + 1

        # Verify that each color can fill a bottle
        for count in color_counts.values():
            if count != bottle.capacity:
                return False
        
        # Check are there enough empty spaces
        total_space = sum(bottle.capacity - len(bottle.contents) for bottle in game_state.bottles)
        if total_space < bottle.capacity:
            return False

        # Additional mathematical & logical checks...
        
        return True
Python
복사