The class of games considered is one that includes games involving two players with complete information. There are two players who alternate their movements adversaries, each sees the faults of his opponent and his own success. In each turn the game and define the legal moves that will impact every move possible. The Olympics begin from a specified initial state and ends in a position that may be declared the winner, loser or table. The issue of games has always attracted the interest of researchers in AI.
Among the most important are the following. In 1951 Alan Turing wrote the first computer program capable of playing a full game of chess, this program never actually ran on a computer. In the early sixties Samuel Arthur succeeded in building the first major program and operational to play, this program played checkers and could learn from their mistakes. Different machines have been developed for chess players, including the Deep Thought 2 to the middle of this decade reached an ELO of 2600 (Kasparov reached 2805). The 10/02/96 program Deep Blue beat G. Kasparov, this program assesses 50 000 million positions in three minutes.
Among the reasons for this interest are:
1. ? They provide a structured task which is very easy to measure success or failure.
2. ? Applicable strategies are useful for solving complex problems in various domains of real life.
3. ? It should make good games software used to measure the strength of humans.
From the perspective of a search problem has a game tree is an explicit representation of all possible paths of a game. The root node is the starting position of the game, his successors are the positions that the first player can achieve, the successors of these nodes are the positions where the replica of the second player, and so on. The terminal nodes represent positions or leaves winners, losers or tables. Each path from the root to a leaf represents a complete departure from the game. The problem is the size of this game tree, for instance in the game of chess has been determined that the average branching factor is around 35 and that in an average game each player makes 50 moves, therefore, to examine the complete game tree is necessary to examine 35,100 positions.
Therefore, it is clear that a program that performs a search in the dark you can not select even the first move. Need a heuristic search procedure. The technique used is as follows: to develop a part of the game tree to a certain depth depending on available computing power and steadiness, statically evaluate the positions reached and then propagate to the root of the tree these values. Various tricks are used to seriously evaluate algorithms to explore the possible moves only a small percentage of the search tree. The process of spreading to the root of the estimated values for the leaves of the tree is best known MINMAX, and the oldest trick to simplify the tree is the Alpha - Beta.
Formalizing the problem
The six rules that define a game from the perspective of game theory are:
• 1. There are at least two players.
• 2. The game begins with one or more players making a choice among the alternatives specified.
• 3. After the first movement is in a certain situation is the same. This situation determines who will make the next move and what are their alternatives.
• 4. The selections made by the players may or may not be known.
• 5. If a game is described in terms of successive moves, then there is a termination rule.
• 6. Each game ends in a certain situation. Each player receives a payout.
With regard to rule four, if all players know all possible alternatives for a particular player at any time, the game is called game of perfect information. The particular class of games that will be studied are called zero-sum game with perfect information between two players. These are characterized by the above rules under the following restrictions:
* 1. There are exactly two players.
* 2. It has perfect information.
* 3. The payment is complete.
The three restriction means that if player A has a payment of p, then pay the other player B will de-p. A game can be formally defined as a class of search problem with the following components:
* The initial state.
* A set of operators which define the legal moves a player can do.
* A termination criterion that determines when the game ends.
* A utility function which gives a numerical value to the resulting state of a game.
For example. In chess can take the values +1, -1 or 0. The two players are called Max and Min plays first and then take turns in the game. If this were a normal search problem then all that Max would have to do is find a sequence of movements that lead to a terminal condition that will be winner (as the utility function). However, Max and Min is not alone for its part will try to win.
So, Max needs to find a strategy that will lead to a terminal state regardless of what Min do, the strategy includes the right move for each possible Max Min movement. In practice it is impossible to have the game tree and the utility function applied to the terminal nodes. To resolve this problem replaces the completion criteria for a cut to a given depth and the utility function by a heuristic evaluation function is applied to the leaves of the tree. An evaluation function calculates an estimate of the value of a position from the standpoint of a player, in this case Max. The idea was proposed by Shannon and is based on the players of chess and other games have developed ways to judge the possibilities of a position.
The outcome of the game depends highly on the quality of the evaluation function. The evaluation function must agree with the utility function on the terminal stages, and should be calculable with reasonable effort. There must be a compromise between the accuracy of the evaluation function and its cost in time. A third characteristic is that the evaluation function must accurately reflect the real chance of winning.
One way to construct the evaluation function is using the expression: W1 + W2 * f1 * f2 +. + Wn * fn where wi are the weights and fi are the features of a particular position. For example, in chess the weights could be 1 for the pawn, 3 for the Bishop, 3 for the horse, 5 for the towers and 9 for the queen, and fi would be the amount of each type of piece. Turing proposed another evaluation function for chess: add the values of the black pieces (N), add the values of the white pieces (B) and calculate the ratio B / N. In Samuel's checkers program used a more sophisticated approach where the evaluation function is a combination of several simple functions, these functions include the lead in the piece, the ability to progress, the center control, mobility, etc..; and takes the form: C1 * + C2 + C3 * * Advance Control Centre +. The Ci are adjusted for a simple learning process in which increased the weight of those components that had suggested movements that led to victory, while the weights of those that had led to losses are decreased. For the game of tic - tac-toe is an evaluation function: E (p) = (number of rows, columns or diagonals still open to Max) -- (Number of rows, columns or diagonals still open for Min). If p is a winner for Max, E (p) = (. If p is a winner for Min, E (p) = - (.
Note that the precision of the value that gives the evaluation function of a position as an estimate of the possibilities for Max to win from it, increases as this is closer to a final position for the game, i.e., the precision grows with increasing depth of cut. The depth of cut can be fixed or could vary dynamically using an approach similar to depth-first iterative search, however, both approaches can be disastrous. An appropriate criterion for deciding whether a given level can make a cut or not is the stability of the game situation. That is, to make the cut should expect a steady state, ie a level that does not happen any drastic change to the next level. The positions of this type are called inactive.
Minimal Procedure
So far we have discussed two major knowledge-based components of a good game program: a good generator of movement and good static evaluation function. Both feature great deal of knowledge about the particular game being played. The third necessary component is a procedure to propagate the values determined by the evaluation function for the leaf nodes of the search tree generated to a given level to the root of the tree. The classical procedure for this purpose is the mini-max rule:
* 1. The value V (j) of a node j on the border of the tree is equal to its static evaluation function E (j), otherwise
* 2. If j is a Max node, V (j) is equal to the maximum of all its successors
* 3. If j is a MIN node, V (j) equals the minimum value of all his successors.
This rule is based on the assumption that the assessments propagated from the border to root give a more precise estimate for that node that could be obtained directly by applying the evaluation function. A procedure depth-first search based on this rule is as follows:
* 1. If j is terminal, return V (j) = E (j).
* 2. Generate the successors of j, j1, j2, j3 .., jb.
* 3. Evaluate V (j1), V (s2), V (j3)., V (jb) from left to right.
* 4. If j is a Max node, return V (j) = max [V (j1), V (s2), V (j3)., V (jb)].
* 5. If j is a MIN node, return V (j) = min [V (j1), V (s2), V (j3)., V (jb)].
This procedure is recursive, as in step 3 if ji terminals is not necessary to apply the procedure to calculate v, and as the assessment runs from left to right the resulting search is depth-first. Obviously it is not necessary to generate all the successors at once and kept in memory until they are evaluated. The search can be run using the back (backtracking) so that the successors are generated and evaluated one by one and the value of the parent node is updated every time a child is evaluated. The resulting procedure is as follows:
* 1. If j is terminal, return V (j) = E (j).
* 2. for k = 1,2., b do:
* a. jk generate the k - th successor of j.
* b. evaluate V (jk).
* c. if k = 1, set hp (j) (V (j1). Otherwise, for k (2,
CV (j) (max [cv (j), v (jk)] if j is a Max node, or
CV (j) (min [CV (j), v (jk)] if j is a Min node
* 3. Return V (j) = CV (j).
The variable CV (j) represents the present value of the node j. updated Step 2 (b) is executed recursively calling Minimax. Note that in both versions of the minimax evaluation procedure j is not complete until all successors are evaluated. Consequently this procedure examines all terminal nodes. If the depth of cut is my hay b legal moves at each point, then the time complexity is O (bm). For the actual play time this cost is completely impractical, but this algorithm is the basis for more realistic methods for mathematical analysis of games. The minimax procedure is an optimal method to select a movement from a given search tree, provided that the assessment of the terminal nodes is exactly correct. In reality, these assessments are often crude estimates of the value of a position and can lead to large errors. One way to address this problem is to use probabilistic distribution.
Horizon Effect
If you assign a value x to a node at depth d, it is possible that expands to a level, depth d +1, could be a very different assessment situation. This is because the value x is an approximation of actual value. Unfortunately, you can always look forward movement and find a different assessment. This is called the horizon effect. On the horizon effect can be postponed by an event pernicious inevitable delay techniques. This is due to the insertion of delay movements that cause certain loss of parts only happens beyond the horizon of the program (maximum search depth) so that the loss is hidden. For example, suppose we make a change where the opponent's pieces require two movements to capture our piece. In the next opponent's turn, this will probably the first of these movements. But then, in our turn, we could start an attack somewhere else in the game.
The opponent must respond to the attack, so it can not complete the exchange of pieces, but this exchange is still awaiting completion. In our next turn can instigate an attack on another front, requiring another response of the opponent. If the search stops at that point never notice that the loss of one of our parts is inevitable. The horizon effect is closely related to a key issue in implementing the search method. How to select the depth of cut? Due to the horizon effect is ideal to have a dynamic depth of cut, which is determined by detecting the states of quietude, rest or stability. These states are those whose value does not change dramatically making another move. The horizon effect is less apparent in programs with a procedure to find sleep states where making the cut. The horizon effect is described as negative or positive. The negative horizon effect is essentially the one described above, is a form of self-deception in which the program unveils a series of slow movements an unpleasant result, beyond the horizon of the search, in essence, is an unsuccessful attempt to prevent a unpleasant result. The positive horizon effect is a different form of self-deception in which the program attempts to undertake a desired result within the search horizon even though the result may be much better if we postponed some movements.
Alpha – Beta Procedure
The procedure Alpha - Beta cut the search tree branches that are unnecessary. This sets the alpha and beta levels which serve as reference values for the cuts. These values are transmitted dynamically adjust and top - down as follows: Cota - (: The coat of court for a Min node j is called a lower bound (equal to the higher present value of all predecessors of j. Max j exploration may be terminated as soon as their present value equals or falls CV under (. Cota - (: The coat of court for a Max node j is called an upper bound (equal to the lowest present value of all predecessors of j. Min j exploration may be terminated as soon as their present value equals or exceeds CV (. The procedure V (j ;(,() recursively evaluates v (j) for the parameters (<(. Retorna V (j) where the value is between (and (, returns (if (V (j) (() o ( if (V (j) ((). Clearly, if J is the root of a tree playing his minimax value is obtained by V (j ,-(,+(). Procedure (- (: V (j, (, ().
1. If j is terminal, return V (j) = E (j). In another case, whether hee., Jk's successors j, k (1, If j is a Max node go to step 2, but going to 2 ".
2. Set ((max [(, v (jk, (, ()]
3. If (((, return (, else continue
4. If k = b, return (; else k (k +1 and go to step 2. 2 ". Set ((min [(, v (jk, (, ()] 3 ". If (((, return (, else continue 4 ". If k = b, return (; else k (k +1 and go to step 2.
The effectiveness of the algorithm (- (depends on the order in which the successors are examined. For any set of leaf nodes there is an ordering such that the algorithm does not perform any cutting, if all leaf nodes must be evaluated and acted on algorithm in its worst case. Although the algorithm alphabet is the paradigm of the algorithms for gambling problems and one of the finest artificial intelligence algorithms, has several limitations, including:
1. It is assumed that estimated that serves as the heuristic evaluation of the value of the node is perfect, which is not necessarily so.
2. It seeks the maximum or minimum value of the estimates made using the evaluation function of the nodes, however, roads leading to several good alternatives may be preferred paths to an alternative that seems better.
3. All knowledge of the game is summarized in a scalar value, which necessarily causes loss of information that is potentially useful for the search.
4. The algorithm expands the nodes in a depth-first way that depends only on the order in which the successors of a node are generated.
5. It is assumed that both players use the same heuristic.
The game of chess
A typical chess program contains three elements: description of the board, cutting and tree search and evaluation of the position.
a) Data of interest.
At the start of a game of chess each side has 20 possible moves so that there are 400 different possible positions after the first exchanges. The number of possible moves for each player is approximately 50 in the middle game.
As mentioned the branching factor (b) for the chess game is 35 the depth of the tree (d) is 100.
b) Evaluation function.
The form taken by the statistical evaluation function for this game is:
E (p) = a + b * B * R * M + c + d + e * C * P + f * A
The parameter B is the material balance exists in the position, this will use the values: pawn = 1, knight = bishop = 3, rook = 5, Queen = 10. So if max has the advantage of a bishop and a pawn min would have B = +3 -1 = +2.
A: The relative safety of the king. The coefficient b may vary throughout the game, becoming 0 when the main parts have been removed from the game and help the attack King protecting their own pieces.
M: Mobility of the parts measured by the number of spaces for which each piece can move: this number will be weighed by a factor depending on the type of item.
C: Control the center. The coefficient d takes a high value in the middle game.
Q: Pawn structure. The positive contribution of the laborers are protected and advanced past the max and min isolated and unprotected.
A: This is a measure of the relative chances to attack, taking into account parts advanced beyond the fourth row, on rows or columns towers open, potential bullies, etc. Search method.