Common problem with Hill climbing:
An alternative way of justifying the problem is that the states are boards with 8 queens already on them, so an action is a movement of one of the queens. In this type of case, our agent can use an evaluation function and do hill climbing. Which, it counts the number of pairs of queens where one can take the other, and only moves a queen if the movement reduces the number of pairs of that. When there is a choice of movements both resulting in the same decrease, the agent can be chosen one randomly from the choices. In this sequence the 8-queens problem, and there are only 56 * 8 = 448 possible ways to move one queen, so that's why our agent only has to calculate the evaluation function 448 times at each stage. If it single chooses moves to where the situation with respect to the evaluation function improves, so it is doing hill climbing or gradient descent if it's better to think of the agent going downhill rather than uphill.
A common problem with this search strategy is a local maxima: the search has not been yet reached at a solution, but it one of the way to go downhill in terms of the evaluation function. For example, we could get to the stage where only two queens can take each other, to moving any queen increases this number to at least three. In cases like this, the agent can do a random re-start whereby they randomly choose a state to start the whole process from again. This search strategy has the appeal of never requiring be store more than one state at any one time that is the part of the hill the agent is on. Russell and Norvig make the analogy that this kind of search is like trying to climb Mount Everest in the fog with amnesia, but they do concede that it is often the search strategy of choice for some industrial problems.