Minimax Backed-Up Scores

RL Environments, Games & Applications DS practice problem on Onlearn.

Difficulty: medium.

Topics: Minimax Backed-Up Scores, Alpha-Beta Pruning, Zero-Sum Equilibrium, Bellman Backup Operator, Minimax Value Propagation, Terminal State Utility, Game Theory, Reinforcement Learning, Search Algorithms, Decision Theory, Computational Complexity, Adversarial Search, Markov Decision Processes, Recursive State Evaluation, Heuristic Function Design, Tree Traversal Strategies.

Implement a function minimax score that computes the minimax backed up value for a game tree and returns the optimal path of play. In two player zero sum games, the minimax algorithm is used to determine the best strategy for each player. The game tree is represented as a nested list structure where: A number (int or float) represents a leaf node (terminal evaluation score) A list represents an internal node whose elements are its children Players alternate turns at each depth level of the tree. At a maximizing node, the player selects the child with the highest backed up value. At a minimizing node, the player selects the child with the lowest backed up value. Leaf node values propagate upward through the tree according to these selection rules. When multiple children share the same optimal backed up value at any node, the child with the lowest index should be selected. Args: tree: A nested list structure representing the game tree. Numbers are leaf scores, lists are internal nodes. is maximizing: Boolean indicating whether the root node is a maximizing player (default True). Returns: A tuple (value, optimal path) where: value: float, the minimax backed up score at the root, rounded to 4 decimal places optimal path: list of integers, the sequence of child indices chosen along the optimal line of play from root to leaf