《强化学习导论》第二版源代码（python）.rar

大小: 4.06MB

文件类型: .rar

金币: 2

下载: 0 次

发布日期: 2023-11-10
语言: Python
标签: 强化学习

高速下载

资源简介

英文原书，强化学习导论英文第二版pdf加源码实现（python）。包括第一版中第二章到第十章到中文翻译

资源截图

小图大图

代码片段和文件信息

#######################################################################
# Copyright （C）                                                       #
# 2016 - 2018 Shangtong Zhang（zhangshangtong.cpp@gmail.com）           #
# 2016 Jan Hakenberg（jan.hakenberg@gmail.com）                         #
# 2016 Tian Jun（tianjun.cpp@gmail.com）                                #
# 2016 Kenta Shimada（hyperkentakun@gmail.com）                         #
# Permission given to modify the code as long as you keep this        #
# declaration at the top                                              #
#######################################################################

import numpy as np
import pickle

BOARD_ROWS = 3
BOARD_COLS = 3
BOARD_SIZE = BOARD_ROWS * BOARD_COLS

class State:
    def __init__（self）:
        # the board is represented by an n * n array
        # 1 represents a chessman of the player who moves first
        # -1 represents a chessman of another player
        # 0 represents an empty position
        self.data = np.zeros（（BOARD_ROWS BOARD_COLS））
        self.winner = None
        self.hash_val = None
        self.end = None

    # compute the hash value for one state it‘s unique
    def hash（self）:
        if self.hash_val is None:
            self.hash_val = 0
            for i in self.data.reshape（BOARD_ROWS * BOARD_COLS）:
                if i == -1:
                    i = 2
                self.hash_val = self.hash_val * 3 + i
        return int（self.hash_val）

    # check whether a player has won the game or it‘s a tie
    def is_end（self）:
        if self.end is not None:
            return self.end
        results = []
        # check row
        for i in range（0 BOARD_ROWS）:
            results.append（np.sum（self.data[i :]））
        # check columns
        for i in range（0 BOARD_COLS）:
            results.append（np.sum（self.data[: i]））

        # check diagonals
        results.append（0）
        for i in range（0 BOARD_ROWS）:
            results[-1] += self.data[i i]
        results.append（0）
        for i in range（0 BOARD_ROWS）:
            results[-1] += self.data[i BOARD_ROWS - 1 - i]

        for result in results:
            if result == 3:
                self.winner = 1
                self.end = True
                return self.end
            if result == -3:
                self.winner = -1
                self.end = True
                return self.end

        # whether it‘s a tie
        sum = np.sum（np.abs（self.data））
        if sum == BOARD_ROWS * BOARD_COLS:
            self.winner = 0
            self.end = True
            return self.end

        # game is still going on
        self.end = False
        return self.end

    # @symbol: 1 or -1
    # put chessman symbol in position （i j）
    def next_state（self i j symbol）:
        new_state = State（）
        new_state.data = np.copy（self.data）
        new_state.data[i j] = symbol
        return new_state

    # print the board
    def print_state（self）:

属性            大小     日期    时间   名称
----------- ---------  ---------- -----  ----

    .......        40  2019-03-14 04:56  《强化学习导论》第二版源代码（python）\.gitignore

    .......       148  2019-03-14 04:56  《强化学习导论》第二版源代码（python）\.travis.yml

    .......     11292  2019-03-14 04:56  《强化学习导论》第二版源代码（python）\chapter01\tic_tac_toe.py

    .......      9151  2019-03-14 04:56  《强化学习导论》第二版源代码（python）\chapter02\ten_armed_testbed.py

    .......      3808  2019-03-14 04:56  《强化学习导论》第二版源代码（python）\chapter03\grid_world.py

    .......      7391  2019-03-14 04:56  《强化学习导论》第二版源代码（python）\chapter04\car_rental.py

    .......      8716  2019-03-14 04:56  《强化学习导论》第二版源代码（python）\chapter04\car_rental_synchronous.py

    .......      2445  2019-03-14 04:56  《强化学习导论》第二版源代码（python）\chapter04\gamblers_problem.py

    .......      3436  2019-03-14 04:56  《强化学习导论》第二版源代码（python）\chapter04\grid_world.py

    .......     13167  2019-03-14 04:56  《强化学习导论》第二版源代码（python）\chapter05\blackjack.py

    .......      1814  2019-03-14 04:56  《强化学习导论》第二版源代码（python）\chapter05\infinite_variance.py

    .......      9355  2019-03-14 04:56  《强化学习导论》第二版源代码（python）\chapter06\cliff_walking.py

    .......      4269  2019-03-14 04:56  《强化学习导论》第二版源代码（python）\chapter06\maximization_bias.py

    .......      6574  2019-03-14 04:56  《强化学习导论》第二版源代码（python）\chapter06\random_walk.py

    .......      4018  2019-03-14 04:56  《强化学习导论》第二版源代码（python）\chapter06\windy_grid_world.py

    .......      4249  2019-03-14 04:56  《强化学习导论》第二版源代码（python）\chapter07\random_walk.py

    .......      1627  2019-03-14 04:56  《强化学习导论》第二版源代码（python）\chapter08\expectation_vs_sample.py

    .......     23252  2019-03-14 04:56  《强化学习导论》第二版源代码（python）\chapter08\maze.py

    .......      4892  2019-03-14 04:56  《强化学习导论》第二版源代码（python）\chapter08\trajectory_sampling.py

    .......     15793  2019-03-14 04:56  《强化学习导论》第二版源代码（python）\chapter09\random_walk.py

    .......      4262  2019-03-14 04:56  《强化学习导论》第二版源代码（python）\chapter09\square_wave.py

    .......      9605  2019-03-14 04:56  《强化学习导论》第二版源代码（python）\chapter10\access_control.py

    .......     13681  2019-03-14 04:56  《强化学习导论》第二版源代码（python）\chapter10\mountain_car.py

    .......     11839  2019-03-14 04:56  《强化学习导论》第二版源代码（python）\chapter11\counterexample.py

    .......     12140  2019-03-14 04:56  《强化学习导论》第二版源代码（python）\chapter12\mountain_car.py

    .......      9679  2019-03-14 04:56  《强化学习导论》第二版源代码（python）\chapter12\random_walk.py

    .......      8012  2019-03-14 04:56  《强化学习导论》第二版源代码（python）\chapter13\short_corridor.py

    .......     36003  2019-03-14 04:56  《强化学习导论》第二版源代码（python）\images\example_13_1.png

    .......    238133  2019-03-14 04:56  《强化学习导论》第二版源代码（python）\images\example_6_2.png

    .......     31488  2019-03-14 04:56  《强化学习导论》第二版源代码（python）\images\example_8_4.png

............此处省略67个文件信息

上一篇：极智量化Python语言帮助文档.pdf
下一篇：PyQt5 Python 桌面应用程序源码.zip

共有条评论

《强化学习导论》第二版源代码（python）.rar

资源简介

资源截图

代码片段和文件信息

评论

相关资源