强化学习实验小车爬山例程

大小: 6KB

文件类型: .rar

金币: 2

下载: 0 次

发布日期: 2021-06-08
语言: 其他
标签: 强化学习 C++ mountain car

高速下载

资源简介

Sutton课本中的小车爬山例程，强化学习中的基本仿真实验程序。

资源截图

小图大图

代码片段和文件信息

/*
This is an example program for reinforcement learning with linear 
function approximation.  The code follows the psuedo-code for linear 
gradient-descent Sarsa（lambda） given in Figure 8.8 of the book 
“Reinforcement Learning: An Introduction“ by Sutton and Barto.
One difference is that we use the implementation trick mentioned on 
page 189 to only keep track of the traces that are larger 
than “min-trace“. 

Before running the program you need to obtain the tile-coding 
software available at http://envy.cs.umass.edu/~rich/tiles.C and tiles.h
（see http://envy.cs.umass.edu/~rich/tiles.html for documentation）.

The code below is in three main parts: 1） Mountain Car code 2） General 
RL code and 3） top-level code and misc.

Written by Rich Sutton 12/19/00
 */

#include 
#include “tiles.h“
#include “stdio.h“
#include “stdlib.h“
#include 
#include 
#include 

//////////     Part 1: Mountain Car code     //////////////

// Global variables:
float mcar_position mcar_velocity;       //位置和速度值

#define mcar_min_position -1.2
#define mcar_max_position 0.6
#define mcar_max_velocity 0.07            // the negative of this in the minimum velocity
#define mcar_goal_position 0.5

#define POS_WIDTH （1.7 / 8）               // the tile width for position
#define VEL_WIDTH （0.14 / 8）              // the tile width for velocity

// Profiles
void MCarInit（）;                              // initialize car state
void MCarStep（int a）;                         // update car state for given action
bool MCarAtGoal （）;                           // is car at goal?

void MCarInit（）
// Initialize state of Car
  { mcar_position = -0.5;
    mcar_velocity = 0.0;}

void MCarStep（int a）
// Take action a update state of car
  { mcar_velocity += （a-1）*0.001 + cos（3*mcar_position）*（-0.0025）;
    if （mcar_velocity > mcar_max_velocity） mcar_velocity = mcar_max_velocity;
    if （mcar_velocity < -mcar_max_velocity） mcar_velocity = -mcar_max_velocity;
    mcar_position += mcar_velocity;
    if （mcar_position > mcar_max_position） mcar_position = mcar_max_position;
    if （mcar_position < mcar_min_position） mcar_position = mcar_min_position;
    if （mcar_position==mcar_min_position && mcar_velocity<0） mcar_velocity = 0;}

bool MCarAtGoal （）
// Is Car within goal region?
  { return mcar_position >= mcar_goal_position;}
  
  
//////////     Part 2: Semi-General RL code     //////////////

#define MEMORY_SIZE 10000                        // number of parameters to theta memory size
#define NUM_ACTIONS 3                            // number of actions
#define NUM_TILINGS 10

// Global RL variables:
float Q[NUM_ACTIONS];                            // action values
float theta[MEMORY_SIZE];                        // modifyable parameter vector aka memory weights
float e[MEMORY_SIZE];                            // eligibility traces  资格轨迹
int F[NUM_ACTIONS][NUM_TILINGS];

属性            大小     日期    时间   名称
----------- ---------  ---------- -----  ----

     文件      13541  2015-08-24 10:28  mountaincar1.cpp

     文件       4306  2015-08-24 10:29  tiles.C

     文件        339  2015-08-20 09:42  tiles.h

----------- ---------  ---------- -----  ----

                18186                    3

上一篇：OpenGL画树的代码
下一篇：MINIZIP 压缩解压缩附编译好的zlibstatic.lib minizip.lib

共有条评论

强化学习实验 小车爬山例程

资源简介

资源截图

代码片段和文件信息

评论

相关资源

强化学习实验小车爬山例程