Low dimensional state representation learning with robotics priors in continuous action spaces
Nicolo Botteghi et al., University of Twente, IROS 2021
- Progress: , link
State representation learning
- Three major categories
- Methods of encoding information to low-dimentional spaces by relying only on observation reconstruction.
E.g. AE, VAE, denoising AE
- Problem: ignoring small objects present in the observations, while these objects can be relevant for solving the task. Reconstruction are not useful, decoder may be not necessary.
- Solution
- Model-based
- Forward transition model, reward model, inverse model (?).
- Problem: may collapse to trivial solutions, especially in case of sparse rewards
- All methods loosely constraining the state space using auxiliary loss functions injecting prior knowledge in the form of loss functions for training the encoder networks
- Ref: Learning state representations with robotic priors, Autonomous Robots, 2015
- Methods of encoding information to low-dimentional spaces by relying only on observation reconstruction.
E.g. AE, VAE, denoising AE
Method
- State and action
- Agent’s state changes are directly related to the magnitude of the action taken.
- Observation
-> state prediction - State
Use the magnitude of action connecting state prediction -> state distance
- Simplicity prior 状态空间一定能缩小
- Temporal coherence prior 时间上接近的state应该有较近的距离 应考虑到action带来的magnitude的影响
- Proportionality prior 类似的action带来的state变化应类似
- Repeatability prior 对于action,除了magnitude变化,还应考虑方向
- Causality prior
- 两级网络的结构
-> network structure REF: Low dimensional state representation learning with reward-shaped priors. the sensor modalities are independently processed by convolutional layers, flattened and merged through fully connected layers to create the final low-dimensional state prediction of dimension 5(?) dimension: -> : mapping state to action
EXP
- RGB and 2D LiDAR data merging.
Q: does this universal? to what range can be applied? Both architectures present three fully connected hidden layers of dimension 512?