Symbolic learning and representation learning
- 用什么形式进行表征?显式?隐式(latent token)?
2024
Keypoint Action Tokens Enable In-Context Imitation Learning in Robotics
Norman Di Palo et al., Imperial College London, RSS2024
- 用现有的text-based transformers(GPT-4 Turbo),将任务的visual observation动作序列化(Keypoint Action Tokens,KAT)
- 3D keypoint token:通过DINO-ViTs取出K个salient descriptor,将这些关键点重新映射回3D空间,利用特征点和其3D坐标,经LLM输出动作序列
CORN: Contact-based Object Representation for Nonprehensile Manipulation of General Unseen Objects
Yoonyoung Cho et al., KAIST, ICLR2024
- Nonprehensile:不好抓握的物体
- 由点云数据和手部姿态训练encoder module,使用PPO训练policy module
2021
Low dimensional state representation learning with robotics priors in continuous action spaces
Nicolo Botteghi et al., University of Twente, IROS 2021
State representation learning
- Three major categories
- Methods of encoding information to low-dimensional spaces by relying only on observation reconstruction.
E.g. AE, VAE, denoising AE
- Problem: ignoring small objects present in the observations, while these objects can be relevant for solving the task. Reconstruction are not useful, decoder may be not necessary.
- Solution
- Model-based
- Forward transition model, reward model, inverse model (?).
- Problem: may collapse to trivial solutions, especially in case of sparse rewards
- All methods loosely constraining the state space using auxiliary loss functions injecting prior knowledge in the form of loss functions for training the encoder networks
- Ref: Learning state representations with robotic priors, Autonomous Robots, 2015
- Methods of encoding information to low-dimensional spaces by relying only on observation reconstruction.
E.g. AE, VAE, denoising AE
Method
- State and action
- Agent’s state changes are directly related to the magnitude of the action taken.
- Observation
-> state prediction - State
Use the magnitude of action connecting state prediction -> state distance
- Simplicity prior 状态空间一定能缩小
- Temporal coherence prior 时间上接近的state应该有较近的距离 应考虑到action带来的magnitude的影响
- Proportionality prior 类似的action带来的state变化应类似
- Repeatability prior 对于action,除了magnitude变化,还应考虑方向
- Causality prior
- 两级网络的结构
-> network structure REF: Low dimensional state representation learning with reward-shaped priors. the sensor modalities are independently processed by convolutional layers, flattened and merged through fully connected layers to create the final low-dimensional state prediction of dimension 5(?) dimension: -> : mapping state to action
实验
- RGB and 2D LiDAR data merging.
Q: does this universal? to what range can be applied? Both architectures present three fully connected hidden layers of dimension 512?
2020
Learning Markov State Abstractions for Deep Reinforcement Learning
Cameron Allen et al., NeurIPS2020
主要工作:
- ** 从rich observation中,抽取MDP的abstract state representation的充分条件(sufficient conditions)**
- 首先基础假设是获取的state representation满足Markov property
- 获取Markov abstract state representation的充分条件
- The agent’s policy
- The inverse dynamics model:
对比Transition model ,两者可通过贝叶斯定理Bayes’ theorem进行转化 - A density ratio
需要找到更准确的定义
- 上述条件approximately satisfied using a combination of two popular representation learning objectives:
- Inverse model estimation: 预测action分布,用于联系两个先后发生的状态
Predicts the action distribution that explains two consecutive states
- Temporal contrastive learning: 决定两个状态是否前后关联
Determines whether two states were in fact consecutive.
- Inverse model estimation: 预测action分布,用于联系两个先后发生的状态
Evaluating Learned State Representations for Atari
Adam Tupper et al., IVCNZ2020
- 主要工作:
- 检验通过不同的autoencoders(一种用来做representation learning的神经网络)习得state表征的效果。提出了如何评估learned state representations的方法
- 采用的Autoencoders
- AEs: Autoencoders
- VAEs: Variational autoencoders
-VAEs: Disentangled variational encoders
2019
A Theory of Abstraction in Reinforcement Learning
David Abel, Thesis 2020/AAAI 2019
- Thesis and paper
2018
Visual Rationalizations in Deep Reinforcement Learning for Atari Games
Laurens Weitkamp et al., BNAIC2018
- 主要工作:通过使用Grad-CAM将RL agent的decision process可视化,创建action-specific activation maps将the most important evidences的区域高亮
State Abstractions for Lifelong Reinforcement Learning
David Abel et al., ICML2018
Symbolic algorithms for graphs and Markov Decision Processes with fairness objectives
Krishnendu Chatterjee et al., International conference on computer aided verification 2018
2017
FFRob: Leveraging symbolic planning for efficient task and motion planning
Caelan Reed Garrett et al.
- Related work Manipulation planning Symbolic planning Task and motion planning
Symbolic planning representation: SAS+: equivalent of STRIPS Relaxed evaluation: ignores delete effects Condition tests dynamic programming + lazy collision checking Search algorithms: any standard search Search heuristics
2016
Combining task and motion planning: A culprit detection problem
Fabien Lagriffoul et al.
CTAMP: Combined Task and Motion Planning logical constraints by a task planner based on answer set programming (ASP, Lifschitz, 2008)
Key components: 1. geometric reasoner capable of analyzing the cause of geometric failures. 2. A common language between the geometric reasoner and the task planner to describe the cause of failure other than mere “success/failure”.
Question: how to find the minimal explanation for geometric failures
The problem this paper is trying to solve: if a task is not feasible, “explaining” in a geometric way?
- Benefit: stop searching from high-level (infinite number of unfeasible plans by permuting the temporary locations of blocks, by permuting the order of actions, or by increasing the number of actions)
Isolating the minimal number of factors explaining the failure is the culprit detection problem that we propose to solve