FFRob: Leveraging symbolic planning for efficient task and motion planning
Caelan Reed Garrett et al.
- Related work Manipulation planning Symbolic planning Task and motion planning
Symbolic planning representation: SAS+: equivalent of STRIPS **Relaxed evaluation: ignores delete effects Condition tests dynamic programming + lazy collision checking Search algorithms: any standard search Search heuristics
Combining task and motion planning: A culprit detection problem
Fabien Lagriffoul et al.
CTAMP: Combined Task and Motion Planning logical constraints by a task planner based on answer set programming (ASP, Lifschitz, 2008)
Key components: 1. geometric reasoner capable of analyzing the cause of geometric failures. 2. A common language between the geometric reasoner and the task planner to describe the cause of failure other than mere “success/failure”.
Question: how to find the minimal explanation for geometric failures
The problem this paper is trying to solve: if a task is not feasible, “explaining” in a geometric way?
- Benefit: stop searching from high-level (infinite number of unfeasible plans by permuting the temporary locations of blocks, by permuting the order of actions, or by increasing the number of actions)
Isolating the minimal number of factors explaining the failure is the culprit detection problem that we propose to solve
Learning Markov State Abstractions for Deep Reinforcement Learning
Cameron Allen et al., NeurIPS2020
- Tag:#mdp#george_konidaris
- Progress 40%
- Link, Github
主要工作:
- ** 从rich observation中,抽取MDP的abstract state representation的充分条件(sufficient conditions)**
- 首先基础假设是获取的state representation满足Markov property
- 获取Markov abstract state representation的充分条件
- The agent’s policy
- The inverse dynamics model:
对比Transition model ,两者可通过贝叶斯定理Bayes’ theorem进行转化 - A density ratio
需要找到更准确的定义
- 上述条件approximately satisfied using a combination of two popular representation learning objectives:
- Inverse model estimation: 预测action分布,用于联系两个先后发生的状态
Predicts the action distribution that explains two consecutive states
- Temporal contrastive learning: 决定两个状态是否前后关联
Determines whether two states were in fact consecutive.
- Inverse model estimation: 预测action分布,用于联系两个先后发生的状态
Evaluating Learned State Representations for Atari
Adam Tupper et al., IVCNZ2020
- 主要工作:
- 检验通过不同的autoencoders(一种用来做representation learning的神经网络)习得state表征的效果。提出了如何评估learned state representations的方法
- 采用的Autoencoders
- AEs: Autoencoders
- VAEs: Variational autoencoders
-VAEs: Disentangled variational encoders
Visual Rationalizations in Deep Reinforcement Learning for Atari Games
Laurens Weitkamp et al., BNAIC2018
- 主要工作:通过使用Grad-CAM将RL agent的decision process可视化,创建action-specific activation maps将the most important evidences的区域高亮
State Abstractions for Lifelong Reinforcement Learning
David Abel et al., ICML2018
- Progress: . link
A Theory of Abstraction in Reinforcement Learning
David Abel, Thesis 2020/AAAI 2019
- Progress: . link to the thesis, link to the paper
Symbolic algorithms for graphs and Markov Decision Processes with fairness objectives
Krishnendu Chatterjee et al., International conference on computer aided verification 2018
- Progress: .link