FFRob: Leveraging symbolic planning for efficient task and motion planning

Caelan Reed Garrett et al.

  • Related work Manipulation planning Symbolic planning Task and motion planning

Symbolic planning representation: SAS+: equivalent of STRIPS **Relaxed evaluation: ignores delete effects Condition tests dynamic programming + lazy collision checking Search algorithms: any standard search Search heuristics

Combining task and motion planning: A culprit detection problem

Fabien Lagriffoul et al.

CTAMP: Combined Task and Motion Planning logical constraints by a task planner based on answer set programming (ASP, Lifschitz, 2008)

Key components: 1. geometric reasoner capable of analyzing the cause of geometric failures. 2. A common language between the geometric reasoner and the task planner to describe the cause of failure other than mere “success/failure”.

Question: how to find the minimal explanation for geometric failures

The problem this paper is trying to solve: if a task is not feasible, “explaining” in a geometric way?

  • Benefit: stop searching from high-level (infinite number of unfeasible plans by permuting the temporary locations of blocks, by permuting the order of actions, or by increasing the number of actions)

Isolating the minimal number of factors explaining the failure is the culprit detection problem that we propose to solve


Learning Markov State Abstractions for Deep Reinforcement Learning

Cameron Allen et al., NeurIPS2020

主要工作:

  • ** 从rich observation中,抽取MDP的abstract state representation的充分条件(sufficient conditions)**
  • 首先基础假设是获取的state representation满足Markov property
  • 获取Markov abstract state representation的充分条件
    1. The agent’s policy
    2. The inverse dynamics model: 对比Transition model ,两者可通过贝叶斯定理Bayes’ theorem进行转化
    3. A density ratio 需要找到更准确的定义
  • 上述条件approximately satisfied using a combination of two popular representation learning objectives:
    1. Inverse model estimation: 预测action分布,用于联系两个先后发生的状态

      Predicts the action distribution that explains two consecutive states

    2. Temporal contrastive learning: 决定两个状态是否前后关联

      Determines whether two states were in fact consecutive.


Evaluating Learned State Representations for Atari

Adam Tupper et al., IVCNZ2020

  • 主要工作:
    • 检验通过不同的autoencoders(一种用来做representation learning的神经网络)习得state表征的效果。提出了如何评估learned state representations的方法
    • 采用的Autoencoders
      • AEs: Autoencoders
      • VAEs: Variational autoencoders
      • -VAEs: Disentangled variational encoders

Visual Rationalizations in Deep Reinforcement Learning for Atari Games

Laurens Weitkamp et al., BNAIC2018

  • 主要工作:通过使用Grad-CAM将RL agent的decision process可视化,创建action-specific activation maps将the most important evidences的区域高亮

State Abstractions for Lifelong Reinforcement Learning

David Abel et al., ICML2018


A Theory of Abstraction in Reinforcement Learning

David Abel, Thesis 2020/AAAI 2019


Symbolic algorithms for graphs and Markov Decision Processes with fairness objectives

Krishnendu Chatterjee et al., International conference on computer aided verification 2018