Nature Communications (Nov 2019)
Optimizing agent behavior over long time scales by transporting value
Abstract
People are able to mentally time travel to distant memories and reflect on the consequences of those past events. Here, the authors show how a mechanism that connects learning from delayed rewards with memory retrieval can enable AI agents to discover links between past events to help decide better courses of action in the future.