Andrew Barto - Intrinsically Motivated Learning of Hierarchical Collections of Skills (2004)

History / Edit / PDF / EPUB / BIB /
Created: February 15, 2018 / Updated: March 22, 2020 / Status: finished / 3 min read (~415 words)

  • In their example, complexity comes from composition, but it seems sometimes that more difficult skills are not necessarily about complexity, but precision

  • Today's machine learning algorithms fall far short of the possibilities for machine learning
    • They are typically applied to single, isolated problems for each of which they have to be hand-tuned and for which training data sets have to be carefully prepared
    • They do not have the generative capacity required to significantly extend their abilities beyond initially built-in representations
    • They do not address many of the reasons that learning is so useful in allowing animals to cope flexibly with new problems as they arise over extended periods of time
  • An agent's activity is said to be intrinsically motivated if the agent engages in it for its own sake rather than as a step toward solving a specific problem

  • Autonomous mental development should result in a collection of reusable skills
  • An option is something like a subroutine. It consists of
    • an option policy that directs the agent's behavior for a subset of the environment states
    • an initiation set consisting of all the states in which the option can be initiated
    • a termination condition, which specifies the conditions under which the option terminates
  • An option is not a sequence of actions; it is a closed-loop control rule, meaning that it is responsive to on-going state changes

  • It is clear that children accumulate skills while they engage in intrinsically motivated behavior, e.g., while at play
  • When they notice that something they can do reliably results in an interesting consequence, they remember this in a form that will allow them to bring this consequence about if they wish to do so at a future time when they think it might contribute to a specific goal
  • Whatever the details of how intrinsic reward is defined, it should diminish with continued repetition of the activity that generates it
  • Similarly, exploration of regions about which the agent is not ready to learn should be aversive to the agent
  • This process will naturally produce what Utgoff and Stracuzzi called "many-layered" learning in which the agent learns what is easy to learn first, then uses this knowledge to learn harder things

  • Barto, Andrew G., Satinder Singh, and Nuttapong Chentanez. "Intrinsically motivated learning of hierarchical collections of skills." Proceedings of the 3rd International Conference on Development and Learning. 2004.