Teaching Robots with Positive Reinforcement at Johns Hopkins University

Image by Gerd Altmann from Pixabay

Dr. Todd A. Ward


Researchers at Johns Hopkins University are successfully using positive reinforcement procedures to help robots learn.  Reinforcers in the form of points are given to the robots upon the successful completion of a block stacking task.  If the robot makes a mistake, no reinforcers are given.

Though the term isn’t mentioned in the video below, the team effectively uses differential reinforcement procedures to accelerate learning that would take a month, down to two days.  They stress that the block sorting task is used for convenience, but the methods can be generalized to a myriad of other applications such as self driving cars and surgical robotics.

For more on their work, and to meet the team, check out the video below.

Todd A. Ward, PhD, BCBA-D, LBA is a science writer, social philosopher, behavioral systems analyst, and the President and Founder of bSci21Media, LLC, which aims to connect behavioral science to the world in an engaging, non-academic way.  Dr. Ward received his PhD in behavior analysis from the University of Nevada, Reno under Dr. Ramona Houmanfar.  He has served as a Guest Associate Editor of the Journal of Organizational Behavior Management, and as an Editorial Board member of Behavior and Social Issues.  His publications follow a theme of behavioral systems analysis, organizational performance, theory & philosophy, and language & cognition.  He has also provided ABA services to children and adults with various developmental disabilities in day centers, in-home, residential, and school settings, and previously served as Faculty Director of Behavior Analysis Online at the University of North Texas.  Dr. Ward can be reached at [email protected]

About bsci21 703 Articles
President, bSci21 Media, LLC Editor, bSci21.org

Be the first to comment

Leave a Reply

Your email address will not be published.


This site uses Akismet to reduce spam. Learn how your comment data is processed.