A pixels-to-actions model of collective behavior

Collective behavior is widespread in nature. To date, however, we lack task-performing models that learn collective behavior from high-dimensional sensory inputs and perform actions, akin to real animals. Here we used deep reinforcement learning and curiosity-driven learning—two mechanisms deeply rooted in psychological and neuroscientific research—to build artificial agents that develop collective behavior. We built artificial brains, embodied those brains in virtual animal bodies as “agents,” and raised those agents in realistic virtual reality environments. Like animals, our agents spontaneously learned to group from high-dimensional visual inputs in naturalistic environments. Our agents also learned to group without hardcoded interaction rules, using intrinsic motivation (curiosity) to drive learning. Thus, deep reinforcement learning and curiosity-driven learning are sufficient to generate collective behavior in visually-realistic environments. This work bridges the divide between high-dimensional sensory inputs and collective action, resulting in a “pixels-to-actions” model of collective behavior.

First, we trained and tested 10 artificial agents together within a plain white room. The agents received no external rewards from the environment. Instead, the agents were intrinsically motivated by curiosity to seek informative experience in the environment (e.g. other agents). As a result, our agents showed more pronounced grouping behavior than a group of randomly moving baseline agents, indicating that our agents developed ego-motion and grouping behavior.

In natural visual environments, the appearance of agents can change radically due to changes in viewpoint, illumination, position, etc. Thus, to group successfully, animals must build invariant representations of groupmates that tolerate natural image variation. Did the agents learn robust grouping behaviors that could generalize to novel illumination conditions? To test this, we examined the grouping behavior of our agents under varying illumination conditions. The agents developed robust grouping behavior, despite the fact that the agents looked significantly different in the test conditions than in the training condition. This suggests that autonomous artificial agents can develop a form of invariant object recognition using purely self-supervised learning rules.

Next, we tested whether the agents could also learn to recognize group members in complex, natural environments. Natural environments introduce a number of challenges for biological and artificial agents. For instance, natural environments make it more difficult for agents to detect and recognize objects because objects must be parsed from cluttered backgrounds. We observed pronounced grouping behavior when the agents were trained and tested together in the complex environment. For comparison, grouping behavior was not observed when the agents were trained separately from one another and then tested their behavior in a group.

In sum, we present a pixels-to-actions model of collective behavior, which indicates that we have isolated a set of learning mechanisms that are sufficient to produce this behavior in autonomous artificial agents. Two generic learning mechanisms—deep reinforcement learning and curiosity-driven learning—can generate the core skills needed to support collective behavior in realistic visual environments.

A pixels-to-actions model of collective behavior

Donsuk Lee, Samantha M. W. Wood, Justin N. Wood