Generative Data Intelligence

Scalable agent architecture for distributed training

Date:

The tasks are designed to be as varied as possible. They differ in the goals they target, from learning, to memory, to navigation. They vary visually, from brightly coloured, modern-styled texture, to the subtle brown and greens of a desert at dawn, midday, or by night. And they contain physically different settings, from open, mountainous terrain, to right-angled mazes, to open, circular rooms.

In addition, some of the environments include ‘bots’, with their own, internal, goal-oriented behaviours. Equally importantly, the goals and rewards differ across the different levels, from following language commands and using keys to open doors, foraging mushrooms, to plotting and following a complex irreversible path.

However, at a basic level, the environments are all the same in terms of their action and observation space allowing a single agent to be trained to act in every environment in this highly varied set. More details about the environments can be found on the DeepMind Lab GitHub page.

Importance-Weighted Actor-Learner Architectures

In order to tackle the challenging DMLab-30 suite, we developed a new distributed agent called Importance Weighted Actor-Learner Architecture that maximises data throughput using an efficient distributed architecture with TensorFlow.

Importance Weighted Actor-Learner Architecture is inspired by the popular A3C architecture which uses multiple distributed actors to learn the agent’s parameters. In models like this, each of the actors uses a clone of the policy parameters to act in the environment. Periodically, actors pause their exploration to share the gradients they have computed with a central parameter server that applies updates (see figure below).

Source: https://deepmind.com/blog/article/impala-scalable-distributed-deeprl-dmlab-30

spot_img

Latest Intelligence

spot_img

Chat with us

Hi there! How can I help you?