Scalable Agent Architecture For Distributed Training

The tasks are designed to be as varied as possible. They differ in the goals they target, from learning, to memory, to navigation. They vary visually, from brightly coloured, modern-styled texture, to the subtle brown and greens of a desert at dawn, midday, or by night. And they contain physically different settings, from open, mountainous terrain, to right-angled mazes, to open, circular rooms.

In addition, some of the environments include ‘bots’, with their own, internal, goal-oriented behaviours. Equally importantly, the goals and rewards differ across the different levels, from following language commands and using keys to open doors, foraging mushrooms, to plotting and following a complex irreversible path.

However, at a basic level, the environments are all the same in terms of their action and observation space allowing a single agent to be trained to act in every environment in this highly varied set. More details about the environments can be found on the DeepMind Lab GitHub page.

Importance-Weighted Actor-Learner Architectures

In order to tackle the challenging DMLab-30 suite, we developed a new distributed agent called Importance Weighted Actor-Learner Architecture that maximises data throughput using an efficient distributed architecture with TensorFlow.

Importance Weighted Actor-Learner Architecture is inspired by the popular A3C architecture which uses multiple distributed actors to learn the agent’s parameters. In models like this, each of the actors uses a clone of the policy parameters to act in the environment. Periodically, actors pause their exploration to share the gradients they have computed with a central parameter server that applies updates (see figure below).

Source: https://deepmind.com/blog/article/impala-scalable-distributed-deeprl-dmlab-30

Generative Data Intelligence

Scalable agent architecture for distributed training

Importance-Weighted Actor-Learner Architectures

BloFin Sponsors TOKEN2049 Dubai and Celebrates the SideEvent: WhalesNight AfterParty 2024 – CoinJournal

Pantera Capital buys more Solana (SOL) from FTX – CoinJournal

Latest Intelligence

MetalCore and Mon Protocol Fuel Blockchain Gaming Revolution

China’s former CBDC chief is under government investigation – CoinJournal

SandboxAQ Head of Product for AI Simulation Platforms, Arman Zaribafiyan, is an IQT Vancouver/Pacific Rim 2024 Speaker – Inside Quantum Technology

SandboxAQ Head of Product for AI Simulation Platforms, Arman Zaribafiyan, is an IQT Vancouver/Pacific Rim 2024 Speaker – Inside Quantum Technology

Quantum News Briefs: April 26, 2024: News From Zurich Instruments and QuantWare • Quantum Computing Inc. • Center for Quantum Information (CQI), Tsinghua University,...

Quantum News Briefs: April 26, 2024: News From Zurich Instruments and QuantWare • Quantum Computing Inc. • Center for Quantum Information (CQI), Tsinghua University,...

Chat with us