Developer Diary: AlphaStar Chases the Singularity

Developer Diary

Developer Diary – You Heard It Here First – Saturday, 9 February 2019

AlphaStar Chases the Singularity

Something extremely important has just happened in the field of AI which has significant implications for mankind. After about 2 years of effort, DeepMind has figured out how to succeed at StarCraft. On December 10, 2018, their Starcraft AI called "AlphaStar" beat a strong amateur for the first time and two weeks ago it beat a second-tier professional player 10-1 in a live event. AlphaStar uses a new method of neural network training that promises to give computers the ability to create genuine strategies for the first time--the last remaining serious obstacle to achieving artificial intelligence.

I have examined some of the games from the challenge matches and found that although the AI relied heavily on super-human micromanagement of units to win, nevertheless, its planning and decision making was good enough to be comparable to human capability. It was evident from the games that although AlphaStar is still inferior in some respects to the best human players, it is only a question of time before it achieves a super human skill level.

This development is very important because it means that computers are now capable of complex strategic tasks that formerly could be performed only by a human being. For example, a computer could be used to make strategic decisions in a war, for example. Tasks requiring planning and forethought, such as managing a satellite, can now be fully automated.

The main limitation to the technology is that the task must be susceptible to simulation, because the approach uses self-play reinforcement learning which depends on accurate simulation of the environment and processes. This limitation applies to any machine learning technology. It is also worthwhile mentioning that real-world tasks which require biological sensors, like the human eye, are still out of reach for a computer. A camera does not compare to the capabilities of biological eyes.

The implications of this are revolutionary and the potential effects incalculable. For example, just to name three applications:

(1) Protein foldings can now be computed automatically. Since organic molecule bindings are fully capable of simulation, it is now possible to automatically compute protein foldings, probably the single-most important challenge in biology right now.

(2) Innovation in process chemistry can now be automated. Computers can be used to discover critical new reactions and industrial processes for chemical and drug synthesis. This means that the rate of invention of new drugs and chemicals can potentially be increased dramatically.

(3) Innovation in mathematics can now be automated. "Experimental mathematics" as first practiced by Jonathan Borwein has been used to compute new discoveries in mathematics since the mid 1990s. However, in each case the discovery process has to be painstakingly designed and managed by a human mathematician. Using the AlphaStar methods can allow a computer to become an automated mathematician that can prove theorems or develop new computational algorithms that implement a human policy without a corresponding human design. The implication of this for AI should be obvious: AlphaStar can be adapted to develop improved training strategies and network architectures for its own operation. In other words, it can automatically improve itself. Building such a self-improving system would require significant effort, but it is only a matter of work. No further discoveries need to be made.

In my opinion, the development of AlphaStar is critically important and may inaugurate a new chapter in human history because it makes the Singularity a real possibility for the first time. Up until now, the Singularity was just an idea and nobody knew if it was really possible or just an imagination. Since AlphaStar has shown a genuine capability to strategize, this makes the Singularity viable. Mankind now has ability to greatly accelerate innovation and the growth of technology using fully automatic methods. The immediate question is how fast can these methods be adapted and put to work solving practical problems?

How AlphaStar Works

AlphaStar uses a convolutional long short-term memory network, a well-known method that has the capability to learn sequences. The key innovations that make AlphaStar different are that it incorporates a relatively new invention called a Pointer Network and it uses a creative new method of parallelized training that the Deep Mind team calls Asynchronous Advantage Actor-Critic. In machine learning, the training technology is often more important than the structure of the AI engine itself. In other words, how data gets fed to the system is more important that the design of the system. For example, the whole reason Amazon Alexa works is not the way she is designed (which is well-known), but the sophistication of the training technique used to train her (which Amazon keeps secret). AlphaStar started by learning from games played by human players and then ramped up by self play, i.e., playing against itself over and over again millions of times. This general approach is called reinforcement learning and the Deep Mind team has provided the outlines of their technology in a blog post published last month entitled AlphaStar: Mastering the real-time strategy game Starcraft II and in a paper published over a year ago outlining the problem StarCraft II: A New Challenge for Reinforcement Learning. They also released an introductory video called AlphaStar: The Inside Story showing their team discussing the new technology and showing it beating a professional-caliber player for the first time.

return to John Chamberlain's home · diary index

Developer Diary · about · info@johnchamberlain.com · bio · Revised Saturday, 9 February 2019 · Pure Content