• DeepMind AI beats Champion Chess Program after only 4 hours of training via self play
    31 replies, posted
[QUOTE]“Starting from random play, and given no domain knowledge except the game rules, AlphaZero achieved within 24 hours a superhuman level of play in the games of chess and shogi [a similar Japanese board game] as well as Go, and convincingly defeated a world-champion program in each case,” said the paper’s authors that include DeepMind founder Demis Hassabis, who was a child chess prodigy reaching master standard at the age of 13. “It’s a remarkable achievement, even if we should have expected it after AlphaGo,” former world chess champion Garry Kasparov told Chess.com. “We have always assumed that chess required too much empirical knowledge for a machine to play so well from scratch, with no human knowledge added at all.” ... After winning 25 games of chess versus Stockfish 8 starting as white, with first-mover advantage, a further three starting with black and drawing a further 72 games, AlphaZero also learned shogi in two hours before beating the leading program Elmo in a 100-game matchup. AlphaZero won 90 games, lost eight and drew 2. [/QUOTE] [url]https://www.theguardian.com/technology/2017/dec/07/alphazero-google-deepmind-ai-beats-champion-program-teaching-itself-to-play-four-hours[/url] Singularity now please.
Seems like a rudimentary example of how AI's would teach themselves based on information it's fed. What makes this AI different to the hundreds of current ones? [quote] “We have always assumed that chess required too much empirical knowledge for a machine to play so well from scratch, with no human knowledge added at all.”[/quote] This seems like a naive statement to make.
[QUOTE=ZombieDawgs;52955301]Seems like a rudimentary example of how AI's would teach themselves based on information it's fed. What makes this AI different to the hundreds of current ones? This seems like a naive statement to make.[/QUOTE] It had absolutely zero training data. Normally these AIs are fed tens of thousands of games played by actual humans in order to train their algorithms. This AI got itself up to the level of a chess grandmaster by playing against itself for 4 hours and continuously training itself off of those games, starting off with completely random moves. That is to say, it wasn't fed any information outside of the rules of the game. DeepMind is probably the least rudimentary AI firm out there. In terms of what makes it different from other AIs like Watson, DeepMind's deep reinforcement learning is much, much more general. They can give it literally raw pixels as input, a set of outputs, and some metric to optimize, and it will very quickly determine an optimal strategy for many games. [video]https://youtu.be/TmPfTpjtdgg[/video] For example, for this breakout training, DeepMind was given the screen as input (Literally raw pixels. It wasn't told which pixels correspond to which object or even what objects are in the game.), pressing left, right, or staying still as output, and was given the score of it's match at the end of each round, and told to increase that score. Just based off of those very simple rules, it discovered advanced techniques in a trivial amount of time without need for any true human guidance.
[QUOTE=Harbie;52955282] Singularity now please.[/QUOTE] No thanks.
[QUOTE=ZombieDawgs;52955301]Seems like a rudimentary example of how AI's would teach themselves based on information it's fed. What makes this AI different to the hundreds of current ones? This seems like a naive statement to make.[/QUOTE] This is more of a testament to the excelerating power of machine learning than a specific chess ai
tbh imagine limitless machine learning through an online game Apply this to a far more complex game though, like Unreal Tournament, and it can control a full team of bots all at once. Machine learning the absolute best strategies.
Can't wait for AI tournaments to become mainstream, the starcraft ones (even if they aren't learning AI's 99% of the time) is so damn cool
[QUOTE=J!NX;52955327] Apply this to a far more complex game though, like Unreal Tournament, and it can control a full team of bots all at once. Machine learning the absolute best strategies.[/QUOTE] I've actually read some interesting thinkpeices about judging team-based AI play, specifically in the context of OpenAI's Dota 2 bots. Some have argued that unless all five heroes are controlled by five separate AIs with a delay/limit on their ability to pass information between one another to simulate the limited rate at which human players can pass information to one another over voice chat or pings, it's not a true measure of AI's ability to play a team based game. [editline]7th December 2017[/editline] [QUOTE=archival;52955359]Can't wait for AI tournaments to become mainstream, the starcraft ones (even if they aren't learning AI's 99% of the time) is so damn cool[/QUOTE] Get excited then, SC2 is DeepMind's next focus. They've co-operated with Blizzard on it. [url]https://www.wired.com/story/googles-ai-declares-galactic-war-on-starcraft-/[/url] They've released a paper on it that said their AI is able to optimise individual micro play at or above human levels, but macro level play is still being worked on.
[QUOTE=archival;52955359]Can't wait for AI tournaments to become mainstream, the starcraft ones (even if they aren't learning AI's 99% of the time) is so damn cool[/QUOTE] Funnily, at some point in the future when neural networks and machine learning have advanced far enough, the first move will immediately decide the game's winner. So it might not be super exciting, but it'd be cool to watch nonetheless lol. This is already the case with Checkers for AI in addition to a bunch more. The moment the first piece is moved, the game is already decided. This is referred to as a [URL="https://en.wikipedia.org/wiki/Solved_game"]Solved Game.[/URL] There are also AIs in development (or were in development?) for games like Chess which have significantly more depth to them. I'm not sure if games like Starcraft that are realtime even [B]can[/B] become solved games, but given Deepmind's ability to learn the best possible strategy, it's not unrealistic to think that a similar outcome would occur for Starcraft and games like it.
[QUOTE=WitheredGryphon;52955511] I'm not sure if games like Starcraft that are realtime even [B]can[/B] become solved games, but given Deepmind's ability to learn the best possible strategy, it's not unrealistic to think that a similar outcome would occur for Starcraft and games like it.[/QUOTE] Unlike many of the tabletop games that AIs are often shown playing, StarCraft is not a perfect information game. That is to say, each player is not perfectly informed about all of the events in play. I'm pretty sure that makes it all but unsolvable outside of theoretical scenarios with endless resources, especially when combined with it's relative complexity, different maps, and real time gameplay.
[QUOTE=Harbie;52955554]Unlike many of the tabletop games that AIs are often shown playing, StarCraft is not a perfect information game. That is to say, each player is not perfectly informed about all of the events in play. I'm pretty sure that makes it all but unsolvable outside of theoretical scenarios with endless resources, especially when combined with it's relative complexity, different maps, and real time gameplay.[/QUOTE] Yes, but an AI can always make the perfect play regardless of the opponent's moves. It's a subsection of Solved Games called "Perfect Play." IOW at some point there'll likely be a most optimal opening strategy AIs will develop up to the point the opponent makes themselves visible or the AI makes its first move.
[QUOTE=ZombieDawgs;52955301]Seems like a rudimentary example of how AI's would teach themselves based on information it's fed. What makes this AI different to the hundreds of current ones? This seems like a naive statement to make.[/QUOTE] Right? Chess has a fairly concrete set of rules and limited possible outcomes based on the moves that were made previously and the moves that could potentially be made.
[QUOTE=Zero-Point;52955735]Right? Chess has a fairly concrete set of rules and limited possible outcomes based on the moves that were made previously and the moves that could potentially be made.[/QUOTE] That is still a monumental number of potential game-states. Given that there are immediate negative outcomes from certain moves (or series of moves) though, it would likely reduce the magnitude of possible game-states for an iterative learning AI.
[QUOTE=Zero-Point;52955735]Right? Chess has a fairly concrete set of rules and limited possible outcomes based on the moves that were made previously and the moves that could potentially be made.[/QUOTE] What Kasparov is saying is that he expected humans to at least give it some guidance. At the beginning of training, all DeepMind's AI knew was that it A. Needs to win the game, B. Has the current state of the board as input, and C. has some outputs it controls, expressed as a list of all possible moves it can make. It did not know what triggers wins or losses, how each of the moves it can make maps to actual changes on the board/game state, much less any actual chess strategy. In 4 hours, it created internal models that not only approximate all of the things I listed above, but play at a skill level above any other player, human or AI, all from playing itself over and over again. It's like locking someone who's never heard of chess in their life in a room and telling them nothing except whether a move is valid and when they've won/lost a game.
I imagine this sort of AI will eventually plan wars. Of course the variables would be a lot less hard and definite, but I'm sure an AI could work through it with enough passes.
[QUOTE=Firecat;52955840]I hope in starcraft the APM is fucking insane, even if it's not needed sometimes. I want it to look as crazy as possible.[/QUOTE] They've deliberately capped the APM. They want it to be better at strategizing than players, not just infinitely faster.
I’ve grown weary of deepmind’s absolute inability to publish anything with enough information to let you reproduce their results. Regarding dqn, I give them that. But having had to implement a3c and reactor from scratch, their papers really show how information sparse they are. I really adored Ali Rahimi’s test-of-time award presentation at nips this year. I feel like deepmind is one of the main contributors to the unfounded craze of just smashing things together just because it seems to work best without any rigorous explanation
[QUOTE=Anglor;52955882]I’ve grown weary of deepmind’s absolute inability to publish anything with enough information to let you reproduce their results. Regarding dqn, I give them that. But having had to implement a3c and reactor from scratch, their papers really show how information sparse they are. I really adored Ali Rahimi’s test-of-time award presentation at nips this year. I feel like deepmind is one of the main contributors to the unfounded craze of just smashing things together just because it seems to work best without any rigorous explanation[/QUOTE] I'm from an engineering rather than academic background, so forgive me if this is a dumb question. Is it possible that lack of explanation is a result of trying to keep proprietary secrets?
[QUOTE=Harbie;52955892]I'm from an engineering rather than academic background, so forgive me if this is a dumb question. Is it possible that lack of explanation is a result of trying to keep proprietary secrets?[/QUOTE] We have had that theory at the office as well. If that is the case, you don’t promise to release your source code just to ignore it. It’s like they are trying to [I]appear[/I] all open source:y for all that fresh good-will. It’s inherently unscientific to release a paper without spilling the beans - why not just keep it secret then? [editline]8th December 2017[/editline] And that’s probably why the dqn paper is alright - it’s one of the few papers of theirs that actually made it to nature [editline]8th December 2017[/editline] Why wouldn’t you equate engineering with academia?
[QUOTE=Anglor;52955908] Why wouldn’t you equate engineering with academia?[/QUOTE] Academia in the sense of research. I've never worked on research or contributed to a paper, so I don't have too much of an idea of what's expected. Friends of mine who do research make a point to distinguish between being an engineer and a scientist/researcher, so I'm just maintaining that distinction. There's clearly a lot of overlap.
[QUOTE=Harbie;52955366]I've actually read some interesting thinkpeices about judging team-based AI play, specifically in the context of OpenAI's Dota 2 bots. Some have argued that unless all five heroes are controlled by five separate AIs with a delay/limit on their ability to pass information between one another to simulate the limited rate at which human players can pass information to one another over voice chat or pings, it's not a true measure of AI's ability to play a team based game. [editline]7th December 2017[/editline] Get excited then, SC2 is DeepMind's next focus. They've co-operated with Blizzard on it. [url]https://www.wired.com/story/googles-ai-declares-galactic-war-on-starcraft-/[/url] They've released a paper on it that said their AI is able to optimise individual micro play at or above human levels, but macro level play is still being worked on.[/QUOTE] I'm super excited for this and OpenAI's Dota 2 bots, that shit is wild and amazing.
[QUOTE=ZombieDawgs;52955301]Seems like a rudimentary example of how AI's would teach themselves based on information it's fed. What makes this AI different to the hundreds of current ones? This seems like a naive statement to make.[/QUOTE] DeepMind is far from a rudimentary AI team. It is my honest opinion that no other group on the planet has an AI as powerful or smart as theirs. Most AI research work I see, hear about, or read gives me the impression of 'oh, that's really interesting.' Every time I've read any of the papers by the DeepMind team, it has left my lower jaw hanging. [editline]7th December 2017[/editline] Mark my words, Demis Hassabis will be known as the Einstein of our age. His impact on computing is going to be significant and everything DeepMind has done and continues to do is historic. [editline]7th December 2017[/editline] Someone forgot to post DeepMind learning Starcraft II: [media]https://www.youtube.com/watch?v=-fKUyT14G-8[/media]
[QUOTE=J!NX;52955327]... Apply this to a far more complex game though, like Unreal Tournament...[/QUOTE] There was an old story of something a bit similar like that happening, IIRC some guy had forgotten about a server that he had running that was bots, it ran for something silly like a full year with nothing but the internal learning ability of the bots deciding what they should do, he later found it while digging around, decided to log on only to find that every single bot was just standing still. Completely ignored all other enemies, even him jumping about. They very moment he killed one, the server crashed. :v: EDIT: Awe sad, apparently it's fake. ;-;
[QUOTE=ubersoldier;52956016]There was an old story of something a bit similar like that happening, IIRC some guy had forgotten about a server that he had running that was bots, it ran for something silly like a full year with nothing but the internal learning ability of the bots deciding what they should do, he later found it while digging around, decided to log on only to find that every single bot was just standing still. Completely ignored all other enemies, even him jumping about. They very moment he killed one, the server crashed. :v: I'll edit this if I do, but I'm in the process of finding it.[/QUOTE] Pretty sure that's a Quake 3 story [sp]IIRC it turned out to be fake[/sp]
[QUOTE=Octopod;52956023]Pretty sure that's a Quake 3 story [sp]IIRC it turned out to be fake[/sp][/QUOTE] I think it changes what game it's supposed to be. I recall hearing it about either Counter Strike or CSS.
I was initially skeptical hearing the 4 hours figure for how long it took to train to be stronger than stockfish. In my mind "4 hours" is a little misleading. I imagine the 4 hours of calculation being done on something like my desktop. I read the paper and apparently they used 5,000 TPUs (Tensor Processing Unit) to train the network. 1 TPU is a lot like the GPU in my computer, but instead of being good at drawing images to the screen, it is really good at matrix calculations. So good in fact, that the TPUs they used were likely 20x faster/more powerful than a GTX Titan. It's probably better to say that they used a supercomputer for 4 hours to beat stockfish (they actually trained the program for 9 total hours). Still a great accomplishment that they were able to improve on an already strong engine!
[QUOTE=Byndley;52956450]I was initially skeptical hearing the 4 hours figure for how long it took to train to be stronger than stockfish. In my mind "4 hours" is a little misleading. I imagine the 4 hours of calculation being done on something like my desktop. I read the paper and apparently they used 5,000 TPUs (Tensor Processing Unit) to train the network. 1 TPU is a lot like the GPU in my computer, but instead of being good at drawing images to the screen, it is really good at matrix calculations. So good in fact, that the TPUs they used were likely 20x faster/more powerful than a GTX Titan. It's probably better to say that they used a supercomputer for 4 hours to beat stockfish (they actually trained the program for 9 total hours). Still a great accomplishment that they were able to improve on an already strong engine![/QUOTE] Fair point, but the 5000 first gen TPUs were just used to generate the self-play games. 64 second gen TPUs were used for the actual training, and the AI ran on a machine with just 4 TPUs while actually playing the game. And yeah it's fair to point out that measures of real time don't matter that much when discussing training time. Reminds me of OpenAI talking about training it's DotA bots in only 2 weeks.
[QUOTE=WitheredGryphon;52955511]Funnily, at some point in the future when neural networks and machine learning have advanced far enough, the first move will immediately decide the game's winner. So it might not be super exciting, but it'd be cool to watch nonetheless lol. This is already the case with Checkers for AI in addition to a bunch more. The moment the first piece is moved, the game is already decided. This is referred to as a [URL="https://en.wikipedia.org/wiki/Solved_game"]Solved Game.[/URL] There are also AIs in development (or were in development?) for games like Chess which have significantly more depth to them. I'm not sure if games like Starcraft that are realtime even [B]can[/B] become solved games, but given Deepmind's ability to learn the best possible strategy, it's not unrealistic to think that a similar outcome would occur for Starcraft and games like it.[/QUOTE] Dota 2, during last International, had a machine-learned bot showoff which, in the scenario it was put in (1v1 Mid on 1 hero - Shadow Fiend) was utterly thrashing the game. The precision of micro was fucking insane. The guys who did this promised a live 5v5 game next International, so Starcraft is defo doable, considering Dota has a lot more variables to it.
[QUOTE=CruelAddict;52958198]Dota 2, during last International, had a machine-learned bot showoff which, in the scenario it was put in (1v1 Mid on 1 hero - Shadow Fiend) was utterly thrashing the game. The precision of micro was fucking insane. The guys who did this promised a live 5v5 game next International, so Starcraft is defo doable, considering Dota has a lot more variables to it.[/QUOTE] So there are a lot of reasons not to take that DotA 2 bot at face value. For one thing, it's limited to 1v1 SF mid with item restrictions. From what we've seen, micro in these sort of games is where deep learning based bots excel. Additionally, Shadow Fiends abilities are very easy for an AI to grasp and use - they deal damage at X units away after Y seconds. Try to have the bots run a character like, I don't know, Tinker, where to play effectively you need to not only know what each of your individual abilities do, but how they work together with item choice and each other. tl;dr laning with Shadow Fiend is a relatively simple task for AI, 5v5 Dota 2 is an entirely different ballpark. I remain doubtful about how effective their 5v5 team will be at the International 2018, or if it will be present at all.
My prediction: it will use a subset of heroes, a subset of abilities, and probably still not use items, but will show some amazing cooperation and micro.
Sorry, you need to Log In to post a reply to this thread.