A New Model for Diplomacy in 4x Games

Started by FarAway Sooner, March 29, 2014, 12:53:22 AM

Previous topic - Next topic

0 Members and 2 Guests are viewing this topic.

James Sterrett

Quote from: Mr. Bigglesworth on April 02, 2014, 09:34:10 PM
And yet players typically develop some basic rules of thumb, they dont go calculating huge tables of permutations. What scared devs, I think, is how the options multiply. You don't have to go through all options, they key is tree pruning.

Even a chess engine might go searching out 5-10 moves for each likely branch. If you add the human type rules of thumb to a pruned search you get through fast with not a lot of number crunching.

Interesting article on the state of AI for Go: http://www.newyorker.com/online/blogs/elements/2014/03/the-electronic-holy-war.html

Even tree pruning takes number crunching, and the less clear-cut the measurement of success is, the harder it gets.

Mr. Bigglesworth

That does not apply. Chess and go are fairly flat. Chess the center has more value, go the edges have more value. In any strategy game based on a real map there are huge gradients in value for areas with choke points, resources, capital cities, all kinds of things. Combine that with leadership 'personality' Expansionist/ aggressive translated to zero sum game. Cooperative/ alliance builder you can use Shapley value.

You can include balance of power philosophy for some states. Some states only care about getting the most now. Next year is another problem, maybe even another politician's problem. Others think of long term growing trade.

Outlook changes based on trustworthyness. It changes again based on the level of threat to the state.

In simple terms there are all kinds of things that let you prove any possibility from certain branches cannot possibly yield more. Therefore the whole branch is wiped out, no need to search it.
"Once more unto the breach, dear friends, once more; "
- Shakespeare's Henry V, Act III, 1598

Bletchley_Geek

#32
Quote from: James Sterrett on April 03, 2014, 09:38:00 AM
Quote from: Mr. Bigglesworth on April 02, 2014, 09:34:10 PM
And yet players typically develop some basic rules of thumb, they dont go calculating huge tables of permutations. What scared devs, I think, is how the options multiply. You don't have to go through all options, they key is tree pruning.

Even a chess engine might go searching out 5-10 moves for each likely branch. If you add the human type rules of thumb to a pruned search you get through fast with not a lot of number crunching.

Interesting article on the state of AI for Go: http://www.newyorker.com/online/blogs/elements/2014/03/the-electronic-holy-war.html

Even tree pruning takes number crunching, and the less clear-cut the measurement of success is, the harder it gets.

Not a bad article, but contains a huge lie:

QuoteTo say that Go is more complex than chess, though, is a little like saying that one infinity is larger than another. While technically true—and mathematically possible—it does not fully explain why computers, which can't fully compute chess or Go, have become good at one and not the other. "A few hundred orders of magnitude don't matter when you're up in the ten to the one hundred and twenty," Murray Campbell, a member of the I.B.M. Deep Blue team, told me. The key lies in Go's structure. Deep Blue was able to exploit a weakness in chess's armor: at the grandmaster level, to tell who is winning, you add up the pieces on the board and consider their positions. Campbell explained that, to win, you just stay ahead the whole time, vastly reducing the number of moves for computers to consider.

The truth is that present-day computer chess uses two kind of 'heuristics' to evaluate moves: positional heuristics (basically, an algorithm that assigns a numeric value to a given game configuration) and, more importantly, a cleverly crafted 'pattern database' that maps game states into a vast - but represented in a very compact form - database of chess end games (think about all the chess problems you find in the newspaper, compiled together). What positional heuristics do (or rather, ensembles of such heuristics) is to speed up the search up to the mid game. The reason for this is that, as less pieces are on the board, the values of such positional heuristics become less informed (which means that is harder to tell apart good from bad states). The solution to that was enabled by 'pattern database heuristics', that basically rate states according to how similar they are to game states enabling a known end game (that is, a sequence of moves where the player has victory guaranteed).

This strategy doesn't work for Go for several reasons. First, the game tree is exponentially bigger than that of chess, so weaknesses in positional heuristics (all heuristics, by definition, have a 'blind spot') early in the search can lead you astray very easily. And then, Go "end game" isn't as well documented as that of Chess, so nobody has come out with good pattern database heuristics (and there are well-founded doubts about the technical feasibility of building such pattern databases in the first place).

In order to succeed in Go, Go programs had to abandon the algorithmic framework used in Chess since Claude Shannon's wrote his seminal paper on the subject (presenting the Min-Max algorithm). The problem with Min-Max (and it's evolutions like Alpha-Beta Min-Max or NegaScout) is that is too systematic even with pruning rules. That systemacity - which guarantees optimal play provided enough time to compute the moves - had to be abandoned. And here we come with the Monte Carlo approaches: which are derived from the methods to solve the "n-armed bandit problem" (that is, how to figure out the strategy that maximizes profit when playing a slot machine).

These methods are based on 'rolling out' a base strategy (which in the case of a board game, encodes the kind of positional or lookahead reasoning of chess heuristics), and adjusting this base strategy by sampling possible executions (i.e. the program applies the strategy until reaching a terminal - win or loss - state) and adjusting the expected pay offs accordingly. The 'magic' of these methods lies in being able to account for the variance in payoffs, so the computer program prioritizes those strategies which 1) held the promise of better payoffs and 2) haven't been explored so often. And the level of play of these programs is proportional to the number of strategy evaluations they can make per unit of time.

The fact they play differently as is common amongst humans allows to make an interesting 'meta-physical' question. Games - all games - can be formalized into a mathematical model (which can be more or less 'workable'). We humans play these games, but usually only get a glimpse of all the possible developments of the game. We are taught to play these games in very particular ways: as in Chess, strategies are to a great degree a cultural construct built around the game. The Monte-Carlo go-playing program doesn't leverage this cultural baggage (which is useful in order to play the game effectively) and explores ways of playing which have never crossed our minds. It's a bit like these programs to look for you birthday date into the decimal digits of the number pi - you're guaranteed to find it, infinitely often. But it is also true that you can find there patterns nobody has thought of yet (and assigned to them any meaning).

FarAway Sooner

Bletchley, in simple terms, then, why is it so easy to find certain exploits that AI will fall for so consistently in most turn-based strategy games?

Mr. Bigglesworth

In simple terms humans are experts at pattern recognition.
"Once more unto the breach, dear friends, once more; "
- Shakespeare's Henry V, Act III, 1598

Bletchley_Geek

Quote from: FarAway Sooner on April 03, 2014, 10:13:26 PM
Bletchley, in simple terms, then, why is it so easy to find certain exploits that AI will fall for so consistently in most turn-based strategy games?

Quote from: Mr. Bigglesworth
In simple terms humans are experts at pattern recognition.

What Mr. Biggles said basically: we're very good at figuring out the intentions behind the moves of others. That allows us to pre-empt future opponent moves or to setup a 'trap' to bait them. Purely reactive - that is, they only consider the current state of the game - AIs are just incapable of imitating the kind of reasoning we do when we contrast the moves of our opponents against the state the game was in and where it makes sense for the opponent to take it. Expert human players tend to be very good at masquerading their actual intentions, misleading or distracting their opponents.

bbmike

Quote from: Mr. Bigglesworth on April 03, 2014, 11:15:32 PM
In simple terms humans are experts at pattern recognition.

You watched Cosmos the other night!  ;D
"My life is spent in one long effort to escape from the commonplace of existence."
-Sherlock Holmes

"You know, just once I'd like to meet an alien menace that wasn't immune to bullets."
-Brigadier Lethbridge-Stewart

"There's a horror movie called Alien? That's really offensive. No wonder everyone keeps invading you!"
-The Doctor

"Before Man goes to the stars he should learn how to live on Earth."
-Clifford D. Simak

Mr. Bigglesworth

I wish, I don't have cable. Internet only.
"Once more unto the breach, dear friends, once more; "
- Shakespeare's Henry V, Act III, 1598

Martok

Quote from: Anguille on April 03, 2014, 12:00:07 AM
Quote from: Martok on April 01, 2014, 03:29:04 PM
Quote from: FarAway Sooner on March 30, 2014, 12:06:54 AM
I'd be happy to just see a game where my people's attitudes towards an enemy were tracked separately from my own status with them (i.e., War, Peace, Non-Aggression Pact, etc.).  As was the case in the EU games, if you go to war with somebody where there's not a proper casus belli, you pay the price in domestic turmoil. 
Armada 2526 has something like this.  It sadly doesn't have any true effect, however (i.e., declaring war on a nation your people like lowers your colonie's collective happiness/loyalty).

If i am not mistaken, it's the case in Star Trek: birth of the federation. No?
BOTF only shows another empire's feelings towards you.  Armada does that, *plus* displays your own people's feelings towards that empire as well. 

"Like we need an excuse to drink to anything..." - Banzai_Cat
"I like to think of it not as an excuse but more like Pavlovian Response." - Sir Slash

"At our ages, they all look like jailbait." - mirth

"If we had lines here that would have crossed all of them. For the 1,077,986th time." - Gusington

"Government is so expensive that it should at least be entertaining." - airboy

"As long as there's bacon, everything will be all right." - Toonces