This is giving strong "Using TDD to build a Sudoku solver" vibes[1].
Let's use some hot tools to solve a problem. Sorta solve a problem. Well, AI doesn't actually know abstract strategy, so maybe it'll work later with a different AI. But it autogenerated some basic logic that only had a few bugs!
There is no strategic understanding by any party of this tool (or the Sudoku one): the human "creator" nor the AI agent. Human knowledge is not advanced at all. The LLM can't tell you _why_ it makes its choices - maybe it could pretend to, but the descriptions themselves would probably be hallucinatory.
It's "just a game" but obviously this also applies to AI decision making in much more consequential settings. We should not strive just to come up with "the right answer" but ask _why_ it is the right answer.
Slay the Spire is my favorite video game of all time. I have 400+ hours on my Switch, 100+ hours on my phone, and 100+ hours on my PC. I'm obsessed.
I think creating an AI for this game is quite difficult, but I'd love a chat bot to discuss decisions with - especially a bot that could take in the current state.
It isn't enough to just take in the cards, but you should also take in relics, the counters on the relics, the potions, and what ascension you are at.
For example, you may have the cards to kill an enemy now - but it may be more beneficial to wait - either to increment the counter on a relic, or to draw a card that does an effect on a fatal hit.
The main thing that’s fun about these games is figuring out how to gain insights about how to play better. I think offloading the theorizing and problem solving in any way would make the game less fun. See also: most (not all) use of strategy guides and walkthroughs
I 100% agree. At A20 I constantly find myself in situations where I have to pick a card I'd never pick because it is my best option at that time. This leads to me learning more strategies and insights.
For example, in a previous A20 run I had I took Metallicize as Ironclad late in Act 1 because I was facing the Guardian and I didn't encounter other good block plans. Prior to that, I tended to avoid Metallicize to save spots for other block cards instead as it was too slow. This situation made me realize that having a Metallicize in your deck isn't that bad, especially if you need block NOW and you can counteract having it in your deck later on with draw/exhaust.
Metallicize is also strong because Ironclad has many powerful strategies that want a very thin deck - two dropkicks go infinite if you can get your deck small enough. Metallicize adds a solid block strategy to any deck and doesn't increase the effective deck size because powers don't really take a spot.
> For example, you may have the cards to kill an enemy now - but it may be more beneficial to wait - either to increment the counter on a relic, or to draw a card that does an effect on a fatal hit.
How is this any different than chess?
Solving this using a LLM is novel and interesting, but I'd feel confident in claiming that writing an AI bot for STS using "classic" AI methods would be pretty easy, actually.
I believe chess evaluation functions are always the difference between your opponent and you - i.e. it's "the same" to take a piece or avoid them taking yours. You must simply find any path to check.
STS, on the other hand, is about accumulating some resources while spending others. Which resources you want to spend and which you want to accumulate depend a great deal on your character and build. Skillful play often involves recognizing which resources you would benefit from losing at the current stage of the game. Training is also made much more difficult by how long you need to go to determine if you are on a winnable path. Power spikes are non-linear and many deck compositions would optimally play in a 'losing' way for a certain period of time before getting the pieces they need.
The game tree is both broader and deeper in Slay, so much harder to do tree search. Also very irregular; in chess you have very few situational decisions (eg en passant, castling) whereas many decisions in Slay are one-off (events) that are quite hard to weigh the EV. Finally you don’t have perfect information in Slay, which makes things WAY harder.
You’d probably want to look at Go AI instead of Chess, but better would be the Dota / StarCraft AIs. Very different architecture.
There's just a lot more variability due to the sheer number of "pieces" and the fact that you can combo moves... but maybe it's just a matter of scaling
There are more possible moves to consider in STS than Go? I've dumped a hundred hours into the former and thousands into the latter and I wouldn't agree.
Almost all the games you play on PCs have much bigger state spaces than Go.
To give an intuitve example on why StS's state space is large, imaging these two situations: In both case you have the exact same map, same deck, same HP, same potions and same relics. The only difference is in one case you have 100 gold and in the other you have 101 gold.
They're still two different states. And all the following states are different.
The size of space state doesn't necessarily translate to how difficult the game feels. Foundamentally Go has zero randomness while StS is full of randomness, so it's a little apple to orange.
Absolutely. Assuming you don't crack the deck prng, you have to furcate the decision tree for every possible order of cards whenever you shuffle the deck - that quickly gets to a number of states larger than atoms in the observable universe.
This also actually modifies strategy - top player Jorbs notoriously builds massive spreadsheets to crack tough decisions on important runs and deck order is huge for that.
To echo the other commenters - yup, there are so many possible moves in the game.
For example, there is a relic that gives you the status "Intangible" every 6 turns, and then it resets. "Intangible" makes it so that every incoming damage is only 1 damage point. This relic has a counter and the number of turns it counts exists out of individual fights. So for example, you could be on your 1st turn in a fight, but to the relic it could be your 3rd.
At first, you might think to yourself that you should stall each fight long enough where you can end the fight on the relic counter being 5 - so that any next encounter you start will be your 6th turn to the relic - meaning you'll be intangible at the beginning of the fight.
But - there are situations where you might want the counter to be at 4, so that you will be intangible not on the 1st turn of a fight but the 2nd. For example, there are some enemies that don't do damage on turn 1, or do more damage on turn 2. There are situations where you want the counter to be at 3, 2, etc.
To add onto this - it can be VERY difficult to stall a fight such that you get the relic to the number you want and you kill the enemy you are facing on that turn.
There are so many decisions to be made surrounding this one relic. What I've described here doesn't really cover all the theory surrounding it.
You can find similar decision paths and theory around every card, relic, potion, enemy, boss, etc. in the game.
It's one of the many reasons why I love this game so, so much.
I have up Dota 2 the day my son was born. Probably one of the best decisions of my life .. ! I probably have more hours in Dota 1 than 2, but these thoughts could drive me crazy.
(more for practical reasons at the time, suddenly I just couldn't devote 30-60+ minutes at a time to a game, because babies scream at all the hours and anything I could do to help my wife out was the least I could do!)
> I'm very excited about STS 2.
citybuilder folk are meant to be all about Manor Lords this week! :)
At some point I stopped playing it (after, well, 100s of hours), because somehow I felt that its only getting harder without me getting any reward outside of the raw challenge itself - which then feels just like working.
Is there anything I didn't really notice that feels rewarding on higher levels?
I like that it requires a small amount of mental energy to perform the 'optimal' play in every situation. The obvious example is doing basic arithmetic to calculate whether you should block more or attack.
At first, the game feels like it's deterministic with sprinkles of randomness, however, at high ascension levels it is the opposite - random chance until you get a broken deck that makes winning trivial. It's a slot machine.
As the other commenter said, it actually isn't like a slot machine.
I recommend watching runs from players that are going for win streaks. Many of the runs in win streaks are not slot machine decks. Going for a win streak forces you to take cards that you otherwise wouldn't have.
Watching runs in general will also show you that players win without broken decks. It's just that videos with the highest views tend to have broken decks because those are the most entertaining.
For me, the reward is having a deck that works well enough to tackle anything thrown at you.
This means many different things, but basically amounts to having the right amount of cards or relics that give you scalability for your attacks, a plan to mitigate attacks (with statuses or blocks), and a plan to handle beating the heart.
After a certain point, grabbing cards to beat the heart leads to beating other things easily - including Sword and Shield and the double Act 3 boss.
Funnily enough, I have the opposite problem. I'm often building decks that can beat everything easily except for the heart.
> Is there anything I didn't really notice that feels rewarding on higher levels?
The thought that you put into strategy matters more when the game is harder.
Personally, I don't find it fun past ascension 17 but if I play at a0 it's a bit mindless which is not as fun anymore.
No, the point of the ascension mechanic is that it gets harder and harder so that you need to play more and more optimally to have a chance at winning, there aren’t any benefits past ascension 1.
If you aren’t the sort of person who wants a harder challenge just for the sake of challenging yourself, then there’s no point yeah.
I'm currently trying to beat A20+heart with the four characters, then I will lose interest. I've only beat it with Watcher so far. Working on Defect now, then Ironclad, then Silent (ugh). Watching Baalord's videos on youtube really helped.
Watch Baalord's videos on youtube. Dude had a run of like twenty A20+Heart wins in a row with different characters. And he doesn't start over constantly like I do.
An LLM is definitely the best tool for "solving" a game like Slay the Spire or any game that require strategic thinking. Sounds like a fun project and it would be crazy impressive if the solution was better than random chance.
Fun. I do like how you can pretty easily create an LLM to do a whole bunch of tasks that it wasn't designed to do, so long as you don't particularly care about how well it does it.
I tried creating a robot driver back when I first started playing around with ChatGPT. I told it the list of commands it could output, like "Turn Left [n] degrees" and "Raise right hand [n] degrees" and "Say [x]." I then gave it instructions and it seemed to work just fine.
"All" I would need to do then would be to have a basic robot, program an API to drive it, and add voice-to-text to send commands to ChatGPT, and I would have a pretty basic voice-controlled robot, where the "brains" were coded in five minutes. At least good enough for a demo.
As an aside, what is the current state of "LLMs for NPCs" in video games?
I don't really game or follow the industry, but I have to imagine both modders and publishers are working furiously to introduce more natural conversational experiences?
0. What conversation? Current LLMs don't really sound like conversation.
1. Running LLMs in real time for NPC conversation is a no no because they don't have extra cpu/gpu time for that, they need to have 8k ultra HDR at 240 fps.
2. Even if they did, to have a 'natural' conversation experience they'd have to ask their players to ... type text into the game. The tendency is to remove even text based predefined options and replace them with icons - see Fallout 4.
3. Even if they somehow got past 1 and 2, that would mean user generated content into the game and you'll soon see screenshots online of people playing the latest open world rpg thingy and generating furry porn inside it without any mods. That's not wholesome family fun, only violence is.
4. But don't worry, they will use LLMs to pregenerate bland content and stuff it in every corner of the world. If we're lucky, maybe they'll even human check it before release.
That being said, I'd enjoy a new generation of text adventure games that don't have you guess at what the parser can understand. But just the parser for my input.
There's a PC game called "Suck Up!" where you play as a vampire trying to trick people into inviting you into their home so you can feed off them. All the NPCs are powered by an LLM and the game is all about using actual persuasive speech to appeal to their different personalities to convince them to let you in.
Let's use some hot tools to solve a problem. Sorta solve a problem. Well, AI doesn't actually know abstract strategy, so maybe it'll work later with a different AI. But it autogenerated some basic logic that only had a few bugs!
[1] https://ravimohan.blogspot.com/2007/04/learning-from-sudoku-...
It's "just a game" but obviously this also applies to AI decision making in much more consequential settings. We should not strive just to come up with "the right answer" but ask _why_ it is the right answer.
I think creating an AI for this game is quite difficult, but I'd love a chat bot to discuss decisions with - especially a bot that could take in the current state.
It isn't enough to just take in the cards, but you should also take in relics, the counters on the relics, the potions, and what ascension you are at.
For example, you may have the cards to kill an enemy now - but it may be more beneficial to wait - either to increment the counter on a relic, or to draw a card that does an effect on a fatal hit.
For example, in a previous A20 run I had I took Metallicize as Ironclad late in Act 1 because I was facing the Guardian and I didn't encounter other good block plans. Prior to that, I tended to avoid Metallicize to save spots for other block cards instead as it was too slow. This situation made me realize that having a Metallicize in your deck isn't that bad, especially if you need block NOW and you can counteract having it in your deck later on with draw/exhaust.
How is this any different than chess?
Solving this using a LLM is novel and interesting, but I'd feel confident in claiming that writing an AI bot for STS using "classic" AI methods would be pretty easy, actually.
STS, on the other hand, is about accumulating some resources while spending others. Which resources you want to spend and which you want to accumulate depend a great deal on your character and build. Skillful play often involves recognizing which resources you would benefit from losing at the current stage of the game. Training is also made much more difficult by how long you need to go to determine if you are on a winnable path. Power spikes are non-linear and many deck compositions would optimally play in a 'losing' way for a certain period of time before getting the pieces they need.
You’d probably want to look at Go AI instead of Chess, but better would be the Dota / StarCraft AIs. Very different architecture.
To give an intuitve example on why StS's state space is large, imaging these two situations: In both case you have the exact same map, same deck, same HP, same potions and same relics. The only difference is in one case you have 100 gold and in the other you have 101 gold.
They're still two different states. And all the following states are different.
The size of space state doesn't necessarily translate to how difficult the game feels. Foundamentally Go has zero randomness while StS is full of randomness, so it's a little apple to orange.
This also actually modifies strategy - top player Jorbs notoriously builds massive spreadsheets to crack tough decisions on important runs and deck order is huge for that.
For example, there is a relic that gives you the status "Intangible" every 6 turns, and then it resets. "Intangible" makes it so that every incoming damage is only 1 damage point. This relic has a counter and the number of turns it counts exists out of individual fights. So for example, you could be on your 1st turn in a fight, but to the relic it could be your 3rd.
At first, you might think to yourself that you should stall each fight long enough where you can end the fight on the relic counter being 5 - so that any next encounter you start will be your 6th turn to the relic - meaning you'll be intangible at the beginning of the fight.
But - there are situations where you might want the counter to be at 4, so that you will be intangible not on the 1st turn of a fight but the 2nd. For example, there are some enemies that don't do damage on turn 1, or do more damage on turn 2. There are situations where you want the counter to be at 3, 2, etc.
To add onto this - it can be VERY difficult to stall a fight such that you get the relic to the number you want and you kill the enemy you are facing on that turn.
There are so many decisions to be made surrounding this one relic. What I've described here doesn't really cover all the theory surrounding it.
You can find similar decision paths and theory around every card, relic, potion, enemy, boss, etc. in the game.
It's one of the many reasons why I love this game so, so much.
The people I used to play with are hitting 10,000+.
I now only play deckbuilders, citybuilders, and roguelikes. I'm very excited about STS 2.
(more for practical reasons at the time, suddenly I just couldn't devote 30-60+ minutes at a time to a game, because babies scream at all the hours and anything I could do to help my wife out was the least I could do!)
> I'm very excited about STS 2.
citybuilder folk are meant to be all about Manor Lords this week! :)
Is there anything I didn't really notice that feels rewarding on higher levels?
At first, the game feels like it's deterministic with sprinkles of randomness, however, at high ascension levels it is the opposite - random chance until you get a broken deck that makes winning trivial. It's a slot machine.
I recommend watching runs from players that are going for win streaks. Many of the runs in win streaks are not slot machine decks. Going for a win streak forces you to take cards that you otherwise wouldn't have.
Watching runs in general will also show you that players win without broken decks. It's just that videos with the highest views tend to have broken decks because those are the most entertaining.
For me, the reward is having a deck that works well enough to tackle anything thrown at you.
This means many different things, but basically amounts to having the right amount of cards or relics that give you scalability for your attacks, a plan to mitigate attacks (with statuses or blocks), and a plan to handle beating the heart.
After a certain point, grabbing cards to beat the heart leads to beating other things easily - including Sword and Shield and the double Act 3 boss.
Funnily enough, I have the opposite problem. I'm often building decks that can beat everything easily except for the heart.
The thought that you put into strategy matters more when the game is harder. Personally, I don't find it fun past ascension 17 but if I play at a0 it's a bit mindless which is not as fun anymore.
Do you prefer being spoon fed rewards for grinding endlessly, like in your average free game?
/s
AlphaGo Zero isn't LLM. The Dota bot that beat pro gamers is not LLM.
If this is an ad of Amazon Q, I'm not sure whether it's a good one.
>An LLM is definitely not the best tool for "solving" a game like Slay the Spire
I tried creating a robot driver back when I first started playing around with ChatGPT. I told it the list of commands it could output, like "Turn Left [n] degrees" and "Raise right hand [n] degrees" and "Say [x]." I then gave it instructions and it seemed to work just fine.
"All" I would need to do then would be to have a basic robot, program an API to drive it, and add voice-to-text to send commands to ChatGPT, and I would have a pretty basic voice-controlled robot, where the "brains" were coded in five minutes. At least good enough for a demo.
I don't really game or follow the industry, but I have to imagine both modders and publishers are working furiously to introduce more natural conversational experiences?
0. What conversation? Current LLMs don't really sound like conversation.
1. Running LLMs in real time for NPC conversation is a no no because they don't have extra cpu/gpu time for that, they need to have 8k ultra HDR at 240 fps.
2. Even if they did, to have a 'natural' conversation experience they'd have to ask their players to ... type text into the game. The tendency is to remove even text based predefined options and replace them with icons - see Fallout 4.
3. Even if they somehow got past 1 and 2, that would mean user generated content into the game and you'll soon see screenshots online of people playing the latest open world rpg thingy and generating furry porn inside it without any mods. That's not wholesome family fun, only violence is.
4. But don't worry, they will use LLMs to pregenerate bland content and stuff it in every corner of the world. If we're lucky, maybe they'll even human check it before release.
That being said, I'd enjoy a new generation of text adventure games that don't have you guess at what the parser can understand. But just the parser for my input.