alphacats

package module

v0.0.0-...-014f423 Latest Latest Go to latest Published: Nov 6, 2020 License: GPL-3.0 Imports: 10 Imported by: 0

Details

Valid go.mod file

The Go module system was introduced in Go 1.11 and is the official dependency management solution for Go.
Redistributable license

Redistributable licenses place minimal restrictions on how software can be used, modified, and redistributed.
Tagged version

Modules with tagged versions give importers more predictable builds.
Stable version

When a project reaches major version v1 it is considered stable.
Learn more about best practices

Repository

github.com/timpalpant/alphacats

Links

Open Source Insights

README ¶

AlphaCats

AlphaCats was a failed attempt to solve the game of Exploding Kittens using Deep Counterfactual Regret Minimization. AlphaCats is built around the go-cfr package.

Due to the depth of the game tree, external sampling is intractable, and other forms of MC-CFR sampling (such as outcome sampling), led to high-variance samples and a model that struggled to converge.

Future areas of investigation could include variance-reduction and improved sampling techniques.

Usage

cmd/alphacats is the main driver binary. CFR iteration can be launched with:

./cmd/alphacats/alphacats -logtostderr \
    -decktype core -cfrtype deep -iter 10 \
    -sampling.num_sampling_threads 5000 \
    -sampling.max_num_actions 2 \
    -sampling.exploration_eps 1.0 \
    -deepcfr.traversals_per_iter 10000 \
    -deepcfr.buffer.size 10000000 \
    -deepcfr.model.num_encoding_workers 4 \
    -deepcfr.model.batch_size 10000 \
    -deepcfr.model.max_inference_batch_size 10000 \
    -output_dir output -v 1 2>&1 | tee run.log

This will run DeepCFR with a reservoir buffer of size 10 million, and sample the game tree using robust sampling with K=2.

Tabular CFR can also be launched with -cfrtype tabular. It requires a large amount of memory and therefore a smaller test game can be selected with -decktype test. Tabular CFR is not thread-safe and must be run with -sampling.num_sampling_threads 1.

Model

The underlying model used in AlphaCats is an LSTM over the game history that feeds forward into a deep fully connected network.

# The history (LSTM) arm of the model.
history_input = Input(name="history", shape=history_shape)
lstm = Bidirectional(CuDNNLSTM(32, return_sequences=False))(history_input)

# The private hand arm of the model.
hands_input = Input(name="hands", shape=hands_shape)

# Concatenate and predict advantages.
merged_inputs = concatenate([lstm, hands_input])
merged_hidden_1 = Dense(128, activation='relu')(merged_inputs)
merged_hidden_2 = Dense(128, activation='relu')(merged_hidden_1)
merged_hidden_3 = Dense(128, activation='relu')(merged_hidden_2)
merged_hidden_4 = Dense(64, activation='relu')(merged_hidden_3)
merged_hidden_5 = Dense(64, activation='relu')(merged_hidden_4)
normalization = BatchNormalization()(merged_hidden_5)
advantages_output = Dense(N_OUTPUTS, activation='linear', name='output')(normalization)

model = Model(
    inputs=[history_input, hands_input],
    outputs=[advantages_output])
model.compile(
    loss='mean_squared_error',
    optimizer=Adam(clipnorm=1.0),
    metrics=['mean_absolute_error'])

See model/train.py for the training script. During training, samples are first generated using a go-cfr sampler, saved to *.npz files, and then loaded by the script in minibatches. The resulting model is saved in TensorFlow format, and loaded for inference (see model/lstm.go).

Documentation ¶

Index ¶

Constants
func CountDistinctShuffles(deck cards.Set) int
func EnumerateDealsWithP0Hand(deck, p0Deal cards.Set, cb func(d Deal))
func EnumerateDealsWithP1Hand(deck, p1Hand cards.Set, cb func(d Deal))
func EnumerateInitialDeals(deck cards.Set, cardsPerPlayer int, cb func(d Deal))
func EnumerateShuffles(deck cards.Set, cb func(shuffle cards.Stack))
type AbstractedInfoSet
type BeliefState
- func NewBeliefState(opponentPolicy func(cfr.GameTreeNode) []float32, infoSet gamestate.InfoSet) *BeliefState
type Deal
- func NewRandomDeal(deck []cards.Card, cardsPerPlayer int) Deal
- func NewRandomDealWithConstraints(drawPile cards.Stack, p1Hand cards.Set) Deal
type GameNode
- func NewGame(drawPile cards.Stack, p0Deal, p1Deal cards.Set) *GameNode
type InfoSetWithAvailableActions
- func (is *InfoSetWithAvailableActions) MarshalBinary() ([]byte, error)
- func (is *InfoSetWithAvailableActions) UnmarshalBinary(buf []byte) error

Constants ¶

View Source

const (
	PlayTurn turnType
	GiveCard
	ShuffleDrawPile
	MustDefuse
	InsertKittenRandom
	GameOver
)

Variables ¶

This section is empty.

Functions ¶

func CountDistinctShuffles ¶

func CountDistinctShuffles(deck cards.Set) int

func EnumerateDealsWithP0Hand ¶

func EnumerateDealsWithP0Hand(deck, p0Deal cards.Set, cb func(d Deal))

func EnumerateDealsWithP1Hand ¶

func EnumerateDealsWithP1Hand(deck, p1Hand cards.Set, cb func(d Deal))

func EnumerateInitialDeals ¶

func EnumerateInitialDeals(deck cards.Set, cardsPerPlayer int, cb func(d Deal))

func EnumerateShuffles ¶

func EnumerateShuffles(deck cards.Set, cb func(shuffle cards.Stack))

Types ¶

type AbstractedInfoSet ¶

type AbstractedInfoSet struct {
	Player           gamestate.Player
	PublicHistory    gamestate.History
	Hand             cards.Set
	P0PlayedCards    cards.Set
	P1PlayedCards    cards.Set
	DrawPile         cards.Stack
	AvailableActions []gamestate.Action
}

AbstractedInfoSet abstracts away private history. The main difference in this abstraction is that the exact ordering in which private cards were received in the history is neglected. A second difference is that cards known to be in the draw pile (but not known where) are forgotten. This can happen if a SeeTheFuture card is played followed by a shuffle.

func (*AbstractedInfoSet) Key ¶

func (is *AbstractedInfoSet) Key() []byte

Key implements cfr.InfoSet.

func (*AbstractedInfoSet) MarshalBinary ¶

func (is *AbstractedInfoSet) MarshalBinary() ([]byte, error)

func (AbstractedInfoSet) String ¶

func (a AbstractedInfoSet) String() string

func (*AbstractedInfoSet) UnmarshalBinary ¶

func (is *AbstractedInfoSet) UnmarshalBinary(buf []byte) error

type BeliefState ¶

type BeliefState struct {
	// contains filtered or unexported fields
}

BeliefState holds the distribution of probabilities over underlying game states as perceived from the point of view of one player.

func NewBeliefState ¶

func NewBeliefState(opponentPolicy func(cfr.GameTreeNode) []float32, infoSet gamestate.InfoSet) *BeliefState

Return all game states consistent with the given initial hand. Note that the passed hand should include the Defuse card.

func (*BeliefState) Clone ¶

func (bs *BeliefState) Clone() *BeliefState

func (*BeliefState) Len ¶

func (bs *BeliefState) Len() int

func (*BeliefState) Less ¶

func (bs *BeliefState) Less(i, j int) bool

func (*BeliefState) SampleDeterminization ¶

func (bs *BeliefState) SampleDeterminization() *GameNode

func (*BeliefState) Swap ¶

func (bs *BeliefState) Swap(i, j int)

func (*BeliefState) Update ¶

func (bs *BeliefState) Update(infoSet gamestate.InfoSet)

Update belief state by propagating all current states forward, expanding determinizations as necessary and filtering to those that match the given new info set.

type Deal ¶

type Deal struct {
	DrawPile cards.Stack
	P0Deal   cards.Set
	P1Deal   cards.Set
}

func NewRandomDeal ¶

func NewRandomDeal(deck []cards.Card, cardsPerPlayer int) Deal

func NewRandomDealWithConstraints ¶

func NewRandomDealWithConstraints(drawPile cards.Stack, p1Hand cards.Set) Deal

type GameNode ¶

type GameNode struct {
	// contains filtered or unexported fields
}

GameNode implements cfr.GameTreeNode for Exploding Kittens. GameNode represents a state of play in the extensive-form game tree.

func NewGame ¶

func NewGame(drawPile cards.Stack, p0Deal, p1Deal cards.Set) *GameNode

NewGame creates a root node for a new game with the given draw pile and hands dealt to each player.

func (*GameNode) Clone ¶

func (gn *GameNode) Clone() *GameNode

func (*GameNode) CloneWithState ¶

func (gn *GameNode) CloneWithState(state gamestate.GameState) *GameNode

func (*GameNode) Close ¶

func (gn *GameNode) Close()

Close implements cfr.GameTreeNode.

func (*GameNode) Depth ¶

func (gn *GameNode) Depth() int

func (*GameNode) GetChild ¶

func (gn *GameNode) GetChild(i int) cfr.GameTreeNode

GetChild implements cfr.GameTreeNode.

func (*GameNode) GetChildProbability ¶

func (gn *GameNode) GetChildProbability(i int) float64

GetChildProbability implements cfr.GameTreeNode.

func (*GameNode) GetDrawPile ¶

func (gn *GameNode) GetDrawPile() cards.Stack

func (*GameNode) GetHistory ¶

func (gn *GameNode) GetHistory() gamestate.History

func (*GameNode) GetInfoSet ¶

func (gn *GameNode) GetInfoSet(player gamestate.Player) gamestate.InfoSet

func (*GameNode) GetState ¶

func (gn *GameNode) GetState() gamestate.GameState

func (*GameNode) InfoSet ¶

func (gn *GameNode) InfoSet(player int) cfr.InfoSet

InfoSet implements cfr.GameTreeNode.

func (*GameNode) InfoSetKey ¶

func (gn *GameNode) InfoSetKey(player int) []byte

func (*GameNode) LastAction ¶

func (gn *GameNode) LastAction() gamestate.Action

func (*GameNode) NumChildren ¶

func (gn *GameNode) NumChildren() int

func (*GameNode) Parent ¶

func (gn *GameNode) Parent() cfr.GameTreeNode

func (*GameNode) Player ¶

func (gn *GameNode) Player() int

Player implements cfr.GameTreeNode.

func (*GameNode) SampleChild ¶

func (gn *GameNode) SampleChild() (cfr.GameTreeNode, float64)

SampleChild implements cfr.GameTreeNode.

func (*GameNode) String ¶

func (gn *GameNode) String() string

String implements fmt.Stringer.

func (*GameNode) Type ¶

func (gn *GameNode) Type() cfr.NodeType

Type implements cfr.GameTreeNode.

func (*GameNode) Utility ¶

func (gn *GameNode) Utility(player int) float64

Utility implements cfr.GameTreeNode.

type InfoSetWithAvailableActions ¶

type InfoSetWithAvailableActions struct {
	gamestate.InfoSet
	AvailableActions []gamestate.Action
}

func (*InfoSetWithAvailableActions) MarshalBinary ¶

func (is *InfoSetWithAvailableActions) MarshalBinary() ([]byte, error)

func (*InfoSetWithAvailableActions) UnmarshalBinary ¶

func (is *InfoSetWithAvailableActions) UnmarshalBinary(buf []byte) error

Source Files ¶

View all Source files

Directories ¶

Path	Synopsis
cards
cmd
alphacats This version of alphacats uses one-sided IS-MCTS with a NN to guide search, in a PSRO framework.	This version of alphacats uses one-sided IS-MCTS with a NN to guide search, in a PSRO framework.
alphacats_bootstrap Generate training samples for PSRO network bootstrap by playing games with Smooth UCT search.	Generate training samples for PSRO network bootstrap by playing games with Smooth UCT search.
alphacats_mcts This version of alphacats uses Smooth UCT MCTS only.	This version of alphacats uses Smooth UCT MCTS only.
count_belief_states
count_game_nodes Script to estimate the number of nodes touched in an external sampling run.	Script to estimate the number of nodes touched in an external sampling run.
play_model
gamestate
matrixgame
model Package model implements an LSTM-based network model for use in MCTS.	Package model implements an LSTM-based network model for use in MCTS.
internal/npyio Package npyio is a fork of github.com/sbinet/npyio that is hard-coded for []float32s to avoid reflection.	Package npyio is a fork of github.com/sbinet/npyio that is hard-coded for []float32s to avoid reflection.
internal/tffloats Package tffloats constructs *tf.Tensors from []float32 slices, avoiding reflection.	Package tffloats constructs *tf.Tensors from []float32 slices, avoiding reflection.

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL