Leduc Holdem
Released:
A Toolkit for Reinforcement Learning in Card Games
Project description
RLCard is a toolkit for Reinforcement Learning (RL) in card games. It supports multiple card environments with easy-to-use interfaces. The goal of RLCard is to bridge reinforcement learning and imperfect information games. RLCard is developed by DATA Lab at Texas A&M University and community contributors.
Kuhn & Leduc Hold’em: 3-players variants Kuhn is a poker game invented in 1950 Bluffing, inducing bluffs, value betting 3-player variant used for the experiments Deck with 4 cards of the same suit KQJT Each player is dealt 1 private card Ante of 1 chip before card are dealt One betting round with 1-bet cap If there’s a outstanding bet.
Run examples/leducholdemhuman.py to play with the pre-trained Leduc Hold'em model. Leduc Hold'em is a simplified version of Texas Hold'em. Leduc Hold'em is a simplified version of Texas Hold'em. Rules can be found here. Leduc, Alberta: Casinos and other gaming details regarding the latest gaming news, poker tourneys, slots details, pari-mutuel (horse racing and greyhound racing), and more subjects. Contact information and photos of most gambling facilities in Leduc.
- Official Website: http://www.rlcard.org
- Tutorial in Jupyter Notebook: https://github.com/datamllab/rlcard-tutorial
- Paper: https://arxiv.org/abs/1910.04376
- GUI: RLCard-Showdown
- Resources: Awesome-Game-AI
- This site requires JavaScript and Cookies to be enabled. Please change your browser settings or upgrade your browser.
- Leduc Hold’em is a simplified version of Texas Hold’em. Rules can be found here.
News:
- We have released RLCard-Showdown, GUI demo for RLCard. Please check out here!
- Jupyter Notebook tutorial available! We add some examples in R to call Python interfaces of RLCard with reticulate. See here
- Thanks for the contribution of @Clarit7 for supporting different number of players in Blackjack. We call for contributions for gradually making the games more configurable. See here for more details.
- Thanks for the contribution of @Clarit7 for the Blackjack and Limit Hold'em human interface.
- Now RLCard supports environment local seeding and multiprocessing. Thanks for the testing scripts provided by @weepingwillowben.
- Human interface of NoLimit Holdem available. The action space of NoLimit Holdem has been abstracted. Thanks for the contribution of @AdrianP-.
- New game Gin Rummy and human GUI available. Thanks for the contribution of @billh0420.
- PyTorch implementation available. Thanks for the contribution of @mjudell.
Cite this work
If you find this repo useful, you may cite:
Installation
Make sure that you have Python 3.5+ and pip installed. We recommend installing the latest version of rlcard
with pip
:
Alternatively, you can install the latest stable version with:
The default installation will only include the card environments. To use Tensorflow implementation of the example algorithms, install the supported verison of Tensorflow with:
To try PyTorch implementations, please run:
If you meet any problems when installing PyTorch with the command above, you may follow the instructions on PyTorch official website to manually install PyTorch.
We also provide conda installation method:
Conda installation only provides the card environments, you need to manually install Tensorflow or Pytorch on your demands.
Examples
Please refer to examples/. A short example is as below.
We also recommend the following toy examples in Python.
R examples can be found here.
Demo
Run examples/leduc_holdem_human.py
to play with the pre-trained Leduc Hold'em model. Leduc Hold'em is a simplified version of Texas Hold'em. Rules can be found here.
We also provide a GUI for easy debugging. Please check here. Some demos:
Available Environments
We provide a complexity estimation for the games on several aspects. InfoSet Number: the number of information sets; InfoSet Size: the average number of states in a single information set; Action Size: the size of the action space. Name: the name that should be passed to rlcard.make
to create the game environment. We also provide the link to the documentation and the random example.
Game | InfoSet Number | InfoSet Size | Action Size | Name | Usage |
---|---|---|---|---|---|
Blackjack (wiki, baike) | 10^3 | 10^1 | 10^0 | blackjack | doc, example |
Leduc Hold’em (paper) | 10^2 | 10^2 | 10^0 | leduc-holdem | doc, example |
Limit Texas Hold'em (wiki, baike) | 10^14 | 10^3 | 10^0 | limit-holdem | doc, example |
Dou Dizhu (wiki, baike) | 10^53 ~ 10^83 | 10^23 | 10^4 | doudizhu | doc, example |
Simple Dou Dizhu (wiki, baike) | - | - | - | simple-doudizhu | doc, example |
Mahjong (wiki, baike) | 10^121 | 10^48 | 10^2 | mahjong | doc, example |
No-limit Texas Hold'em (wiki, baike) | 10^162 | 10^3 | 10^4 | no-limit-holdem | doc, example |
UNO (wiki, baike) | 10^163 | 10^10 | 10^1 | uno | doc, example |
Gin Rummy (wiki, baike) | 10^52 | - | - | gin-rummy | doc, example |
API Cheat Sheet
How to create an environment
You can use the the following interface to make an environment. You may optionally specify some configurations with a dictionary.
- env = rlcard.make(env_id, config={}): Make an environment.
env_id
is a string of a environment;config
is a dictionary that specifies some environment configurations, which are as follows.seed
: DefaultNone
. Set a environment local random seed for reproducing the results.env_num
: Default1
. It specifies how many environments running in parallel. If the number is larger than 1, then the tasks will be assigned to multiple processes for acceleration.allow_step_back
: DefualtFalse
.True
if allowingstep_back
function to traverse backward in the tree.allow_raw_data
: DefaultFalse
.True
if allowing raw data in thestate
.single_agent_mode
: DefaultFalse
.True
if using single agent mode, i.e., Gym style interface with other players as pretrained/rule models.active_player
: Defualt0
. Ifsingle_agent_mode
isTrue
,active_player
will specify operating on which player in single agent mode.record_action
: DefaultFalse
. IfTrue
, a field ofaction_record
will be in thestate
to record the historical actions. This may be used for human-agent play.- Game specific configurations: These fields start with
game_
. Currently, we only supportgame_player_num
in Blackjack.
Once the environemnt is made, we can access some information of the game.
- env.action_num: The number of actions.
- env.player_num: The number of players.
- env.state_space: Ther state space of the observations.
- env.timestep: The number of timesteps stepped by the environment.
What is state in RLCard
State is a Python dictionary. It will always have observation state['obs']
and legal actions state['legal_actions']
. If allow_raw_data
is True
, state will also have raw observation state['raw_obs']
and raw legal actions state['raw_legal_actions']
.
Basic interfaces
The following interfaces provide a basic usage. It is easy to use but it has assumtions on the agent. The agent must follow agent template.
- env.set_agents(agents):
agents
is a list ofAgent
object. The length of the list should be equal to the number of the players in the game. - env.run(is_training=False): Run a complete game and return trajectories and payoffs. The function can be used after the
set_agents
is called. Ifis_training
isTrue
, it will usestep
function in the agent to play the game. Ifis_training
isFalse
,eval_step
will be called instead.
Advanced interfaces
For advanced usage, the following interfaces allow flexible operations on the game tree. These interfaces do not make any assumtions on the agent.
- env.reset(): Initialize a game. Return the state and the first player ID.
- env.step(action, raw_action=False): Take one step in the environment.
action
can be raw action or integer;raw_action
should beTrue
if the action is raw action (string). - env.step_back(): Available only when
allow_step_back
isTrue
. Take one step backward. This can be used for algorithms that operate on the game tree, such as CFR. - env.is_over(): Return
True
if the current game is over. Otherewise, returnFalse
. - env.get_player_id(): Return the Player ID of the current player.
- env.get_state(player_id): Return the state that corresponds to
player_id
. - env.get_payoffs(): In the end of the game, return a list of payoffs for all the players.
- env.get_perfect_information(): (Currently only support some of the games) Obtain the perfect information at the current state.
Running with multiple processes
RLCard now supports acceleration with multiple processes. Simply change env_num
when making the environment to indicate how many processes would be used. Currenly we only support run()
function with multiple processes. An example is DQN on blackjack
Library Structure
The purposes of the main modules are listed as below:
- /examples: Examples of using RLCard.
- /docs: Documentation of RLCard.
- /tests: Testing scripts for RLCard.
- /rlcard/agents: Reinforcement learning algorithms and human agents.
- /rlcard/envs: Environment wrappers (state representation, action encoding etc.)
- /rlcard/games: Various game engines.
- /rlcard/models: Model zoo including pre-trained models and rule models.
Evaluation
The perfomance is measured by winning rates through tournaments. Example outputs are as follows:
For your information, there is a nice online evaluation platform pokerwars that could be connected with RLCard with some modifications.
More Documents
For more documentation, please refer to the Documents for general introductions. API documents are available at our website.
Contributing
Contribution to this project is greatly appreciated! Please create an issue for feedbacks/bugs. If you want to contribute codes, please refer to Contributing Guide.
Acknowledgements
We would like to thank JJ World Network Technology Co.,LTD for the generous support and all the contributions from the community contributors.
Release historyRelease notifications RSS feed
0.2.8
0.2.7
0.2.6
0.2.5
0.2.4
0.2.3
0.2.1
0.2.0
Leduc Hold'em Rules
0.1.17
0.1.15
0.1.14
0.1.13
0.1.12
0.1.11
0.1.10
0.1.9
0.1.8
Leduc Hold'em Rules
0.1.7
0.1.6
0.1.5
0.1.4
0.1.3
0.1.2
0.1.1
0.1
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Filename, size | File type | Python version | Upload date | Hashes |
---|---|---|---|---|
Filename, size rlcard-0.2.8.tar.gz (6.7 MB) | File type Source | Python version None | Upload date | Hashes |
Hashes for rlcard-0.2.8.tar.gz
Algorithm | Hash digest |
---|---|
SHA256 | 48677a1bf1f5e925c3995de0890db7d111cdb11626a6f3e2e1ddbbe750c9f332 |
MD5 | bad0f3be0127c61c047fb1a1f6da59b7 |
BLAKE2-256 | 1df79c4f68698fb70fdc4fced67d1f7d2e7250be5fdc51c9b3b6779d53806307 |
Fall 2020
Public Reports
Public Video Reports
Other Titles
- Tic Tac Toe Solver
- Beating the house in Blackjack
- Effect of Noise on Learning a Planar Pushing Task using SAC
- ResQNet: Finding Optimal Fire Rescue Routes
- COVID Chatbot
- Regularized Follow-the-Leader in Online MDP for Efficient Topographical Mapping
- Learning POMDP model parameters from missing observations
- Reinforcement Learning for No-Limit Texas Hold ‘Em with Bomb Pots
- Identifying Optimal Locations for Satellite Image Capture
- Diet Conscious Meal Planner
- Mitigating Risk of Public Transit during COVID-19
- Predicting the Match: Using Bayesian Networks to Predict Professional Tennis Outcomes
- Efficient Single-Agent Capture of a Moving Target
- Q-Learning Applied to the Taxi Problem
- Settlers of CATAN
- Autonomous Snake
- Q-Learning for Pre-Flop Texas Hold ‘Em
- Deep RL for Atari Games
- Simulating a D&D Encounter with Q-Learning
- Deep RL for Automated Stock Trading
- Dating Under Uncertainty
- Retinal Implant Electrical Stimulation via RL
- Batch Offline RL in the Healthcare Setting
- Computer Caddy – Using RL to advice Golfers’ Club Selection
- RL for Fischer Random Chess
- Timely Decision Making with Probability Path Model
- Satellite-Imagery Based Poverty Level Evaluation System in Mexico with Deep RL Approach
- Pokemon Showdown
- Deep RL for Space Invaders
- Learning to Run
- RL-Based Control of Policy Selection in Near-Accident Scenarios
- Model Predictive Control for an Aircraft Autopilot
- Finding Inharmonic Timbres Locally Consonant with Arbitrary Scales
- Escape Roomba
- Driving in Traffic
- Playing Snake
- The 2020 FLatland Challenge
- Elevator Scheduling with Neural Q-Learning
- Optimizing Immunotherapy Treatment using RL
- Modeling Leduc Hold ‘Em Poker
- Auto Trading System Using Q-Learning
- Energy System Modeling
- Optimizing Fox in the Forest through RL
- Learning Gin Rummy
- Car Racing with Deep RL
- Sequential Decision Making for Mineral Exploration
- Advanced Driver Assistance Systems
- Learning to Play Stargunner with Deep Q-Networks
- A Fourth-and-Goal Football Recommender System
- Algorithms for Motion Planning
- Playing Farkle
- Connect 4: A Survey of Different RL Techniques to Destroy Your Pride
- Decision Making in the word game, Codenames
- Reinforcement Learning Approaches for An Adversarial Snake Agent
- An Attention-Based, Reinforcement-Learned Heuristic Solver for the Double Travelling Salesman Problem With Multiple Stacks
- Uncertainty Aware Model-Based Policy Optimization
- Navigating the Four-Way Stop Autonomously
- Ground Water Remediation Using Sequential Decision Making
- Final Project: Satellite Collision Avoidance
- Q Learning for 4th Down Decision Making in the NFL
- Q-Learning for the Game of Nim: Does The Agent Learn a Combinatorially Optimal Strategy On Its Own?
- Contextual Bandit Algorithms in Recommender Systems
- A Comparison of Reinforcement Learning Methods for Autonomous Navigation
- Reinforcement Learning for Behavior Planning in Intersections for Autonomous Vehicles
- Reinforcement Learning for Pacman Capture the Flag
- Comparing Different Optimization Techniques for Learning Continuous Control with Neural Networks
- Autonomous Exploration in Subterranean Environments
- Improving Image Denoising through Decision Making
- Using MDPs to Optimally Allocate Funds
- Explanations Meet Decision Theory
- Learning Policies for Adaptive LiDAR Scanning with POMDPs
- Cautious Markov Games: A New Framework for Human-Robot Interaction
- Selecting a multibasis community structure for the connectome
- Reinforcement Learning Techniques for Long-Term Trading and Portfolio Management
- Optimal Asset Allocation with Markov Decision Processes
- Symbolic Regression with Bayesian Networks
- Scheduling battery charging using deep reinforcement learning
- Online Knapsack Problem Using Reinforcement Learning
- Policy gradient optimization for
- Resource Allocation for Wildfire Prevention
- Using Reinforcement Learning to Play Omaha
- Fraud Detection for Mobile Payments using Bayesian Network and CNN
- Neuro-Adaptive Artificial Neural Networks for Reinforcement Learning
- AI Agent for Qwirkle
- Learning Optimal Wildfire Suppression Policies With Reinforcement Learning
- Bid Smart with Uncertainty: An Autonomous Bidder
- AA228/CS238 Final Report
- Modeling Identification of Approaching Aircraft as a POMDP
- Short-Term Trading Policies for Bitcoin Cryptocurrency Using Q-learning
- Reinforcement Learning of a Battery Power Schedule for a Short-Haul Hybrid-Electric Aircraft Mission
- Autonomous Helicopter Control for Rocket Recovery
- Reinforcement Learning Strategies Solving Game Gomoku
- A Wildfire Evacuation Recommendation System
- Battleship with Alogrithm
- Developing an Optimal Structure for Breast Cancer Single Cell Classification
- Utilizing Deep Q Networks to Optimally Execute Stock Market Entrance and Exit Strategies
- Online Planning for a Grid World POMDP
- Contingency Manager Agent for Safe UAV Autonomous Operations
- Solving Mastermind as a POMDP
- Simulated Drone Flight with Advantaged Actor Critic Reinforced Learning in 2 and 3 Dimensions
- Solving Queueing Problem Using Monte Carlo Tree Search
- Bayesian Structure Learning on NFL play data
- Multi-Agent Rendezvous Using Reinforcement Learning
- Dynamic Portfolio Optimization
- Fairness and Efficiency in Multi-Portfolio Liquidation: An Multiple-Agent Deep Reinforcement Learning Approach
- Evaluating Poker Hands
- Saving Artificial Intelligence Clinician
- Evaluation of online trajectory planning methods for autonomous vehicles
- Solving Leduc Hold’em Counterfactual Regret Minimization
- From aerospace guidance to COVID-19: Tutorial for the application of the Kalman filter to track COVID-19
- A Reinforcement Learning Algorithm for Recycling Plants
- Monte Carlo Tree Search with Repetitive Self-Play for Tic-Tac-Toe
- Developing a Decision Making Agent to Play RISK
Fall 2019
Public Reports
Other Titles
- Linear Array Target Motion Analysis Using POMDPs
- Speed or Safety?: Calculating Urban Walking Routes Based on Probability of Crime and Foot Traffic
- AlphaGomoku
- Modelling Uncertainty in Dynamic Real-time Multimodal Routing Problems
- Reinforcement Learning for Portfolio Allocation
- Preparation of Papers for AIAA Technical Conferences
- Autonomous Racing
- Deep Learning Enabled Uncorrelated Space Observation Association
- Landing a Lunar Spacecraft with Deep Q-Learning
- POkerMDP: Decision Making for Poker
- 1V1 Leduc Hold’em Bot
- Political Influencers: Using Election Finance Data to Analyze Campaign Success via Bayesian Networks
- Developing AI Policies for Street Fighter via Q-learning
- Impact of Market Technical Indicators On Future Stock Prices Using Reinforcement Learning
- Allocation of Hearts for Transplant as an MDP
- Multi-Agent Reinforcement Learning in a 2D Environment for Transportation Optimization
- Planning under Uncertainty for Discrete Robotic Navigation with Partial Observability
- Deep Reinforcement Learning Applied to Mid-Frequency Trading
- Application of Subspace Identification for Classification of Neural-Activity during Decision-Making
- Using Markov-Decision Processes to Design Betting Strategies for the NFL
- Maneuvering Characteristics Control Systems using Discrete-Time MDPs
- MDP Based Motion Planning In Unsignaled Intersections
- Competitive Blackjack Using Reinforcement Learning
- Modelling Pedestrian Vehicle Interaction at Stop Sign using Markov Decision Process
- Jeopardy! Wagering Under Uncertainty
- Love Letters Under Uncertainty
- Playing The Resistance with a POMDP
- Robotic Simultaneous Localization and Mapping with 2D Laser Scan
- Mars Rover: Navigating an Uncertain World
- Modeling Blood Donations Over Time as a POMDP
- Reinforcement Learning for Control on OpenAI Gym Environments
- Playing Connect 4 using Reinforcement Learning
- Evaluation of Reduced Algorithmic Complexity for Grasping Tasks by Using a Novel Underactuated Curling Grasper with Reinforcement Learning
- Optimizing Strategies for Settlers of Catan
- Exploring Search Algorithms for Klondike Solitaire
- A Sparse Sampling Control Strategy for Risk Minimization during Stretchable Sensor Network Deployment
- Computing Strategies for the 7 Wonders Board Game
- POMDP modeling of stochastic Tetris
- Solving a Maze with Doors and Hidden Tigers
- Playing “Dominion” with Deep Reinforcement Learning
- Delivery Vehicle Navigation in Crowd with Reinforcement Learning
- Capturing Uncertainty in a Multi-Modal Setting With JRMOT: A Real-Time 2D-3D Multi-Object Tracker
- Decentralized Satellite Network Communication
- Seismic Network Planning
- Reinforcement Learning for PaoDeKuai, A Card Game
- Training A Bai Fen Agent with Reinforcement Learning
- Decision Making for Launch Cancellation Based Upon Storm Conditions
- Optimizing for the Competitive Edge: Modeling Sequential Binary Decision Making for Two Competing Firms
- Datacenter Equipment Maintenance Optimization
- To Heat Or Not To Heat: Reinforcement Learning for Optimal Residential Water Heater Control
- Learning to Play Snake Game with Deep Reinforcement Learning
- Optimal Traffic Light Control for Efficient City Transportation
- Modeling NBA Point Spread Betting as an MDP
- Solving a car racing game using Reinforcement Learning
- Is Uncertainty Really Harmful: Solving Partially Observable Lunar Lander Problem with Deep Reinforcement Learning
- Autonomous Navigation of an RC Boat Under a POMDP Framework
- Evaluating the Bayes-Adaptive MDP Framework on Stochastic Gridworld Environments
- Value Iteration with Enhanced Action Space for Path Planning
- The Medical Triage Problem: Improving Hospitals’ Admission Decisions
- Optimal Route Selection for Riders in Toronto
- Model Free Learning for Optimal Power Grid Management
- Wasting Less Time on the Road Using MDPs
- Learning User Preferences to Produce Sequential Movie Recommendations
- A Comparison of Learning Based Control Methods for Optimal Trajectory Tracking with a Quadrotor
- Artificial Pancreas: Q-Learning Based Control for Closed-loop Insulin Delivery Systems
- Navigating in an Uncertain World
- Teaching an Autonomous Car to Drive through an Intersection with POMDPs
- Atomic structure minimization using simulated annealing with a MCTS temperature scheme
- AI Game Player for 2048
- Deep Q-Learning with GARCH for Stock Volatility Trading
- Learning to Become President
- Solving GNSS Integrity-Based Path Planning in Urban Environments via a POMDP Framework
- Reservoir operation under climate uncertainty
- Reinforcement Learning for Maze Solving
- Using Reinforcement Learning to Find Basins of Attraction
- Planning for Asteroid Prospecting Missions with POMDPs
- Human-Aware Robot Motion Planner
- Determining Federal Funds Rate Changes – Hike / Cut / Hold – Under Economic Uncertainty
- Simulating Work-Life Balance with POMDP
- Solving 2048 as a Markov Decision Process
- Accounting for Delay in Dynamic Resource Allocation for Wildfire Suppression – a POMDP Approach
- Daily Allocation of Assets with Distinct Risk Profiles using Reinforcement Learning
- LocoNets for Deep Reinforcement Learning
- Exploring a full joint observability game with Markov decision processes
- Deep Bayesian Active Learning for Multiple Correct Outputs
- Convolutionally Reducing Markov Decision Processes
- Robust Decision Making Agent for Frozen Lake
- Tic-Tac-Toe How Many In A Row?
- Turbomachinery Optimization Under Uncertainty
- Devising a Policy for Liar’s Dice Using Model Free Reinforcement Learning
- Political Compromises: an Iterative Game of Prisoner’s Dilemma
- Optimal Home Energy System Management using Reinforcement Learning
- Drone Tracking in a 2-dimentional Grid using Particle Filter Algorithm
- Deep Reinforcement Learning for Traffic Signal Control
- A Deep Reinforcement Learning Approach to Recommender Systems
- FlyCroTugs – Collaborative Object Manipulation Using Flying Tugs
- Local Approximation Q-Learning for a Simplified Satellite Reconnaissance Mission
- Developing Policies for Blackjack Using Reinforcement Learning
- Applying Q-learning to the Homicidal Chauffeur Problem
- Optimal Satellite Detumbling through Reinforcement Learning
- Active Preference-Based Gaussian Process Regression for Reward Learning and Optimization
- A Comparative Study on Heart Disease Prediction
- Robot Navigation with Human Intent Uncertainty
- Conquering the Queen of Spades: A Hearts Agent
- Using Markov Decision Processes to Predict Soccer Player Market Value
- Effectiveness of Recurrent Network for Partially-Observable MDPs
- Capture The Flag
- Predicting uncertainty
- Optimal Asset Allocation with Markov Decision Processes
- Nets on Nets: Using Bayesian Networks to Predict Supplier Links in Economic Networks
- Playing 2048 With Reinforcement Learning
- Trading strategies using deep reinforcement learning with news and time series stock data
- Modeling Contract Bridge as a POMDP
- Solving Rubik’s Cubes Using Milestones
- Playing 2048 with Deep Reinforcement Learning
- An Approximate Dynamic Programming Minimum-Time Guidance Policy for High Altitude Balloons
- Identifying Bots on Twitter
- Approaches to Model-Free Blackjack
- Jumping Robot Simulator: An Exploration of Methods to Teach a Bio-Inspired Frog Robot to Navigate
- Air Traffic Control Tower Policy for Terminal Environment Operations
- Managing a Prediction Market Portfolio
- Applying Partially Observable Markov Decision Making Processes to a Product Recommendation System
- Self-Driving Under Uncertainty
- Reinforcement Learning for QWOP
- Modeling Macroeconomic Phenomena with Multi-Agent Reinforcement Learning
- Optimal Learning Policy via POMDP planning
- AI Guidance for Thermalling in a Glider
- Decision Making For Profit: Portfolio Management using Deep Reinforcement Learning
- Self-play Reinforcement Learning for Open-face Chinese Poker
- Feature Constrained Graph Generation with a Modified Multi-Kernel Kronecker Model
- Sensor Fusion of IMU and LiDAR Data Using a Multirate Extended Kalman Filter
- Optimizing Empiric Antibiotic Delivery in the Emergency Department
- The Task Completion Game
- Optimizing Modified Mini-Metro (M³)
- Improving Pragmatic Inferences with BERT and Rational Speech Act Framework and Data Augmentation
- Deep Q-Learning for Playing Hanabi as a POMDP
- A Comparative Study on Heart Disease Prediction
Fall 2018
Public Reports
Other Titles (excluding optional final projects)
- Occlusion Handling for Local Path Planning with Stereo Vision
- Pre-Flop Betting Policy in Poker
- Optimal Impulsive Maneuver Times for Simultaneous Imaging and Gravity Recovery of an Asteroid
- Monte-Carlo Planning in Subsurface Resource Development
- Learning to Win at Go-Stop
- Police Officer Distribution
- Optimizing Road Construction to Improve Traffic Conditions Using Reinforcement Learning
- Q-Learning for Casino Hold’em
- Modeling a Connected Highway Merge as a POMDP Using Dynamic GPS Error
- Figure 8 Race Track Optimal and Safe Driving
- Predictive Maintenance of Trucks using POMDPs
- Predictive Models for Maximizing Return on Agriculture given Location and Temperature
- A Policy to Deal With Delay Uncertainty
- Reinforcement Learning Methods for Energy Microgrids
- Boom! Tetris for Bot – Designing a Reinforcement Learning Framework for NES Tetris
- Hidden Markov Models for Economic Cycle Regime Estimation
- Push Me: Optimizing Notification Timing to Promote Physical Activity
- Resource Allocation for Floridian Hurricanes
- Motion Planning in Human-Robot Interaction Using Reinforcement Learning
- Automated Neural Network Architecture Tuning with Reinforcement Learning
- Imitation Learning in OpenAI Gym with Reward Shaping
- Collision Avoidance for Unmanned Rockets using Markov Decision Processes
- MDP Solvers for a Successful Sushi Go! Agent
- Uncovering Personalized Mammography Screening Recommendations through the use of POMDP Methods
- Implementing Particle Filters for Human Tracking
- Decision Making in the Stock Market: Can Irrationality be Mathematically Modelled?
- Single and Multi-Agent Autonomous Driving using Value Iteration and Deep Q-Learning
- Buying and Selling Stock with Q-Learning
- Application and Analysis of Online, Offline, and Deep Reinforcement Learning Algorithms on Real-World Partially-Observable Markov Decision Processes
- Reward Augmentation to Model Emergent Properties of Human Driving Behavior Using Imitation Learning
- Classification and Segmentation of Cancer Under Uncertainty
- Comparison of Learning Methods for Price Setting of Airfare
- QMDP Method Comparisons for POMDP Pathfinding
- Global Value Function Approximation using Matrix Completion
- Artificial Intelligence Techniques for a Game of 2048
- Exploring the Boundaries of Art
- An Iterative Linear Algebra Approach to Dynamic Programming
- Solving Open AI Gym’s Lunar Lander with Deep Reinforcement Learning
- Application of Imitation Learning to Modeling Driver Behavior in Generalized Environments
- Craps Shoot: Beating the House…?
- Movie Recommendations with Reinforcement Learning
- Playing Atari 2600 Games Using Deep Learning
- Traverse Synthesis for Planetary Exploration
- Optimal operation of an islanded microgird under a Markov Decision Process framework
- Implementing Deep Q-learning Extensions in Julia with Flux.jl
- Learning How to Buy Food
- Using Dynamic Programming for Optimal Meal Planning
- Modelling Wildfire Evacuation using MDPs
- Comparing Multimodal Representations for Robotic Reinforcement Learning Tasks
- Applying Reinforcement Learning to Packet Routing in Mesh Networks
- Xs & Os: Creating a Tic-Tac-Toe Foe
- Doggo Does a Backflip: Deep Reinforcement Learning on a Quadruped Platform
- GrocerAI: Using Reinforcement Learning to Optimize Supermarket Purchases
- Reinforcement Learning For The Buying and Selling of Financial Assets
- Towards Designing a Policy on Automotive GPS Integrity
- Generalized Kinetic Component Analysis
- Trading Wheat Futures Contracts
- Using PCR, Neural Networks, and Reinforcement Learning
- Reinforcement Learning for Inverted Pendulums
- Electric Vehicle Charging under Uncertainty
- Automatic Accompaniment Generator: An MDP Approach
- Comparison of Methods in Artificial Life
- Modeling a Better Visual Acuity Test
- Online Methods Applied to the Game of Euchre
- Missile Defense Strategy: Towards Optimal Interceptor Allocation
- Smart Charging of Electric Vehicles under State Uncertainty
- Learning to Play Atari Breakout Using Deep Q-Learning and Variants
- Decision making on fault-code
- Learning FlappyBird with Deep Q-Networks
- A Fresh Start: Using Reinforcement Learning to Minimize Food Waste and Stock-Outs in Supermarkets
- Autonomous orbital maneuvering using reinforcement learning
- Autonomous Decision-Making for Space Debris Avoidance
- Maximizing Monthly Expenditures Under Uncertainty
- Modeling Voter Preferences in US General Elections
- Application of Reinforcement Learning to the Path Planning with Dynamic Obstacles
- A Decision Making framework for Medical Diagnostics
- Learning to Walk Using Deep RL
- Q(λ)-Learning with Boltzmann and ε-greedy Exploration Applied to a Race Car Simulation
- Reinforcement learning for Glassy/Phase Transitions
- Proximal Policy Optimization in Julia
- University Technology Patent and License Decisions: Open- versus Closed-Loop Planning in a Markov Decision Process
- A Policy Gradient Approach for Continuous-Time Control of Spacecraft Manipulator Systems
- Applying Techniques in Reinforcement Learning to Motion Planning in Redundant Robotic Manipulators
- Deep Q-Learning for Atari Pong
- Adversarial Curiosity for Model-Based Reinforcement Learning
- A Markov Decision Process Approach to Home Energy Management with Integrated Storage
- Using Maximum Likelihood Model-Based Reinforcement Learning to Play Skull
- Cryptocurrency Trading Strategy with Deep Reinforcement Learning
- Evaluating Multisense Word Embeddings Final Report
- Near-Earth Object (NEO) Deflection via POMDP
- Reinforcement Learning for Car Driving
- Reinforcement Learning for Automatic Wheel Alignment
- Julia Implementation of Trust Region Policy Optimization
- Deep Reinforcement Learning with Target and Prediction Networks
- Playing Tower Defense with Reinforcement Learning
- Q-Learning agent as a portfolio trader
- Multi-Robot Rendezvous from Indoor Acoustics
- Portfolio Asset Allocation using Reinforcement Learning
- Creating a 2048 AI Solver using Expectimax
- Robustness of Reinforcement Learning Based Communication Networks in Multi-Modal Multi-Step Games to Input Based Adversarial Attacks
- Deep Q-Learning with Nearest Neighbors in Sequential Decision-Making for Sepsis Treatment
- Positioning Archival Radar Data with a Particle Filter
- Reinforcement Learning for Atari Skiing
- Understanding Donations with Reinforcement Learning
- Known and Unknown Discrete Space Exploration Using Deep Q-Learning
- Speeding Up Reinforcement Learning with Imitation
- Learning Bandwidth-Limited Communication in Decentralized Multi-agent Reinforcement Learning
Fall 2017
- 2048 as a MDP
- A Computational Approach to Employee Resource Allocation between Multiple Projects
- Accelerated Training of Deep Q Learning Models for Atari Games
- AlphaOthello: Developing an Othello player through Reinforcement Learning on Deep Neural Networks
- An Online Approach to Energy Storage Management Optimization
- An Optimal Basketball Foul Strategy by Value Iteration
- Annealed Reward Functions in Continuous Control Reinforcement Learning
- Applications of Inverse Reinforcement Learning for Multi-Feature Path Planning
- Attributing Authorship in the Case of Anonymous Supreme Court Opinions Utilizing SVMs and Probabilistic Inference on Score Uncertainty
- Balancing Safety and Performance in Imitation Learning
- Baseball Pitch Calling as a Markov Decision Process
- Batch Reinforcement Learning Technological Investment Strategies Utilizing The Contingent Effectiveness Model In A Markov Decision Process
- Bayesian Learning of Image Transformations from User Preferences for Individualized Automatic Filters
- BetaMiniMax: An Agent for Cheat
- Building a Game Agent to Play Resistance
- Building Trust in Autonomy: Sharing Future Behavior in Reinforcement Learning
- Car racing with low dimensional input features
- Comparison of Classical Control Methods and POMDPs for 3D Motion Control
- Control of a Partially-Observable Linear Actuator
- DDQN Learning for 2048
- Deep Q-learning in OpenAI gym
- Deep Q-Learning with Target Networks for Game Playing
- Design of A Planning Machinery for Choosing an NBA Team’s Play Style Strategy
- Detecting Human from Image with Double DQN
- Determining the Optimal Betting Policy: World Series
- Disrupting Distributed Consensus (or Not) Using Reinforcement Learning
- Dominating Dominoes
- Double A3C: Deep Reinforcement Learning on OpenAI Gym Games
- Emergent Language in Multi-Agent Co-operative Reinforcement Learning
- Explore the Frontier of Safe Imitation Learning for Autonomous Driving
- Fast Operation of Coordinated Distributed Energy Resources without Network Models using Deep Deterministic Policy Gradients
- Faster Algorithms for Contextual Combinatorial Cascading Bandits
- Finding a Scent Source with a Soft Growing Robot Using Monte Carlo Tree Search
- Gaming Bitcoin Leveraging Model-Based Reinforcement Learning
- Get Ready for Demand Response
- GlideAI: Optimizing Soaring Strategy with Reinforcement Learning
- Grid Stability Management and Price Arbitrage for Distributed Energy Storage and Generation via Reinforcement Learning
- Guiding the management of sepsis with deep reinforcement learning
- HMMs for Prediction of High-Cost Failures
- Integrating Mini-Model Evidence into Policy Evaluation
- Investigating Parametric Insurance Models As Multi- Variable Decision Networks
- Learning an Optimal Policy for Police Resource Allocation on Freeways
- Learning Terminal Airspace Models from TSAA Data
- Learning the Education System
- Learning the Policy of the Policy: Deep Reinforcement Learning with Model-Based Feedback Controllers
- Learning to Play a Simplified Version of Monopoly Using Multi-Agent SARSA
- Learning to Play Othello Without Human Knowledge
- Limbed Robot Motion Control through Online Reinforcement Learning
- Linear Approximation Q-Learning to Learn Movement in a 2D Space
- Locally Optimal Risk Aware Path Planning
- Massively Parallel Reinforcement Learning Using Trust Region Policy Optimization
- Model-Free Learning of Casino Blackjack
- Model-Free Reinforcement Learning of a Modified Helicopter Game
- Model-Free Reinforcement Learning on Flappy Bird
- Modeling Disaster Evacuation Paths
- Modeling Flight Delay and Cancelation
- Modeling NBA Matchups
- Modeling Optimally Efficient Earth to Earth Flight Trajectories in Kerbal Space Program with Reinforcement Learning
- Modeling Real Estate Investment with Deep Reinforcement Learning
- Multi-Agent Cooperative Language Learning
- Multi-armed Bandits with Unobserved Confounders
- Multidisciplinary Design Optimization for Approximating Unsteady Aerodynamics of Flexible Aircraft Structures
- Navigating Chaos: Autonomous Driving in a Highly Stochastic Environment
- Optimal Flight Itineraries Under Uncertainty Using a Stochastic Markov Decision Process
- Optimal Strategy for Two-Player Incremental Classification Games Under Non-Traditional Reward Mechanisms
- Optimizing sequential time-lapse seismic davcx bta collection using a POMDP
- Personal Portfolio Asset Allocation as An MDP Problem
- Planetary Lander with Limited Sensor Information and Topographical Uncertainty
- Playing Flappy Bird Using Deep Reinforcement Learning
- POMDP and MDP for Underwater Navigation
- POMDP Modeling of a Simulated Automatic Faucet for Cognitive State and Task Recognition
- Portfolio Management
- Power Grid real time optimization using Q-Learning
- Predicting Congressional Voting Behavior and Party Affiliation using Machine Learning
- Predicting Income From OkCupid Profiles
- Predicting NBA Game Outcomes using POMDPs
- Predicting Subjective Sleep Quality
- Preparation of Papers for AIAA Technical Journals
- Pursuit-Evasion Game with an Agent Unaware of its Role
- Rapid Reinforcement Learning by Injecting Stochasticity into Bellman
- Real Time Collision Detection and Identification for Robotic Manipulators
- Reinforcement Learning Applied to Quadcopter Hovering
- Reinforcement Learning Approaches to Pathfinding in Stochastic Environments
- Reinforcement Learning For A Reach-Avoid Game
- Reinforcement Learning for Atari Breakout
- Reinforcement Learning for Crypto-Currency Arbitrage Bot
- Reinforcement Learning for Precision Landing of a Physically Simulated Rocket
- Reinforcement learning in an online multiplayer game
- Reinforcement Learning of Blackjack Variants
- Reinforcement training of nonlinear reduced order models
- Reward Shaping with Dynamic Guidance to Accelerate Learning for Multi-Agent Systems
- Risk – Bayesian World Conquest
- Roboat: Reinforcement of Boat’s Optimal Adaptive Trajectory
- Robotic Arm Motion Planning Based on Reinforcement Learning
- Robotic Decision Making Under Uncertainty
- Sensor Selection for Energy-Efficient Path Planning Under Uncertainty
- ShAIkespeare: Generating Poetry with Reinforcement Learning and Factor Graphs
- Shared Policies in Aircraft Avoidance
- Simulated Autonomous Driving with Deep Reinforcement Learning
- Simulating Coverage Path Planning with Roomba
- SLAMming into Obstacles: Simultaneous Localization and Mapping the Path of a Turtlebot
- Smart Health Coach: Using Markov Decision Processes to Optimize Health Advising Strategies
- Smarter Queues by Reinforcement Learning
- Solving Real-world Oil Drilling Problem with Multi-Armed Bandit and POMDP Models
- Stay in Your Lane: Probabilistic Vehicular Automation for DIY Robocars
- Supervised Learning and Reinforcement Learning for Algorithmic Trading
- Taking Out the GaRLbage
- Terrain Relative Navigation and Path Planning for Planetary Rovers
- Time-Constrained Sample Retrieval in a Martian Gridworld with Unknown Terrain
- Trade-offs in Connect Four Game-Playing Agents
- Training an Intelligent Driver on Highway Using Reinforcement Learning
- UAV Collision Avoidance Using Neural Network-Assisted Q-Learning
- Understanding Limitations of Network Meta-analysis Approaches to Rank Effectiveness of Treatments
- Using Bayesian Networks to Impute Missing Data
- Using Bayesian Networks to Predict Credit Card Default
- Using Bayesian Networks to Understand and Predict Wildfires
- Using Classification Models to Represent and Predict Students’ Restaurant Preferences
- Using Q-Learning to Optimize Lunar Lander Game Play
- Using the QMDP Method to Determine an Open Ocean Fishing Policy
- Utilizing fundamental factors in reinforcement learning for active portfolio management
- Utilizing Fundamental Factors in Reinforcement Learning for Active Portfolio Management
Fall 2016
Leduc Hold'em Tournaments
- Model-Free Reinforcement Learning of Blackjack
- Partially Observable Actions in Solving Markov Decision Processes. The Case for Insulin Dosing Optimization in Diabetic Patients.
- Using Monte-Carlo Tree Search to Solve the Board Game Hive
- Blackjack: How to use MDP’s to (nearly) beat the house
- Cancer Metabolism Mapping: Bayesian Networks and Network Learning Techniques to Understand Cancer Metabolic and Regulatory Pathways
- Gibbs Sampling in BayesNets.jl
- UAV Collision Avoidance Policy Optimization with Deep Reinforcement Learning
- Improving Training Efficiency in Deep Q-Learning for Atari Breakout
- Monitoring Machine Workload Distribution with Kalman Filter
- Approximating Transition Functions to Cart Track MDPs via Sub-State Sampling
- Approaching Quantitative Trading with Machine Learning
- Structure and Parameter Learning in Bayesian Networks with Applications to Predicting Breast Cancer Tumor Malignancy in a Lower Dimension Feature Space
- Autonomous Racing by Learning from Professionals
- Bravo Zulu: Optimizing Teammate Selection for Military and Civilian Applications
- Investigating Transfer Learning in Deep Reinforcement Learning Context
- Simultaneous Estimation and Control with MCTS
- Controlling Soft Robots with POMCP
- Automatic Learning of Computer Users’ Habits
- Learning to Play Soccer in the OpenAI Gym
- Playing Ultimate Tic-Tac-Toe with TD Learning and Monte Carlo Tree Search
- A Bayesian Network Model of Pilot Response to TCAS Resolution Advisories
- Improving Head Impact Kinematics Measurement Accuracy using Sensor Fusion
- Drive Decision Making at Intersections
- Deterministic and Bayesian Techniques for Spaceborne Vision-Based Non-Stellar Object Detection
- A Two-Phased Deep Reinforcement Learning Algorith for High-Frequency Trading
- Implementation and Experimentation of a DQN solver in Julia for POMDPs.jl
- Landing on the Moon
- Deserted Island: Cooperative Behavior in Absence of Explicit Delayed Reward
- DeepGo.py
- Managing Groundwater under Uncertain Seasonal Recharge
- Using Reinforcement Learning to Find Flaws in Collision Avoidance Systems
- Effectiveness of Bayesian Networks in Building a Prediction Model for Movie Success
- Data Driven Agent based on Aircraft Intent
- Deep Q-Learning with Natural Gradients
- A Shot in the Dark: Beating Battleship with POMCP
- Accelerated Asynchronous Deep Reinforcement Learning Variant of Advantage Actor-Critic
- Applying Reinforcement Learning and Online Methods on the Inverted Pendulum Problem
- Predicting Sentiment with Deep Q-Learning
- A Lookahead Strategy for Super-Level Set Estimation using Gaussian Processes
- Modeling Breast Cancer Treatment as a Markov Decision Process
- Learning 31 using Cross-Entropy Methods
- Improving Haptic Guidance using Reinforcement Learning
- NLPLab: Actor-Critic Training in Natural Language Processing
- Deep Reinforcement Learning on Atari Breakout
- Reinforcement Learning for LunarLander
- Reinforcement Learning for AI Machine Playing Hearthstone
- Using Deep Q-Learning to Automate CNN Training
- Automatic Continuous Variable Encoder in Bayesian Network
- Side Channel Analysis using Neural Networks and Random Forests
- A Decision-Making System for Wildfire Management
- Decentralized Game Theoretic Methods for the Distributed Graph Coverage Problem
- Autonomous altitude control for high altitude balloons
- Neural Network Arbitration for Better Time and Accuracy trade-offs
- Deep Deterministic Policy Gradient with Robot Soccer
- Towards a Personal Decision Support System
- Optimal Gerrymandering under Uncertainty
- The Ambulance Dilemma: Crossing an Intersection with Monte Carlo Tree Search
- DeepDominionDevelopmental Policy Design: an MDP approach
- Training of a craps betting strategy with Reinforcement Learning Techniques
- Engineering a Better Monkey
- Decision Making During a Bicycle Race
- Using Discrete Pressure Measurements to Understand Subsonic Bluff-Body Dynamic Damping
- Effective Move Selection in Chess Without Lookahead Search
- Solving Texas Hold’em with Monte-Carlo Planning
- Reinforcement Learning of High-Speed Autonomous Driving through Unknown Map
- Implementation and deployment of particle filter for simulated and real-world localization tasks
- Tree Augmented Naive Bayes and Backward Simulation
- Transfer of Q values across tasks in Reinforcement Learning
- Training Regime Modifications for Deep Q-Network Learning Acceleration
- Reinforce Optimizer
- Approximating Ligand Docking Using a Markov Decision Process
- Breaking Down Social Media Filter Bubbles via Reinforcement Learning
- Performing an N-Sentiment Classification Task on Tinder Profiles Based On Image Feature Extraction
- Play Blackjack With Monte Carlo Simulation And Q-learning with Linear Regression
- Observer-Actor Neural Networks for Self-Play in Imperfect Information Games
- Using Hybrid Bayes Nets to Model Country Prosperity
- Solving a Pandemic! Various Approaches for Tackling the Board Game
- Improved Markov Decision Process Model for Resource Allocation in Disaster Scenarios
- Learning Chess through Reinforcement Learning
- Deep Reinforcement Learning For Continuous Control: An Investigation of Techniques and Tricks
- Computer Vision Through Perception: Semantic Understanding of Novel Scenes through Data Programming
- Path Planning for Insertion of a Bevel-Tip Needle
- Modeling human biases through reinforcement learning
- Bootstrapping Neural Network with Auxiliary Tasks
- Q-Learning Application in Optimizing Pokémon Battle Strategy
- Model-based exploration in natural language generation
- Automated Aircraft Touchdown
- Longitudinal Vehicle Control using a Markov Decision Process and Deep Neural Network
- MOMDP-based Aerial Target Search Optimization
- Greedy Thick-Thinning Structure Learning and Bayesian Network Conditional Independence Implementations in BayesNets.jl
- Multiagent Planning For Aerial Broadband Internet
- Viral Marketing as an MDP
- Neural Soccer – Towards Exploration by the Pursuit of Novelty
- Locally Weighted Value Iteration in Julia
- Fully-Nested Interactive POMDPs for Partially-Observable Turn-Based Games
- Optimal Policy Considerations for Gas Turbine Maintenance
- Learning Optimal Manipulation of Food Webs
- Estimating Resource Prospector’s Probability of Failure Using Importance Sampling and Cross Entropy
- Dynamically Discount Deep Reinforcement Learning
- Deep Reinforcement Learning: Accelerated Learning with Effective Gradient Ascent Optimization Algorithms
- Autonomous Human Tracking in Simulated Environment
- A LQG Library for POMDPs.jl
Fall 2015
- Mars Hab-Bot: Using MDPs to simulate a robot constructing human-livable habitats on Mars
- A Value Iteration Study of BlackJack
- Optimized Store-Stocking via Monte Carlo Tree Search with Stochastic Rewards
- Trajectory Planning for Map Exploration Using Terrain Features
- Instruction Following with Deep Reinforcement Learning
- Using Markov Decision Processes to Minimize Golf Score
- Reinforcement Learning for Scheduling I/O Requests on Multi-Queue Flash Storage
- Finding the Perfect ‘Job’ in resource allocation
- Maximizing Influence in Social Networks
- A Machine Learning Regression Approach to General Game Playing
- Modeling GPS Spoofing as a Partially Observable Markov Decision Process
- Travel Hacking with MDPs
- Optimal Mission Planning for a Satellite-Based Particle Detector via Online Reinforcement Learning
- An MDP Approach to Motion Planning for Hopping Rovers on Small Solar System Bodies
- Sampling Strategies for Deep Reinforcement Learning
- Descriptive Power of Bayesian Structure Learning in Stock Market
- Large-Scale Traffic Grid Signal Control Using Fuzzy Logic and Decentralized Reinforcement Learning
- Simulated Pedestrian-like Navigation with a 1D Kalman Filter with an Accelerometer and the Global Positioning System
- Search and Track Tradeoff for Multifunction Radars
- Play Calling in American Football Using Value Iteration
- Reinforcement learning for commodity trading
- Learning the Stock Market, a Naive Approach
- A POMDP Framework for Modelling Robotic Guidance During a Tissue Palpation Task
- Reinforcement Learning of an Artificially Intelligent Hearts Player1
- Toy Helicopter Control via Deep Reinforcement Learning
- Gas Refuelling Optimization Modelled as a Markov Decision Process
- Q-Matrix and Policy Compression via Deep Learning
- Augmenting Self-Learning In Chess Through Expert Imitation
- Monte Carlo Tree Search Applied to a Variant of the RockSample Problem
- Supply Chain Management using POMDPs
- Online Markov Decision Process Framework for Modeling Efficient Home Robot Cleaners
- Reinforcement Learning for Path Planning with Soft Robotic Manipulators
- Exploring POMDPS with Recurrent Neural Networks
- Tic-tac-toe with reinforcement learning: best strategies and influence of parameters
- Vehicle Speed Prediction using Long Short-Term Memory Networks
- Explorations on Learning Bayesian Networks
- Playing unknown game on a visual world
- Reinforcement Learning for Atari Games
- Q-learning in the Game of Mastermind
- Modeling of a Baseball Inning as MDP
- Reinforcement Learning for Path Planning with Soft Robotic Manipulators
- Autonomous Driving on a Multi-lane Highway Using POMDPs
- Solving a Maze Without Location Data
- Markov Decision Processes and Optimal Policy Determination for Street Parking
- Solving an opponent-based match-three mobile game
- Life begins as a POMDP: improving decision making in the IVF clinic
- Path Planning for Target-Tracking Unmanned Aerial Vehicle
- Discrete State Filter Implementation for a Battleships Artificial Intelligence
- POMDP for Search and Rescue with Obstacle Avoidance: Incorporation of Human in the Loop
- Application performance over cellular networks
- An MDP Approach to Motion Planning for Hopping Rovers on Small Solar System Bodies
- Solving Dudo: beating Liar’s Dice with a POMDP
- Reinforcement Learning for Tetris
- Robot Path Planning using Monte Carlo POMDP
- Reinforcement Learning of an Artificially Intelligent Hearts Player
- Enhancing Computational Efficiency of PILCO Model-based Reinforcement Learning Algorithm
- Analysis of UCT Exploration Parameter in Sailing Domain Problems
- Solving a Search and Rescue Planning problem with MOMDPs
- Robot Motion Planning in Unknown Environments using Monte Carlo Tree Search
- Delivery optimization of an on-demand delivery service
- Solving MultiAgent Decision Making using MDPs
- Efficient and Modular Inventory Management Framework for Small Businesseses
- Markov Decision Processes in Board Game Playing
- Automated Model Selection via Gaussian Processes
- Predictive Hybrid Vehicle Control Policy
- Optimal Policies for In-Space Satellite Communications
- Spacecraft Navigation in Cluttered, Dynamic Environments Using 3D Lidar
- Playing Chess Endgames using Reinforcement Learning
- Space Debris Removal
- Large-Scale Traffic Grid Signal Control Using Fuzzy Logic and Decentralized Reinforcement Learning
- Relation Extraction from Scratch
- Lane Merging as a Markov Decision Process
- Using MDP/POMDP to Help in Search of Survivors of a Plane Crash
- Applying POMCP to Controlling Partially Observable Diffusion Processes
- Credit Risk Classification using Bayesian Network
Fall 2014
- Automating Air Traffic Management for Flight Arrivals
- Policy Learning for Sokoban
- Flight Path Optimization Under Constraints Using a Markov Decision Process Approach
- Visual Localization and POMDP for Autonomous Indoor Navigation
- Monte Carlo Tree Search for Online Learning in Golf Course Management
- Pushing on Leaves
- Beating 2048
- Improved electrical grid balancing with demand response scheduled by an MDP
- Multi-Fidelity Model Management in Engineering Design Optimization Using Partially Observable Markov Decision Processes
- Smarter Generators in Power Markets
- Beach Paddle Ball
- Applying POMDP to RockSample problem
- Targeting Hostile Vehicle Modeled as a Partially Observable Markov Decision Process with State-Dependent Observation Model
- Reinforcement Learning and Linear Gaussian Dynamics Applied to Multifidelity Optimization of a Supersonic Wedge
- Approximate POMDP Solutions for Short-Range UAV Traffic Conflict Resolution
- WorkSmart: The Implementation of a Modified Q-Learning Algorithm for an Intelligent Daily To-Do List Android Application
- Imminent Obstacle Avoidance with Friction Uncertainty
- Dynamic Restrictions during Commercial Space Vehicle Launches
- Autonomous Direct Marketing with Deep Q-Learning
- Efficient Risk Estimation for Chance-Constrained Robotic Motion Planning Under Uncertainty
- Probabilistic Aircraft Arrival Rate Prediction
- Audio Keylogging: Translating Acoustic Signals into Keystrokes
- Collision Avoidance for Small Multi-Rotor Aircraft using SARSA(λ) and Fourier Basis Functions
- Reinforcement Learning with Tetris
- Stock Market Reinforcement Learning
- Obstacle Avoidance for Automated Vehicle using Markov Decision Processes
- Control of Epidemics on a Graph
- Autonomous ATC for non-towered airports
- Path Planning for Terrain Relative Navigation using POMDPs
- Vehicle Braking Controller in a Markov Decision Process Framework
- Multi-Armed Bandit Heuristics for HTTP Denial-of-Service Attacks
- Structure Learning for Probabilistic Driving Models
- Casino Blackjack Modeled as a Markov Decision Process
- Competitive Collision Avoidance
- Efficient Sampling Of Protein Landscapes Via Markov Decision Processes
- Flight Deck Interval Management (An MDP Approach)
- BGT Model for Analysis of Head-On Collisions
- Collision Avoidance System Parameter Optimization
- Dynamic Demand Prediction and Routing for Autonomous Mobility-on-Demand Systems
- Action-Constrained, Multi-Species Task Scheduling: The Kayaker Problem
- Reinforcement Learning with Low-rank Matrix Factorization
- Automated Sequencing and Spacing of Arrival Aircraft in Final Vector Approach Airspace
- Exploring Policy Learning for Blackjack