TY - UNPB
T1 - Latent-variable MDP models for adapting the interaction environment of diverse users
AU - Ramamoorthy, Subramanian
AU - Mahmud, MM Hassan
AU - Rosman, Benjamin
AU - Kohli, Pushmeet
PY - 2014
Y1 - 2014
N2 - Interactive interfaces are a common feature of many systems ranging from fieldrobotics to video games. In most applications, these interfaces must be used by aheterogeneous set of users, with substantial variety in effectiveness with the sameinterface when configured differently. We address the problem of personalizingsuch an interface, adapting parameters to present the user with an environmentthat is optimal with respect to their individual traits - enabling that particular userto achieve their personal optimum. We model the user as a parameterised Markov Decision Process (MDP), wherein the transition dynamics within a task depends on the latent personality traits (e.g., skill or dexterity) of the user. A key innovation is that we adapt at the level of action sets, picking a personalized optimal set of actions that the user should use. Our solution involves a latent variable formulation wherein we maintain beliefs over the latent type of users, which serves as a proxy for the hidden personality traits. This allows us to compute a Bayes optimal action set which when presented to the user allows them to achieve optimal performance. Our experiments, with real and simulated human participants, demonstrate that our personalized adaptive solution outperforms any alternate static solution, and also other adaptive algorithms such as EXP−3. Furthermore, we show that our algorithm is most useful under high diversity in user base, where the benefits of safe initialization and quick adaptation (properties our algorithm provably enjoys) are most pronounced.
AB - Interactive interfaces are a common feature of many systems ranging from fieldrobotics to video games. In most applications, these interfaces must be used by aheterogeneous set of users, with substantial variety in effectiveness with the sameinterface when configured differently. We address the problem of personalizingsuch an interface, adapting parameters to present the user with an environmentthat is optimal with respect to their individual traits - enabling that particular userto achieve their personal optimum. We model the user as a parameterised Markov Decision Process (MDP), wherein the transition dynamics within a task depends on the latent personality traits (e.g., skill or dexterity) of the user. A key innovation is that we adapt at the level of action sets, picking a personalized optimal set of actions that the user should use. Our solution involves a latent variable formulation wherein we maintain beliefs over the latent type of users, which serves as a proxy for the hidden personality traits. This allows us to compute a Bayes optimal action set which when presented to the user allows them to achieve optimal performance. Our experiments, with real and simulated human participants, demonstrate that our personalized adaptive solution outperforms any alternate static solution, and also other adaptive algorithms such as EXP−3. Furthermore, we show that our algorithm is most useful under high diversity in user base, where the benefits of safe initialization and quick adaptation (properties our algorithm provably enjoys) are most pronounced.
M3 - Working paper
BT - Latent-variable MDP models for adapting the interaction environment of diverse users
ER -