Exploring Clustering Techniques for Effective Reinforcement Learning based Personalization for Health and Wellbeing


Personalisation has become omnipresent in society. For the domain of health and wellbeing such personalisation can contribute to better interventions and improved health states of users. In order for personalisation to be effective in this domain, it needs to be performed quickly and with minimal impact on the users. Reinforcement learning is one of the techniques that can be used to establish such personalisation, but it is not known to be very fast at learning. Cluster-based reinforcement learning has been proposed to improve the learning speed. Here, users who show similar behaviour are clustered and one policy is learned for each individual cluster. An important factor in this effort is the method used for clustering, which has the potential to influence the benefit of such an approach. In this paper, we propose three distance metrics based on the state of the users (Euclidean distance, Dynamic Time Warping, and high-level features) and apply different clustering techniques given these distance metrics to study their impact on the overall performance. We evaluate the different methods in a simulator with users spawned from very distinct user profiles as well as overlapping user profiles. The results show that clustering configurations using high-level features significantly outperform regular reinforcement learning without clustering (which either learn one policy for all or one policy per individual).

2018 IEEE Symposium Series on Computational Intelligence (SSCI)