In this paper, we focus on user-centered handover decision making in open-access non-stationary femtocell networks. Traditionally, such handover mechanism is usually based on a measured channel/cell quality metric such as the channel capacity (between the user and the target cell). However, the throughput experienced by the user is time-varying because of the channel condition, i.e. owing to the propagation effects or receiver location. In this context, user decision can depend not only on the current state of the network, but also on the future possible states (horizon). To this end, we need to implement a learning algorithm that can predict, based on the past experience, the best performing cell in the future. We present in this paper a reinforcement learning (RL) framework as a generic solution for the cell selection problem in a non-stationary femtocell network that selects, without prior knowledge about the environment, a target cell by exploring past cells behavior and predicting their potential future state based on Q-learning algorithm. Our algorithm aims at balancing the number of handovers and the user capacity taking into account the dynamic change of the environment. Simulation results demonstrate that our solution offers an opportunistic-like capacity performance with less number of handovers.