Cell selection with cell range expansion (CRE) that is a technique to expand a pico cell range virtually by adding a bias value to the pico received power, instead of increasing transmit power of the pico base station (PBS), can make coverage, cell-edge throughput, and overall network throughput improved. Many studies about CRE have used a common bias value among all user equipments (UEs), while the optimal bias values that minimize the number of UE outages vary from one UE to another. The optimal bias value that minimizes the number of UE outages depends on several factors such as the dividing ratio of radio resources between macro base stations (MBSs) and PBSs, it is given only by the trial and error method. In this paper, we propose a scheme to select a cell by using Q-learning algorithm where each UE learns which cell to select to minimize the number of UE outages from its past experience independently. Simulation results show that, compared to the practical common bias value setting, the proposed scheme reduces the number of UE outages and improves network throughput in the most cases. Moreover, instead of the degradation of the performances, it also solves the storage problem of our previous work.