Towards wellbeing-awareness in computing, researches for estimating users' emotions using smartphone sensor data have been actively conducted as smartphones are getting more and more ubiquitous. Most studies have constructed emotion estimation models based on machine learning with contextual data from the smartphones and user's self-reporting ground truth label often collected via Experience Sampling Method (ESM). However, snice our emotion changes frequently in our daily lives, trying to collect the ground truth of such volatile emotions leads a storm of ESMs which could be burden to the users. In order to find better ESM methods, we propose and compare 3 ESMs, namely Randomized ESM that executes in randomly timings, Trigger ESM that executes when the user's behavior changes, and Unlocking ESM that sets up ESM on the unlocking screen. We constructed various emotional estimation models with four types of time granularity (1 day, 1/3 day, 3 hours, 1 hour) in four weeks experience with eight persons. As for the response rate, Unlocking ESM was the highest. In addition, it was clear that Unlocking ESM had the highest estimation accuracy in most cases.