Recent years have seen the introduction of service robots as waiters or waitresses in restaurants and cafes. In such venues, it is common for customers to visit in groups as well as for them to engage in conversation while eating and drinking. It is important for cyber serving staff to understand whether they are eating and drinking, or not, in order to wait on tables at appropriate times. In this paper, we present a method by which the robots can recognize eating and drinking actions performed by individuals in a group. Our approach uses the positions of joints in the human body as a feature and long short-term memory to achieve a recognition task on time-series data. We also used head directions in our method, as we assumed that it is effective for recognition in a group. The information garnered from head directions and joint positions is integrated via logistic regression and employed in recognition. The results show that this yielded the highest accuracy and effectiveness of the robots' tasks.