Smart devices, such as smartphones and wearable cameras, have become widely used, and lifelogging with such gadgets has been recognized as a common activity. Since this trend produces a large amount of individual lifelog records, it is important to support users’ efficient access of their personal lifelog archives. NTCIR Lifelog task series have studied the retrieval setting as a task called Lifelog Semantic Access sub-task (LSAT). This task is that, given a topic of users’ daily activity or events, e.g. “Find the moments when a user was eating any food at his/her desk at work”, as a query, a system retrieves the relevant images of the moments from users’ lifelogging records of their daily lives. Although, in the NTCIR conferences, interactive systems, which can utilize searchers’ feedback in the retrieval process, have showed the higher performance than systems in automatic manner without users’ feedback in the retrieval process, interactive systems rely on the quality of initial results, which can be seen as results of automatic systems. We envision that automatic retrieval that will be used in interactive systems later. In this paper, therefore, based on a principal that the system should be easy to implement for the later applicability, we propose a method scoring lifelog moments using only the meta information generated by publicly available pretrained detectors with word embeddings. Experimental results show the higher performance of the proposed method than the automatic retrieval systems presented in the NTCIR-14 Lifelog-3 task. We also show the retrieval can be further improved by about 0.3 of MAP with query formulation considering relevant/irrelvant writing about multimodal information in query topics.