Following the technical progress and growing touristic market, demand on guidance systems is constantly increasing. Current systems are not personalized, they usually provide only a general information on sightseeing spot and do not concern about the tourist's perception of it. To design more adjustable and context-aware system, we focus on collecting and estimating emotions and satisfaction level, those tourists experience during the sightseeing tour. We reducing changes in their behaviour by collecting two types of information: conscious (short videos with impressions) and unconscious (behavioural pattern recorded with wearable devices) continuously during the whole tour. We have conducted experiments and collected initial data to build the prototype system. For each sight of the tour, participants provided an emotion and satisfaction labels. We use them to train unimodal neural network based models, fuse them together and get the final prediction for each recording. As tourist himself is the only source of labels for such system, we introduce an approach of post-experimental label correction, based on paired comparison. Such system built together allows us to use different modalities or their combination to perform real-time tourist emotion recognition and satisfaction estimation in-the-wild, bringing touristic guidance systems to the new level.