We found a match
Your institution may have access to this item. Find your institution then sign in to continue.
- Title
Multi-modal emotion prediction system using convergence media and active contents.
- Authors
Chung, Kyungyong; Kim, Jin-Su
- Abstract
Multimedia provides a lot of information for users through various forms of information contents and information processing. These days, it is utilized and converged in diverse fields. Particularly, in the convergence of movie or TV drama media and information technologies, how to visualize or predict emotional changes on the basis of a variety of multimedia information has steadily been researched. Based on emotional changes, it is possible to analyze genres of movies or TV dramas. Viewers try to select a movie on the basis of their preferred storylines. In other words, users select and personalize the genres fitting their sentiment in the emotional flow of videos so that they want the customized media reflecting their emotional flow to be recommended. A typical method of predicting emotions shows the high accuracy of emotion prediction when text-based lines are utilized. Nevertheless, when subtle emotions in text lines are presented with voice and video information, it is possible to increase the accuracy of emotion prediction. Therefore, this study proposes a system that is capable of predicting emotional context primarily with the use of a text, and then with the analysis of emotions in the dialogs of characters or their voices in scenes, or images of characters at the point of time. In order to predict emotions efficiently, the proposed system analyzes the time information on a character's particular emotion words from text data, extracts voice signals of the time section and converts them into a spectrogram, and saves the face image at the point of time. This imaged spectrogram and the face image for facial expression analysis are used as input data of CNN (convolutional neural network) and are trained. An emotion found in a particular paragraph is predicted, and therefore an emotional flow and an emotion in a particular scene are discerned. The developing multi-modal emotion prediction system collects the estimated emotions from convergence media and active contents (text, voice, and image) and active contents such as text, voice, and image. Finally, it predicts emotions.
- Subjects
AFFECTIVE forecasting (Psychology); CONVOLUTIONAL neural networks; ACTIVE medium; HUMAN facial recognition software; TELEVISION dramas; FACE
- Publication
Personal & Ubiquitous Computing, 2023, Vol 27, Issue 3, p1245
- ISSN
1617-4909
- Publication type
Article
- DOI
10.1007/s00779-021-01602-8