Results 1 - 10
of
10
Multimodal human computer interaction: A survey
, 2005
"... In this paper we review the major approaches to Multimodal Human Computer Interaction, giving an overview of the field from a computer vision perspective. In particular, we focus on body, gesture, gaze, and affective interaction (facial expression recognition and emotion in audio). We discuss user ..."
Abstract
-
Cited by 119 (3 self)
- Add to MetaCart
(Show Context)
In this paper we review the major approaches to Multimodal Human Computer Interaction, giving an overview of the field from a computer vision perspective. In particular, we focus on body, gesture, gaze, and affective interaction (facial expression recognition and emotion in audio). We discuss user and task modeling, and multimodal fusion, highlighting challenges, open issues, and emerging applications for Multimodal Human Computer Interaction (MMHCI) research.
Multimodal interfaces: Challenges and perspectives
- Journal of Ambient Intelligence and Smart Environments
, 2009
"... Abstract. The development of interfaces has been a technology-driven process. However, the newly developed multimodal interfaces are using recognition-based technologies that must interpret human-speech, gesture, gaze, movement patterns, and other behavioral cues. As a result, the interface design ..."
Abstract
-
Cited by 11 (0 self)
- Add to MetaCart
Abstract. The development of interfaces has been a technology-driven process. However, the newly developed multimodal interfaces are using recognition-based technologies that must interpret human-speech, gesture, gaze, movement patterns, and other behavioral cues. As a result, the interface design requires a human-centered approach. In this paper we review the major approaches to multimodal Human Computer Interaction, giving an overview the user and task modeling, and to the multimodal fusion. We highlight the challenges, open issues, and the future trends in multimodal interfaces research.
Speech Pen: Predictive handwriting based on ambient multimodal recognition
- Proceedings of the ACM Conference on Human Factors in Computing Systems (CHI '05
, 2005
"... It is tedious to handwrite long passages of text by hand. To make this process more efficient, we propose predictive handwriting that provides input predictions when the user writes by hand. A predictive handwriting system presents possible next words as a list and allows the user to select one to s ..."
Abstract
-
Cited by 8 (0 self)
- Add to MetaCart
(Show Context)
It is tedious to handwrite long passages of text by hand. To make this process more efficient, we propose predictive handwriting that provides input predictions when the user writes by hand. A predictive handwriting system presents possible next words as a list and allows the user to select one to skip manual writing. Since it is not clear if people are willing to use prediction, we first run a user study to compare handwriting and selecting from the list. The result shows that, in Japanese, people prefer to select, especially when the expected performance gain from using selection is large. Based on these observations, we designed a multimodal input system, called speech-pen, that assists digital writing during lectures or presentations with background speech and handwriting recognition. The system recognizes speech and handwriting in the background and provides the instructor with predictions for further writing. The speech-pen system also allows the sharing of context information for predictions among the instructor and the audience; the result of the instructor’s speech recognition is sent to the audience to support their own note-taking. Our preliminary study shows the effectiveness of this system and the implications for further improvements. Author Keywords Predictive handwriting, speech recognition, handwriting
Multi-modal text entry and selection on a mobile device
- In Proc. GI 2010, ACM
, 2010
"... Rich text tasks are increasingly common on mobile devices, requiring the user to interleave typing and selection to produce the text and formatting she desires. However, mobile devices are a rich input space where input does not need to be limited to a keyboard and touch. In this paper, we present t ..."
Abstract
-
Cited by 6 (1 self)
- Add to MetaCart
(Show Context)
Rich text tasks are increasingly common on mobile devices, requiring the user to interleave typing and selection to produce the text and formatting she desires. However, mobile devices are a rich input space where input does not need to be limited to a keyboard and touch. In this paper, we present two complementary studies evaluating four different input modalities to perform selection in support of text entry on a mobile device. The modalities are: screen touch (Touch), device tilt (Tilt), voice recognition (Speech), and foot tap (Foot). The results show that Tilt is the fastest method for making a selection, but that Touch allows for the highest overall text throughput. The Tilt and Foot methods—although fast—resulted in users performing and subsequently correcting a high number of text entry errors, whereas the number of errors for Touch is significantly lower. Users experienced significant difficulty when using Tilt and Foot in coordinating the format selections in parallel with the text entry. This difficulty resulted in more errors and therefore lower text throughput. Touching the screen to perform a selection is slower than tilting the device or tapping the foot, but the action of moving the fingers off the keyboard to make a selection ensured high precision when interleaving selection and text entry. Additionally, mobile devices offer a breadth of promising rich input methods that need to be careful studied in situ when deciding if each is appropriate to support a given task; it is not sufficient to study the modalities independent of a natural task.
Analysis and semantic modeling of modality preferences in industrial human-robot interaction
- in IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS
, 2015
"... Abstract — Intuitive programming of industrial robots is es-pecially important for small and medium-sized enterprises. We evaluated four different input modalities (touch, gesture, speech, 3D tracking device) regarding their preference, usability, and intuitiveness for robot programming. A Wizard-of ..."
Abstract
-
Cited by 3 (2 self)
- Add to MetaCart
(Show Context)
Abstract — Intuitive programming of industrial robots is es-pecially important for small and medium-sized enterprises. We evaluated four different input modalities (touch, gesture, speech, 3D tracking device) regarding their preference, usability, and intuitiveness for robot programming. A Wizard-of-Oz experiment was conducted with 30 partic-ipants and its results show that most users prefer touch and gesture input over 3D tracking device input, whereas speech input was the least preferred input modality. The results also indicate that there are gender specific differences for preferred input modalities. We show how the results of the user study can be formalized in a semantic description language in such a way that a cognitive robotic workcell can benefit from the additional knowledge of input and output modalities, task parameter types, and preferred combinations of the two. I.
Towards Simulating Humans in Augmented Multi-party Interaction
"... Human-computer interaction requires modeling of the user. A user profile typically contains preferences, interests, characteristics, and interaction behavior. However, in its multimodal interaction with a smart environment the user displays characteristics that show how the user, not necessarily con ..."
Abstract
- Add to MetaCart
(Show Context)
Human-computer interaction requires modeling of the user. A user profile typically contains preferences, interests, characteristics, and interaction behavior. However, in its multimodal interaction with a smart environment the user displays characteristics that show how the user, not necessarily consciously, verbally and nonverbally provides the smart environment with useful input and feedback. Especially in ambient intelligence environments we encounter situations where the environment supports interaction between the environment, smart objects (e.g., mobile robots, smart furniture) and human participants in the environment. Therefore it is useful for the profile to contain a physical representation of the user obtained by multi-modal capturing techniques. We discuss the modeling and simulation of interacting participants in the European
Getting rid of “OK Google”: Individual Multimodal Input Adaption in Real World Applications
"... Abstract. Multimodal Interaction has the potential to significantly increase the ease of use of human computer interaction (HCI). At the same time, due to error-prone recognition based inputs, it is merely used in real-world applications. While literature on multimodal input fusion describes modelin ..."
Abstract
- Add to MetaCart
(Show Context)
Abstract. Multimodal Interaction has the potential to significantly increase the ease of use of human computer interaction (HCI). At the same time, due to error-prone recognition based inputs, it is merely used in real-world applications. While literature on multimodal input fusion describes modeling of different user behaviors as a key for increased robustness, it still failed to prove it’s practical use. This article presents the design of a user study that applies previous theoretical work on individual user adaption in a smartwatch scenario. We describe the practical implementation of a process for error recognition and recovery based on the history of multimodal inputs and a concrete scenario suitable for evaluating its practical impact on the ease of use. This could prove the real-world use of individual multimodal input adaption and finally lead to multimodal systems less cumbersome than today.
Multimodal
"... binding of parameters for task-based robot programming based on semantic descriptions of modalities and parameter types ..."
Abstract
- Add to MetaCart
(Show Context)
binding of parameters for task-based robot programming based on semantic descriptions of modalities and parameter types
MMWA-ae: boosting knowledge from Multimodal Interface Design, Reuse and Usability Evaluation
"... The technological progress designing new de-vices and the scientific growth in the field of Human-Computer Interaction are enabling new in-teraction modalities to move from research to com-mercial products. However, developing multimodal interfaces is still a difficult task due to the lack of tools ..."
Abstract
- Add to MetaCart
(Show Context)
The technological progress designing new de-vices and the scientific growth in the field of Human-Computer Interaction are enabling new in-teraction modalities to move from research to com-mercial products. However, developing multimodal interfaces is still a difficult task due to the lack of tools that consider not only code generation, but usability of such interfaces. In this paper, we present the MultiModal Web Approach and its au-thoring environment, whose main goal is boosting the dissemination of project knowledge for future developments benefited by the solutions to recur-ring problems in that multimodal context. 1.