Publications
Journal Papers
Developing a Sign Language Writing System: Focus on Necessity and Sign Language-Specific Features
Authors: Nobuko Kato, Yuto Nameta, Megumi Shimomori, Akihisa Shitara, Sumihiro Kawano, Yuhki Shiraishi
International Journal on Advances in Systems and Measurements • December 2024
Achieving universal access in professional settings necessitates the development of computer-assisted sign language writing support system, considering the perceptual characteristics of the deaf and hard of hearing individuals. This study explores sign language-specific features to elucidate the requirements for a sign language writing support system. Analysis of news sentences expressed in sign language reveals the prevalence of distinct expressions like topicalized and wh-cleft sentences. We explore a writing system that incorporates these features and conduct experiments involving transcribing sign language movies. We first examine whether it is necessary to write sign language when learning in specialized contexts, thus identifying the key features of sign language sentences that need to be written effectively and clarify the functions required for the system based on actual writing experiments.
Technical Understanding from Interactive Machine Learning Experience: a Study Through a Public Event for Science Museum Visitors
Authors: Wataru Kawabe, Yuri Nakao, Akihisa Shitara, Yusuke Sugano
Interacting with Computers • March 2024
While AI technology is becoming increasingly prevalent in our daily lives, the comprehension of machine learning (ML) among non-experts remains limited. Interactive machine learning (IML) has the potential to serve as a tool for end users, but many existing IML systems are designed for users with a certain level of expertise. Consequently, it remains unclear whether IML experiences can enhance the comprehension of ordinary users. In this study, we conducted a public event using an IML system to assess whether participants could gain technical comprehension through hands-on IML experiences. We implemented an interactive sound classification system featuring visualization of internal feature representation and invited visitors at a science museum to freely interact with it. By analyzing user behavior and questionnaire responses, we discuss the potential and limitations of IML systems as a tool for promoting technical comprehension among non-experts.
HaptStarter: Designing haptic stimulus start system for deaf and hard of hearing sprinters
Authors: Akihisa Shitara, Miki Namatame, Sayan Sarcar, Yoichi Ochiai, Yuhki Shiraishi
International Journal of Human-Computer Studies • February 2024
In this study, we design and develop HaptStarter—a haptic stimulus start system—to improve the starting performance of the deaf and hard of hearing (DHH) sprinters. A DHH person has a physical ability nearly equivalent to hearing; however, the difficulties in perceiving audio information lead to differences in their performance in sports. Furthermore, the visual reaction time is slower than the auditory reaction time (ART), while the haptic reaction time is equivalent to it. However, a light stimulus start system is increasingly being used in sprint races to aid DHH sprinters. In this study, we design a brand-new haptic stimulus start system for DHH sprinters; we also determine and leverage an optimum haptic stimulus interface. The proposed method has the potential to contribute toward the development of prototypes based on the universal design principle for everyone (DHH, blind and low-vision, and other disabled sprinters with wheelchairs or artificial arms or legs, etc.) by focusing on the overlapping area of sports and disability with human–computer interaction.
Development of a Shoulder-Mounted Tactile Notification System for the Deaf and Hard of Hearing
Authors: Yuta Murayama, Rito Emura, Shunya Tanaka, Yukiya Nakai, Akihisa Shitara, Fumio Yoneyama, Yuhki Shiraishi
Journal on Technology & Persons with Disabilities • 2023
In this study, a shoulder-mounted tactile notification system is proposed and developed for the deaf and hard-of-hearing (DHH) to go out safely and secure operation. The vibration detection rate, correct response rate, and reaction time are investigated using four types of vibrations with input voltages of 1.0, 3.0, 5.0, and 7.0 V for 24 DHH people. Additionally, the location and vibration time of oscillators that satisfy the two conditions of "being able to perceive the movement as a single point" and "being able to recognize the direction quickly without getting lost" are investigated for six DHH people. The experimental results reveal that the parameters suitable for tactile presentation are extracted based on objective and subjective surveys.
Sensor Glove Approach for Continuous Recognition of Japanese Fingerspelling in Daily Life
Authors: Yuhki Shiraishi, Akihisa Shitara, Fumio Yoneyama, Nobuko Kato
International Journal on Advances in Life Sciences • December 2022
To achieve smooth communication between the deaf and hard of hearing and hearing people, we developed a Japanese fingerspelling (JF) recognition system based on sensor gloves. A light and inexpensive sensor glove was adapted for the daily use of the system. We conducted evaluation experiments using a convolutional neural network (CNN) to recognize 76 characters in JF. The target JF alphabet included 35 characters for dynamic fingerspelling, and required both finger and wrist movement. The experimental results show that the average recognition rate of the developed system was approximately 70.0%. Additionally, we conducted a continuous fingerspelling recognition experiment using CNNs and long short-term memory (LSTMs) networks, aiming to recognize consecutive fingerspelling. We proposed a dataset to exploit the characteristics of JF and selected 64 words according to the finger flexion, direction, and movement differences among various signers. Using the collected data, we then conducted evaluation experiments with seven types of neural networks. The overlapping characteristics present in JF were exploited because finger flexion, finger extension, hand direction, and hand movements vary significantly among people currently learning sign language, people corresponding in Japanese sign language (JSL), and people using JSL in their daily lives. Consequently, the average recognition rate (micro F-measure) of 76 JF characters was approximately 92.1%. Based on the results of single fingerspelling and continuous fingerspelling recognition experiments, we discussed the issues concerning the recognition of JF characters and development of sign language recognition systems.
Notification, Wake-Up, and Feedback of Conversational Natural User Interface for the Deaf and Hard of Hearing
Authors: Takashi Kato, Akihisa Shitara, Nobuko Kato, Yuhki Shiraishi
International Journal on Advances in Software • June 2022
Most voice-based conversational natural user interfaces (NUIs), such as Amazon Alexa and Google Assistant, rely on speech input and output, posing an accessibility barrier for the deaf and hard of hearing (DHH). For example, DHH users may not be aware of notifications from the system, may not receive response information, and the system may have difficulty recognizing their wake-words. In designing a conversational NUI for DHH users, we consider that simply replacing speech information with sign language information does not suffice to create an accessible, comfortable user experience. In this study, we conducted an experiment with 12 DHH users to determine whether luminous notifications and text display methods showing sign language in place of the standard text output were effective, as well as whether gazing was effective as a wake-up method. The second experiment was conducted with 24 DHH users to identify better wake-up and feedback presentation methods. We propose conversational NUI guidelines for DHH users based on the results of these experiments. We examined accessibility options for DHH users at each step of the conversation with the voice user interface (VUI), and expect this work to serve as a basis for future conversational NUI design.
Proposal of a Vibration Stimulus Start System for the Deaf and Hard of Hearing
Authors: Akihisa Shitara, Miki Namatame, Yuhki Shiraishi
Journal on Technology & Persons with Disabilities • 2018
In this research, we have proposed and developed the vibration stimulus start system suitable for deaf and hard of hearing (DHH) sprinters adapted for a crouch start, and have developed the reaction time measurement system for light stimulus and vibration stimulus. The experimental results show that the possibility that the vibration stimulus start system is suitable for DHH sprinters despite issues are remained to be solved: the thumb and finger placement limitation and the weakness of the strength of vibration.
International Conferences (Full Paper)
A Shoulder-Mounted Tactile Notification System for d/Deaf and Hard of Hearing Individuals: Toward Practical Use in Multi-Person Meetings
Authors: Yuhki Shiraishi, Akihisa Shitara, Fumio Yoneyama, Yukiya Nakai, Nobuko Kato
2025 IEEE International Conference on Systems, Man, and Cybernetics (SMC) • October 2025
An Exploratory Study on Spatial Personalization in AR Caption Design for Deaf and Hard-Of-Hearing
Authors: Kosuke Funayama, Akihisa Shitara, Fumio Yoneyama, Nobuko Kato, Yuhki Shiraishi
2025 IEEE International Conference on Systems, Man, and Cybernetics (SMC) • October 2025
Look to Wake: A Gaze-Based Alternative to Spoken Wake Words for Deaf and Hearing Users
Authors: Yuhki Shiraishi, Akihisa Shitara, Fumio Yoneyama, Nobuko Kato
Eyes4Access Workshop in The 2025 ACM Symposium on Eye Tracking Research & Applications • May 2025
Conventional voice-activated wake words (e.g., “Hey Siri”) inherently exclude Deaf and Hard of Hearing (DHH) users, emphasizing the need for an inclusive method to initiate interactions with AI assistants. We propose Look to Wake, a gaze-based activation strategy that benefits both DHH and hearing individuals by aligning with Deaf communication norms, where eye contact—rather than explicit signing—commonly initiates conversations. To ensure safety and usability in dynamic settings like driving or walking, we introduce brief repeated glances (“flicker gaze”) augmented by subtle peripheral visual feedback. Drawing on recent HCI research, we identify key gaps—longitudinal evaluation, cross-modal comparisons, and technical robustness—and call for inclusive, participatory design to accommodate the diverse needs of the DHH community. Ultimately, our approach reimagines wake-word interactions by shifting from spoken commands to more intuitive, visually oriented methods, paving the way for accessible, multimodal assistant technologies.
Improving Continuous Japanese Fingerspelling Recognition with Transformers: A Comparative Study against CNN-LSTM Hybrids
Authors: Akihisa Shitara, Yuhki Shiraishi
The Eighteenth International Conference on Advances in Computer-Human Interactions (ACHI 2025) • May 2025
To achieve smooth communication between d/Deaf and hard of hearing (d/DHH) and hearing people, we have developed a continuous Japanese fingerspelling (JS) recognition system using sensor gloves and deep learning. We have selected a light and inexpensive sensor glove adapted for the system’s daily use. In our prior system using a machine learning model that combines convolutional neural network (CNN) and long short-term memory (LSTM), despite achieving the average micro F-measure of 76 JF characters was 92.1%, we reported the average macro F-measure of only 64.7%. Two problems cause this issue: distinguishing between static and dynamic fingerspellings, and the decreased recognition rate due to the large number of instances “ϕ” (the transition movements characters). Therefore, we conducted a quantitative evaluation using the CNN-LSTM combined machine learning model as a baseline to verify whether the Transformer Encoder could improve JS recognition rates. Consequently, for the 76 JF characters, the average micro and macro F-measures were 93.8% (0.2) and 77.4% (1.0), respectively.
Sign Language Writing System: Focus on the Representation of Sign Language-Specific Features
Authors: Nobuko Kato, Yuito Nameta, Akihisa Shitara, Yuhki Shiraishi
The Seventeenth International Conference on Advances in Computer-Human Interactions (ACHI 2024) • May 2024
Achieving universal access in professional settings necessitates the development of computer-assisted input/output systems tailored to sign language, considering the perceptual characteristics of the deaf and hard of hearing individuals. This study examines sign language-specific features to elucidate the requirements for a sign language writing support system. Analysis of news sentences expressed in sign language reveals the prevalence of distinct expressions like topicalized and wh-cleft sentences. We explore a writing system that incorporates these features and conduct experiments involving transcribing sign language movies. The paper delineates the crucial features of sign language sentences for effective writing and outlines the requisite functions of the system based on actual writing experiments.
One-Handed Signs: Standardization for Vehicle Interfaces and Groundwork for Automated Sign Language Recognition
Authors: Akihisa Shitara, Taiga Kasama, Fumio Yoneyama, Yuhki Shiraishi
The Seventeenth International Conference on Advances in Computer-Human Interactions (ACHI 2024) • May 2024
In scenes where d/Deaf and hard of hearing (d/Dhh) individuals drive vehicles, they may face issues reliant on not only environmental sounds but also auditory information through communication. Therefore, we investigated the needs in scenarios where drivers drive in a car and proposed an in-car sign recognition standard using one-handed signs to improve communication issues d/Dhh drivers face. Specifically, we focused on one-handed signs in sign language conversations among d/Dhh individuals and selected sign language expressions based on one-handed signs assuming sign language recognition.We selected one-handed signs used in assumed context scenarios driving car situations where the one-handed sign was likely to occur. Additionally, we conducted surveys with d/Dhh individuals involved to assess whether they found these signs natural and acceptable. We also discussed the annotation rules for annotation labels in datasets intended for sign language recognition.
Visually-Structured Written Notation Based on Sign Language for the Deaf and Hard-of-Hearing
Authors: Nobuko Kato, Yuhki Hotta, Akihisa Shitara, Yuhki Shiraishi
The 15th International Conference on Computer Supported Education (CSEDU 2023) • April 2023
Deaf and hard-of-hearing (DHH) students often face challenges in comprehending highly specialized texts due to the long time needed to understand their content. This may be due to factors such as the complexity of Japanese syntax, which differs from Japanese sign language. This study describes the results of a questionnaire on the notation method that we proposed based on sign language for DHH individuals. The results revealed that DHH individuals who use sign language correctly answered most questions on sentence structure when using the proposed notation methods than when using Japanese sentences.
Designing Gestures for Digital Musical Instruments: Gesture Elicitation Study with Deaf and Hard of Hearing People
Authors: Ryo Iijima, Akihisa Shitara, Yoichi Ochiai
In the 24th International ACM SIGACCESS Conference on Computers and Accessibility (ASSETS ’22) • October 2022
When playing musical instruments, deaf and hard-of-hearing (DHH) people typically sense their music from the vibrations transmitted by the instruments or the movements of their bodies while performing. Sensory substitution devices now exist that convert sounds into light and vibrations to support DHH people’s musical activities. However, these devices require specialized hardware, and the marketing profiles assume that standard musical instruments are available. Hence, a significant gap remains between DHH people and their musical performance enjoyment. To address this issue, this study identifies end users’ preferred gestures when using smartphones to emulate the musical experience based on the instrument selected. This gesture elicitation study applies 10 instrument types. Herein, we present the results and a new taxonomy of musical instrument gestures. The findings will support the design of gesture-based instrument interfaces to enable DHH people to more directly enjoy their musical performances.
See-Through Captions in a Museum Guided Tour: Exploring Museum Guided Tour for Deaf and Hard-of-Hearing People with Real-Time Captioning on Transparent Display
Authors: Ippei Suzuki, Kenta Yamamoto, Akihisa Shitara, Ryosuke Hyakuta, Ryo Iijima, Yoichi Ochiai
Computers Helping People with Special Needs (ICCHP-AAATE 2022) • July 2022
Access to audible information for deaf and hard-of-hearing (DHH) people is an essential component as we move towards a diverse society. Real-time captioning is a technology with great potential to help the lives of DHH people, and various applications utilizing mobile devices have been developed. These technologies can improve the daily lives of DHH people and can considerably change the value of audio content provided in public facilities such as museums. We developed a real-time captioning system called See-Through Captions that displays subtitles on a transparent display and conducted a demonstration experiment to apply this system to a guided tour in a museum. Eleven DHH people participated in this demonstration experiment, and through questionnaires and interviews, we explored the possibility of utilizing the transparent subtitle system in a guided tour at the museum.
Sign Language Conversational User Interfaces Using Luminous Notification and Eye Gaze for the Deaf and Hard of Hearing
Authors: Takashi Kato, Akihisa Shitara, Nobuko Kato, Yuhki Shiraishi
The Fourteenth International Conference on Advances in Computer-Human Interactions (ACHI 2021) • July 2021
We investigate the design of a user-friendly natural user interface for the deaf and hard of hearing (DHH). Voice-based conversational user interfaces (CUIs), such as Amazon Alexa and Google Assistant, are becoming increasingly popular among consumers. DHH users may not be aware of notifications from CUIs, may not be able to obtain response information, and may have difficulty waking up the CUIs. In this study, we designed a system that adds luminous notifications and sign language to the CUI and conducted Wizard of Oz experiments to investigate whether the system can provide an optimal user experience for DHH users. The results suggest that luminous notifications improve the usability and make notifications easier. After assessing the necessity of sign language/text display, we found that people with longer sign language histories tend to use sign language, and all people require the use of a text display. The percentage of DHH users who gazed at the system before entering commands into the system (93.4%) also suggests that gazing can be an effective way to wake up the system. Our findings provide guidance for future CUI designs to improve the accessibility for DHH users.
Personalized Navigation that Links Speaker’s Ambiguous Descriptions to Indoor Objects for Low Vision People
Authors: Jun-Li Lu, Hiroyuki Osone, Akihisa Shitara, Ryo Iijima, Bektur Ryskeldiev, Sayan Sarcar, Yoichi Ochiai
Universal Access in Human-Computer Interaction. Access to Media, Learning and Assistive Environments. HCII 2021 • July 2021
Indoor navigation systems guide a user to his/her specified destination. However, current navigation systems face the challenges when a user provides ambiguous descriptions about the destinations. This can commonly happen to visually impaired people or those who are unfamiliar with new environments. For example, in an office, a low-vision person asks the navigator by saying “Take me to where I can take a rest?". The navigator may recognize each object (e.g., desk) in the office but may not recognize which location the user can take a rest. To overcome the gap of surrounding understanding between low-vision people and a navigator, we propose a personalized interactive navigation system that links user’s ambiguous descriptions to indoor objects. We build a navigation system that automatically detect and describe objects in the environment by neural-network models. Further, we personalize the navigation by re-training the recognition models based on previous interactive dialogues, which may contain the corresponding between user’s understanding and the visual images or shapes of objects. In addition, we utilize a GPU cloud for supporting computational cost and smooth the navigation by locating user’s position using Visual SLAM. We discussed further research on customizable navigation with multi-aspect perceptions of disabilities and the limitation of AI-assisted recognition.
Sensor Glove Approach for Japanese Fingerspelling Recognition System Using Convolutional Neural Networks
Authors: Tomohiko Tsuchiya, Akihisa Shitara, Fumio Yoneyama, Nobuko Kato, Yuhki Shiraishi
The Thirteenth International Conference on Advances in Computer-Human Interactions (ACHI 2020) • November 2020
We have developed a Japanese fingerspelling recognition system based on a sensor glove using deep learning to achieve smooth communication between the deaf and hard-of-hearing and the hearing people. In this research, we conducted evaluation experiments using a convolutional neural network for 76 characters of Japanese fingerspelling. In the developed system, we adopt the sensor glove that is light and cheap. Besides, the target Japanese fingerspelling includes 35 characters of dynamic fingerspelling, which both finger and wrist have to be moved to express. Experimental results show that the average recognition rate is about 70.0%. Based on the results, we discuss the peculiarity of Japanese fingerspelling and the improvement of sensor gloves and algorithms.
Alarm Sound Classification System in Smartphones for the Deaf and Hard-of-Hearing Using Deep Neural Networks
Authors: Yuhki Shiraishi, Takuma Takeda, Akihisa Shitara
The Thirteenth International Conference on Advances in Computer-Human Interactions (ACHI 2020) • November 2020
For the deaf and hard-of-hearing to be able to go out safely, it is important to recognize alarm sounds (horns, bicycle bells, ambulance sirens, etc.) among various environmental sounds and to transmit the kinds of sounds to those people, even in noisy environmental sounds. In this paper, we propose and develop an alarm sound classification system using deep neural networks by smartphones that can always be carried when they are going out. Besides, evaluation experiments are performed to verify the effectiveness of the system using the 5-fold cross-validation method. Furthermore, we evaluate the classification ratio for unlearned data and that adding the date download from the web, and also discuss the limitation of the system to improve the system more useful.
Tactile Stimulus Start System Proposal for Deaf and Hard of Hearing
Authors: Akihisa Shitara, Miki Namatame, Yuhki Shiraishi
The 7th International Conference for Universal Design • March 2019
Proposal of a Vibration Stimulus Start System for Deaf and Hard of Hearing
Authors: Akihisa Shitara, Miki Namatame, Yuhki Shiraishi
33rd CSUN Assistive Technology Conference • March 2018
International Conferences (Demo / Poster)
Smartphone Drum: Gesture-based Digital Musical Instruments Application for Deaf and Hard of Hearing People
Authors: Ryo Iijima, Akihisa Shitara, Sayan Sarcar, Yoichi Ochiai
In Symposium on Spatial User Interaction (SUI ’21) • November 2021
Smartphone applications that allow users to enjoy playing musical instruments have emerged, opening up numerous related opportunities. However, it is difficult for deaf and hard of hearing (DHH) people to use these apps because of limited access to auditory information. When using real instruments, DHH people typically feel the music from the vibrations transmitted by the instruments or the movements of the body, which is not possible when playing with these apps. We introduce “smartphone drum,’’ a smartphone application that presents a drum-like vibrotactile sensation when the user makes a drumming motion in the air with their smartphone like a drumstick. We implemented an early prototype and received feedback from six DHH participants. We discuss the technical implementation and the future of new instruments of vibration.
Word Cloud for Meeting: A Visualization System for DHH People in Online Meetings
Authors: Ryo Iijima, Akihisa Shitara, Sayan Sarcar, Yoichi Ochiai
In The 23rd International ACM SIGACCESS Conference on Computers and Accessibility (ASSETS ’21) • October 2021
Deaf and hard of hearing (DHH) people have limited access to auditory input, so they mainly receive visual information during online meetings. In recent years, the usability of a system that visualizes the ongoing topic in a conference has been confirmed, but it has not been verified in a remote conference that includes DHH people. One possible reason is that visual dispersion occurs when there are multiple sources of visual information. In this study, we introduce “Word Cloud for Meeting,” a system that generates a separate word cloud for each participant and displays it in the background of each participant’s video to visualize who is saying what. We conducted an experiment with seven DHH participants and obtained positive qualitative feedback on the ease of recognizing topic changes. However, when the topic changed in a sequence, it was found to be distracting. Additionally, we discuss the design implications for visualizing topics for DHH people in online meetings.
See-Through Captions: Real-Time Captioning on Transparent Display for Deaf and Hard-of-Hearing People
Authors: Kenta Yamamoto, Ippei Suzuki, Akihisa Shitara, Yoichi Ochiai
In The 23rd International ACM SIGACCESS Conference on Computers and Accessibility (ASSETS ’21) • October 2021
Real-time captioning is a useful technique for deaf and hard-of-hearing (DHH) people to talk to hearing people. With the improvement in device performance and the accuracy of automatic speech recognition (ASR), real-time captioning is becoming an important tool for helping DHH people in their daily lives. To realize higher-quality communication and overcome the limitations of mobile and augmented-reality devices, real-time captioning that can be used comfortably while maintaining nonverbal communication and preventing incorrect recognition is required. Therefore, we propose a real-time captioning system that uses a transparent display. In this system, the captions are presented on both sides of the display to address the problem of incorrect ASR results, and the highly transparent display makes it possible to see both the body language and the captions.
A Preliminary Study on Understanding Voice-only Online Meetings Using Emoji-based Captioning for Deaf or Hard of Hearing Users
Authors: Kotaro Oomori, Akihisa Shitara, Tatsuya Minagawa, Sayan Sarcar, Yoichi Ochiai
In The 22nd International ACM SIGACCESS Conference on Computers and Accessibility (ASSETS ’20) • October 2020
In the midst of the coronavirus disease 2019 pandemic, online meetings are rapidly increasing. Deaf or hard of hearing (DHH) people participating in an online meeting often face difficulties in capturing the affective states of other speakers. Recent studies have shown the effectiveness of emoji-based representation of spoken text to capture such affective states. Nevertheless, in voice-only online meetings, it is still not clear how emoji-based spoken texts can assist DHH people to understand the feelings of speakers without perceiving their facial expressions. We therefore conducted a preliminary experiment to understand the effect of emoji-based text representation during voice-only online meetings by leveraging an emoji-based captioning system. Our preliminary results demonstrate the necessity of designing an advanced system to help DHH people understanding the voice-only online meetings more meaningfully.
Domestic Conferences
YouTube-SL-25における日本手話動画の再分類と段階的アノテーションの必要性
Authors: 船山 滉介,設楽 明寿,加藤 伸子,白石 優旗
第24回情報科学技術フォーラム(FIT2025) • 2025年9月
本研究は,YouTube-SL-25 に含まれる “JSL” ラベル付き1,073本の動画を対象に,手話に関する専門知識を用いず視認可能な4分類軸—(i) 手話歌の有無,(ii) 同時出演者数,(iii) 話者属性,(iv) 日本語音声併用の有無—に基づき段階的アノテーションを実施し,ラベルと出現形式の乖離を検討した.その結果,手話歌が163本(15.2%)を占めるなど,出現形式が大きく異なる動画が混在していた.話者属性ではろう者を中心にCODAや聴者も含まれ,語順や非手指文法の使用から中間型とみなされる例も複数確認された.本手法は,専門的注釈に先立ってJSLラベル下の出現形式を再分類・整理する実践的枠組みとして有効である.今後は,文法的基準の導入,音声使用のばらつきを考慮した複合的注釈,および話者の自己認識に基づく分析が課題となる.
マイクアレイ音源位置推定型音声認識システムを用いたろう・難聴者と聴者のコミュニケーション支援の評価
Authors: 石濱日菜,船山滉介,設楽明寿,加藤伸子,川田夏希,羽原恭寛,白石優旗
情報処理学会アクセシビリティ研究会 第27回研究会 • 2025年3月
ろう・難聴者が複数人の聴者の会話に参加することは困難である.一例としてDHHが聴者の家族の会話に十分参加できず,疎外感を感じる「ディナーテーブル症候群」が報告されている.そのため,単に会話内容を把握するだけでなく,自発的かつ楽しいと感じられる会話の実現が求められる.話者区別なしTLでは,正確に文字起こしが行われても話者をリアルタイムで区別できないため,後から話者を特定する手間が生じていた.本研究では,ろう・難聴者1名と聴者3名のグループにおいて,マイクアレイによる音源位置推定を用いたリアルタイム話者識別システム「VUEVO」を活用し,円滑なコミュニケーションを実現するための指標を設定した.そして,言葉を使った戦略ゲームの一種である「ワードウルフ」における会話や雑談といった状況下で,このシステムを用いた表示方法がこれらの指標を満たすか,また指標として十分な有用性を持つかを評価した.
Transformerを用いた連続指文字認識の精度向上に関する研究ーCNN-LSTM複合モデルとの比較分析ー
Authors: 設楽明寿,白石優旗
情報処理学会アクセシビリティ研究会 第26回研究会 • 2024年12月
我々は,ろう・難聴者と聴者の円滑なコミュニケーションを実現するため,深層学習を用いたセンサグローブによる連続指文字認識システムを開発してきた.そこでは,CNN と LSTM を組み合わせた学習モデルにおいて,F-measure の micro 平均は 92.1% にもかかわらず,F-measure の macro 平均は 64.7% と報告している.理由としては,静止指文字と動的指文字の識別の困難性,指文字から次に表出する指文字との間である「わたり」のデータが多いことによる指文字の認識率の低下などが挙げられる.そこで,Transformer の使用が指文字の認識率の向上に寄与するかについて,CNN と LSTM を組み合わせた学習モデルをベースラインとして定量的評価を実施した.また,個人差の影響が軽減されているかについて,学習に使用するデータの選別を通じて交差検証による性能評価を行った.
一般ユーザ向けの機械学習開発体験イベントにおける小型デバイスの効果検証
Authors: 川辺航,福田大翔,設楽明寿,中尾悠里,菅野裕介
WISS 2024 (第32回インタラクティブシステムとソフトウェアに関するワークショップ), 日本ソフトウェア科学会研究会資料シリーズ(Web) • 2024
本研究では,機械学習の技術的背景を持たないユーザでも機械学習モデルの訓練・評価が可能なシステムを提案し,機械学習モデル開発の体験会を通じてその効果を検証した.PC やタブレットなど画面操作を中心とするシステムは,技術的経験の少ないユーザにとって親しみづらく,体験の質を低下させることが指摘されている.これに対し,我々はユーザのエンゲージメント向上を目指して小型デバイスを採用した.デバイスはスクリーン操作のみならず,ボタン,振動,車輪の動きといった物理的なインタラクションを介して音声分類モデルを設計できる特徴を持つ.ユーザ実験では,デバイスがユーザの開発体験に与える影響を,ユーザの主観的評価,実世界への機械学習応用アイデア,ならびに訓練用に作られた教師データといった要素を通じて調査した.実験結果から,物理的な UI がユーザの開発への積極的な関与とアイデア創出を促進することが明らかになると同時に,彼らの技術的理解に関する課題も浮き彫りとなった.
ARグラスを用いたろう・難聴者向け字幕提示方法の探索的研究:字幕位置の影響に着目して
Authors: 船山滉介,設楽明寿,米山文雄,加藤伸子,白石優旗
情報処理学会アクセシビリティ研究会 第25回研究会 • 2024年7月
ろう・難聴者向けの情報保障手段として,AR グラスが注目されつつある.従来の HMD 技術では,単に字幕をミラーリングしており頭の動きに合わせて対象物が追従されている.一方,AR グラスは,従来の HMD 技術だけではなく字幕の提示位置を AR 空間上でユーザーが柔軟に設定できる技術も持ち合わせている.本研究の目的は,字幕の表示位置の影響に着目し,1)従来の HMD 技術(ミラーリング),2)AR 空間上で表示位置を固定,3)AR 空間上でユーザーが表示位置を自由に決定,の 3 種類の字幕提示方法で実験し,映像コンテンツと対人コミュニケーションの内容理解度に与える影響を明らかにすることである.また,各手法の課題と利点について実験結果とフィードバックに基づいて詳細に分析することで,AR グラスを用いた字幕提示方法の可能性と改善点を探ることを目指した.実験方法は,e-sports の試合映像を視聴しながら,隣の手話話者とコミュニケーションをとる環境下で,それぞれの字幕提示方法を試した.その後にそれぞれの字幕提示方法での内容理解度を評価し,それぞれの手法の課題と利点についてもフィードバックで収集した.
ろう・難聴者に適したバスケットボール試合中の情報伝達法:片手サインの提案と情報伝達手の導入
Authors: 新林玖妃,設楽明寿,米山文雄,加藤伸子,白石優旗
情報処理学会アクセシビリティ研究会 第24回研究会 • 2024年3月
バスケットボールでは攻守の切り替えが激しいスポーツであり,試合中に正確な情報を迅速に伝える必要がある.しかし,ろう・難聴者は,補聴器・人工内耳の有無に関わらず,特に背後から指示を受け取るのが難しい.したがって,本研究では,試合中のプレイへの影響が少ない「片手サイン」を提案するとともに,その指示内容の認識を常時可能とする「情報伝達手」を新たに導入する.ここで,情報伝達手は,コートの周囲に複数人配置し,選手の表出する片手サインを読み取ると同時に,同じサインを表出することで背後からの指示を理解できるようになる.本論文では,提案した片手サインの妥当性と情報伝達手の最適配置の評価実験を行うとともに,実利用に向けた課題について議論する.
ろう・難聴者に適した肩掛式振動型通知システムの有効性の検討:複数人会議を模したVR環境における評価
Authors: 村山悠太,設楽明寿,米山文雄,加藤伸子,鮫島健一郎,白石優旗
情報処理学会アクセシビリティ研究会 第24回研究会 • 2024年3月
ろう・難聴者は音情報を得ることが困難のため,代用として触覚情報提示が求められる.一方,既存の音を触覚で通知するシステムは,いずれも方向情報の提示が困難という課題がある.したがって,本研究では,これまでに開発してきた肩掛け式の通知システムの実用化を目指し,複数人での会議環境下で評価実験を行った.その際,再現性の確保とデータ収集の容易化のため,VR 機器と字幕付き会議映像を用いて実験環境を構築した.実験では,会議中に「名前」または「アラーム」がランダムで表示される.また,発言終了時「誰が発言していたか」に対する回答を得る.これを開発デバイス装用,非装用で実施した.「名前」「アラーム」の表示に気づいた回数と表示から気づくまでの時間のそれぞれで,振動の有無の差について統計的仮説検定を行ったところ,有意水準 5% でデバイス装用時の方が多く,早く気づいたことが確認された.一方,「誰が発言していたか」の正答数については有意差が確認されなかった.ただし,アンケート調査を行ったところ「音の発生方向を振動で伝えて欲しい」と考えている実験参加者が有意水準 5% で統計的に多いことが確認された.
複数人のろう者による手話会話の生配信に関する事例報告:xDiversityスピンオフ・シンポジウム企画xTalk
Authors: 設楽明寿,鈴木一平,田中沙紀子,菅野裕介,落合陽一
情報処理学会アクセシビリティ研究会 第24回研究会 • 2024年3月
様々な障害当事者と共創していく xDiversity プロジェクトは,その取り組みを広げるために,ゲストと様々な視点や見解等からカジュアルに議論するスピンオフ・シンポジウム企画「xTalk(クロス・トーク)」を実施し続けてきた.具体的には,日本科学未来館にある xDiversity プロジェクトの研究室から動画共有サービス「YouTube」上で生配信を実施し,配信後もアーカイブを公開している.その中で,出演者の複数人がろう者である回の配信において,現地でのカジュアルな手話会話と,配信での手話の見やすさをいかに両立させるかが課題であった.本稿では,過去 2 回実施した出演者全員がろう者の配信において,どのような点を課題と感じ,いかに解決しようと試みたのか,事例として報告する.
ろう・難聴者のための文を視覚的に構造化する表記法の検討
Authors: 加藤伸子,堀田悠輝,設楽明寿,白石優旗
第197回ヒューマンインタフェース学会研究会「個々のニーズに立脚した高齢者・障害者支援技術および一般(SIG-ACI-30)」 • 2022年12月
ろう・難聴者に適した肩掛式振動型通知システムの開発:振動有りファントムセンセーションと「肩たたき」の活用
Authors: 江村里都,村山悠太,米山文雄,設楽明寿,田中俊也,中居志紀也,金村果林,白石優旗
情報処理学会アクセシビリティ研究会 第19回研究会 • 2022年7月
ろう・難聴者は音情報を得ることが困難であり,その代用としてスマートフォンやスマートウォッチなどを活用した振動による触覚情報提示が求められる.しかし,スマートフォンは振動を感知できないケースがあり,スマートウォッチは方向情報の提示が難しいという課題がある.したがって,本研究では,移動有りファントムセンセーション活用した肩掛け式の通知システムを開発する.初期検討として,「一点の移動として知覚できる」と「迷わずに方向を早く認知できる」の 2 つの条件を満たす,振動子の箇所と振動時間を調査する.具体的には,振動子は,肩井,肩井から内側方向,肩井から外側方向の 3 箇所の中から 2 箇所に配置し,振動時間は 0.1s から 0.5s の範囲で調査する.結果,肩井から肩井の外側方向へ,0.4s から 0.5s 間振動させた場合が最適であることが判明した.また,ろう文化における呼びかけの慣習的行為である「肩たたき」を想定した振動パターンについて実験を行った結果,有用性について確認できた.また,アンケート調査により,開発システムの有効性と課題について確認した.
聴覚障害者へのスタート合図に最適な触覚刺激インターフェースの特定 -陸上競技短距離走スタートシステムのユニバサールデザインを目指して-
Authors: 設楽明寿,生田目美紀,白石優旗
情報処理学会 第81回全国大会 • 2019年3月
現在の聴覚障害者陸上競技では,スタート合図音の代理として光刺激スタートシステムが使われている.しかし,一般に視覚に対する反応時間は聴覚に対する反応時間よりも遅いという課題がある.そこで,聴覚に対する反応時間とほぼ同様の反応時間である触覚に注目し,触覚刺激スタートシステムを提案し開発した.本研究では,聴覚障害のある陸上競技選手に最適な触覚刺激インターフェースを特定するために,反応時間計測実験を行った.具体的には,プッシュタイプの3種類の触覚刺激と接触部分の2種類の組み合わせ(計6通り)を検討した結果,一番速く安定している組み合わせを最適な触覚刺激インターフェースとして特定した.更に,プッシュタイプ触覚刺激に対する反応時間は,既存のLEDタイプ視覚刺激よりも安定して速いことが示された.
聴覚障害者に適したクラウチングスタートにおける触覚刺激スタート合図の特定
Authors: 設楽明寿,生田目美紀,白石優旗
情報処理学会アクセシビリティ研究会 第7回研究会 • 2018年8月
現在,聴覚障害者陸上競技短距離走では,光刺激 (LED 方式視覚刺激) によるクラウチングスタートが採用されているが,一般に,触覚は視覚に比べて優位に (40 ms 程度) 知覚時間が短く,聴覚の知覚時間とほぼ同じ (5 ms 程度の差) と報告されている.そこで,我々は,聴覚障害者を対象とし,触覚を活用したスタートシステムを提案している.本論文では,聴覚障害者に適した触覚刺激を特定するため,プッシュ方式と振動方式の 2 種類の触覚刺激に対して反応時間比較実験を行った.データ解析ならびにアンケート調査の結果,プッシュ方式の方が振動方式よりもスタート合図に適していることが判明した.更に,プッシュ方式触覚刺激と LED 方式視覚刺激との比較を行った結果,反応時間に有意差が認められなかったものの,プッシュ方式触覚刺激の可能性と課題を明確化できた.
振動刺激を用いたクラウチングスタートにおける反応時間計測システムの開発
Authors: 設楽明寿,白石優旗
情報処理学会アクセシビリティ研究会 第3回研究会 • 2017年3月
聴覚障害者陸上競技では,光刺激によるスタートシステムが導入されているが,先行研究によると,視覚に比べて触覚の方が知覚時間が短いことが示されている.そこで,我々は,新たな代行感覚である振動刺激によるスタートシステムを提案している.しかし,クラウチングスタートのような全身を使った反応時間について,視覚刺激と聴覚刺激に対する比較実験は,我々の知る限り行われていない.よって,本論文では,クラウチングスタートにおける反応時間計測システムを開発し,その有効性について検討する.
聴覚障害者陸上競技に適した振動刺激スタートシステムの提案
Authors: 設楽明寿,白石優旗
情報処理学会アクセシビリティ研究会 第2回研究会 • 2016年12月
近年,聴覚障害者陸上競技では,従来のピストル音によるスタート合図を代行するため,光刺激によるスタートシステムが導入されるようになった.しかし,従来の実験では,視覚の知覚時間は聴覚の知覚時間より約 30 ms 遅れると報告されている.一方,触覚の知覚時間は聴覚の知覚時間から約 5 ms しか遅れないと報告されている.また,光刺激の場合は,まばたきによってスタートが遅れる恐れもある.本研究では,聴覚障害者陸上競技における新たな代行感覚を利用したスタートシステムとして,振動刺激によるスタートシステムを提案する.