“Can we hear lost voices again?” Restoring ‘my voice’ by reading light and reviving with AI
Pohang University of Science & Technology (POSTECH)
image:
Schematic diagram illustrating the difference between communication using conventional voice-based methods and the developed silent speech interface.
view more
Credit: POSTECH
Hearing words even when spoken in silence—a new technology has been developed that reads the subtle movements of neck muscles using light and employs AI to restore them into actual voices.
A research team led by Professor Sung-Min Park (Department of IT Convergence Engineering, Mechanical Engineering, Electrical Engineering, and the Graduate School of Convergence) and Dr. Sunguk Hong (Department of Mechanical Engineering) at POSTECH (Pohang University of Science and Technology) conducted this study. The findings were published in the online edition of Cyborg and Bionic Systems, a Science Partner Journal in the field of biomedical engineering.
The research began with tiny changes that occur around the neck when a person speaks. It is not just the vocal cords that create sound. Whenever we speak, the muscles and skin around the neck move together, drawing an invisible "movement map" on the skin. The research team focused on the fact that these microscopic movements contain information about what the person intends to say.
To capture this information, the research team developed a ‘Multiaxial Strain Mapping Sensor.’ This sensor, which combines a miniature camera with small reference markers on a soft silicone material, can be conveniently worn on the neck and detects even the most minute skin movements. The wearing position and tightness can be adjusted for the individual, and an algorithm automatically corrects errors that may occur when the device is reattached, allowing it to operate stably in daily environments.
The strain patterns collected by the sensor are analyzed by AI. It estimates the words or sentences the user intends to say and combines them with voice synthesis technology trained on the individual's vocal characteristics to reproduce the actual voice. Even without producing sound, it "reads" the speech and converts it into a voice.
Existing voice restoration technologies used biological signals such as ‘EMG (electromyography)’ or ‘EEG (electroencephalography),’ but they had limitations in daily life due to complex equipment and uncomfortable wearability. The research team solved this problem with a wearable sensor and confirmed through experiments that speech could be reconstructed with high accuracy even in noisy environments such as factories.
The scope of application is also broad. It is expected to be used in various fields, such as communication assistance for patients who have lost their voices due to vocal cord diseases or laryngeal surgery, communication technology for industrial sites without microphones or radios, and even "silent communication" in libraries or conference rooms.
Professor Sung-Min Park, who led the study, said, "We hope this technology will accelerate the day when patients with speech disorders can reclaim their voices," adding, "It is a noteworthy technology because it has a wide range of potential applications, including assisting laryngectomized patients, communicating in noisy industrial environments, and even supporting silent conversations.“
Meanwhile, this research was conducted with support from Doctoral Course Research Grant Program and the Mid-career Researcher Program of the Ministry of Education, Bio&Medical Technology Development Program and the Pioneering Convergence Science and Technology Development Program of the Ministry of Science and ICT.
Journal
Cyborg and Bionic Systems
Article Title
Soft Multiaxial Strain Mapping Interface with AI-Driven Decoding for Silent Speech in Noise
No comments:
Post a Comment