I created a live captioning solution over the course of two design iterations that seeks to provide a tool for Deaf-hard of hearing (DHH) and English as a second language (ESL) people that makes following conversations and identifying context clues easier.
Sole: Product Designer, UX Designer, UX Researcher, Interaction Designer, Visual Designer, Back-End Developer, Information Architect
The goal was to create a low fidelity prototype that takes in audio, converts it to text and displays the text as an AR object in a HoloLens app. This MVP serves as a proof of concept for future iterations on more advanced devices.
Microsoft Hololens, Visual Studio Code, Unity
Deaf-hard of hearing (DHH) and English as a second language (ESL) people have difficulties when engaging with hearing enabled people. As technology has progressed we have found progressively better ways to treat medical conditions. Except for hard of hearing and deafness. We still have hearing aids and progress has been made regarding cochlear implants. However, modern technology hasn’t been leveraged to address these issues yet.
I performed research in the form of interviews, personas, and competitor analysis to better understand the user base and identify the expectations of the users. Through these methods I develop a touchstone that I use to periodically center myself and the project around the user. The touchstone is metaphorical, it represents an iteration of the project that addresses only the core issues identified by the users. This can, of course, be updated as new information is gathered.
We created four personas based off of the four major user groups. They are: Deaf, Hard of hearing, ESL, Neuro-typical hearing. Quotes are paraphrased from interviews with the respective stakeholders.
Instead of sending the audio out to Google, it's processed locally to increase speed and reduce dependency on internet speeds. While the onboard mic isn't the best, this might work in my favor as it reduces the effective distance and filters out any conversations that don't happen within the immediate vicinity.
These decisions resulted in more of a low-fidelity product than first planned. They also resulted in not having to incorporate auxiliary devices.
I focus on processing the audio locally and making use of the onboard mic. The main concerns are building up the base product before the interface.
The prototyping phase took place over the course of both iterations. During the first iteration I focused on sketches while the second iteration primarily focused on a low fidelity concept created in unity and visual studio code. This process took the majority of the time during the second iteration, nearly three months. The issues encountered include dealing with a deprecated API for the HoloLens 1, incorrectly functioning HoloLens 1 emulator, computer issues, loosing access to the HoloLens 1 and the time needed to learn C#. The issues and delays experienced during this phase effected the rest of the phases significantly.
The issues experienced during the prototyping phase restricted the available time and resources available for testing. This along with restrictions imposed by COVID-19 made testing a wearable headset incredibly difficult and further restricted testing opportunities. The only option remaining was to make use of the HoloLens 1 emulator. Unfortunately, this option was also not viable due to software restrictions of test subjects. The primary issues was that the emulator requires Windows 10 Pro (a costly upgrade from Windows 10 Home) to have access to Hyper-V which is required for rendering virtual environments. Further restricting testing was the lack of screen shots of the application in use. These restrictions meant that the best testing I could do was a survey about DHH people's experiences with captions, AR, their preferences and opinions on the concept.
The final product is a low-fidelity prototype. It listens to audio in the immediate vicinity via the onboard mic and transcribes what is said into text. The text is then printed at the bottom of a text box with the history accessible by scrolling upwards. This prototype works on the HoloLens and the HoloLens emulator. The scripts for the application were created using Visual Studio which were then imported to Unity. In Unity the visual assets were created and integrated with the scripts. Once the application was finished it was built using Unity and then run via Visual Studio. When using the emulator an auxiliary microphone is needed.
A mockup of the proposed UI is shown below.
It took 7-8 months to complete both iterations of the project. During this time I learned just how difficult a deprecated API can be and how important it is to maintain documentation of a project.
One challenge was taking on all the roles by myself. This was a time consuming project from start to finish and doing it by myself was demanding. However, through my perseverance I gained several valuable skills. Experience creating AR apps, using Unity, prepping machines for virtualization, learning C#, creating virtual assets, and adapting to challenging research situations.
If I could do anything differently I would have opted for the HoloLens 2 and made do with the emulator. The documentation was far more complete, the hardware was more advanced, and the online communities are more active.