Priyanjali Gupta, a student in engineering, has developed an AI model that provides instantaneous translations from American Sign Language (ASL) to English.
Her mother happens to be her source of inspiration since she asked her to “do something now that she’s studying engineering,” a sentiment shared by many Indian mothers. Gupta is a third-year computer science student specializing in data science from the Vellore Institute of Technology, Tamil Nadu.
“She taunted me. But it made me contemplate what I could do with my knowledge and skillset. One fine day, amid conversations with Alexa, the idea of inclusive technology struck me. That triggered a set of plans,” Gupta, from Delhi, told Interesting Engineering (IE).
About a year after her mother’s remark, in February 2022, Gupta developed an AI model using the Tensorflow object detection API. Specifically, a pre-trained model known as ssd_mobilenet is used to facilitate transfer learning. Her idea, which bridges the gap and creates a ripple in inclusive technology, went viral after she posted it on LinkedIn, receiving more than 58,000 reactions and 1,000 positive endorsements.
“The dataset is made manually by running the Image Collection Python file that collects images from your webcam for or all the mentioned below signs in the American Sign Language: Hello, I Love You, Thank you, Please, Yes and No,” says her Github post.
Gupta says the video by data scientist Nicholas Renotte on Real-Time Sign Language Detection was the inspiration for her model.
“The dataset is manually made with a computer webcam and given annotations. The model, for now, is trained on single frames. To detect videos, the model has to be trained on multiple frames for which I’m likely to use LSTM. I’m currently researching on it,” Gupta says. In data science, sequence prediction problems have long been seen as amenable to the application of Long-Short Term Memory networks (LSTMs).
Developing a deep learning model from scratch for sign detection is not trivial, as Gupta admits. “Making a deep neural network solely for sign detection is rather complex,” she told IE. She responds to one of the comments in the same vein, “I’m just an amateur student but I’m learning. And I believe, sooner or later, our open source community, which is much more experienced than me will find a solution.”
Even though American Sign Language (ASL) is supposedly the third most widely spoken language in the United States (behind English and Spanish), translation applications and technologies have not caught up. However, the rapid growth of the sign language community known as the Zoom Boom has been covered extensively thanks to the pandemic. Researchers at Google AI have demonstrated a real-time sign language detection model that can accurately identify signers at a rate of 91% of the time.
Researchers and developers, in my opinion, are working hard to find a workable solution. In any case, I believe that the first step would be to standardize sign languages and other modes of communication with the specially-abled and work on bridging the communication gap,” Gupta says.
Leave a Reply