Pic2Speech uses Artificial Intelligence to describe the content of pictures to help visually impaired understanding the world.
The model is built with Keras and is mostly based on Show and Tell: A Neural Image Caption Generator" by Vinyals et al. It's trained on the Full MS COCO for around 500k steps.
The model is deployed on an Azure ML Service using Azure ML Python API.
The mobile app is developed with Google Flutter and let people take a picture with their smartphone and get a vocal description for it.