Satwik Murarka
4
Image caption generator combines both computer vision and natural language processing (NLP) techniques to automatically generate a textual description of an image, making it useful for tasks such as image indexing, retrieval, and accessibility for visually impaired individuals. The task of generating accurate and descriptive captions requires an understanding of both the visual content of the image and the language used to describe it. This typically involves building a convolutional neural network (CNN) for image feature extraction and a recurrent neural network (RNN) for language generation.The aim of the project is to introduce the mentees with deep learning and subsequently build upon the knowledge to build the caption generator.
Prerequisites:Basic proficiency in Python
Basics of ML
Week | Work |
---|---|
Week 1 | Study about Deep Neural Networks and learn PyTorch |
Week 2 | Understand and implement CNNs for a classification task |
Week 3 | Study various sequence models like RNNs |
Week 4 | Buffer week |
Week 5 | Learn about NLP fundamentals and Word Embeddings |
Week 6-9 | Study different Image Captioning architectures, Implement the Image Captioning Model |