30Jun 2025

IMAGE STORY: ENHANCED COGNITIVE VISUAL NARRATIVE SYSTEM

  • Dept. of Manufacturing Engineering and Industrial Management, COEP Technological University Pune, India.
  • Dept. of Computer Science & IT, COEP Technological University Pune, India.
  • Abstract
  • Keywords
  • Cite This Article as
  • Corresponding Author

This paper presents the Enhanced Cognitive Visual Narrative System (ECVNS), a sophisticated multi-modal artificial intelligence framework designed for automated visual storytelling. The system integrates multiple state-of-the-art deep learning models including OWLv2for object detection, BLIP for image captioning and visual question answering, CLIP for emotional analysis, and ViLT for scene understanding. The framework demonstrates the capability to generate coherent, contextually relevant narratives in six languages based on comprehensive visual analysis. Our approach combines computer vision techniques with natural language generation to create a unified system that can understand visual content at multiple semantic levels and translate this understanding into creative storytelling. The system achieves high accuracy in object detection, scene understanding, and emotional inference, resulting in narratives that demonstrate both technical precision and creative quality. This work contributes to the advancing field of multimodal AI and has applications in content creation, accessibility, education, and entertainment.


[Ashutosh Kumar Singh, Anish Khobragade and Vikas Kanake (2025); IMAGE STORY: ENHANCED COGNITIVE VISUAL NARRATIVE SYSTEM Int. J. of Adv. Res. (Jun). 1218-1231] (ISSN 2320-5407). www.journalijar.com


Ashutosh Kumar Singh
Dept. of Manufacturing Engineering and Industrial Management, COEP Technological University Pune, India
India