Abhinav Jain

Abhinav Jain

Machine Learning Engineer

VideoKen

Biography

Hi, my name is Abhinav Jain. I work as a Machine Learning/ Research Engineer. I am broadly interested in multi-modal analytics where deep learning based algorithms are used to analyse content in text, images and videos for reasoning and further decision-making.

I have worked for two years on IBM Watson Compare & Comply service for structured data extraction from business documents. I have also been working on smart data preparation for downstream processing in AI-based systems.

Interests

  • Deep Learning
  • Applied Artificial Intelligence
  • Computer Vision
  • Deep Reinforcement Learning

Education

  • BTech in Electrical Engineering, 2017

    Indian Institute of Technology, Kanpur

Experience

 
 
 
 
 

Machine Learning Engineer

VideoKen

Nov 2020 – Present Bangalore
 
 
 
 
 

Researcher

IBM Research, India

Jul 2019 – Jul 2020 New Delhi
 
 
 
 
 

Research Engineer

IBM Research, India

Jul 2017 – Jul 2019 New Delhi

Projects

Deep Metric Learning

Video Representation Learning for Fine-Grained Scene Recognition and Retrieval.

Evolving AI

Model Learning with limited training data.

Scanned PDF-to-HTML Conversion

Extract structured information from unstructured documents.

Text Enrichment

Enrichment of educational texts with supplementary information.

Data Transformation

Problem Statement: Given a small subset of input samples, collect corresponding output samples from the user and learn a transformation routine that converts provided data into user-intended format. Motivation: The idea is to facilitate end users, mostly non-experts, with a Programming-by-example, PbE system that captures the user intention and applies to all the provided samples with minimum intervention.

Visual Cues for Text

Provide visual aid for a sequence of text based instructions.

Recent Publications

Quickly discover relevant content by filtering publications.

Simultaneous Optimisation of Image Quality Improvement and Text Content Extraction from Scanned Documents

In this paper, we propose to combine the OCR performance into the loss function during training of single image super resolution (SISR) networks for document images.
Simultaneous Optimisation of Image Quality Improvement and Text Content Extraction from Scanned Documents

Learning Convolutional Neural Networks with Deep Part Embeddings

In this paper, we propose a novel way of training CNNs with a small subset of training samples using Deep Part Embeddings.
Learning Convolutional Neural Networks with Deep Part Embeddings

Radial Loss for Learning Fine-grained Video Similarity Metric

In this paper, we propose the Radial Loss which utilizes category and sub-category labels to learn an order-preserving fine-grained video similarity metric.
Radial Loss for Learning Fine-grained Video Similarity Metric

Pentuplet Loss for Simultaneous Shots and Critical Points Detection in a Video

In this paper, we propose a novel pentuplet loss to learn the frame image similarity metric through a pentuplet-based deep learning framework.
Pentuplet Loss for Simultaneous Shots and Critical Points Detection in a Video

Content Driven Enrichment of Formal Text using Concept Definitions and Applications

We propose a text enrichment framework that enrichest concepts form input text with their definitions, applications and a pre-requisite concept graph that showcases the inter-dependency within the extracted concepts.
Content Driven Enrichment of Formal Text using Concept Definitions and Applications

Coherent Visual Description of Textual Instructions

In this paper, we present a novel multistage framework to convert textual instructions into coherent visual descriptions (text instructions annotated with images).
Coherent Visual Description of Textual Instructions

Contact