• Hi!
    I'm Keshav

    Deep Learning Enthusiast and Researcher

About

  I am a recent Ph.D. graduate from Texas State University. During my Ph.D. journey I mainly work with my advisor Dr. Yan along with other projects with Dr. Ngu. My primary research area is motion and activity understanding in 360-degree videos.
  Currently, I am working as Sr. Software Engineer ML/Data Scientist at Tesla in Palo Alto CA.

RESEARCH

PUBLICATIONS

Learning Omnidirectional Flow in 360-degree Video via Siamese Representation

Optical flow estimation in omnidirectional videos faces two significant issues: the lack of benchmark datasets and the challenge of adapting perspective video-based methods to accommodate the omnidirectional nature. This paper proposes the first perceptually natural-synthetic omnidirectional benchmark dataset with a 360-degree field of view, FLOW360, with 40 different videos and 4,000 video frames. We conduct comprehensive characteristic analysis and comparisons between our dataset and existing optical flow datasets, which manifest perceptual realism, uniqueness, and diversity. To accommodate the omnidirectional nature, we present a novel Siamese representation Learning framework for Omnidirectional Flow (SLOF). We train our network in a contrastive manner with a hybrid loss function that combines contrastive loss and optical flow loss. Extensive experiments verify the proposed framework's effectiveness and show up to 40% performance improvement over the state-of-the-art approaches. Our FLOW360 dataset and code are available at this https URL.

K. Bhandari, B. Duan, G. Liu, H. Latapie, Z. Zong, and Y. Yan, Learning Omnidirectional Flow in 360-degree Video via Siamese Representation. ECCV 2022

VIEW PAPER

Revisiting Optical Flow Estimation in 360 Videos

Nowadays 360 video analysis has become a significant research topic in the field since the appearance of high-quality and low-cost 360 wearable devices. In this paper, we propose a novel LiteFlowNet360 architecture for 360 videos optical flow estimation. We design LiteFlowNet360 as a domain adaptation framework from perspective video domain to 360 video domain. We adapt it from simple kernel transformation techniques inspired by Kernel Transformer Network (KTN) to cope with inherent distortion in 360 videos caused by the sphere-to-plane projection. First, we apply an incremental transformation of convolution layers in feature pyramid network and show that further transformation in inference and regularization layers are not important, hence reducing the network growth in terms of size and computation cost. Second, we refine the network by training with augmented data in a supervised manner. We perform data augmentation by projecting the images in a sphere and re-projecting to a plane. Third, we train LiteFlowNet360 in a self-supervised manner using target domain 360 videos. Experimental results show the promising results of 360 video optical flow estimation using the proposed novel architecture.

K. Bhandari, Z. Zong, Y. Yan, Revisiting Optical Flow Estimation in 360 Videos. International Conference on Pattern Recognition, ICPR 2020, Italy

VIEW PAPER

EGOK360: A 360 Egocentric Kinetic Human Activity Video Dataset

Recently, there has been a growing interest in wearable sensors which provides new research perspectives for 360° video analysis. However, the lack of 360° datasets in literature hinders the research in this field. To bridge this gap, in this paper we propose a novel Egocentric (first-person) 360° Kinetic human activity video dataset (EgoK360). The EgoK360 dataset contains annotations of human activity with different sub-actions, e.g., activity Ping-Pong with four sub-actions which are pickup-ball, hit, bounce-ball and serve. To the best of our knowledge, EgoK360 is the first dataset in the domain of first-person activity recognition with a 360° environmental setup, which will facilitate the egocentric 360° video understanding. We provide experimental results and comprehensive analysis of variants of the two-stream network for 360 egocentric activity recognition. The EgoK360 dataset can be downloaded from https://egok360.github.io/.

K. Bhandari, Mario A. DeLaGarza, Z. Zong, Hugo Latapie, Y. Yan, EGOK360: A 360 EGOCENTRIC KINETIC HUMAN ACTIVITY VIDEO DATASET. International conference on Image Processing, ICIP 2020, UAE

VIEW PAPER

PROJECT

Experimental Summary: Anticipating Micro Chaos in human postural balance. Insights from Stick Balancing

This report presents a synopsis of an experiment conducted for anticipating micro chaos in human postural balance using deep learning. We base foundation of our experiment on [4,3]. Preliminary results demonstrate our method achieve nearly perfect accuracy. However, this assessment doesn’t justify goodness of our model. Since anticipation of wrongdoing is highly time dependent, we redesign accuracy and other metrics like precision and recall to be time dependent.

[WORK IN PROGRESS (Summer2020 - ): PRELIMINARY REPORT]

VIEW PAPER

Semantic Segmentation for LiTS (Liver Tumor Segmentation)

In the recent years, popularity of semantic segmentation in computer vision has massively increased. Proposed deep learning architectures has their own pros and cons. Some architectures require huge amount of training data while others rely on the heavy use of data augmentation to address this. These models are however trained and experimented on baseline datasets and there are enough rooms to discuss about their generalizability in rare data sets. In this work we try show how na¨ıve architectures in this domain fails and requires modification. We will also discuss an intuition behind a modification and experiment our own custom architecture based on the work [7]. The major objective of this work is to diagnose the weak spot of the generalizability of the previous work [2, 7] and define a new research goal for future work.

[NOT PUBLISHED]

VIEW PAPER

PRESENTATION

Coming Soon

Project

Old Project

Deep Neural News Recommender

NLP and Recommender System

github pypi

SERVICES AND AFFILIATION
Education

Education

Texas State University

San Marcos, Texas

2018 - 2022

Tribhuvan University

Kathmandu, Nepal

2012 - 2016

My Specialty

My Skills

Programming

95%

Deep Learning

85%

Computer Vision

90%

Data Science

75%
Things I love to do

Hobbies

Photography

Travelling

Tennis

Cooking

Read

Recent Blog

New blog is coming soon

New Blog Coming Soon

Working on a new blog...

Get in Touch

Contact

k underscore b459 at txstate dot edu

ksvadari at gmail dot com

https://www.linkedin.com/in/keshav-bhandari

Fremont, CA