Research Interests


With the rise of Google Alpha Go, the attempt to make algorithms such that a machine can mimic the learning process of that of humans is the hottest technological issue currently. Usage of neural networks with many deep layers in this area made a branch of algorithm called deep learning possible. Deep learning tries to extract features that makes difficult classification jobs for machines possible. Our laboratory has a few on-going projects to apply deep learning algorithms into real life situations.

Deep learning is an approach to machine learning that has drawn heavily on our knowledge of the human brain, statistics and applied math as it developed over the past several decades. In recent years, it has seen tremendous growth in its popularity and usefulness, due in large part to more powerful computers, larger datasets and techniques to train deeper networks. The years ahead are full of challenges and opportunities to improve deep learning even further and bring it to new frontiers.

To summarize, deep learning tries to extract features that makes difficult classification jobs for machines possible. Our laboratory has a few on-going projects to apply deep learning algorithms into real life situations.

Object recognition

Object recognition is one of the fundamental parts in computer vision and has been widely studied for decades. Recent visual recognition systems based on deep architectures that learns useful features without such crafting started to create outstanding breakthroughs not only in object recognition, but also in other fields such as speech recognition , large-scale object classification ,object detection, and language models. In 2012, there was a breakthrough in ImageNet Large Scale Visual Recognition Challenge. The deep convolutional neural network broke previous state of the art in ImageNet 1000 class dataset and won the competition with more than 10% accuracy compared to 2nd place. This improvement was possible due to large amount of labeled data, newly suggested regularization methods called dropout which successfully prevented the deep networks from overfitting. Since 2012, deep convolutional networks are still breaking the records in lots of tasks in computer vision and other fields of studies.

Feature extraction, CBLL Tech Report

The adaptive visual tracking framework consists of deep neural network features and many tracking methods. The neural network as feature generator is learned offline and operated on online process. It can generate online learning data for online learning algorithm and handle it for getting various features.

Human activity recognition

How do humans recognize and imitate others motor actions? It has been an important issue in the cognitive developmental robotics research to explain the internal mechanism of humans' action learning process and to make an intelligent robot that mimics the ability of humans. Hence, there have been computational studies using a neural network model to recognize and imitate human motor behaviors.

A Deep Neural Network (DNN) which consists of multi-layered Convolutional neural networks (CNNs) in these days has successfully extracted most relevant features hierarchically for recognizing visual objects. Moreover, they learn positional-invariant features by itself from a dataset by a subset of shared connection weights. They also have been used to extract spatiotemporal features of time-series models for an audio classification.

In engineering, recognizing human actions is a significant task as it could be applied to various areas such as video surveillance, content-based video search, and human-robot interaction. Microsoft Kinect is one of the greatest low-cost 3D cameras that provides 3D human pose data with RGB and depth video stream. In this project, both the RGB video stream data and pose data are utilized for improved human action recognition task based on deep neural networks (DNNs).

Human face verification

It would be very useful if robots can identify users by their faces as humans do. Face verification system can be used very widely in tasks such as user verification for semi-personal devices, crime investigation, etc.

Convolutional Neural Network is the most promising technique for extracting features from images in classification tasks or detection tasks. Classification is used for distinguishing the target from the other people. Detection is used for finding faces within the entire image. The most important thing in face verification is dealing with similarity measurement between the input images and trained images. The key idea is about enlarging inter-personal variation and reducing intra-personal variation.

Visual Tracking

Visual object tracking is one of the traditional problems of computer vision and comes to the important core issues with wide-ranging applications including self-driving car and robot-vision interaction, etc.
The deep learning of neural network works on vision recognition and classification tasks briskly, and it can extract great features of an image for classification. Recently, many approaches have studied the visual tracking in two-ways with these characteristics. First, they can consider the tracking problem as classifying each video and frame by learning all dataset. Second, use the deep neural network as feature generator and use other classifiers for using their features such as Support Vector Machine(SVM). On the second part, the features can be used to learn discriminative target appearance models like online SVM.
The adaptive visual tracking framework consists of deep neural network features and many tracking methods. The neural network as feature generator is learned offline and operated on online process. It can generate online learning data for online learning algorithm and handle it for getting various features.

Caption generation

Automatic caption generation for images is one of the most important problems in artificial intelligence. Caption generation shows a relationship between image information and natural language processing and this improves the understanding of artificial intelligence. Also, image captioning can be utilized in many applications such as helping visually impaired people and searching images on the web. Automatic image captioning is also one of the most challenging problems because it includes not only understanding important information about an image through feature extraction, but also generating natural language by summarizing key information on images. Recently, lots of successful research results in caption generation for images have been published after 2014 MS COCO Image Captioning Challenge. Most of the previous researches have used deep learning algorithms which were successful in artificial intelligence and these approaches significantly improved image captioning results.

In deep learning, Convolutional Neural Networks(CNN) is the representative model for image feature extraction and Recurrent Neural Networks(RNN) is widely used for processing time series data like speech or text. From this research, we might be able to understand the relationship between images and the natural language.


Brain Decoding

Can we decode a human's brain activity to know what sensory, cognitive or motor information is represented in the brain? Over the past decade fMRI studies have developed increasingly sensitive techniques for analyzing the information represented in BOLD fMRI signals. The most popular technique for decoding information about a person's perceptual states is the linear classification using spatially distributed patterns of activity across an array of voxels.

Classification techniques are limited, however, because they can only distinguish among a handful of pre-specified states. Recent fMRI studies have advanced beyond classification by using the encoding models which represent the information in the activity of single voxels. The fitted encoding models provide quantitative predictions of brain activity for novel mental states that were not used to fit the models. We aim to decode the brain activity measured by fMRI to understand how brain represents the world.

Decoding using the linear classifier

Encoding and decoding model

Human Emotion Decoding using Eye Tracking and fMRI

We proposed a human emotion and intension tagging system based on a combined functional Magnetic Resonance Imaging (fMRI) and eye tracking study. While subjects are shown some affective pictures, several eye behavior data are monitored in MRI scanner. Pupil dilation and constriction response, gaze trajectory, fixation time, brain regions and patterns including amygdala are all considered as features to determine human emotion and intention. The final goal is to develop the emotion and intention color coding system based on a wearable device including eye tracking glasses and the expected results are shown in the below figure.

Brain connectivity analysis in neurological disorders

Structural and functional systems of human brains have characteristics of complex networks. Numerous brain functions are associated with a specific brain region but are also often generated by interactions among multiple brain regions. Although a brain's structure is damaged by focal neurological disease, this damage influences functions of distant brain regions. The physiological influences of neurological disorders are better assessed over an entire network rather than the local site of damage. In recent neurologic research, a brain connectivity analysis has been widely used to emphasize the fundamental role of distributed neural networks, and to investigate changes in network structures and function recovery. The brain connectivity analysis begins to construct brain networks using fMRI and DTI data. A network is defined using a set of nodes and edges between pairs of nodes. Regions of interest (ROIs) are predefined as the nodes of the network. The edges of the network was obtained by calculating statistical dependencies and tracts between regions. Recently, a graph theoretical analysis for characterizing brain networks was applied in the field of brain research. The global and local properties of the networks are investigated using measures of graph theoretical approach. This approach is a powerful tool for better understanding neurological disorders and understanding reorganization during recovery after neurological disorders. We are trying to investigate network reorganization and to search indicators of motor function during recovery through brain connectivity analysis.

Brain Reverse Engineering & Imaging Lab (IT Convergence Center N1 #521)
KAIST, 291 Daehak-ro(373-1 Guseong-dong), Yuseong-gu, Daejeon 305-701, Republic of Korea
Tel: +82-42-350-8172~4, Fax: +82-42-350-8170