PhD position - Unsupervised learning of representation for video analysis

SL-DRT-17-1104

RESEARCH FIELD

Computer science and software

ABSTRACT

Videos are more and more present in our everyday life. Having tools to help us to organize them and to search for useful information in a large video data set becomes critical. Specifically, we are interested in activities and actions recognition inside video clips: for each video clip predict the class of activity or action. Besides recognition we are interested in activity detection as well. In this case, the video clips may contain several activities (or actions) and the system has to localize them in time and space. Finally, we are interested in dense description of events in videos. Instead of only giving the class of the event, we want to generate a high level textual description of the events detected in the video. Common use case of these tasks are surveillance, assistance and human-machine interaction.Videos are difficult to analyze due to their structure. Movement of objects, scene dynamics or action duration make video signal highly structured with a different level of information mixed into one signal. To build algorithms robust against those variations one need to extract representations that encode efficiently the most important features required by the analytic task and take into account the spatio-temporal structure of the video. Most of successful video analytics algorithms use supervised approach. Large training data set that maps video signal to desired task output are needed to make them work. Even if there are a large amount of videos available, the number of annotated videos is not enough to bring up current supervised approach to a next level of performance. In this work we propose to learn video representation using unsupervised approach to make use of the all available data without being limited by the annotation process. The main contribution will be on the exploration of the deep generative models such as the Generative Adversarial Network or the Variational Auto Encoder for video representation learning. The second contribution will focus on the use of these representations to solve activity recognition and dense description once using multi-task approach.

LOCATION

Département Intelligence Ambiante et Systèmes Interactifs (LIST)

Vision & Ingénierie des Contenus (SAC)

Saclay

CONTACT PERSON

RABARISOA Jaonary

CEA

DRT/DIASI//LVIC

CEA Saclay - Nano-INNOVBat 861 - PC 173 - F91191 Gif Sur Yvette CedexFrance

Phone number: 00169080129

Email: jaonary.rabarisoa@cea.fr

UNIVERSITY / GRADUATE SCHOOL

Normandie Université

Mathématiques - Information - Ingénierie des Systèmes (MIIS)

START DATE

Start date on 01-10-2017

THESIS SUPERVISOR

CANU Stéphane

INSA ROUEN

Département ASI/LITIS (Laboratoire d'Informatique, du Traitement de l'Information et des Systèmes)

LITIS - INSA Rouen - Dep ASIAvenue de l’Université - BP 8 76801 Saint-Étienne-du-Rouvray Cedex

Phone number: +33 2 32 95 98 44

Email: stephane.canu@insa-rouen.fr

« The age limit is 26 years for PhD offers and 30 years old for post-doc offers. »


Mer om arbetsgivare

Företag

CEA Tech

Platser

Kategorier

Sista ansökningsdag

2017-10-01


Sök tjänsten

Maila annonsen till mig



Mer om arbetsgivare

Företag

CEA Tech

Platser

Kategorier

Sista ansökningsdag

2017-10-01


Sök tjänsten

Maila annonsen till mig