Oct 30 @ Trento – Invited talks

Is identifying people using their voice a good idea?

Simon King, University of Edinburgh

Machines can now identify individuals from their speech more reliably
than human listeners. That makes voice authentication systems a very
attractive and customer-friendly option: “your voice is your
password”. But, at exactly the same time as this increasing deployment
of voice authentication, advances in computer-generated speech have made
it rather easy to attack these systems. In fact, in the earliest
experiments, computer-generated speech was identified by voice
authentication systems even more accurately than human speech!
Countermeasures against some attacks are now available, but it’s not
yet clear who will win this arms race: the good guys or the bad guys?
Would you use your voice as your password?


Simon King is Professor of Speech Processing in the department of Linguistics at UEDIN, and director of the Centre for Speech Technology Research (CSTR). He is has research interests in speech synthesis, speech recognition and signal processing, with around 200 refereed publications in these areas. A common theme in his research is the application of linguistic expertise to technological applications. He co-authored the Festival speech synthesis toolkit. He is a Fellow of the IEEE and has served on the IEEE Spoken Language Technical Committee, is an associate editor of Computer Speech and Language and a former associate editor of IEEE Transactions on Audio Speech and Signal Processing, former secretary and treasurer of the ISCA Speech Synthesis Special Interest Group, and current coordinator of the annual Blizzard speech synthesis evaluation programme. He has extensive project management experience, including as Director of the FP7 STREPs EMIME and Simple4All (both rated ‘Excellent’ at final review) and many UK projects.

Digital Humans - How they are created and what defines their level of realism

Uroš Sikimić, 3Lateral

Non-verbal communication conveyed via facial gesticulation presents an universal language across the whole human population. While each face is unique, it carries an enormous amount of information that our brains process and understand. Throughout the talk we will be discussing how non-verbal communication and convincing facial gesticulation is captured in the 3D and 4D scanning process, and then processed in order to produce digital replicas of actual people – their facial biokinetic models. We will introduce different components along this process that define the fidelity and level of realism digital humans convey.

Uroš Sikimić, PhD, is leading business development activities and special projects at 3Lateral. He has a Bachelor and Master degree in Electrical and Computer Engineering, from the Faculty of Technical Sciences, University of Novi Sad, and a joint PhD degree in Innovation and Technology Management awarded by three technical universities in Italy (Politecnico di Milano, Politecnico di Torino and Politecnico di Bari). He was engaged as a Visiting Researcher at Stockholm School of Economics (Sweden) and New York University (USA), where he conducted research in the field of Big Data analysis exploring market for technologies, patent trade and licensing activities of companies. Uroš Sikimić completed his Post-Doctoral studies at Singapore Management University (Singapore) where he was responsible to study, consult and support technology start-ups. His expertise lies in between technology and business, while he has a proven track record of helping companies grow by maximizing the potential embedded in their technologies. 


Nov 9 @ Trento – Guest lecture

Steganography and Steganalysis

Rainer Böhme, University of Innsbruck

Steganography is the art and science of hiding information in inconspicuous cover data such that the mere existence of a secret message is kept confidential. Steganalysis refers to the task of detecting secret messages in covers. This lecture introduces security notions and design principles of steganographic systems. It sets out with a discussion on the relation between subfields of multimedia security. The lecture then presents several iterations of the cat-and-mouse game between defenders and attackers for the case of (grayscale) images as covers. Special emphasis is set on the problem of selecting embedding positions with the help of coding theory and game theory. The lecture will close with an outlook pointing to recent instances of steganography problems in general systems security.


Rainer Böhme is Professor for Security and Privacy at the Department of Computer Science, University of Innsbruck, Austria. Prior to that he was Assistant Professor of Information Systems and IT Security at the University of Münster in Germany and Postdoctoral Fellow at the International Computer Science Institute in Berkeley, California. His research interests include multimedia security, digital forensics, privacy-enhancing technologies, as well as economics of information security and privacy and virtual currencies. He holds a Master’s degree in Communication Science and Economics and a Doctorate in Computer Science, both from TU Dresden in Germany. Rainer Böhme served this community as Program Co-chair of the Information Hiding Conference 2012, General Co-chair of the ACM Workshop on Information Hiding & Multimedia Security 2018, and Associate Editor of IEEE T-IFS since 2017.


April 3 @ Innsbruck – Invited talks

Face Perception by the Human Visual System

Valerie Goffaux, Université Catholique de Louvain

Date: April 3, 2019.

Time: 11:00.

Location: Seminarraum 1 (ICT building, ground floor).

Visual perception results from a complex chain of processes, starting with the spatial frequency- and orientation-selective encoding of luminance in the primary visual cortex (V1). More anterior high-level regions have increasingly larger receptive field and specialize for increasingly more complex shape properties. Some of these high-level regions specialize for given visual categories such as faces or natural scenes. We know very little about how such high-level specialization builds upon primary encoding stages in V1. Our work combines multiple investigation techniques: psychophysics, electrophysiology (scalp EEG, ERP and steady-state), as well as neuroimaging (of V1 and high-level visual regions) and suggests that the specialization of face processing, though emerging at high-level stages of visual processing, roots into the processing of selective ranges of the primary orientation and SF information.

Valérie Goffaux holds a permanent and full-time research position at UCL since October 2013. She is leading an independent research lab (Goffauxlab) investigating human face perception using a diversity of state-of-the-art neuroscientific methods (psychophysics, electroencephalogram, and functional magnetic resonance imaging). Her research is influential in the field of face perception and beyond (e.g., vision research, psychology, and neuroscience). By means of psychophysics, neuroimaging and electrophysiological methods, Goffauxlab addresses two main questions: (1) how is low-level input transformed as it enters high-level stages of visual processing, when and where in the brain is it encoded? And (2) how does the brain adapt its sampling and encoding of low-level visual information while navigating in the natural environment?


Face Tracking and its Applications

Justus Thies, Technische Universität München

Date: April 3, 2019.

Time: 13:00.

Location: Seminarraum 1 (ICT building, ground floor).

In recent years, there has been a lot of progress in the field of human digitization and, especially, in reconstructing and tracking faces. Even smart-phones are able to reliably track the face of a user to allow a bunch of new applications.
This talk is about facial performance capture using commodity hardware (such as a standard webcam). It not only allows us to capture new performances but also to analyze existing videos of a person, especially, videos downloaded from the internet. At the core of the presented methods, a 3D model is used to represent the face. Based on this 3D reconstruction, we are able to perform facial reenactment, i.e., transferring the expressions from one video to another video. There are many use-cases for this system, ranging from video-dubbing to teleconferencing in VR. But on the other hand, it also enables the manipulation of video content. Therefore, we are working on techniques to identify such manipulations which will be covered by this talk.

Justus Thies is working as a postdoctoral researcher at the Technical University of Munich. In September 2017 he joined the Visual Computing Lab of Prof. Dr. Matthias Nießner. Previous, he was a PhD student at the University of Erlangen-Nuremberg under the supervision of Günther Greiner (2014-2017). During the time as a PhD student he collaborated with other institutes and did internships at Stanford University and the Max-Planck Institut Informatik. His research focuses on 3D scanning and motion capture using commodity hardware. Thus, he is interested in Computer Vision and Computer Graphics, as well as in efficient implementations of optimization techniques, especially on graphics hardware.



TBD @ Innsbruck – Guest lecture

Discrimination of real and artificial human faces in digital media

Giulia Boato, University of Trento

Modern computer graphics technologies brought realism in computer-generated characters, making them achieve truly natural appearance. Besides traditional virtual reality applications such as avatars, games, or cinema, these synthetic characters may be used to generate realistic fakes, which may lead to improper use of technology. This fact raises the demand for advanced tools able to discriminate real and artificial human faces in digital media. In this seminar, the state of the art on discrimination between computer generated and natural faces will be revised, discussing also recent approaches which propose solutions based on advanced computer vision technologies allowing the  extraction of physiological measurements from video sequences. The analysis of such physiological signals and/or facial dynamic information can also be exploited to tackle the strictly connected problem of face anti-spoofing.

Giulia Boato is Associate Professor at the University of Trento. She holds a M.Sc. in Mathematics, 2002, and a Ph.D. in Information and Communication Technologies, 2005. Her research interests are focused on image and signal processing, with particular attention to multimedia data protection, data hiding and multimedia forensics, but also intelligent multidimensional data management and analysis. Her research focused in the last 5 years in particular towards the discrimination of CG and real human faces and the development of effective image/ audio/video modifications detection. She was chair of the International Workshop Living Web: making diversity a true asset (2009) and of the workshop on Event-based Media Integration (2013). She was Technical Program Chair of the ACM Information Hiding andMultimedia Security Workshop 2018. She has participated in many projects, (e.g., in the case of EC projects, SAFESPOT, Living Knowledge, GLOCAL, Eternals, 3D-ConTourNet).