MeL4CV Workshop at AAAI 2021

1. Joaquin Vanschoren, Eindhoven University of Technology

Joaquin Vanschoren is an assistant professor of machine learning at TU/e. His research is focused on machine learning systems that learn how to learn, all by themselves. Specifically, he works on meta-learning and automated machine learning (AutoML). He hopes that one day, computers will learn automatically and continuously from one task to the next, with minimal supervision and maximal curiosity, much like humans do. He is the founder of OpenML.org which is an online platform for sharing data, code, models and experiments.

Title: Learning to reinforcement learn for Neural Architecture Search
Reinforcement learning (RL) is a goal-oriented learning solution that has proven to be successful for Neural Architecture Search (NAS) on the CIFAR and ImageNet datasets. However, a limitation of this approach is its high computational cost, making it unfeasible to replay it on other datasets. Through meta-learning, we could bring this cost down by adapting previously learned policies instead of learning them from scratch. In this talk, I will discuss a deep meta-RL algorithm that learns an adaptive policy over a set of environments, making it possible to transfer it to previously unseen tasks. I will also discuss how the agent effectively adapts its strategy during training to design better architectures than the ones designed by a standard RL algorithm, and can design good architectures during the evaluation on previously unseen environments.

2. Timothy Hospedales, University of Edinburgh

Timothy Hospedales is a Professor within IPAB in the School of Informatics at the University of Edinburgh, where he heads the Machine Intelligence Research group. He is also the Principal Scientist at Samsung AI Research Centre, Cambridge and a Turing Fellow of the Alan Turing Institute. His research focuses on data-efficient and robust machine learning using techniques such as meta-learning and lifelong transfer-learning, in both probabilistic and deep learning contexts. He works in a variety of application areas including computer vision, vision and language, reinforcement learning for robot control, finance and beyond.

Title: Meta-Learning: Representations and Objectives
In this talk I will first give an overview perspective and taxonomy of major work the field, as motivated by our recent survey paper on meta-learning in neural networks. I hope that this will be informative for newcomers, as well as reveal some interesting connections and contrasts that will be thought-provoking for experts. I will then give a brief overview of recent meta-learning work from my group which covers some broad issues in computer vision where meta-learning can be applied including dealing with domain-shift, data augmentation, learning with label noise, and accelerating RL. Finally, I will describe some recent results that may settle the recent debate in few-shot learning about whether or not meta-learning is made redundant by strong feature extractor training.

3. Ramesh Raskar, MIT Media Lab

Ramesh Raskar is an Associate Professor at MIT Media Lab and directs the Camera Culture research group. His focus is on AI and Imaging for health and sustainability. They span research in physical (e.g., sensors, health-tech), digital (e.g., automated and privacy-aware machine learning) and global (e.g., geomaps, autonomous mobility) domains. He received the Lemelson Award (2016), ACM SIGGRAPH Achievement Award (2017), DARPA Young Faculty Award (2009), Alfred P. Sloan Research Fellowship (2009), TR100 Award from MIT Technology Review (2004) and Global Indus Technovator Award (2003). He has worked on special research projects at Google [X] and Facebook and co-founded/advised several companies.

Title: AutoML and Crowdsourced Epidemiology for Covid-19
The pandemic response is in a disarray. The orchestration requires computational surveillance of the social graph as innocent people infect other innocent people. Can we convert every smartphone each into a node for sensing and for nudging the user? There are three subproblems: (i) automating machine learning to harness data on the phones (ii) privacy preserving distributed machine learning to learn models and provide inference and (iii) data markets to incentivize data sharing. The talk will provide a view of the past and the future. Our group presented MetaQNN, the earliest AutoML work using Q-learning [Baker et al 2016]. Our group is leading a new form of distributed private learning, Split Learning [Gupta et al 2017] and data markets [Vepakoma et al 2018]. PathCheck foundation, a non-profit spin off from our group at MIT, is focusing on building smartphone based solutions for the pandemic with exposure notification and vaccination coordination apps, server side components and predictive dashboards. We believe that automated ML, private ML and data exchanges will play a critical role in pandemic response and pandemic preparedness.

4. Frank Hutter, University of Freiburg, Germany

Frank Hutter is a Professor and the Head of the Machine Learning Lab at University of Freiburg, Germany. In addition to his full-time role at the University of Freiburg, he also consults for the Bosch Center for AI (BCAI) as Chief Expert for AutoML. He is interested in all facets of intelligence, and how we can replicate it in artificial systems. In particular, he works on statistical machine learning (in particular deep learning), to learn effective representations for large amounts of very noisy data (including uncertainty quantification), automated problem solving including knowledge representation and autonomously-learning software systems, which can improve their performance over time without the need for a human in the loop.

Title: Towards Robust and Efficient Meta-Learning for AutoDL
Much like deep learning (DL) replaces manually-engineered domain-specific features with automatically-learned representations, automated deep learning (AutoDL) promises to replace manually-engineered architectures and hyperparameters with automatically-learned choices, thereby enabling true end-to-end representation learning for new datasets. In this talk, I will discuss ways towards making AutoDL more robust and efficient, drawing on methods from meta-learning, neural architecture search, multi-fidelity hyperparameter optimization, and ensemble building.

5. Peter Tu, Chief Scientist, GE Global Reseach

Peter Tu is GE’s Chief Scientist for Artificial Intelligence. He is a Principal Investigator for the DARPA GAILA and CREATE programs focused on langauge acquisition via visual grounding and interagent cooperation. He was also the Prinicipal Investigator for the DARPA sponsored Sherlock effort focused on group level social behavior understanding at a distance based on visual cues, National Institute of Justice’s 3D Face Enhancer Program focused on improving face recogntion from poor quality surveillance video, and FBI ReFace project tasked with developing an automatic system for face reconstruction from skeletal remains. His topics of interest include: developing new forms of agency and awareness, the grounding problem, associative memory and artificial intentionality. Dr. Tu has over 75 peer reviewed publications and has filed more than 50 U.S. patents.

Title: Grounded Language Acquisition
This talk will describe investigations into the fundamental question of grounding. That is to say, how can an agent acquire and represent the meaning of concepts. We take the view that an agent prior to acquiring natural language has experienced the world in a largely private manner. The agent has visual experiences from which it has learned that objects and object categories exits. Objects have attributes and persist over time. The agent also understands that events take place in a physical world and that agents exist and have purpose. It may be the case that having been exposed to the physical world, children encapsulate such knowledge in the form of a private language. In this talk we show that Emergent Languages, which can be constructed in an unsupervised manner, can be thought of as a private language. A natural language such as English can be viewed of as a repository of concepts. By applying unsupervised clustering to images described via an Emergent Language we show that an agent can then analyze such clusters and with the aid of a Natural Language expert, map such clusters to their associated natural language concepts. Another form of experience is exposure to spoken speech. The idea being that syntax and statistical frequencies can be observed. This allows for a form of inductive learning where an agent can expand its conceptual knowledge. Having discovered the existence of novel concepts and through machine translation mapped the associated Emergent Language descriptions to these concepts, our next task is to construct relationships between these concepts. The hypotheses being that the meaning of a concept emerges from its connections with other concepts. Relationships that are investigated include: physical and causal, metaphorical and pragmatic.

6. Rama Chellappa, Professor, Johns Hopkins University

Rama Chellappa is a Distinguished University professor, a Minta Martin professor in engineering, and chair of the Department of Electrical and Computer Engineering, University of Maryland (UMD), College Park, Maryland. At UMD, he is an affiliate professor of Computer Science Department, Applied Mathematics and Scientific Computing Program, member of the Center for Automation Research and a permanent member of the Institute for Advanced Computer Studies. His current research interests span many areas in image processing, computer vision, and machine learning. Recently, he received the inaugural Leadership Award from the IEEE Biometrics Council. He is currently on the Editorial Board of SIAM Journal of Imaging Science and Image and Vision Computing. He has also served as the General and Technical Program chair/co-chair for several IEEE international and national conferences and workshops. He is a Golden Core member of the IEEE Computer Society, served as a distinguished lecturer of the IEEE Signal Processing Society and as the president of IEEE Biometrics Council. He is a fellow of the IEEE, IAPR, OSA, AAAS, ACM, and AAAI and holds six patents.

Title: Meta-learning: Something Old and Something New
Meta learning or learning to learn is a theme that has been around for several decades. In the current data-driven paradigm, meta-learning has reemerged as a promising methodology for designing intelligent systems. In this talk, I will take a more general view of meta-learning in the context of traditional model-based approaches and recent data-driven methods for solving computer vision tasks. I will discuss meta-learning methods for designing supervisors for image understanding systems, self- healing trackers, and discuss more recent methods for domain generalization, zero-shot detection, and bias mitigation. Two challenging computer vision problems that need meta-learning methods will then be presented. I will conclude the talk by pointing out some of the shortcomings of meta-learning techniques as they are practiced today.

7. Walter Scheirer, University of Notre Dame

Walter J. Scheirer received the M.S. degree in computer science from Lehigh University, USA, in 2006 and the Ph.D. degree in engineering from the University of Colorado, Boulder, Colorado, USA, in 2009. He is currently an Associate Professor with the Department of Computer Science and Engineering, University of Notre Dame, USA. Prior to that, he was a Post-Doctoral Fellow with Harvard University, USA, and a Research Assistant Professor with the University of Colorado, USA. His research interests include computer vision, machine learning, biometrics, and digital humanities.

Title: Representational Dissimilarity Analysis as a Tool for Neural Network Model Search
Neuroscience theory posits that the brain’s visual system coarsely identifies broad object categories via neural activation patterns, with similar objects producing similar neural responses. Artificial neural networks also have internal activation behavior in response to stimuli. It can be hypothesized that networks exhibiting brain-like activation behavior will demonstrate brain-like characteristics, e.g., stronger generalization capabilities. This talk introduces a human-model similarity (HMS) metric, which quantifies the similarity of human fMRI and network activation behavior. To calculate HMS, representational dissimilarity matrices (RDMs) are created as abstractions of activation behavior, measured by the correlations of activations to stimulus pairs. HMS is then the correlation between the fMRI RDM and the neural network RDM across all stimulus pairs. The metric is tested on unsupervised predictive coding networks, which specifically model visual perception, and assessed for statistical significance over a large range of hyperparameters. The experiments show that networks with increased human-model similarity are correlated with better performance on two computer vision tasks: next frame prediction and object matching accuracy. Further, HMS identifies networks with high performance on both tasks. An unexpected secondary finding is that the metric can be employed during training as an early-stopping mechanism.

8. Hien Van Nguyen, University of Houston

Hien Van Nguyen is a tenure-track Assistant Professor (2016-Present). He did his Ph.D from University of Maryland (2013). He has co-authored 40+ journals, conference papers & workshops, book chapters, and 15+ patents. He has worked at Uber Self-Driving Car Division, Siemens Corporate Research, Army Research Lab, Mitsubishi Electric Research Lab. He is also among the top 3 authors in non-linear dictionary learning based on ISI Web of Knowledge and Google Scholar.

Title: Meta-Learning for Medical Image Analysis
Accenture estimated that artificial intelligence (AI) could save the healthcare industry $150 billion annually by 2026. Despite the great promise, AI models’ development and deployment face a significant challenge of data heterogeneity. Specifically, AI models trained on data from one hospital will likely suffer from accuracy drops (biases) when performing on data coming from another hospital. For example, an algorithm developed at a hospital with a mostly Caucasian patient population will not achieve the same accuracy on minority patients from a local clinic. Meta-learning is a promising framework to mitigate the above-mentioned problem. In this talk, I will give a brief introduction to contemporary meta-learning algorithms, and show their applications to biomedical image analysis.