😊 Bio

I am an AI Research Leader with 15+ years of experience advancing AI, from classical computer vision and machine learning to today's foundation and multimodal generative models.

I specialize in adaptive and collaborative multimodal learning and generation, with a forward-looking emphasis on: 1) specialization of large multimodal and diffusion-based models, 2) controllable multimodal generation and editing for data synthesis, 3) interaction-native modeling and learning, and 4) impact on real-world industrial applications.

My background integrates deep research roots with industrial execution:

  • During nearly a decade at Amazon, I served as a Principal Scientist, leading high-impact core research and product efforts across Prime Video, Alexa, and mobile/.com shopping. I co-developed novel models and architectures for video understanding, vision-language representation, Large Multimodal Models, and diffusion models. My work powered AI-driven features such as live sports highlights, virtual try-on, interactive product recommendations, and shopping assistants, used by millions of users worldwide and generating O(XXM) USD in business impact.
  • I spent the first part of my career in academia, obtaining my Ph.D. in Computer Science from the University of Verona (Italy) in 2012 supervised by Prof. Vittorio Murino and Prof. Marco Cristani. I was a visiting student at the University of British Columbia with Prof. Nando de Freitas. I was a postdoctoral fellow at Dartmouth College working with Prof. Lorenzo Torresani and I was a postdoctoral fellow at the Italian Institute of Technology working with Prof. Vittorio Murino.

Creativity fuels my work in both science and sound. A long-time keyboardist and former band member, I am currently composing and producing original music in my home studio. You can explore my latest tracks here: Listen on SoundCloud.

📢 News

  • Now: I am searching for a new home to join, seeking an environment where I can lead high-stakes innovation and drive the next generation of AI for industry.
  • Mar 2, 2026: Invited guest lecture on Multimodal Intelligence at University of Utah. Thanks Ziad!
  • Feb 21, 2026: 1/2 papers accepted at CVPR 2026!
  • Dec 5, 2025: Invited speaker at the University of Trento and FBK. Thanks Yiming!
  • Nov 28, 2025: Invited speaker at the Turin AI Fall School 2025. Thanks Tatiana!
  • Oct 27, 2025: Invited speaker at the IIT. Thanks Vittorio!
More

📝 Research, Publications and Patents [Google Scholar]

Interactive Episodic Memory with User Feedback

N. Subedi, L. Bazzani, Z. Al-Halah.

In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2026

PDF

Multi-Level Conditioning by Pairing Localized Text and Sketch for Fashion Image Generation

Z. Liu, D. Talon, F. Girella, Z. Ruan, M. Mondo, L. Bazzani, Y. Wang, M. Cristani.

Arxiv, 2026

Project PDF

Med-MMFL: A Multimodal Federated Learning Benchmark in Healthcare

A. Chhetri, B. Niroula, P. Shrestha, Y. R. Shrestha, L. A. Anderson, P. K. Gyawali, L. Bazzani, B. Bhattarai.

Arxiv, 2026

PDF Code

Learning Visual Hierarchies in Hyperbolic Space for Image Retrieval

Z. Wang, S. Ramasinghe, C. Xu, J. Monteil, L. Bazzani, T. Ajanthan

In International Conference on Computer Vision (ICCV), 2025

PDF

LatteCLIP: Unsupervised CLIP Fine-Tuning via LMM-Synthetic Texts

A. Cao, M. Jaritz, M. Guillaumin, R. de Charette, L. Bazzani

In IEEE Winter Conference on Applications of Computer Vision (WACV), 2025

PDF Code

UniCoRN: Unified Commented Retrieval Network with LMMs

M. Jaritz, M. Guillaumin, S. Sternig, L. Bazzani

Arxiv, 2025

PDF

ViewFusion: Towards Multi-View Consistency via Interpolated Denoising

X. Yang, Y. Zuo, S. Ramasinghe, L. Bazzani, G. Avraham, A. van den Hengel

In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2024

Project PDF Code

iEdit: Localised Text-guided Image Editing with Weak Supervision

R. Bodur, E. Gundogdu, B. Bhattarai, T-K Kim, M. Donoser, L. Bazzani

In Computer Vision and Pattern Recognition (CVPR) Workshops, 2024

PDF

[Patent] Interactive Retrieval Using Visual Semantic Matching

US-11720942, 2023

PDF

[Patent] Localized Visual Similarity

US-11809520, 2023

PDF

[Patent] Attribute-based Interactive Product Recommendations

US-11829445, 2023

PDF

[Patent] Visual Blending of Content

US 11416910, 2022

PDF

[Patent] Machine learning System to Score Alt-text in Image Data

US-11361212, 2022

PDF

Contrastive Language-Action Pre-training for Temporal Localization

M. Xu, E. Gundogdu, M. Lapin, B. Ghanem, M. Donoser, L. Bazzani

Arxiv, 2022

PDF

Learning Attribute-driven Disentangled Representations for Interactive Fashion Retrieval

Y. Hou, E. Vig, M. Donoser, L. Bazzani

In International Conference on Computer Vision (ICCV), 2021

PDF Code

Revamping Cross-Modal Recipe Retrieval with Hierarchical Transformers and Self-supervised Learning

A. Salvador, E. Gundogdu, L. Bazzani, M. Donoser

In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2021

PDF Code

Localized Triplet Loss for Fine-Grained Fashion Image Retrieval

A. D’Innocente, N. Garg, Y. Zhang, L. Bazzani, M. Donoser

In Computer Vision and Pattern Recognition (CVPR) Workshops, 2021

PDF

Learning Joint Visual Semantic Matching Embeddings for Language-guided Retrieval

Y. Chen, L. Bazzani

In European Conference on Computer Vision (ECCV), 2020

PDF

Image Search with Text Feedback by Visiolinguistic Attention Learning

Y. Chen, S. Gong, L. Bazzani

In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2020

Project PDF Code

[Patent] Automated Video Ratings

US-10643074, 2020

PDF

Image Captioning as Neural Machine Translation Task in SOCKEYE

L. Bazzani, T. Domhan, F. Hieber

Arxiv, 2018

PDF Code

Recurrent Mixture Density Network for Spatiotemporal Visual Attention

L. Bazzani, H. Larochelle, L. Torresani

International Conference on Learning Representations (ICLR), 2017

Project PDF Video

Group Detection and Tracking using Sociological Features

S. Vascon, and L. Bazzani

Group and Crowd Behavior for Computer Vision, 2017

PDF

Approximate Log-Hilbert-Schmidt distances between covariance operators for image classification

H. Q. Minh, M. San Biagio, L. Bazzani, V. Murino

In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016

PDF

Self-taught object localization with deep networks

L. Bazzani, A. Bergamo, D. Anguelov, L. Torresani

In IEEE Winter Conference on Applications of Computer Vision (WACV), 2016

Project PDF Code

A Unifying Framework in Vector-valued Reproducing Kernel Hilbert Spaces for Manifold Regularization and Co-Regularized Multi-view Learning

H. Q. Minh, L. Bazzani, V. Murino

Journal of Machine Learning Research (JMLR), 2016

Project PDF Code

Kernel Methods on Approximate Infinite-Dimensional Covariance Operators for Image Classification

H. Q. Minh, M. San Biagio, L. Bazzani, V. Murino

Arxiv, 2016

PDF

Joint Individual-Group Modeling for Tracking

L. Bazzani*, M. Zanotto*, M. Cristani, V. Murino

IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 2015

PDF Video Dataset

SDALF: modeling human appearance with symmetry-driven accumulation of local features

L. Bazzani, M. Cristani, V. Murino

Person Re-identification, 2014

Project PDF Code Video

Weighted bag of visual words for object recognition

L. Bazzani*, M. San Biagio*, M. Cristani, V. Murino

In IEEE International Conference on Image Processing (ICIP), 2014

PDF

A unifying framework for vector-valued manifold regularization and multi-view learning

H. Q. Minh, L. Bazzani, V. Murino

The 30th International Conference on Machine Learning (ICML), 2013

Project PDF Code

Semi-supervised multi-feature learning for person re-identification

D. Figueira, L. Bazzani, H.Q. Minh, M. Cristani, A. Bernardino, V. Murino

In International Conference on Advanced Video and Signal-based Surveillance (AVSS), 2013

PDF

Person re-identification with a PTZ camera: an introductory study

P. Salvagnini, L. Bazzani, M. Cristani, V. Murino

In International Conference on Image Processing (ICIP), 2013

PDF

Symmetry-driven accumulation of local features for human characterization and re-identification

L. Bazzani, M. Cristani, V. Murino

Computer Vision and Image Understanding (CVIU), 2013

Project PDF Code Video

Social interactions by visual focus of attention in a three-dimensional environment

L. Bazzani, D. Tosato, M. Cristani, M. Farenzena, G. Pagetti, G. Menegaz, and V. Murino

Expert Systems 2013

Project PDF Code Video

Decentralized particle filter for joint individual-group tracking

L. Bazzani, M. Cristani, V. Murino

In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2012

PDF Video Dataset

Learning where to attend with deep architectures for image tracking

M. Denil, L. Bazzani, H. Larochelle, and N. de Freitas

Neural Computation, 2012

Project PDF Code Video Dataset

Re-identification with RGB-D sensors

B. I. Barbosa, M. Cristani, A. Del Bue, L. Bazzani, V. Murino

In 1st International Workshop on Re-Identification, 2012

PDF Dataset

Online bayesian non-parametrics for social group detection

M. Zanotto, L. Bazzani, M. Cristani, V. Murino

In British Machine Vision Conference (BMVC), 2012

PDF

Analyzing groups: a social signaling perspective

L. Bazzani, M. Cristani, G. Paggetti, D. Tosato, G. Menegaz, and V. Murino

Video Analytics for Business Intelligence, 2012

Project PDF Code Video

Multiple-shot person re-identification by chromatic and epitomic analyses

L. Bazzani, M. Cristani, A. Perina, and V. Murino

Pattern Recognition Letters (PRL), 2012

PDF

Learning attentional policies for object tracking and recognition in video with deep networks

L. Bazzani, N. de Freitas, H. Larochelle, V. Murino, J-A Ting

The 30th International Conference on Machine Learning (ICML), 2011

Project PDF Code Video Dataset

Custom pictorial structures for re-identification

D. S. Cheng, M. Cristani, M. Stoppa, L. Bazzani, V. Murino

In British Machine Vision Conference (BMVC), 2011

Project PDF Video Dataset

Social interaction discovery by statistical analysis of F-formations

M. Cristani, L. Bazzani, G. Pagetti, A. Fossati, D. Tosato, A. Del Bue, G. Menegaz, V. Murino

In British Machine Vision Conference (BMVC), 2011

Project PDF Dataset

Towards computational proxemics: Inferring social relations from interpersonal distances

M. Cristani, G. Pagetti, A. Vinciarelli, L. Bazzani, G. Menegaz, V. Murino

In International Conference on Social Computing (SocialCom), 2011

PDF

Multiple-shot person re-identification by hpe signature

L. Bazzani, M. Cristani, A. Perina, M. Farenzena, V. Murino

In International Conference on Pattern Recognition (ICPR), 2010

PDF

Person re-identification by symmetry-driven accumulation of local features

M. Farenzena, L. Bazzani, A. Perina, M. Cristani, V. Murino

In Conference on Computer Vision and Pattern Recognition (CVPR), 2010

Project PDF Code Video

Collaborative particle filters for group tracking

L. Bazzani, M. Cristani, V. Murino

In International Conference on Image Processing (ICIP), 2010

PDF

💼 My Experience

2026-now

Independent Researcher & Innovation Leader

Driving independent research and developing early-stage prototypes on adaptive and cooperative multimodal intelligence, focusing on the intersection of LMMs and interactive systems.
2025-now

Adjunct Professor, University of Verona

Teach Data Visualization as part of the Master’s degree in Data Science.
2016 - 2025

Principal Scientist, Amazon

Led high-impact core research and product efforts across Prime Video, Alexa, and mobile/.com shopping and co-developed novel models for video understanding, vision-language representation, LMMs, and diffusion models.
2014 - 2015

Postdoc, Dartmouth College

Research on video understanding, saliency in videos and object localization and detection. Collaborating with Prof. Lorenzo Torresani and Prof. Hugo Larochelle.
2011 - 2013

Postdoc, Italian Institute of Technology

Research on video understanding, object recognition, Bayesian networks and Kernel-based methods. Collaborating with Prof. Vittorio Murino.
2009 - 2012

PhD in Computer Vision, University of Verona

Research on person re-identification, video understanding, tracking and attentional models. Supervised by Prof. Vittorio Murino and Prof. Marco Cristani.
2010

Visiting Student, University of British Columbia

Collaborating with Prof. Nando de Freitas.