About me
I work as a Principal Researcher at the Samsung AI center in Cambridge, UK., where I lead a group focusing on video anaysis, with special attention to human action analysis. This is scoped within the theme of the center, human-centric AI. Our task at SAIC-Cambridge is to solve the challenges stemming from Samsung's product family through technically novel approaches, and to produce technical advancements that can act as enablers for the development of future products. We publish our work on the top venues on computer vision and machine learning on a regular basis.
Previously I worked for about 3 years for Amazon in Seattle, where I had the pleasure of being part of the Amazon Go and AWS Rekognition teams.
I am interested in a wide variety of topics in machine learning and computer vision - in fact most often it is the process, the team or the prospect of the impact that I find most appealing rather than the topic itself. While most of my work during my years in academia has been on face analysis, in recent years I have worked on topics as diverse as human action recognition, binary CNNs, knowledge distillation and lipreading.
This is my Google Scholar profile.
News
-
4 papers at ICCV'23
ReGen: A good Generative zero-shot video classifier should be Rewarded
Black Box Few-Shot Adaptation for Vision-Language Models
Paper: https://arxiv.org/abs/2304.01752FSD-Prompt: Few-Shot Detection Prompting without retraining
Paper: https://arxiv.org/abs/2210.04845Bayesian Prompt Learning for Image-Language Model Generalization
Paper: https://arxiv.org/abs/2210.02390 -
1 paper at ICLR'23
Efficient Self-supervised Pre-training on Low-compute Networks without Distillation
Paper: https://arxiv.org/abs/2210.02808 Code: https://github.com/saic-fi/SSLight -
3 papers at ECCV'22
EdgeViTs: Competing Light-weight CNNs on Mobile Devices with Vision Transformers
Efficient transformers for Mobile devices
Paper: https://arxiv.org/abs/2205.03436 Code: https://github.com/saic-fi/edgevitLearning hand-held object appearance for compositional action recognition:
SOS! Self-supervised Learning Over Sets Of Handled Objects In Egocentric Action RecognitionI was also lucky to be (a small) part of the work led by SAIC-Toronto on instructional videos, accepted as an oral:
Flow graph to Video Grounding for Multi-Step Localization -
2 papers at BMVC'21
Preprints of the two BMVC'21 papers are available in arXiv:
Few-shot Action Recognition with Prototype-centered Attentive Learning
Knowing What, Where and When to Look: Efficient Video Action Modeling with Attention -
2 papers at NeurIPS'21
Check out the pre-print versions of our NeurIPS'21 papers:
Space-time Mixing Attention for Video Transformer
Low-Fidelity End-to-End Video Encoder Pre-training for Temporal Action Localization -
1 new paper on ICCV'21
One ICCV'21 on temporal action localization: Boundary-sensitive Pre-training for Temporal Localization in Videos
-
1 paper on ICASSP'21
Our work on Lipreading has been accepted for publication on ICASSP'21. See the arxiv version.
-
Two new ICLR'21 papers
One ICLR'21 on knowledge distillation: Knowledge Distillation via Softmax Regression Representation Learning
Code is publicly available hereOne ICLR'21 paper on binary neural networks: High-Capacity Expert Binary Networks
-
AC@ICCV'21
I've been selected to act as Area Chair for the upcoming ICCV'21.
-
Organizing CVPR'21 Workshop
I'm co-organizing the 1st workshop on Binary Networks, to be held in conjunction with CVPR'21.
-
ECCV'20 paper accepted
BATS: Binary ArchitecTure Search has been accepted for ECCV'20.
-
ICLR'20 accepted
Our paper on Binary CNNs has been accepted at ICLR'20. It sets a new state of the art for binary networks: 65.4% top-1 accuracy on ImageNet using a binary ResNet18 (an improvement of over 5%!).
-
ICASSP'20 accepted
Our paper on Lipreading has been accepted at ICASSP'20 for an oral presentation. It raises the state of the art on LRW and LRW1000 on 1.2% and 3.2% top-1 accuracy respectively. You can check out the arXiv version here.
-
ICCV'19 accepted
Our paper on Action Recognition has been accepted at ICCV'19. We achieve 78.8 on Kinetics400 and 53.4 on Something-SomethingV1, and without even using two-stream or non-local NN. The paper can be accessed here.
-
Moving to Samsung
From April 2019 I'll be part of the Samsung AI Research Center in Cambridge, UK, on a new role as Senior Researcher.
-
TPAMI accepted!
You can check it through the IEEExplorer page. Alternatively, there is an Arxiv version.
-
Paper on ECCV'16
You can check it on Arxiv here: https://arxiv.org/abs/1608.01137
-
Amazon move
From June 2016 I'll be part of Amazon on a new role as Research Scientist. I'll thus be leaving my position at the University of Nottingham
-
Co-organizing ChaLearn
I'm co-organizer of the Chalearn LAP and FotW challenge and workshop @ CVPR 2016
The challenge page: http://gesture.chalearn.org/ -
Organising BMVA Technical Meeting
The Computational Face - Automatic Face Analysis and Synthesis
One Day BMVA symposium in London, UK on 14th October, 2015
Chairs: Brais Martinez, Yorgos Tzimiropoulos and Michel Valstar
Keynote speakers: Tim Cootes (University of Manchester), Darren Cosker (University of Bath), Maja Pantic (Imperial College London), Richard Bowden (University of Surrey)
Webpage and Registration: http://www.bmva.org/meetings