Arif Ahmed

Bengaluru, India · arif.ahmed.5.10.1995@gmail.com

Hi, I'm Arif.

Worked in early stage startups and research organizations. Mainly in Deep Learning and Software Development. Currently focusing on R&D Deep Learning roles in Image/Video. My work so far has involved leveraging a diverse toolkit of state-of-the-art deep learning architectures to solve complex computer vision problems, from detection and segmentation to advanced image and video synthesis, with a focus on lipsync and high-fidelity restoration.

I'm keen to expand my expertise beyond current applications, exploring new frontiers in areas like Text-to-Image and Image-to-Video generation. I'm also eager to deepen my experience with Vision Language Models (VLMs) and VoiceAI.

I like cooking, classical paintings and 70s classic rock. Oh, cats and dogs too :)

Experience

Founding Engineer

Stealth Startup

Partnered with a couple of guys to build an end-to-end Agentic AI solution for upcoming/mid D2C brands in the performance marketing space.

Had interest from a couple of Tier-1 VCs, but we ourselves were not sure if our GTM strategy made sense given the recent heavy democratization of Image/Video GenAI. Was very unlikely that we would raise significant capital so decided to let it go. It was an interesting deviation from a normal career path and I learned a lot from this experience.

December 2024 - April 2025

Deep Learning Scientist

Cynapto Technologies(SpectralStudios.ai)

Was the primary research engineer for the visual GenAI offerings of the content localization platform(Spectral Studios). My day-to-day work was productizing Image and Video Generation Algorithms(lipsync, talking heads, identity style-transfer). Modifying architectures to work for production level use-cases , improve training strategy and inference code, training/fine-tuning on custom data. I also built an internal tool to crawl YouTube and curate custom datasets.

Reference: Shashank Bhalotia(Manager)

August 2023 - November 2024

Machine Learning Engineer(MLE-I)

NeuralGarage(VisualDub.ai)

Researched on image/video generation algorithms for improving lipsync, metrics for quality and lipsync, computer vision techniques for face blending. Worked with UNet-based GAN and transformer architectures.

Reference: Subhashish Saha(Co-Founder & COO)

November 2022 - August 2023

Freelancer

Various Clients

Majority of tasks were for Osoyoo and NutrienAg. I worked on Raspberry Pi I2C drivers, fine tuning audio DNNs, minor data cleaning. NutrienAg has a large range on smart farm solutions, involving analysing satellite visual and ground audio data. I worked on pest identification and crop disease analysis.

February 2021 - August 2022

Product Engineer

Thinkerbell Labs

Was part of a small but effective three member software team. Responsible for a lot of major improvements to existing codebase as well as new implementations/products. Worked on on-board software for RasPi, web full-stack, I2C drivers, graph traversal algorithms, etc. Had a chance to explore multiple domains of software and some aspects of HW-SW communications.

My work was mostly on Annie, an all-in-one Braille dicta reader(their flagship device) and Chakravyuh, a COVID monitoring tool(winner of CAWACH grant).

September 2019 - December 2020

Lead Software Engineer

BioP India

An pre-seed idea to develop an affordable desktop 3D bio printer providing a universal extruder and a simple GUI. The printer was designed for biologists to control and analyze the printing process.

We had a basic prototype and had won some minor prize money as well. I was the sole developer of the application to interact with the Bio 3D printer. Unfortunately the venture eventually did not work out and the team disbanded later. The now defunct software is put up here.

July 2018 - November 2019

Projects

E-Commerce Inpainting SDXL

A ComfyUI in-painting based workflow using a base-SDXL + LoRA checkpoints to generate ControlNet guided images of a model holding a physical product.

Fashion Try On and Animate

A ComfyUI image2video workflow to generate a fashion-forward animated clip.

Modeling non-linear Audio Effects

A complete implementation of a DNN model to learn the non-linear distortion effect applied to an acoustic guitar sound.

Spectral-Spatial Classification of Hyperspectral Images

An image processing algorithm to classify regions based on hyperspectral data.

Music Generation using a Differentiable Neural Computer

An attempt to generate sequential data(specifically piano music) using a Differentiable Neural Computer.

Study Project: Multi-Agent Diverse GANs

This was a study project to understand the workings of multi-agent diverse generative adversarial networks

Open Source Contributions

Google Summer of Code 2017 with SymPy

Project in Computational Mathematics. This involved implementing the algorithm in a 2017 paper concerning a numerical method to compute integral of an arbitrary polynomial over 2/3-Polytopes. Detailed information is present in the report(the link in Read More section)

Google Summer of Code 2018 with CERN-HSF

Project in Parallelization. Bulk of the work was to parallelize the various Math and Fitting functions present in ROOT(library for particle physics data analysis). Other minor work was to improve VecCore which further used various backend libraries to parallelize basic math functions.

Skills

Languages, Operating Systems & Tools

Python
C/C++
Java
git
linux
bash
javascript

Libraries

PyTorch
Keras
Tensorflow
scikit-learn
NumPy
SymPy
OpenCV
GPIO

DBMS

MySQL

Education

Birla Institute of Technology and Science Pilani, Goa Campus

Computer Science and Mathematics

2015 - 2019