Arif Ahmed

Bengaluru, India ยท arif.ahmed.5.10.1995@gmail.com

Hi, I'm Arif.

Worked in early stage startups and research organizations. Mainly in Deep Learning and Software Development. Currently focusing on R&D Deep Learning roles in Image/Video. My work so far has involved leveraging a diverse toolkit of state-of-the-art deep learning architectures to solve complex computer vision problems, from detection and segmentation to advanced image and video synthesis, with a focus on lipsync and high-fidelity restoration.

I'm keen to expand my expertise beyond current applications, exploring new frontiers in areas like Text-to-Image and Image-to-Video generation. I'm also eager to deepen my experience with Vision Language Models (VLMs) and VoiceAI.

I like cooking, classical paintings and 70s classic rock. Oh, cats and dogs too :)

Experience

Founding Engineer

Stealth Startup

Partnered with a couple of guys to build an end-to-end Agentic AI solution for upcoming/mid D2C brands in the performance marketing space.

Had interest from a couple of Tier-1 VCs, but we ourselves were not sure if our GTM strategy made sense given the recent heavy democratization of Image/Video GenAI. Was very unlikely that we would raise significant capital so decided to let it go. It was an interesting deviation from a normal career path and I learned a lot from this experience.

December 2024 - April 2025

Deep Learning Scientist

Was the primary research engineer for the visual GenAI offerings of the content localization platform(Spectral Studios). My day-to-day work was productizing Image and Video Generation Algorithms(lipsync, talking heads, identity style-transfer). Modifying architectures to work for production level use-cases , improve training strategy and inference code, training/fine-tuning on custom data. I also built an internal tool to crawl YouTube and curate custom datasets.

Reference: Shashank Bhalotia(Manager)

August 2023 - November 2024

Machine Learning Engineer(MLE-I)

Researched on image/video generation algorithms for improving lipsync, metrics for quality and lipsync, computer vision techniques for face blending. Worked with UNet-based GAN and transformer architectures.

Reference: Subhashish Saha(Co-Founder & COO)

November 2022 - August 2023

Freelancer

Various Clients

Majority of tasks were for Osoyoo and NutrienAg. I worked on Raspberry Pi I2C drivers, fine tuning audio DNNs, minor data cleaning. NutrienAg has a large range on smart farm solutions, involving analysing satellite visual and ground audio data. I worked on pest identification and crop disease analysis.

February 2021 - August 2022

Product Engineer

Was part of a small but effective three member software team. Responsible for a lot of major improvements to existing codebase as well as new implementations/products. Worked on on-board software for RasPi, web full-stack, I2C drivers, graph traversal algorithms, etc. Had a chance to explore multiple domains of software and some aspects of HW-SW communications.

My work was mostly on Annie, an all-in-one Braille dicta reader(their flagship device) and Chakravyuh, a COVID monitoring tool(winner of CAWACH grant).

September 2019 - December 2020

Lead Software Engineer

An pre-seed idea to develop an affordable desktop 3D bio printer providing a universal extruder and a simple GUI. The printer was designed for biologists to control and analyze the printing process.

We had a basic prototype and had won some minor prize money as well. I was the sole developer of the application to interact with the Bio 3D printer. Unfortunately the venture eventually did not work out and the team disbanded later. The now defunct software is put up here.

July 2018 - November 2019

E-Commerce Inpainting SDXL

Project EComm-SDXL Image

A ComfyUI in-painting based workflow using a base-SDXL + LoRA checkpoints to generate ControlNet guided images of a model holding a physical product.

Read more...

Fashion Try On and Animate

Project EComm-SDXL Image

A ComfyUI image2video workflow to generate a fashion-forward animated clip.

Read more...

Modeling non-linear Audio Effects

Project Guitar Image

A complete implementation of a DNN model to learn the non-linear distortion effect applied to an acoustic guitar sound.

Read more...

Spectral-Spatial Classification of Hyperspectral Images

Project Hyperspectral Image

An image processing algorithm to classify regions based on hyperspectral data.

Read more...

Music Generation using a Differentiable Neural Computer

Project DNC Image

An attempt to generate sequential data(specifically piano music) using a Differentiable Neural Computer.

Read more...

Study Project: Multi-Agent Diverse GANs

Project MADGAN Image

This was a study project to understand the workings of multi-agent diverse generative adversarial networks

Read more...

Google Summer of Code 2017 with SymPy

Project in Computational Mathematics. This involved implementing the algorithm in a 2017 paper concerning a numerical method to compute integral of an arbitrary polynomial over 2/3-Polytopes. Detailed information is present in the report(the link in Read More section)

Read more..

Google Summer of Code 2018 with CERN-HSF

Project in Parallelization. Bulk of the work was to parallelize the various Math and Fitting functions present in ROOT(library for particle physics data analysis). Other minor work was to improve VecCore which further used various backend libraries to parallelize basic math functions.

Read more..

Skills

Languages, Operating Systems & Tools
  • Python
  • C/C++
  • Java
  • git
  • linux
  • bash
  • javascript
Libraries
  • PyTorch
  • Keras
  • Tensorflow
  • scikit-learn
  • NumPy
  • SymPy
  • OpenCV
  • GPIO
DBMS
  • MySQL

Education

2015 - 2019