Hi, I'm Arif.
Worked in early stage startups and research organizations. Mainly in Deep Learning and Software Development. Currently focusing on R&D Deep Learning roles in Image/Video. My work so far has involved leveraging a diverse toolkit of state-of-the-art deep learning architectures to solve complex computer vision problems, from detection and segmentation to advanced image and video synthesis, with a focus on lipsync and high-fidelity restoration.
I'm keen to expand my expertise beyond current applications, exploring new frontiers in areas like Text-to-Image and Image-to-Video generation. I'm also eager to deepen my experience with Vision Language Models (VLMs) and VoiceAI.
I like cooking, classical paintings and 70s classic rock. Oh, cats and dogs too :)
Partnered with a couple of guys to build an end-to-end Agentic AI solution for upcoming/mid D2C brands in the performance marketing space.
Had interest from a couple of Tier-1 VCs, but we ourselves were not sure if our GTM strategy made sense given the recent heavy democratization of Image/Video GenAI. Was very unlikely that we would raise significant capital so decided to let it go. It was an interesting deviation from a normal career path and I learned a lot from this experience.
Was the primary research engineer for the visual GenAI offerings of the content localization platform(Spectral Studios). My day-to-day work was productizing Image and Video Generation Algorithms(lipsync, talking heads, identity style-transfer). Modifying architectures to work for production level use-cases , improve training strategy and inference code, training/fine-tuning on custom data. I also built an internal tool to crawl YouTube and curate custom datasets.
Reference: Shashank Bhalotia(Manager)
Researched on image/video generation algorithms for improving lipsync, metrics for quality and lipsync, computer vision techniques for face blending. Worked with UNet-based GAN and transformer architectures.
Reference: Subhashish Saha(Co-Founder & COO)
Majority of tasks were for Osoyoo and NutrienAg. I worked on Raspberry Pi I2C drivers, fine tuning audio DNNs, minor data cleaning. NutrienAg has a large range on smart farm solutions, involving analysing satellite visual and ground audio data. I worked on pest identification and crop disease analysis.
Was part of a small but effective three member software team. Responsible for a lot of major improvements to existing codebase as well as new implementations/products. Worked on on-board software for RasPi, web full-stack, I2C drivers, graph traversal algorithms, etc. Had a chance to explore multiple domains of software and some aspects of HW-SW communications.
My work was mostly on Annie, an all-in-one Braille dicta reader(their flagship device) and Chakravyuh, a COVID monitoring tool(winner of CAWACH grant).
An pre-seed idea to develop an affordable desktop 3D bio printer providing a universal extruder and a simple GUI. The printer was designed for biologists to control and analyze the printing process.
We had a basic prototype and had won some minor prize money as well. I was the sole developer of the application to interact with the Bio 3D printer. Unfortunately the venture eventually did not work out and the team disbanded later. The now defunct software is put up here.
A ComfyUI in-painting based workflow using a base-SDXL + LoRA checkpoints to generate ControlNet guided images of a model holding a physical product.
A ComfyUI image2video workflow to generate a fashion-forward animated clip.
A complete implementation of a DNN model to learn the non-linear distortion effect applied to an acoustic guitar sound.
An image processing algorithm to classify regions based on hyperspectral data.
An attempt to generate sequential data(specifically piano music) using a Differentiable Neural Computer.
This was a study project to understand the workings of multi-agent diverse generative adversarial networks
Project in Computational Mathematics. This involved implementing the algorithm in a 2017 paper concerning a numerical method to compute integral of an arbitrary polynomial over 2/3-Polytopes. Detailed information is present in the report(the link in Read More section)
Project in Parallelization. Bulk of the work was to parallelize the various Math and Fitting functions present in ROOT(library for particle physics data analysis). Other minor work was to improve VecCore which further used various backend libraries to parallelize basic math functions.