Devin Shah

On a mission to apply ML to impactful domains. Every place I've worked at has challenged me more than the last.

Octane Security

Co-founder and Head of ML - 2022-2023

I founded a blockchain security startup that uses ML to find vulnerabilities in smart contracts in 2022. We raised $1.7 million from Alchemy, OrangeDAO, Hyperithm, Duke Capital Partners, Druid Ventures, Symbolic Capital, and Builder Capital, with angels and advisors from Ledger, Apple, Quantstamp, Blowfish, and Meta.

Developed a system for traversing and classifying control-flow graphs to determine control-flow vulnerabilities in smart contracts such as reentrancy and arbitrary external calls.
Created and tested a Transformer-based fuzzer that filtered transactions based on the likelihood of breaking a predefined invariant or bug oracle. Used this method to augment an existing greybox concolic tester and a symbolic execution engine.
Used SFT and RLHF to finetune and steer language models into identifying vulnerable code that evaded all traditional static analyzers.
Setup hosting and infra for all ML-based workflows for easy integration into the Octane platform.
Left the company after helping build the core tech; the company continues to thrive.

Tech Stack

EthereumWorkOSOpenAIPyTorchTerraformHuggingFaceQdrant
MistralAWS API GatewayAWS EC2AWS ECSAWS LambdaLinear

Advising

Machine Learning Advisor - 2023-2024

I am working as an advisor for a few companies looking to integrate AI into their existing business workflow.

Sagiliti: Built a PoV AI-OCR system to automate extraction of data from bill copies. Using OpenAI models and clever type-checking, I am able to automate 90% of their manual extraction process.
Project Imagine: Helping create AI agents that intelligently extract financial data to help executives make better decisions and get analysis on unfamiliar domains for clients.

Tech Stack

OpenAIMistralDSPyKubernetesAirflow

Duke Neurotoolbox Lab

Machine Learning Researcher - 2021-2024

I am working in the Neurotoolbox Lab on improving segmentation of neurons in 2-photon calcium imaging videos. I am advised by Dr. Yiyang Gong and Dr. Yijun Bao.

Developed an active learning architecture that reduced the number of labeled neurons in videos by 50x to reach SOTA accuracy and 10x for baseline accuracy.
Created a technique to use the data from active learning to identify neurons that impact convergence; these neurons happened to be low SNR and low frequency neurons. Isolating these led to faster convergence to the baseline F1.
Ran experiments across multiple GPUs efficiently to reduce a process that would've taken days to hours.

Paper coming soon!

Tech Stack

MATLABTensorFlowCUDA and cuDNN

SuperbAI

Machine Learning Engineer - 2022

I worked as a machine learning engineer for SuperbAI, a company that automates data labeling for enterprise ML applications.

Implemented an approach to estimate training data influence by tracing gradient descent.
Extended approach to object detection for detection of false negatives in human-labeled datasets. Treated mislabeled instances as its own class based on mixed confidences from its original labeled classes.
Detected over 50% of false-negatives and reported them to human labelers as a form of feedback.
Worked mostly with self-driving datasets with varying environments (weather, cities, time of day, etc).

Tech Stack

PyTorchHuggingFaceAWS EC2AWS ECS

AiFi

Computer Vision Team, Summer Intern - 2021

I worked at AiFi, a company that uses a camera-based computer vision system to automate retail stores. They have clients such as Zabka, Microsoft, Verizon, and various NFL, NBA, and Premier League stadiums.

Spearheaded an effort to use domain-randomized synthetic data to train product recognition algorithms.
Developed a product auto-labeling pipeline based on instance segmentations from simulation data.
Generated photorealistic data from unpaired real and simulation images using cycle-consistent GANs.
Used synthetic data to pretrain YOLOv5 and a small set of real examples for SFT; this reduced store deployment time from 2 weeks to 2 days and increased new SKU detection accuracy by 80%

Tech Stack

PyTorchAzureUnityUltralytics

Stanford University School of Medicine

Research and Development Intern - 2019-2021

I interned at the Bacchetta Lab in the Stanford School of Medicine Department of Pediatrics. I worked with PhD student Esmond Lee.

Designed experiments to increase the NGFR+ percentage (editing rate) for FOXP3 gene editing in hematopoietic stem progenitor cells
Used a design of experiments (DoE) approach to optimize CRISPR-Cas9 editing of HSPCs. Analyzed flow cytometry data using FlowJo.
Created a cost analysis to show a reduction in cost of reagents for applications in clinical settings.
Published a manuscript in the Cytotherapy Journal

Tech Stack

FlowJoOrigin LabCRISPR Cas-9MODDE DoE

Publication

My Work

Octane Security

Advising

Duke Neurotoolbox Lab

SuperbAI

AiFi

Stanford University School of Medicine