Portfolio | Tai Vu

Projects

Artificial Intelligence

Nano-Reasoner: Unified Post-Training Framework for Math Reasoning Models

An end-to-end research-grade, memory-efficient post-training pipeline for reasoning models with SFT and RL

Nano-Transformer: An End-to-End LLM Pretraining Stack from Scratch

A compact, fully inspectable transformer language model and training stack built from scratch in PyTorch

PRM-Math: Inference-Time Compute Scaling via Dense Process Supervision

Cutting-edge inference-time compute scaling strategies with a fine-tuned Process Reward Model

Nano-Video-Gen: Spacetime Diffusion Transformer with Rectified Flow Matching

A Diffusion Transformer video generator from scratch with 3D spacetime patch embeddings and Rectified Flow Matching

Agentic Deep Research: Recursive Reasoning and Self-Correction Engine

A multi-agent deep research engine with dynamic DAGs, self-correction loops, and LLM-as-a-Judge evaluation

Video-DPO: Temporal Alignment for Video Diffusion via Direct Preference Optimization

DPO for video diffusion models, aligning generation for temporal stability and motion smoothness

Visual-CoT: Pixel-Grounded Reasoning with Multi-Modal Chain-of-Thought

Fine-tuned VLM generating interleaved reasoning traces with bounding box coordinates, reducing object hallucination

Tool-Use DPO: Schema-Constrained Alignment via Identity Preference Optimization

LLM alignment for rigid API contract adherence using IPO with hard negatives

Web-Browser-Agent: Multimodal Autonomous Web Navigation via Visual Grounding

A multimodal autonomous web agent with a VLM, a hybrid Set-of-Mark pipeline, and a Verify-Act-Verify loop

Data-Scientist-Agent: Multimodal Code-Actuated Agent via Visual Verification

An autonomous data science agent for data analytics and visualization, with a VLM critic for visual verification

Efficient-Reasoner: Adaptive Compute Allocation via Reinforcement Learning

An RL-trained LLM routing between System 1 and System 2 reasoning

Vision-R1: Visual System 2 Reasoning via GRPO

A VLM trained with SFT, expert iteration, and GRPO to generate grounded chain-of-thought reasoning over visual math problems

Tiny-Reason: Distilling Reasoning into 1.5B Parameter Models

A QLoRA fine-tuning pipeline distilling chain-of-thought reasoning into a 1.5B model

Global Deforestation: Classifying Forest Loss Drivers from Satellite Imagery

A deep learning classification framework of deforestation drivers from multi-temporal Landsat imagery

GANime: Generating Anime Characters from Sketches with Deep Learning

Generative models for automating the colorization of anime sketches

FlapAI Bird: Deep Reinforcement Learning for Flappy Bird

A Flappy Bird agent that achieved superhuman performance using reinforcement learning algorithms

DeepAniGNet: Privacy-Preserving Recommendation via Graph Neural Networks

A novel graph-based recommender system using BERT-powered embeddings and graph neutral networks to deliver personalized, privacy-preserving recommendations

How Not to Give a FLOP: Combining Regularization and Structured Pruning

A systematic study of regularization and network pruning techniques on ResNets

MangaNet: Object Detection for Manga with Deep Neural Networks

Advanced object detection model architectures for the manga domain

Web & App Development

ConnAIsseur: Cross-Domain Recipe Recommendation via BERT Embedding Transfer

A full-stack AI recipe platform using domain-adaptive BERT pre-training, cross-domain semantic alignment from restaurant reviews to recipes, and KNN retrieval

PhotoShare: Full-Stack SPA with Per-Photo Visibility Control

A photo-sharing web application with user authentication, user profiles, user listing, photo sharing, favorite lists, commenting, and activity feeds

StockViz: Real-Time Pairs Trading Dashboard with JPMorgan Perspective

A full-stack streaming analytics dashboard computing dual-stock mid-price ratios with static threshold bands and trigger alerts

Shiptivitas: Kanban Board with Dragula-React Reconciliation

A full-stack to-do list web application based on a kanban board

Game 2048: Algorithmic Puzzle with Array Reversal Conjugation

A minimal Java implementation of the game 2048

GapBuffer: Header-Only C++17 Text Editor Data Structure

A simple C++ text editor that efficiently moves its cursor, accesses different positions, adds and removes characters, and edits texts

GraphViz: Force-Directed Graph Layout with Real-Time Qt Animation

A C++ implementation of the Fruchterman-Reingold algorithm for visualizing nodes and edges in a graph

Publications

FlapAI Bird: Training an Agent to Play Flappy Bird Using Reinforcement Learning Techniques

Tai Vu and Leon Tran

PDF Code
How Not to Give a FLOP: Combining Regularization and Pruning for Efficient Inference

Tai Vu, Emily Wen, and Roy Nehoran

PDF Code
Privacy Preserving Inference of Personalized Content for Out of Matrix Users

Michael Sun, Tai Vu, and Andrew Wang

PDF Code
GANime: Generating Anime and Manga Character Drawings from Sketches with Deep Learning

Tai Vu and Robert Yang

PDF Code
BERT-VQA: Visual Question Answering on Plots

Tai Vu and Robert Yang

PDF Code
Pixel-Perfect Piloting: Superhuman Control of Pixelcopter via Reinforcement Learning

Tai Vu, Brad Nikkel, and Jenny Yang

PDF Code
Beyond the Panels: A Deep Neural Network Approach for Manga Object Detection

Tai Vu and Robert Yang

PDF Code
Amplifying Emotional Signals: Data-Efficient Deep Learning for Robust Speech Emotion Recognition

Tai Vu

PDF Code
From Bayes to BERT: A Comprehensive Benchmark for State-of-the-Art Intent Detection

Tai Vu and Robert Yang

PDF Code
The Optimal Route: A Rigorous Survey of Foundational Shortest-Path Algorithms

Tai Vu

PDF