Projects Artificial Intelligence Nano-Reasoner: Unified Post-Training Framework for Math Reasoning Models An end-to-end research-grade, memory-efficient post-training pipeline for reasoning models with SFT and RL PRM-Math: Inference-Time Compute Scaling via Dense Process Supervision Cutting-edge inference-time compute scaling strategies with a fine-tuned Process Reward Model Nano-Video-Gen: Spacetime Diffusion Transformer with Rectified Flow Matching A Diffusion Transformer video generator from scratch with 3D spacetime patch embeddings and Rectified Flow Matching Agentic Deep Research: Recursive Reasoning and Self-Correction Engine A multi-agent deep research engine with dynamic DAGs, self-correction loops, and LLM-as-a-Judge evaluation Video-DPO: Temporal Alignment for Video Diffusion via Direct Preference Optimization DPO for video diffusion models, aligning generation for temporal stability and motion smoothness Visual-CoT: Pixel-Grounded Reasoning with Multi-Modal Chain-of-Thought Fine-tuned VLM generating interleaved reasoning traces with bounding box coordinates, reducing object hallucination Tool-Use DPO: Schema-Constrained Alignment via Identity Preference Optimization LLM alignment for rigid API contract adherence using IPO with hard negatives Web-Browser-Agent: Multimodal Autonomous Web Navigation via Visual Grounding A multimodal autonomous web agent with a VLM, a hybrid Set-of-Mark pipeline, and a Verify-Act-Verify loop Data-Scientist-Agent: Multimodal Code-Actuated Agent via Visual Verification An autonomous data science agent for data analytics and visualization, with a VLM critic for visual verification Efficient-Reasoner: Adaptive Compute Allocation via Reinforcement Learning An RL-trained LLM routing between System 1 and System 2 reasoning Vision-R1: Visual System 2 Reasoning via GRPO A VLM trained with SFT, expert iteration, and GRPO to generate grounded chain-of-thought reasoning over visual math problems Tiny-Reason: Distilling Reasoning into 1.5B Parameter Models A QLoRA fine-tuning pipeline distilling chain-of-thought reasoning into a 1.5B model Global Deforestation: Classifying Forest Loss Drivers from Satellite Imagery A deep learning classification framework of deforestation drivers from multi-temporal Landsat imagery GANime: Generating Anime Characters from Sketches with Deep Learning Generative models for automating the colorization of anime sketches FlapAI Bird: Deep Reinforcement Learning for Flappy Bird A Flappy Bird agent that achieved superhuman performance using reinforcement learning algorithms DeepAniGNet: Privacy-Preserving Recommendation via Graph Neural Networks A novel graph-based recommender system using BERT-powered embeddings and graph neutral networks to deliver personalized, privacy-preserving recommendations How Not to Give a FLOP: Combining Regularization and Structured Pruning A systematic study of regularization and network pruning techniques on ResNets MangaNet: Object Detection for Manga with Deep Neural Networks Advanced object detection model architectures for the manga domain Web & App Development ConnAIsseur: Cross-Domain Recipe Recommendation via BERT Embedding Transfer A full-stack AI recipe platform using domain-adaptive BERT pre-training, cross-domain semantic alignment from restaurant reviews to recipes, and KNN retrieval PhotoShare: Full-Stack SPA with Per-Photo Visibility Control A photo-sharing web application with user authentication, user profiles, user listing, photo sharing, favorite lists, commenting, and activity feeds StockViz: Real-Time Pairs Trading Dashboard with JPMorgan Perspective A full-stack streaming analytics dashboard computing dual-stock mid-price ratios with static threshold bands and trigger alerts Shiptivitas: Kanban Board with Dragula-React Reconciliation A full-stack to-do list web application based on a kanban board Game 2048: Algorithmic Puzzle with Array Reversal Conjugation A minimal Java implementation of the game 2048 GapBuffer: Header-Only C++17 Text Editor Data Structure A simple C++ text editor that efficiently moves its cursor, accesses different positions, adds and removes characters, and edits texts GraphViz: Force-Directed Graph Layout with Real-Time Qt Animation A C++ implementation of the Fruchterman-Reingold algorithm for visualizing nodes and edges in a graph Publications FlapAI Bird: Training an Agent to Play Flappy Bird Using Reinforcement Learning Techniques Tai Vu and Leon Tran PDF Code How Not to Give a FLOP: Combining Regularization and Pruning for Efficient Inference Tai Vu, Emily Wen, and Roy Nehoran PDF Code Privacy Preserving Inference of Personalized Content for Out of Matrix Users Michael Sun, Tai Vu, and Andrew Wang PDF Code GANime: Generating Anime and Manga Character Drawings from Sketches with Deep Learning Tai Vu and Robert Yang PDF Code BERT-VQA: Visual Question Answering on Plots Tai Vu and Robert Yang PDF Code Pixel-Perfect Piloting: Superhuman Control of Pixelcopter via Reinforcement Learning Tai Vu, Brad Nikkel, and Jenny Yang PDF Code Beyond the Panels: A Deep Neural Network Approach for Manga Object Detection Tai Vu and Robert Yang PDF Code Amplifying Emotional Signals: Data-Efficient Deep Learning for Robust Speech Emotion Recognition Tai Vu PDF Code From Bayes to BERT: A Comprehensive Benchmark for State-of-the-Art Intent Detection Tai Vu and Robert Yang PDF Code The Optimal Route: A Rigorous Survey of Foundational Shortest-Path Algorithms Tai Vu PDF