Deep Learning: At LG Electronics, I am developing an AI coding assistant using large language models (LLMs). I have successfully trained LLMs in the distributed settings, and have deployed LLMs to hundreds of users. Recently, I am conducting research on fast and accurate LLM inference.
Algorithm Engineering: My primary research efforts have been devoted to developing fast algorithms. I developed fast algorithms for graph isomorphism, graph isomorphism query processing, and multiple pattern Cartesian tree matching during my Ph.D. studies.
LG Electronics - Artificial Intelligence Lab (Senior Researcher)
- Jan. 2024 - Present: Development of AI Coding Assistant using Large Language Model
- Conducting research on domain adaptive continual pretraining code LLMs.
- Maintaining custom benchmark dataset for offline evaluation.
- Analyzing user data and feedback for online evaluation.
- Constructing instruction dataset and conducting instruction-tuning.
- Aug. 2022 – Dec. 2023: Development of AI Coding Assistant using Large Language Model
- Conducted distributed training of LLMs based on decoder-only transformer.
- Filtered and deduplicated terabytes of source code data.
- Developed a fast LLM inference server in terms of latency and throughput.
- Apr. 2022 – Dec. 2022: Development of Coding Education Program Utilizing AI
- Constructed training data for generating Python code from natural language instruction.
- Trained an encoder-decoder transformer from scratch.
- Developed a web client that inputs prompt, prints AI-generated code, and executes Python code.
- Created a inference server that runs on multiple GPUs, loads multiple copies of the model, and offers dynamic batching for increased throughput.
Seoul National University – Institute of Computer Technology (Post-Doctoral Assistant)
- Jan. 2022 – Mar. 2022: Algorithm Development for Graph Isomorphism Query Processing
- Developed a fast graph isomorphism query processing algorithm that runs orders of magnitude faster than state-of-the-art algorithms.
NAVER – AI Dev2 (Internship)
- Oct. 2021: Analyzing Conversion Tracking Data
- Conducted exploratory data analysis on glad for advertisement data to find meaningful trends.
- Handled hundred gigabytes of (raw) conversion tracking data.
- Solved optimization problem of maximizing conversion rate using linear programming.
Competitive Programming
Programming Languages
Libraries
- PyTorch, TensorFlow, Triton (OpenAI), Seaborn, Pandas, PySpark, HuggingFace Transformers, DeepSpeed, NVIDIA Triton, NVIDIA Faster Transformer, FastAPI, gtest
Others
- AWS (SageMaker, EC2, Lustre, S3)