Skip to main content

HYOUNG-KYU SONG

I am an Applied Scientist at Captions (NY, US), specializing in AI-driven facial video generation. My current research focuses on media generation, particularly talking videos, but I am also interested in model optimizations for LLMs and image/video generation models.

Previously, I worked for about four years on Stable Diffusion model optimization at Nota AI, South Korea, and on multilingual talking face generation at MAUM.AI, also in South Korea. At MAUM.AI (formerly MINDsLab), I was fortunate to co-lead a team of around 18 people. I joined the company after its Series C funding round, and it later went public on the KOSDAQ market in South Korea. At Nota AI, I initially worked as a student intern when the team consisted only of the current CEO and CTO, and I rejoined the company later. In my later years there, I contributed to securing Series C funding through Gen AI optimization efforts.

Now, I am returning to an IC role on an amazing team, where I will focus on contributing through my research and engineering expertise.

Profile Picture

Publications

  • Bo-Kyeong Kim*, Hyoung-Kyu Song, Thibault Castells, Shinkook Choi, “BK-SDM: A Lightweight, Fast, and Cheap Version of Stable Diffusion,” ECCV 2024.
    [Paper][Code][Demo]
  • Thibault Castells*, Hyoung-Kyu Song, Bo-Kyeong Kim, Shinkook Choi, “LD-Pruner: Efficient Pruning of Latent Diffusion Models using Task-Agnostic Insights,” CVPR Workshop (EDGE) 2024.
    [Paper]
  • Thibault Castells*, Hyoung-Kyu Song, Tairen Piao, Shinkook Choi, Bo-Kyeong Kim, Hanyoung Yim, Chang-gwun Lee, Jae Gon Kim, Tae-Ho Kim†, “EdgeFusion: On-Device Text-to-Image Generation,” CVPR Workshop (EDGE) 2024.
    [Paper]
  • Bo-Kyeong Kim*, Geonmin Kim, Tae-Ho Kim, Thibault Castells, Shinkook Choi, Junho Shin, Hyoung-Kyu Song, “Shortened LLaMA: A Simple Depth Pruning for Large Language Models,” arXiv 2024.
    [Paper]
  • Bo-Kyeong Kim*, Jaemin Kang, Daeun Seo, Hancheol Park, Shinkook Choi, Hyoung-Kyu Song, Hyungshin Kim†, Sungsu Lim†, “A Unified Compression Framework for Efficient Speech-Driven Talking-Face Generation,” MLSys Workshop (ODIW) 2023.
    [Paper][Demo]
  • Hyoung-Kyu Song*, Sang Hoon Woo*, Junhyeok Lee, Seungmin Yang, Hyunjae Cho, Youseong Lee, Dongho Choi, Kang-wook Kim, “Talking Face Generation with Multilingual TTS,” CVPR Demo 2022.
    [Paper][Demo]
  • Hyoung-Kyu Song*, Ebrahim AlAlkeem, Jaewoong Yun, Tae-Ho Kim, Hyerin Yoo, Dasom Heo, Myungsu Chae, Chan Yeob Yeun†, “Deep User Identification Model with Multiple Biometric Data,” BMC Bioinformatics 21, 315. 2020.
    [Paper]