Hui Lu (卢辉)

IMG_7704.jpg

Hi! I am Hui Lu (卢辉), a researcher and engineer working on speech and language technologies. My current research focuses on speech-based language modeling and full-duplex spoken dialogue. My goal is to build seamless voice interaction interfaces for AI agents, making human-agent communication more natural and efficient. I also work on disentangled speech representation learning, text-to-speech synthesis, and voice conversion.

I recently defended my Ph.D. thesis at the Chinese University of Hong Kong (CUHK), where I am advised by Prof. Helen Meng. Before joining CUHK, I received my M.E. from Tsinghua University, advised by Prof. Zhiyong Wu, and my B.E. from Tongji University.

Education

Ph.D. in Information Systems, The Chinese University of Hong Kong
M.E. in Computer Science, Tsinghua University
B.E. in Communication Engineering, Tongji University

Work Experience

SenseTime Research, Research Intern
End-to-end full-duplex spoken dialogue modeling.
Speechify Inc., Senior Applied Scientist
Controllable TTS and multilingual voice conversion.
Meta AI (FAIR), Research Scientist Intern
LLM-based speech-to-speech translation.
Tencent AI Lab, Research Intern
Non-autoregressive TTS with VAEs.
Microsoft, Software Engineer Intern

Challenge Award

3rd Place, Track 2: Full-Duplex Interaction
I led the team "SenseDialog" and developed a cascaded spoken dialogue system that ranked 3rd in the full-duplex interaction track.
Certificate for 3rd place in Track 2 of the ICASSP 2026 Human-like Spoken Dialogue Systems Challenge

Selected Publications

Speech-based Language Modeling & Full-Duplex Spoken Dialogue

  1. How Should LLMs Listen While Speaking? A Study of User-Stream Routing in Full-Duplex Spoken Dialogue. Hui Lu, Xueyuan Chen, Huimeng Wang, Shuhai Peng, Shiyin Kang, Xixin Wu, Zhiyong Wu. arXiv preprint, 2026. [paper] [demo]
  2. Towards Streaming Target Speaker Extraction via Chunk-wise Interleaved Splicing of Autoregressive Language Model. Shuhai Peng*, Hui Lu*, Jinjiang Liu, Liyang Chen, Guiping Zhong, Jiakui Li, Huimeng Wang, Haiyun Li, Liang Cao, Shiyin Kang, Zhiyong Wu. arXiv preprint, 2026. [paper]

Text-to-Speech Synthesis

  1. SemaVoice: Semantic-Aware Continuous Autoregressive Speech Synthesis. Huimeng Wang, Hui Lu, Jiajun Deng, Haoning Xu, Youjun Chen, Xueyuan Chen, Zhaoqing Li, Shuhai Peng, Shiyin Kang, Xunying Liu. arXiv preprint, 2026. [paper]
  2. VAENAR-TTS: Variational Auto-Encoder based Non-AutoRegressive Text-to-Speech Synthesis. Hui Lu, Zhiyong Wu, Xixin Wu, Xu Li, Shiyin Kang, Xunying Liu, Helen Meng. Interspeech, 2021. [paper] [demo] [code]

Speech Disentanglement & Voice Conversion

  1. Unifying One-Shot Voice Conversion and Cloning with Disentangled Speech Representations. Hui Lu, Xixin Wu, Haohan Guo, Songxiang Liu, Zhiyong Wu, Helen Meng. ICASSP, 2024. [paper] [demo]
  2. SpeechTripleNet: End-to-End Disentangled Speech Representation Learning for Content, Timbre and Prosody. Hui Lu, Xixin Wu, Zhiyong Wu, Helen Meng. ACM MM, 2023. [paper] [demo] [code]
  3. Disentangled Speech Representation Learning for One-Shot Cross-lingual Voice Conversion Using β-VAE. Hui Lu, Disong Wang, Xixin Wu, Zhiyong Wu, Xunying Liu, Helen Meng. SLT, 2022. [paper] [demo] [code]
  4. One-shot Voice Conversion with Global Speaker Embeddings. Hui Lu, Zhiyong Wu, Dongyang Dai, Runnan Li, Shiyin Kang, Jia Jia, Helen Meng. Interspeech, 2019. [paper] [demo]
  5. A Compact Framework for Voice Conversion Using WaveNet Conditioned on Phonetic Posteriorgrams. Hui Lu, Zhiyong Wu, Runnan Li, Shiyin Kang, Jia Jia, Helen Meng. ICASSP, 2019. [paper] [demo]

Academic Services

  • Reviewing: NeurIPS, ACM Multimedia, ICASSP, Interspeech, COLING, ICME, LREC
  • Teaching: ENGG2760 — Probability for Engineers; ENGG1120 — Linear Algebra for Engineers