I am a senior engineer in High Performance Computing Research Lab at Samsung SDS Research. My research goal is to make AI systems faster, more efficient, and scalable. I’m currently working on designing memory-efficient LLM serving systems and LLM quantization.

News

[2025.03] I joined at Samsung SDS Research.
[2024.12] The paper on timely DNN processing for mobile platforms. “NeuroBalancer: Balancing System Frequencies with Punctual Laziness for Timely and Energy-efficient DNN Inferences” has been accepted to IEEE Transactions on Mobile Computing (TMC) (IF: 7.7).
[2024.03] The paper on coactive neural network inference offloading. “CoActo: CoActive Neural Network Inference Offloading with Fine-grained and Concurrent Execution” has been accepted to ACM MobiSys 2024 (Acceptance Rate: 16.3%=43/263).
[2023.09] The paper on opportunistic parallelism for dynamic computation of neural nerworks. “ASPEN: Breaking Operator Barriers for Efficient Parallelization of Deep Neural Networks” has been accepted to NeurIPS 2023 (poster).
[2023.07] The paper on offloaded computing, “ENTRO: Tackling the Encoding and Networking Trade-off in Offloaded Video Analytics” has been accepted to ACM Multimedia (MM) 23.
[2022.06] The paper in collaboration with CU Boulder, “R-FeC: RL-based FEC Adjustment for better QoE in WebRTC” has benn accepted to ACM Multimedia (MM) 22.
[2022.03] The paper on the optimization of GEMM for mobile platforms, “mGEMM: Low-latency Convolution with Minimal Memory Overhead Optimized for Mobile Devices” has been accepted to ACM MobiSys 2022.
[2021.06] The paper “zTT: Learning-based DVFS with Zero Thermal Throttling for Mobile Devices” won the BEST PAPER AWARD from ACM MobiSys 2021! (MobiSys Opening Session in YouTube)
[2021.05] The paper on ML-based processor control, “zTT: Learning-based DVFS with Zero Thermal Throttling for Mobile Devices” has been accepted to ACM MobiSys 2021 with one of the highest-level ACM artifact badges (ACM Results Reproduced Badge).

Kyungmin Bin

News