“Optimizing On-Device Generative AI for advanced applications.”
Generative AI leverages knowledge learned from large-scale datasets to create novel data and content. Beyond its well-known application in Text-to-Image generation, this field extends its impact across diverse domains such as music, video, 3D reconstruction, and system configuration, driving significant technological advancements.
Our research focuses on the development of state-of-the-art generative AI models, including Large Language Models (LLMs) and Diffusion models, as well as the exploration of foundational theoretical methods. We apply these advanced techniques to emerging fields (e.g. system optimization, performance evaluation), aiming to transcend the limitations of traditional approaches and to propose innovative designs for contemporary systems.
Additionally, our approach explores the potential of LLMs to address a variety of tasks. Specifically, we target the development of acceleration techniques, such as knowledge compression and learning strategies, to optimize and lightweight LLM structures for more efficient deployment.
“Design cognitive machine learning based on a human memory model”
Hyperdimensional (HD) computing is an alternative computing method that processes cognitive tasks in a lightweight and error-torrent way based on theoretical neuroscience. The mathematical properties of high-dimensional spaces have remarkable agreements with brain behaviors. Thereby, HD computing emulates human cognition by computing with high-dimensional vectors, called hypervectors, as an alternative to traditional computing numbers like integers and booleans. With concrete hypervector arithmetics, we enable various pattern-based computations such as memorizing and reasoning, similar to what humans are doing. Our group works on developing diverse learning and cognitive computing techniques based on HD computing, focusing on typical ML tasks, neuro-symbolic AI, and acceleration in next-generation computing environments.
“Systems for ML and ML for Systems”
Machine Learning (ML) is increasingly recognized as a pivotal technology for autonomous data analysis and pattern recognition. Our research group is at the forefront of redefining the role of ML in system design, focusing on innovative solutions such as Near-data Processing (NDP) and Processing In-Memory (PIM) architectures. These technologies are integrated into both main memory and cache levels, utilizing DRAM and SRAM to address the critical bottleneck of data movement. By performing computations directly within memory, NDP and PIM architectures substantially minimize redundant data transfers and enhance computational efficiency.
We are developing software-level framework to orchestrate the NDP/PIM architectures with various next-generetion technologies, e.g., CXL and 6G. Our research also extends to ML-driven system software that optimizes processing environments, facilitating robust cross-platform power and performance predictions as well as resource usage characterizations for edge devices. This approach not only augments computational efficiency but also exploits the transformative potential of ML in evolving computing architectures.