Profile of AI Researcher Cao Ting
Structured Biography of Cao Ting: Academic Background, Research Impact, and Recent Work at Tsinghua University AIR
Introduction
Cao Ting (often cited as Ting Cao) is widely recognized as a leading figure in the field of artificial intelligence (AI) and computer systems research in China. With a distinguished career spanning academia and industry, Dr. Cao’s professional journey traverses global technology giants, premier academic posts, and a string of pioneering research contributions in Edge AI, large foundation models, AI inference systems, and state-of-the-art accelerator design. Her influence extends across the intricate landscape of software-hardware co-design, the democratization of large AI models onto consumer devices, and the intersection of academia, innovation, and industry through her latest role at the Institute for AI Industry Research (AIR) of Tsinghua University. This report presents a comprehensive, structured biography of Cao Ting, with deep analysis into her academic formation, career history, research breakthroughs, areas of expertise, and her evolving work at Tsinghua University under Zhang Yaqin’s celebrated leadership.
Academic Background
Early Education and Doctoral Training
Dr. Cao Ting’s academic foundation is rooted firmly in an elite tradition of systems research and energy-efficient computational design. She earned her PhD from the School of Computing at the Australian National University (ANU), where she was jointly supervised by Prof. Steve Blackburn and Prof. Kathryn McKinley, acclaimed scholars in computer architecture and managed runtimes. Her doctoral research centered on software and hardware co-design strategies to optimize energy efficiency, tackling critical challenges in quantitative power analysis and the practical deployment of low-power, high-performance systems.
Cao’s dissertation produced several well-cited publications, including seminal work on the interplay between managed software environments and emerging multi-core architectures. Her study, "The Yin and Yang of Power and Performance for Asymmetric Hardware and Managed Software" (ISCA 2012), and related work, was recognized as an ACM Research Highlight and selected as an IEEE Micro Top Pick in 2012, signaling her immediate impact on the field.
A particularly notable validation of her research was its selection for inclusion in the canonical textbook “Computer Architecture: A Quantitative Approach” by John L. Hennessy and David A. Patterson. Patterson, a Turing Award laureate, wrote a perspective article specifically highlighting Cao’s contributions, emphasizing how her research reshaped benchmarks and evaluation methodologies in energy-efficient computing for both academia and industry.
Early Career, Fellowships, and Recognitions
Dr. Cao’s pre-doctoral and early postdoctoral recognitions include a finalist award for the Google Sydney Anita Borg Memorial Scholarship and China’s State Scholarship for Study Abroad. These competitive accolades underscored her promise as a future leader in research. She completed brief stints as an intern at AMD Research and a postdoctoral fellow at the Institute of Computing Technology, Chinese Academy of Sciences. These early experiences strengthened her expertise in chip-level systems and real-world high-performance computing.
Career History
Career at Huawei Technologies (2016-2018)
Following her doctoral and early postdoctoral phase, Cao Ting served as a Senior Software Engineer at Huawei Technologies between 2016 and 2018. At Huawei, she joined the Compiler and Programming Language Lab as well as the State Key Laboratory of Computer Architecture. During this period, her work focused on systems research and R&D that underpinned consumer-facing innovations in Huawei’s HarmonyOS and related smart device platforms. She participated in the design and optimization of computing infrastructures that enabled next-generation mobile and embedded device intelligence, laying a critical foundation for subsequent advances in Edge AI research and deployment. Although less publicized than her subsequent industry roles, this experience provided direct exposure to the high-consequence challenges of computation on resource-constrained devices, which would become a defining theme across her career.
Career at Microsoft Research Asia (MSRA), 2018-2025
Role and Responsibilities
In 2018, Cao Ting joined Microsoft Research Asia (MSRA) in Beijing—a lab regarded as one of the world’s premier research institutions for AI and systems research. She quickly rose through the ranks, serving as a Principal Researcher and eventually as a Research Manager. Here, she became a pivotal figure spearheading Microsoft’s research in Edge AI, inference systems, model quantization, and hardware-software co-design. At MSRA, Dr. Cao led projects bridging foundational algorithmic innovation and real-world industrial deployment. Her team’s inventions were routinely integrated into flagship products, including Microsoft Office, Windows, Bing, and contributed to core components of Huawei HarmonyOS. This level of integration validated the practical, disruptive value of her scientific outcomes in global-scale technology ecosystems.
Environment and Institutional Context
Her tenure at MSRA coincided with both a golden age and a sharp inflection point for Sino-American AI research collaboration. As US-China tech rivalry intensified, and following mounting US government restrictions on cross-border AI research, the environment at MSRA shifted dramatically. Many prominent researchers, including Dr. Cao, departed as Microsoft moved to relocate or wind-down Beijing-based labs, reflecting the broader realignment of global talent and collaboration in AI.
Transition to Tsinghua University
AIR and the Zhang Yaqin Connection
In July 2025, Dr. Cao transitioned to academia, accepting a role as Research Professor at the Institute for AI Industry Research (AIR), Tsinghua University. AIR, founded and helmed by Dr. Zhang Yaqin—a former managing director of MSRA, Baidu President, and a globally renowned innovator—serves as a unique bridge between top-tier academic research and rapid translation to industry application. This move, coming amid shifting global technology policy, symbolized not only a personal career evolution but also the strengthening of China’s own research capacity in AI. Her appointment under Zhang Yaqin directly leverages their deep, shared MSRA experience and signals Tsinghua AIR’s ambition to become a world leader in industrial-scale AI research and deployment.
Areas of Expertise
Dr. Cao’s expertise spans several highly impactful domains within artificial intelligence and computer systems:
- Edge AI: Developing foundational systems and algorithmic frameworks to deploy deep learning models and large language models (LLMs) efficiently on resource-constrained entities like mobile phones and PCs, reducing cloud dependency and operational costs.
- Large Foundation Model Algorithms: Innovations in efficient training, quantization, compression, and deployment of large transformer-based models, including LLMs, vision-language models, and multi-modal perception systems.
- AI Inference Systems: Co-design of software and hardware to accelerate inference, with special attention to table-lookup techniques, low-bit quantization, and efficient mixed-precision computation.
- Novel AI Accelerators: Systematic research and development of next-generation accelerator designs—especially for real-time, low-energy, and high-throughput computation necessary for AI deployment at scale.
- Software-Hardware Co-Design: Longstanding focus on aligning software architecture with emerging hardware capabilities, demonstrated from her doctoral phase and continuously iterated in subsequent research and product integration.
Notably, Dr. Cao’s innovations stretch from academic theory to production-level code and architecture, directly impacting products used by hundreds of millions globally.
Research Contributions
Overview and Influence
Cao Ting’s research career is characterized by a rare dual achievement: producing both influential scientific publications recognized at top venues, and high-impact, “real-world” integration into large-scale commercial products. Her work has been published or presented at the most prestigious conferences in systems and artificial intelligence, including ISCA, ASPLOS, MobiSys, OSDI, USENIX ATC, SC, EuroSys, NSDI, PPoPP, PLDI, ICCV, ACL, EMNLP, SenSys, KDD, and many more.
Innovation in Edge AI
Edge AI formed a central axis of her research, defining new paradigms for making deep learning and LLMs feasible on-device. Her contributions include pioneering deployment of large transformer and vision-language models on edge hardware, which traditionally struggled with memory, energy, and compute bottlenecks. She is widely cited for championing software-hardware co-design—designing algorithms and acceleration strategies that are specifically aware of hardware resource constraints on end-user devices. Such work greatly improved both efficiency and accessibility, enabling Microsoft Office’s cloud-free features, Bing mobile inference, and the AI stack for HarmonyOS smartphones.
Mixed Precision and Low-Bit Inference
Dr. Cao is an established expert in low-bit quantization, mixed-precision inference, and table-lookup computation for deep networks—techniques essential to shrinking model size and boosting processing speed without losing critical accuracy. Her LUT-based accelerator research (see LUTensor, LUT-DLA) and BitDistiller project pushed the boundaries on deploying LLMs at 1-4 bit quantization on commodity hardware.
Notable technical innovations include:
- LUTensor and LUT-DLA: First hardware-software co-design frameworks to optimize mixed-precision low-bit inference using lookup tables, substantially improving matrix multiplication efficiency for LLMs on AI accelerators and edge hardware. This research is now referenced in standard system architecture texts.
- BitDistiller and AFPQ: Advanced methods for quantization-aware training and asymmetric floating point quantization, leading to sub-4-bit inference and broader model/device compatibility.
Model Compression and Efficient Fine-Tuning
Dr. Cao’s recent JENGA project, published at USENIX ATC 2025, introduces a breakthrough in “Contextual Token Sparsity” for LLM fine-tuning. By dynamically identifying and eliminating redundant tokens during long-context fine-tuning, JENGA significantly reduces memory and compute needs while maintaining task accuracy—critical for tasks in legal, technical, or creative fields involving lengthy context windows. Her collaborative work in LoRAStencil and Long Exposure continues this emphasis on parameter-efficient fine-tuning, exploring new methods to adapt massive models for specialized tasks with minimal resource use.
AI Systems for Video and Multi-Modal Understanding
Dr. Cao also pioneered agentic and real-time video analytics using large video-language models. Her AVA project, launched in NSDI 2026, introduced novel event-knowledge graph construction and agentic retrieval mechanisms, demonstrating state-of-the-art video analytics—even for ultra-long, open-ended scenarios. StreamMind, another flagship project (ICCV 2025), addressed the high throughput challenge in streaming video dialogue through event-gated cognition, achieving ultra-high frame rates for real-time applications like AI copilots and gaming assistants.
Impactful Integration into Products
Dr. Cao’s research is exceptional not only for its academic merit but also its direct adoption in the tech industry. Her innovations underpin central cloud-saving features in Microsoft Office, speed and robustness upgrades in Bing, enhancements in Windows AI features, and core functionalities in Huawei’s HarmonyOS, highlighting her rare ability to bridge deep research with tangible, at-scale products.
Selected Awards and Recognitions
Throughout her career, Dr. Cao has accrued an array of top awards:
- ACM Research Highlights (2012, 2021): For best-in-class contributions shaping the direction of systems and energy-efficient computing.
- IEEE Micro Top Picks (2012): Recognized for outstanding papers in computer architecture and systems.
- Best Paper Awards: PPoPP 2024, MobiSys 2021, NAS 2014, ICCD 2010, among others, spanning a diversity of topics from low-bit inference to system-level video AI optimization.
Selected Publications Overview
Dr. Cao’s publication record is both prolific and landmark-setting. Below are select, recent works representative of her current focus at Tsinghua AIR.
AVA: Towards Agentic Video Analytics Systems with Video Language Models
Venue: USENIX NSDI 2026
Summary: Proposes AVAS, a pioneering system for open-ended video analytics using video-language models (VLMs), introducing real-time event knowledge graphs and agentic retrieval mechanisms. Achieves top accuracy on public and new ultra-long video benchmarks, significantly outperforming rival VLM and retrieval-based systems.
Collaborators: Yuxuan Yan, Shiqi Jiang, Ting Cao, Yifan Yang, et al.
StreamMind: Unlocking Full Frame Rate Streaming Video Dialogue through Event-Gated Cognition
Venue: ICCV 2025
Summary: Presents a novel approach to streaming video dialogue, introducing a ‘perception-cognition interleaving’ paradigm with event-gated LLM invocation and constant-cost, event-preserving feature extraction. Achieves real-time, 100fps processing and state-of-the-art model capability—an enabler for always-on AI dialogue in gaming and multimedia.
Collaborators: Xin Ding, Hao Wu, Yifan Yang, Shiqi Jiang, Ting Cao, et al.
Neuralink: Fast on-Device LLM Inference with Neuron Co-Activation Linking
Venue: ASPLOS 2026
Summary: Pioneers neuron co-activation linkage for efficient memory mapping of LLMs on flash storage, reducing I/O bottlenecks on mobile devices. Yields nearly 6× improvements in inference latency, making full-scale LLMs practical on smartphones.
Collaborators: Tuowei Wang, Ruwen Fan, Minxing Huang, Zixu Hao, Kun Li, Ting Cao, et al.
JENGA: Enhancing Long-Context Fine-tuning of LLMs with Contextual Token Sparsity
Venue: USENIX ATC 2025
Summary: Introduces a new token-level sparsity mechanism for LLMs, dynamically removing redundant tokens during fine-tuning. Results in close-to-double memory reduction and 1.36× speedups, outperforming state-of-the-art solutions like LongLoRA.
LUTensor: A Software-Hardware Co-Design for LUT-Based Low-Bit LLM Inference
Venue: ISCA 2025
Summary: A hardware-software co-design for fast, energy-efficient low-bit inference using lookup tables, with custom tensor cores. Demonstrates substantial compute density and energy efficiency gains.
Other recent highlights include VPTQ (EMNLP 2024), PruneAug (IEEE TC 2024), BitDistiller (ACL 2024), AFPQ (ACL 2024), Long Exposure (SC 2024), and extensive publications on in-browser inference, diffusion model acceleration, and hardware-aware tensor transformation.
Key Career Milestones
Period | Role / Institution | Key Contributions |
---|---|---|
2010-2015 | PhD, Australian National University | Dissertation on software-hardware co-design for energy efficiency; Seminal paper "The Yin and Yang of Power and Performance" (ISCA 2012) recognized as ACM Research Highlight and IEEE Micro Top Pick. |
2016-2018 | Senior Software Engineer, Huawei Technologies | R&D for HarmonyOS and mobile platforms; systems optimization for Edge AI on consumer devices. |
2018-2025 | Principal Researcher, Microsoft Research Asia (MSRA) | Led research in Edge AI, LLM inference systems, and quantization; research directly integrated into Microsoft Office, Windows, and Bing. |
2025-Present | Research Professor, Tsinghua University AIR | Leads high-impact research groups on Edge AI, foundation model systems, and next-generation accelerators; collaborates with Dr. Zhang Yaqin to bridge academic research with industrial application. |
Integration into Industrial Products
Cao Ting’s research productivity stands out for its concrete, measurable impact across widely adopted global technology platforms. Her research is incorporated in:
- Microsoft Office: Powering grammar correction, AI completion, and on-device intelligence—connected to her leadership in Edge AI and sequence-to-sequence models.
- Windows: Driving efficient AI services and background features with mixed-precision and LUT-based computation, enabling features that run without cloud dependence.
- Bing: Supporting search, recommendation, and response models with novel inference systems.
- Huawei HarmonyOS: Her contributions ensured deep learning inference was possible across a diverse fleet of consumer devices, with system-level efficiency that set new industry standards.
Product teams leveraged her work in ONNX Runtime, quantization strategies, and even the Aggressive Decoding paradigm for NLP services, underscoring her research’s translation from publication to global user experience.
Academic Service and Leadership
Beyond her technical research, Dr. Cao is a dedicated community leader and academic mentor:
- Associate Editor: IEEE Transactions on Computers—a premier journal for computer system innovation.
- Program Committees: Serves on the PC or as reviewer for many flagship conferences: MobiSys, PLDI, OOPSLA, VEE, ChinaSys, ISMM, and others.
- Collaboration and Mentorship: Maintains robust research partnerships with peers at Tsinghua, Microsoft, Huawei, and international institutions; regularly advises PhD and postdoctoral fellows and co-leads large-scale consortia and grant projects.
Recent and Current Work at Tsinghua University AIR
Tsinghua AIR Context
The Institute for AI Industry Research (AIR) at Tsinghua University aspires to be a world leader in bridging fundamental AI research with industrial and societal needs. Under the stewardship of Dr. Zhang Yaqin—a globally recognized engineer, innovator, and former MSRA director—AIR is the epicenter for the next generation of industrial AI advances. Its mission synergizes with national priorities to localize critical AI expertise as global research environments fragment.
Cao Ting’s Role and Research Portfolio
As Research Professor at AIR, Dr. Cao is charged with leading high-impact groups focusing on Edge AI, low-bit inference, foundation model systems, and next-generation accelerator co-design. Her group demonstrates an aggressive agenda to build solutions with real-time, on-device intelligence, pushing toward open-ended and adaptive AI across scales and modalities.
Current projects under her direction span:
- Agentic Video Analytics (AVA): Creating AI systems capable of real-time, event-driven, open-domain video understanding, blending knowledge graphs with advanced retrieval strategies.
- Event-Driven Streaming Dialogue (StreamMind): Enabling interactive, high-throughput video dialogue with event-gated LLMs and state-space driven perception.
- Fast On-Device LLM Inference (Neuralink): Architectural and algorithmic breakthroughs for full-scale LLMs on mobile devices, optimizing neuron placement and data access for low-latency performance.
- Efficient Long-Context LLM Fine-Tuning (JENGA): Advancing memory and computational efficiency for specialized LLM deployment in data-heavy contexts.
These projects are published at the most competitive academic venues (NSDI, ICCV, ASPLOS, USENIX ATC, ISCA, EuroSys, HPCA) and increasingly benchmarked for practical standards in Edge AI deployment.
Collaboration with Zhang Yaqin
Dr. Cao’s synergy with Zhang Yaqin reflects a unique partnership—blending her technical rigor and translational prowess with Zhang’s strategic vision and deep industrial-academic networks. This leadership matrix accelerates AIR’s pursuit to industrialize AI innovations at scale, foster startup ventures and spinouts, and prepare next-generation AI leaders for China and the world.
Conclusion
Dr. Cao Ting’s biography demonstrates uncommon breadth and impact—from advanced academic theory and benchmark-defining systems research to widespread industrial adoption and now, leadership at Tsinghua AIR. Her career traces a trajectory through the heart of global AI innovation, negotiating shifting international landscapes, institutional realignments, and evolving research frontiers.
Her interdisciplinary expertise in Edge AI, low-bit model optimization, software-hardware co-design, and on-device intelligence—combined with her record of industrial translation—cements her status as a central architect for the next wave of human-centered, efficient, and accessible artificial intelligence.
Her recent projects at AIR encapsulate the promise of deploying AI not only as an abstract scientific pursuit, but as a tangible force for transforming products, industries, and societies.
For the most current publication list and professional updates, visit Dr. Cao Ting’s academic website: https://tingcao952.github.io/
References
- About Ting Cao - Dr. Ting Cao.
- AI expert Cao Ting exits Microsoft for Tsinghua University.
- Ting Cao-AIR.
- [email protected], The Australian National University.
- Ya-Qin Zhang-AIR.
- Ya-Qin Zhang - Wikipedia.
- Publications - Dr. Ting Cao.
- LUT Tensor Core: A Software-Hardware Co-Design for LUT-Based.
- StreamMind: Unlocking Full Frame Rate Streaming Video Dialogue through ....
- Neuralink: Fast on-Device LLM Inference with Neuron Co-Activation ....
- Achieving Zero-COGS with Microsoft Editor Neural Grammar Checker.
- Jenga: Enhancing Long-Context Fine-tuning of LLMs with Contextual Token ....
- Towards Agentic Video Analytics Systems with Video Language Models.
- About AIR-AIR.
- Jenga: Enhancing LLM Long-Context Fine-tuning with Contextual Token ....
- LUTensor: A Software-Hardware Co-Design for LUT-Based Low-Bit LLM ....