AI/ML Systems
ML Inference Optimization
Designing efficient scheduling strategies for (distributed) inference of large AI models across heterogeneous compute clusters.
AI/ML Systems
Designing efficient scheduling strategies for (distributed) inference of large AI models across heterogeneous compute clusters.
AI/ML Systems
Developing optimal partitioning strategies for edge inference of transformer models across heterogeneous compute devices.
Networking
Workload-aware low-latency transport protocols for Distributed Compute
Data Centers
Configuration and resource management strategies for achieving optimal energy-performance tradeoffs in large-scale data center networks.
Cloud Computing
Revenue-maximizing pricing schemes for networked compute nodes serving large-scale AI training and inference jobs in public and private clouds.
Edge Computing
Optimal job scheduling and resource allocation for hierarchical edge-cloud architectures to minimize latency under energy constraints.