Search for persons
X
?
Proceedings of the 43rd ACM Symposium on Principles of Distributed Computing ,
1
System Optimizations for Enabling Training of Extreme Long ..:
, In:
?
Proceedings of the 37th ACM International Conference on Supercomputing ,
2
A Hybrid Tensor-Expert-Data Parallelism Approach to Optimiz..:
, In:
?
SC22: International Conference for High Performance Computing, Networking, Storage and Analysis ,
3
DeepSpeed- Inference: Enabling Efficient Inference of Trans..:
, In:
?
2022 IEEE 29th International Conference on High Performance Computing, Data, and Analytics (HiPC) ,
4
1-bit LAMB: Communication Efficient Large-Scale Large-Batch..:
, In:
?
Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis ,
5
ZeRO-infinity : breaking the GPU memory wall for extreme..:
, In:
?
SC21: International Conference for High Performance Computing, Networking, Storage and Analysis ,
6
ZeRO-Infinity: Breaking the GPU Memory Wall for Extreme Sca..:
, In:
?
Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining ,
7
DeepSpeed : System Optimizations Enable Training Deep Le..:
, In:
?
Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis ,
8
ZeRO : memory optimizations toward training trillion par..:
, In:
?
2019 IEEE International Conference on Data Mining (ICDM) ,
10
Fast LSTM Inference by Dynamic Decomposition on Cloud Syste..:
, In:
?
Proceedings of the Twenty-Second International Conference on Architectural Support for Programming Languages and Operating Systems ,
14
Optimizing CNNs on Multicores for Scalability, Performance ..:
, In:
?
Proceedings of the 22nd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming ,
15