Shafi, Aamir
121  results:
Search for persons X
?
1

Exploiting Inter-Layer Expert Affinity for Accelerating Mix..:

, In: 2024 IEEE International Parallel and Distributed Processing Symposium (IPDPS),
Yao, Jinghan ; Anthony, Quentin ; Shafi, Aamir.. - p. 915-925 , 2024
 
?
2

HINT: Designing Cache-Efficient MPI_Alltoall using Hybrid M..:

, In: 2024 IEEE International Parallel and Distributed Processing Symposium (IPDPS),
Ramesh, Bharath ; Contini, Nick ; Alnaasan, Nawras... - p. 802-813 , 2024
 
?
 
?
4

Accelerating communication with multi‐HCA aware collectives..:

Tran, Tu ; Ramesh, Bharath ; Michalowicz, Benjamin...
Concurrency and Computation: Practice and Experience.  36 (2023)  1 - p. , 2023
 
?
5

ScaMP: Scalable Meta-Parallelism for Deep Learning Search:

, In: 2023 IEEE/ACM 23rd International Symposium on Cluster, Cloud and Internet Computing (CCGrid),
Anthony, Quentin ; Xu, Lang ; Shafi, Aamir.. - p. 391-402 , 2023
 
?
6

MPI4Spark Meets YARN: Enhancing MPI4Spark through YARN supp..:

, In: 2023 IEEE International Conference on Big Data (BigData),
Al-Attar, Kinan ; Shafi, Aamir ; Subramoni, Hari. - p. 2265-2274 , 2023
 
?
7

Implementing and Optimizing a GPU-aware MPI Library for Int..:

, In: 2023 IEEE/ACM 23rd International Symposium on Cluster, Cloud and Internet Computing (CCGrid),
 
?
8

Flover: A Temporal Fusion Framework for Efficient Autoregre..:

, In: 2023 IEEE 30th International Conference on High Performance Computing, Data, and Analytics (HiPC),
Yao, Jinghan ; Alnaasan, Nawras ; Chen, Tian... - p. 107-116 , 2023
 
?
11

In-Depth Evaluation of a Lower-Level Direct-Verbs API on In..:

, In: 2023 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW),
 
?
12

Performance Characterization of Using Quantization for DNN ..:

, In: 2023 IEEE 7th International Conference on Fog and Edge Computing (ICFEC),
Ahn, Hyunho ; Chen, Tian ; Alnaasan, Nawras... - p. 1-6 , 2023
 
?
13

A Novel Framework for Efficient Offloading of Communication..:

, In: 2023 IEEE International Parallel and Distributed Processing Symposium (IPDPS),
 
?
14

Accelerating Distributed Deep Learning Training with Compre..:

, In: 2023 IEEE International Parallel and Distributed Processing Symposium (IPDPS),
Zhou, Qinghua ; Anthony, Quentin ; Xu, Lang... - p. 134-144 , 2023
 
?
15

MCR-DL: Mix-and-Match Communication Runtime for Deep Learni..:

, In: 2023 IEEE International Parallel and Distributed Processing Symposium (IPDPS),
Anthony, Quentin ; Awan, Ammar Ahmad ; Rasley, Jeff... - p. 996-1006 , 2023
 
1-15