Preprints

2023

  1. Input-sensitive dense-sparse primitive compositions for GNN acceleration Lenadora, Damitha, Sathia, Vimarsh, Gerogiannis, Gerasimos, Yesil, Serif, Torrellas, Josep, and Mendis, Charith 2023 [arXiv]
  2. FLuRKA: Fast fused Low-Rank & Kernel Attention Gupta, Ahan, Yuan, Yueming, Zhou, Yanqi, and Mendis, Charith 2023 [arXiv]
  3. Dias: Dynamic Rewriting of Pandas Code Baziotis, Stefanos, Kang, Daniel, and Mendis, Charith 2023 [arXiv]
  4. COMET: X86 Cost Model Explanation Framework Chaudhary, Isha, Renda, Alex, Mendis, Charith, and Singh, Gagandeep 2023 [arXiv]

Conference and Journal Papers

2023

  1. Challenges in Metaverse Research: An Internet of Things Perspective Abdelzaher, Tarek, Caesar, Matthew, Mendis, Charith, Nahrstedt, Klara, Srivastava, Mani, and Yu, Minlan In 2023 IEEE International Conference on Metaverse Computing, Networking and Applications (MetaCom) 2023 [PDF]
  2. Learning Large Graph Property Prediction via Graph Segment Training Cao, Kaidi, Phothilimthana, Phitchaya Mangpo, Abu-El-Haija, Sami, Zelle, Dustin, Zhou, Yanqi, Mendis, Charith, Leskovec, Jure, and Perozzi, Bryan In Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems (NeurIPS) 2023 [arXiv]
  3. TpuGraphs: A Performance Prediction Dataset on Large Tensor Computational Graphs Phothilimthana, Phitchaya Mangpo, Abu-El-Haija, Sami, Cao, Kaidi, Fatemi, Bahare, Mendis, Charith, and Perozzi, Bryan In Advances in Neural Information Processing Systems 37: Annual Conference on Neural Information Processing Systems (NeurIPS) 2023 [arXiv]
    Real world performance prediction dataset for TPUs
  4. Unified Convolution Framework: A compiler-based approach to support sparse convolutions Won, Jaeyeon, Hong, Changwan, Mendis, Charith, Emer, Joel, and Amarasinghe, Saman In Proceedings of Machine Learning and Systems (MLSys) 2023 [PDF]
  5. SPADE: A Flexible and Scalable Accelerator for SpMM and SDDMM Gerogiannis, Gerasimos, Yesil, Serif, Lenadora, Damitha, Cao, Dingyuan, Mendis, Charith, and Torrellas, Josep In Proceedings of the 50th Annual International Symposium on Computer Architecture 2023 [PDF] [Bibtex]
  6. TGOpt: Redundancy-Aware Optimizations for Temporal Graph Attention Networks Wang, Yufeng, and Mendis, Charith In Proceedings of the 28th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming, PPoPP 2023, Montreal, QC, Canada, 25 February 2023 - 1 March 2023 2023 [PDF] [Code] [Bibtex]
  7. WACO: Learning Workload-Aware Co-optimization of the Format and Schedule of a Sparse Tensor Program Won, Jaeyeon, Mendis, Charith, Emer, Joel S., and Amarasinghe, Saman P. In Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 2, ASPLOS 2023, Vancouver, BC, Canada, March 25-29, 2023 2023 [PDF] [Code] [Bibtex]

2022

  1. GRANITE: A Graph Neural Network Model for Basic Block Throughput Estimation Sýkora, Ondrej, Phothilimthana, Phitchaya Mangpo, Mendis, Charith, and Yazdanbakhsh, Amir In IEEE International Symposium on Workload Characterization, IISWC 2022, Austin, TX, USA, November 6-8, 2022 2022 [PDF] [Bibtex]
  2. All You Need is Superword-Level Parallelism: Systematic Control-Flow Vectorization with SLP Chen, Yishen, Mendis, Charith, and Amarasinghe, Saman In Proceedings of the 43rd ACM SIGPLAN International Conference on Programming Language Design and Implementation (PLDI) 2022 [PDF] [Bibtex]

2021

  1. VeGen: A Vectorizer Generator for SIMD and Beyond Chen, Yishen, Mendis, Charith, Carbin, Michael, and Amarasinghe, Saman In Proceedings of the 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems 2021 [PDF] [Bibtex]
  2. A Learned Performance Model for Tensor Processing Units Kaufman, Sam, Phothilimthana, Phitchaya, Zhou, Yanqi, Mendis, Charith, Roy, Sudip, Sabne, Amit, and Burrows, Mike In Proceedings of Machine Learning and Systems 2021 [PDF] [Bibtex]
    used in production at Google in the XLA TPU compiler

2020

  1. DiffTune: Optimizing CPU Simulator Parameters with Learned Differentiable Surrogates Renda, Alex, Chen, Yishen, Mendis, Charith, and Carbin, Michael In 53rd Annual IEEE/ACM International Symposium on Microarchitecture, MICRO 2020, Athens, Greece, October 17-21, 2020 2020 [PDF] [Bibtex]

2019

  1. Revec: program rejuvenation through revectorization Mendis, Charith, Jain, Ajay, Jain, Paras, and Amarasinghe, Saman P. In Proceedings of the 28th International Conference on Compiler Construction, CC 2019, Washington, DC, USA, February 16-17, 2019 2019 [PDF] [Bibtex]
  2. Ithemal: Accurate, Portable and Fast Basic Block Throughput Estimation using Deep Neural Networks Mendis, Charith, Renda, Alex, Amarasinghe, Saman P., and Carbin, Michael In Proceedings of the 36th International Conference on Machine Learning, ICML 2019, 9-15 June 2019, Long Beach, California, USA 2019 [PDF] [Code] [Bibtex]
    Best Paper award at the ML for systems workshop co-located with ISCA’19
  3. BHive: A Benchmark Suite and Measurement Framework for Validating x86-64 Basic Block Performance Models Chen, Yishen, Brahmakshatriya, Ajay, Mendis, Charith, Renda, Alex, Atkinson, Eric, Sýkora, Ondrej, Amarasinghe, Saman P., and Carbin, Michael In IEEE International Symposium on Workload Characterization, IISWC 2019, Orlando, FL, USA, November 3-5, 2019 2019 [PDF] [Code] [Bibtex]
  4. Compiler Auto-Vectorization with Imitation Learning Mendis, Charith, Yang, Cambridge, Pu, Yewen, Amarasinghe, Saman P., and Carbin, Michael In Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, NeurIPS 2019, December 8-14, 2019, Vancouver, BC, Canada 2019 [PDF] [Bibtex]

2018

  1. goSLP: globally optimized superword level parallelism framework Mendis, Charith, and Amarasinghe, Saman P. Proc. ACM Program. Lang. 2018 [PDF] [Bibtex]

2017

  1. Making caches work for graph analytics Zhang, Yunming, Kiriansky, Vladimir, Mendis, Charith, Amarasinghe, Saman P., and Zaharia, Matei In 2017 IEEE International Conference on Big Data (IEEE BigData 2017), Boston, MA, USA, December 11-14, 2017 2017 [PDF] [Bibtex]
    Best Student Paper award

2016

  1. Parallelizing WFST speech decoders Mendis, Charith, Droppo, Jasha, Maleki, Saeed, Musuvathi, Madanlal, Mytkowicz, Todd, and Zweig, Geoffrey In 2016 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2016, Shanghai, China, March 20-25, 2016 2016 [PDF] [Bibtex]

2015

  1. Helium: lifting high-performance stencil kernels from stripped x86 binaries to halide DSL code Mendis, Charith, Bosboom, Jeffrey, Wu, Kevin, Kamil, Shoaib, Ragan-Kelley, Jonathan, Paris, Sylvain, Zhao, Qin, and Amarasinghe, Saman P. In Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation, Portland, OR, USA, June 15-17, 2015 2015 [PDF] [Website] [Bibtex]