Making Language Models Go BRRR...

Author: Ibrahim El Kaddouri

Repository: The repository is private

Coming Soon


Graph Surgeon is a small compiler project built on Apache TVM where I take a model exported to ONNX, import it into TVM and compile it to get a clean baseline with proper benchmarks. From there, I implement my own graph-level optimization pass, the kind of transformation a real compiler would do, then rebuild the model through the new pipeline and measure the impact. The final result is a before/after comparison backed by a benchmark that shows whether the pass actually makes the model faster (or if it makes it worse, and why).

References

  1. Apache TVM
  2. arXiv TVM
  3. interesting