#StackBounty: #machine-learning #parallel-computing #benchmarking How does the TEPS benchmark work and why is it relevant to real world…

Bounty: 400


The main benchmarks that computers are measured on are FLOPs, MIPs, and some related ones, which measure the amount of some basic operations that a certain processor can do. It is very clear to me how these benchmarks relate to the processor’s ability to execute real-world algorithms. For example, most scientific and graphics algorithms have a very clear requirement of a certain amount of floating point operations and this is their primary computational cost, so the FLOPs of a GPU contains a lot of information about how fast the GPU will compute those algorithms. (though not complete information, since there are other bottlenecks, such as communication bandwidth limits and efficient scheduling, etc).

The TEPS benchmark

Graph500 is a competition for supercomputers that uses a different benchmark "Traversed Edges Per Second", which is supposed to measure some notion of the communication bandwidth ability of the computer.

I understand the intuitive justification for such a benchmark, since data-communication is a key bottleneck in many applications.
However, I don’t understand how this benchmark is computed, since there doesn’t seem to be a basic explanation on the site. And I don’t understand how exactly this specific benchmark is supposed to relate to real-world computational problems like machine learning tasks. At a basic level, how does TEPS work, and why?

  • How is TEPs computed? Is there a clearer explanation somewhere than the one on the Graph500 website? (I couldn’t find it after searching google scholar).

  • TEPS is somehow measuring the amount of traversed edges in a graph, but what is this graph supposed to be analogous to? e.g. if we compare it to a machine learning task, what would a node be? A single data sample? A single memory location?

  • What insight does the TEPs give us above just directly reading the communication bandwidth of the computers off of their specification, if we want to predict how well the supercomputer will do on some ML/big-data task (or some other task)?

Get this bounty!!!

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.