The Problem Description:
I have this custom “checksum” function:
NORMALIZER = 0x10000 def get_checksum(part1, part2, salt="trailing"): """Returns a checksum of two strings.""" combined_string = part1 + part2 + " " + salt if part2 != "***" else part1 ords = [ord(x) for x in combined_string] checksum = ords # initial value # TODO: document the logic behind the checksum calculations iterator = zip(ords[1:], ords) checksum += sum(x + 2 * y if counter % 2 else x * y for counter, (x, y) in enumerate(iterator)) checksum %= NORMALIZER return checksum
Which I want to test on both Python3.6 and PyPy performance-wise. I’d like to see if the function would perform better on PyPy, but I’m not completely sure, what is the most reliable and clean way to do it.
What I’ve tried and the Question:
Currently, I’m using
timeit for both:
$ python3.6 -mtimeit -s "from test import get_checksum" "get_checksum('test1' * 100000, 'test2' * 100000)" 10 loops, best of 3: 329 msec per loop $ pypy -mtimeit -s "from test import get_checksum" "get_checksum('test1' * 100000, 'test2' * 100000)" 10 loops, best of 3: 104 msec per loop
My concern is I’m not absolutely sure if
timeit is the right tool for the job on
PyPy because of the potential JIT warmup overhead.
Plus, the PyPy itself reports the following before reporting the test results:
WARNING: timeit is a very unreliable tool. use perf or something else for real measurements pypy -m pip install perf pypy -m perf timeit -s 'from test import get_checksum' "get_checksum('test1' * 1000000, 'test2' * 1000000)"
What would be the best and most accurate approach to test the same exact function performance across these and potentially other Python implementations?