ES version is available. Content is displayed in original English for accuracy.
Advertisement
Advertisement
⚡ Community Insights
Discussion Sentiment
63% Positive
Analyzed from 877 words in the discussion.
Trending Topics
#python#code#performance#profiling#language#spy#used#slow#sampling#profiler

Discussion (18 Comments)Read Original on HackerNews
Right this second I'm writing something in Python with critical performance requirements. It needs to average processing 25k things per second. That won't be particularly hard, but it's close enough to the edge of what the language is capable of that I do need to be at least a tiny bit careful with the implementation. I'm highly unlikely to need a profiler for this project in particular, but earlier in my career I probably would have needed one.
Python is fairly commonly used as a glue engine around faster code too, and it's not always obvious when the wrapper code is inducing nontrivial overhead (hidden copies and that sort of thing). Profilers are great for teasing out those sorts of problems. They shine a spotlight on the section of code which should take 0us and is instead dominating your runtime.
The fact that the base language is an order of magnitude (or two!) slower has almost never mattered. If my work gets to the point where it does, and I have an excuse to go mess around with a rust extension or some cool optimized library, things are going very well.
I've been professional developer for over 20 years now, and I've read this forum obsessively for much of that time. I've seen people write things like, "Most engineers would kill for a 5% speedup" and I think, on what planet? Most engineers have much larger problems that cannot be so easily quantified. Come to think of it, I think that there is an allure to performance optimization due to the fact that it can be so easily quantified.
Sometimes Python is just the language used in the domain. Lots of sciences live on Python because it is easy to teach to grad students and the package ecosystem is strong.
A profiler like this can be used to identify which parts to rewrite in a faster language. Sometimes it's easier to write everything in Python first, then measure, than guess at the start which parts need to be fast.
You can also get gains by switching algorithms, both in pure Python and when using a compiled library like `numpy`. And there are also some operations, like string manipulation or the `sqlite3` module, where the Python runtime's implementation has already been optimized in a compiled language.
One thing to note is that there are some differences in blocking behaviour of the target process. Py-spy blocks by default and profiling.sampling doesn't. I wrote a bit about why py-spy blocks by default here https://www.benfrederickson.com/why-python-needs-paused-duri... - the first version of py-spy also didn't block and since we got incredibly misleading results at times this was one of the first changes I made to py-spy
Imagine if I have a single request calling asyncio.gather() on 5 different coroutines. Only 1 is on CPU, the other 4 are on IO. Is Tachyon able to sample all 5 coroutine tasks?