Discussion forum for David Beazley

Curio Benchmarks with PyPy


#1

The PyPy project just posted some HTTP benchmarks that included Curio. https://morepypy.blogspot.com.br/2017/03/async-http-benchmarks-on-pypy3.html

It’s great to see Curio included in there and hanging with the other frameworks. If anything, I think it’s a real positive sign for both Curio and the sans-I/O work with libraries such as h11.

One subtle aspect of the performance graphs is the X-axis which shows the run duration of the benchmark. All of the graphs show slow speedup at first because the PyPy tracing JIT is analyzing the code. That in mind, I think it’s interesting how fast Curio ramps up to more-or-less full speed. That’s very likely a proxy for underlying code complexity. In the case of Curio, there’s just not a lot of moving parts beyond the scheduling of coroutines. So, I’m wondering if we’re seeing that reflected in the results.

I’m looking forward to playing with this more once a more official PyPy release is made.


#2

That’s my suspicion too, though when comparing (say) asyncio+aiohttp vs curio+h11 it’s not totally clear which parts are responsible. Though really I suspect that it’s both that curio is simpler than asyncio and h11 is simpler than aiohttp… (aiohttp has very complicated internals IMO. Though I should be fair and note that it also has a lot more functionality than the toy curio+h11 server that Squeaky used.)

The slightly dismaying part is that on CPython, cleaner design pays off in speed – curio+h11 is substantially faster than asyncio+aiohttp. But on PyPy, apparently the JIT is not only clever enough to strip away clean abstractions, it’s also clever enough to strip away unnecessary complexity – asyncio+aiohttp eventually manages to catch up with curio+h11. Oh well :-). It’s possible simplicity will still pay off at runtime once one moves to more complex (realistic) systems where the JIT’s heuristics struggle more.


#3

I wouldn’t get too heartbroken if I were you. I think aiohttp’s real advantage here is that it is ultimately running a very tight loop, always executing exactly the same code. This means the fact that it has many possible branches is not really relevant: in practice on this test it has very linear control flow, which makes it easy for the tracing JIT to spot what’s going on.

In a real application, the measure of difficulty is not how many theoretical branches there are, but how many are actually taken. I suspect that a real application built on top of curio and h11 would have very few branches actually taken compared to an aiohttp application, mostly because their internal implementations are much simpler.

Put another way: the easiest way to be fast is to do less, and that applies even when there is a JIT in the way. :grin:


#4

One thing that has been sitting in the back of my mind is just how much of a performance hit would be taken by the Sans-IO approach (as opposed to direct read/write on the wire). That in mind, I think the Curio-h11 result is really promising. Maybe this whole crazy stunt is actually going to work.


#5

I think it is really important for Curio to have a fully fledged http server as soon as possible. It will really sell some of the people still stuck with asyncio. The quicker the ecosystem fleshes out the faster adoption will be.


#6

I agree that be great to have some kind of HTTP server, although not necessarily implemented as a part of the Curio core (I really see Curio as more of small library and not a framework that’s aiming to do everything). Honestly, I’ve been kind of sitting back to see how things progress and to see if someone else would step up to take on such a project. At some point, I’ll probably get around to doing it myself if no one else has done it, but all things considered, I’d much rather see someone else take it on.


#7

I actually started but it’s not going so well.


#8

Not going well due to some Curio issue or other factors? Curious.


#9

Im not a great programmer, and I am struggling with performance and completeness, for example I throw wrk at it and I get loads of read errors. :disappointed_relieved: I am sorta cheating and using h11 + curio to do most of the lifting which isn’t anything smart. But I seem to still be getting it wrong.


#10

Hmmm. Read errors aside, what is your ultimate goal? Is it building a full fledged website (e.g. Django-like) or is more for some sort of API (e.g., serving JSON over HTTP). I have my own thoughts about how I might implement HTTP with Curio, but they’re mostly focused on the latter. I don’t really want to build a whole web programming framework.


#11

I was mainly targeting JSON / etc over HTTP but really it doesnt matter so long as you can deliver a response.

I want to build a fully fledged http framework. Its just gonna take time.


#12

A http server with curio, it look more like gunicorn or more like django rest framework?
Could you guys share some thoughts?


#13

It was kinda looking like flask.


#14

I really liked the numbers from that benchmark. Are Curio + h11 so premature to use for simple REST APIs construction?


Curio http server, how?
#15

As curio is single thread, so if a request handler do not yield, curio can not handle many requests but just one request one time.

And a request handler only yield when it hit a database, or a cache, So we have to write some asynchronous version database/cache tools?

Or just put all IO operations in other threads?