Coroutine resumption in event loop agnostic libraries


#1

I’m currently re-writing my little asynchronous webserver project, growler, to support curio in addition to asyncio. I assume if this is successful, it wont be hard to support anything else that comes around. I’d also like it to be a model for writing non-trivial code that is flexible enough to support multiple ‘backends’.

I think I’ve hit a roadblock while trying to replace an asyncio.Future. It was used as a placeholder for the incoming HTTP request’s body data, allowing the app to run while the (potentially large) body loads in the background. When the body is needed, awaiting this future would block until the full body is read in, resume.

I’m looking for a mechanism that can restart the execution of a waiting coroutine in a generic way. I’ll understand if this cannot be done. I’ve been trying various permutations of await-ing and send-ing data between coroutines and generators to no avail.

One option, though I hesitate to ask, is to add a special case for handling asyncio.Future objects in the curio Kernel. This is of course a bad idea for many reasons.


#2

Loading the whole body into memory sounds like a problematic design to me – what are you going to do when someone uploads a 100 MB file? I would expect a streaming API to be preferred for this, like in WSGI…

That said, you should be able to build a Future-like object pretty straightforwardly on top of the curio Event class. (Your class would have an Event object + a slot for the actual data value; the Event keeps track of whether the future has completed or not, and gives you a way to block until it has; and then once it has you read the data.) Queues also solve a somewhat similar problem, and could be useful depending on how you decide to factor the code.


#3

To answer your first question, this is default “body upload” behavior and can be changed depending on request path, and actually there is a 2MB limit on the size of post data by default.

If I remember right, asyncio queues use futures internally, so for my purposes (one item) they would have been the same - although now I think about it, queues could be a solution to this streaming body problem… In your opinion, do you think this should be the default?

But it sounds like you need an asyncio.Task or curio.Event to implement this, which is fine; I’ll just have to move this future-like object creation into backend specific code. This is kind of what I was trying to avoid, but I think it’s impossible not to.


#4

Yeah – but curio provides Queue as a primitive, and it doesn’t provide Future, so at least potentially a queue-based solution might turn out to be easier to make work on both. Or possibly not, I’m just throwing it out there. I think we really really don’t understand yet how to write code that’s portable across event loops, so you’re in the cutting edge that we’ll all learn from :-).

(To be clear though there’s nothing intrinsic that stops curio from supporting Future objects, as evidenced by the fact that you can easily build your own on top of the primitives that it does provide, and that if you look at how stuff like curio.run_in_thread works then it actually uses concurrent.futures.Future internally. I do though think that using Futures encourages a style of coding that I think curio is trying to discourage – that might be why @dabeaz has left it out so far. Or maybe not, I’m just guessing.)

I’ve actually been thinking a lot recently about async WSGI-like APIs, and starting to prototype my own (possibly event-loop agnostic) HTTP server based on h11. But unfortunately these ideas are still a little to incoherent for me to know how to articulate them here :-/.


#5

For those interested, my solution in the end was to move the “future” to the event-loop dependent code. I think all general purpose code will need a little bootstrap code that can use the constructs provided by the async event loop.

I essentially have a function that returns a reader/writer pair:

  • The reader is the asyncio.Future or curio.Queue object: something that may be awaited and returns the value set by the writer.
  • The writer is an initialized generator object that will activate the reader once some code, somewhere, calls writer.send(result). This generator essentially calls queue.put((yield)) or future.set_result((yield))

I haven’t published this yet, but I’ll link here when I do. I’d also love to hear any alternative solutions anyone comes up with.