Discussion forum for David Beazley

My Curio HTTP Server


#1

Hi!

First of all, I am in love with curio! It is exactly how it should be done!

Second, I am really sorry for kind of self promotion. It may look really rude, but honestly I am more like seeking for help and advice.

So, here is curio based HTTP server I wrote. It uses httptools line Sanic does.


You can examine usage examples to get better idea of what it is now

It easily handles 20Mbps 4000rps traffic with 15% CPU usage on my home PC with Jinja2 template response which I think is good enough for most use cases. I’ll be happy to improve performance though, but not my current priority.

I am going to improve it the following ways (no timeline so far):

  1. Add support for HTML forms. Parsing, validation. More or less what Django does. Every part of Form rendering should be overridable Jinja2 template. I always hated that Django default form rendering does not go with what Bootstrap expects and you have to render forms manually.
  2. Connect Forms with peewee models. More or less what ModelForm is in Django. Peewee because I love Peewee. If you like SQLAlchemy, welcome to join me. Very optional, nothing will be assumed in the rest of the code.
  3. Based on #2 create generic admin module. Again inspired by awesome Django Admin, but taking into account my personal negative experience with it.
  4. Create some authentication/authorization infrastructure. Very optional, nothing will be assumed in the rest of the code.
  5. Provide some OpenAPI/SwaggerUI support. Have no idea what is required.

#2

Did you have a look at https://github.com/trollfot/granite ?
Want to join efforts somehow ?


#3

Yeah, why not. I’ll have to read your code first to reply in constructive manner.


#4

First of all I have reviewed dependecies from setup.py

  1. I’ll skip curio and pytest as obvious ones.

  2. I use httptools too so we are on the same wave here. Very good.

  3. I do not use autoroutes, but I use custom regex based approach. So, I have installed 0.2.0 by pip.
    3.a) Performance.
    3.a.1) My naive regex based implementation is about of ten times slower than autoroutes.
    3.a.2) Difference by order of magnitude was surprising, so I looked into what autoroutes does.
    3.a.3) I quickly realized that autoroutes splits route pattern into three parts: before first variable (static prefix), between first and last variable and after last variable (static suffix) and first match prefixes and suffixes. Quick way to verify if this approach provides performance boost was to move a variable in front of pattern, so that static prefix would be empty. It did not help at all.
    3.a.4) The real answer is that autoroutes tries hard to not use regex at all whenever possible. Bad thing is that if I need regex, then using regex is impossible or so unintuitive I have failed to find how. For instance “{id:digit}” matches “123”, but neither “{id:\d+}”, nor “{id:\d\d\d}” do. I don’t know why and honestly do not like such kind of bugs.
    3.b.1) I do not like ‘{}’ syntax. Django, Sanic, Flask all use ‘<>’.
    3.b.2) I like that in my implementation instead of handler I can specify another router, so I can compose routing
    router1 = Router()
    router.add(’/x’, handler1)
    router.add(’/y’, handler2)
    router2 = Router()
    router.add(’/a’, router1)
    and it will correctly handle ‘/a/x’ and call handler1.
    3.b.3) I like that I can add useful predefined types with long corresponding regex patterns, like UUID, so I can write /user/pk:uuid/ instead of /user/pk:0-9a-fA-F{8}-[0-9a-fA-F]{4}-[0-9a-fA-F]{4}-[0-9a-fA-F]{4}-[0-9a-fA-F]{12})|(?:[0-9a-fA-F]{32}/

From one side reusing existing library is always good. From another side I want to keep more powerful (3.b.2) implementation with more popular syntax (3.b.1) and more predictable behavior (3.a.4). If we look at absolute numbers 0.008 milliseconds for finding a route does not look like a problem at all to me and other arguments out-weight better performance.

  1. Looking at public interface I conclude that multifruits supports streaming parsing, so if anyone uploads file you can write it incrementally, not loading entirely into memory first. Very good. I’ll have to review implementation at some moment to confirm my guess.

  2. biscuits looks good to me. I did not implement fancy cookie management API yet, and using biscuits seems reasonable.

  3. wsproto. Does not look like this implementation is async friendly, so probably bad choice. When Iook at RFC 6455 protocol does not seem like hard one. Not something you’ll write in one day, but not one month either. I’d rather postpone this task, that use not-async friendly implementation, because people use websockets to gain performance and will actually get the opposite.

Also, I have reviewed code of granite

  1. I have found custom implementation of Multidict, I use multidict. To me CIMultiDict is very useful.

  2. I do not like Response.streamer API, to my taste it requires generator callback and thus makes code harder to read. Also I do not like that header management is very limited for streaming responses.

  3. I like the way I implemented headers more than what I found in granite. Not only cookies may require special handling. I find using specialized attributes more promising. Of course there is always a question what should be the result of
    response.headers.content_type.type = ‘application’
    response.headers.content_type.subtype = ‘octet-stream’
    response.headers[‘Content-Type’] = ‘text/html’

    response.headers[‘Content-Type’] = ‘text/html’
    response.headers.content_type.type = ‘application’
    response.headers.content_type.subtype = ‘octet-stream’

and I hope to find really good answer to this question, but I think that even if not, specialized attributes provide benefits like code completion and should stay.


#5

It was designed to be async-friendly and it’s in use by multiple async packages (granite, hypercorn, trio-websocket, …). If you have specific concerns you should tell the maintainers :slight_smile:


#6

When I look at this

I see TaskGroup, intermediate buffers and all things you need to glue blocking api and async. It can be used from async, but is not async friendly to my taste. This is my personal impression.


#7

okay, thank you for your time and review.


#8

None of those things help you glue together blocking sync code and async. The thing you need for that is threads.

wsproto is a sans I/O library. It’s designed to be ready to hook up to whatever I/O strategy you like.


#9

FYI: I have found a bug in mutifruits. If first chunk of data does not include complete boundary parsing fails.


#10

Really great to see you playing around with Curio! Just a quick note to encourage you (or anyone else) to submit a pull request on the Curio readme with a link to your project if you’re so inclined.


#11

OMG, that would be a honor to me!


#12

My progress so far:

  • Added binders (things to generate and process HTML forms), so processing of HTML forms is much much easier now.
  • Added a bit of wiki pages

My plans:

  • Reorganize code a bit, I don’t like files with1000+ lines, despite my editor supports folding, etc.
  • Add support for peewee model based binders so CRUD should be very easy.

Need help:

  • With web design of examples. They look bad and not attractive.