Last time I post a blog Boosting Hyper with MAY that you can see the coroutine version is slower than the future version with a single working thread. I’m curious about why it’s slow. Some questions arise in my head. Maybe the context switching cost is too high? Or maybe the logic of thread version is not as optimized as the future version. After all the thread version of hyper is not actively developed compared with the master branch.
So I decide to profile the server to see what actually happens.
Install the profiling tools
I use cargo-profile, it’s not actively developed now, but usable.
I use the following command to install it on my ubuntu vm
1 | $ sudo apt-get install valgrind |
Modify the echo server
We need to let the server exit normally to make the profile tool happy. Just insert this code in the echo server example
1 | fn main() { |
Running the profile
We use the release version to see what happened with hyper.
1 | $ cd hyper |
And run the client in another terminal at the same time.
1 | $ wrk http://127.0.0.1:3000 -d 10 -t 2 -c 20 |
After 10 seconds you will see the result printed out by cargo-profile.
The may version result
1 | 70,276,542 (8.9%) ???:_..std..io..Write..write_fmt..Adaptor....a$C$..T....as..core..fmt..Write..::write_str |
The future version result
1 | 76,417,403 (12.0%) memcpy-sse2-unaligned.S:__memcpy_sse2_unaligned |
Conclusion
Apparently the hyper thread version is not fully optimized. It spend it’s most time to format strings. While the future version spend most of it’s time to copy memory which makes sense that the server is just echo back strings.
And we also notice that the future based version spend a noticeable amount of time to call the poll
method by the framework
(3.5%) ???:tokio_core::reactor::Core::poll
May version context switch runtime is not that heavy, I found this in the profile
(1.8%) ???:_..F..as..generator..gen_impl..FnBox..::call_box
P.S.
For a fare comparison for simple http echo server, I think the most proper candidates are tokio_minihttp and the http example in may project. They don’t evolve too much frameworks and just do the echo things.
Below is the result on my ubuntu vm
tokio_minihttp(1 thread)
1 | $ wrk http://127.0.0.1:8080 -d 10 -t 2 -c 20 |
may http example(1 thread)
1 | $ wrk http://127.0.0.1:8080 -d 10 -t 2 -c 20 |
may http example(2 thread)
1 | $ wrk http://127.0.0.1:8080 -d 10 -t 2 -c 20 |