Log in to h4cker, then connect Hacker News to publish comments.
DZdzaima33 分前
The crappiness of shrink-wrapping in gcc and clang (but especially clang) annoys me a lot. It feels like there should be a quite decent amount of general performance to be gained from properly pushing more into slow paths (or, not necessarily even slow paths, but generally paths with high register pressure / uninlined function calls), never mind calling conventions in general.
On the push impl in the article - for non-x86 (and perhaps even on x86 for performance, though not size/instruction count) it'd be better to allow the size increment to reuse the size read done by the capacity check; with C++'s lack of aliasing information, the interleaved memcpy/store prevents the compiler from deciding this itself.
Comments
2 preview comments · loading full threadLog in to h4cker, then connect Hacker News to publish comments.
The crappiness of shrink-wrapping in gcc and clang (but especially clang) annoys me a lot. It feels like there should be a quite decent amount of general performance to be gained from properly pushing more into slow paths (or, not necessarily even slow paths, but generally paths with high register pressure / uninlined function calls), never mind calling conventions in general. On the push impl in the article - for non-x86 (and perhaps even on x86 for performance, though not size/instruction count) it'd be better to allow the size increment to reuse the size read done by the capacity check; with C++'s lack of aliasing information, the interleaved memcpy/store prevents the compiler from deciding this itself.
[deleted]