The crappiness of shrink-wrapping in gcc and clang (but especially clang) annoys me a lot. It feels like there should be a quite decent amount of general performance to be gained from properly pushing more into slow paths (or, not necessarily even slow paths, but generally paths with high register pressure / uninlined function calls), never mind calling conventions in general.
On the push impl in the article - for non-x86 (and perhaps even on x86 for performance, though not size/instruction count) it'd be better to allow the size increment to reuse the size read done by the capacity check; with C++'s lack of suitable aliasing information, the interleaved memcpy/store prevents the compiler from deciding this itself.
评论
2 条预览评论 · 正在加载完整讨论请先登录 h4cker 账号,然后连接 Hacker News 后发表评论。
The crappiness of shrink-wrapping in gcc and clang (but especially clang) annoys me a lot. It feels like there should be a quite decent amount of general performance to be gained from properly pushing more into slow paths (or, not necessarily even slow paths, but generally paths with high register pressure / uninlined function calls), never mind calling conventions in general. On the push impl in the article - for non-x86 (and perhaps even on x86 for performance, though not size/instruction count) it'd be better to allow the size increment to reuse the size read done by the capacity check; with C++'s lack of suitable aliasing information, the interleaved memcpy/store prevents the compiler from deciding this itself.
[deleted]