Interesting fork of FastMM4 for which I now think I understand why it is not merged into the regular FastMM4 repository: [WayBack] GitHub – maximmasiutin/FastMM4-AVX: FastMM4 fork with AVX support and multi-threaded enhancements (faster locking).
The fork does two things:
- it has multi-threading enhancements (faster locking)
- AVX support which seems tough for floating point heavy applications as Delphi generates SSE instructions for them
Reminder to self: how big is that impact and could the locking be separately merged into the base repository?
Eric Grange:
Looking at https://github.com/pleriche/FastMM4/issues/36 and given than the compile generates SSE2 code for floating point, I guess using AVX may be problematic when your code also does a lot of floating point using Delphi code (rather than AVX asm)
It could be that this repository is using only AVX-128, which might not have a penalty as per Advanced Vector Extensions – Wikipedia:
The AVX instructions support both 128-bit and 256-bit SIMD. The 128-bit versions can be useful to improve old code without needing to widen the vectorization, and avoid the penalty of going from SSE to AVX, they are also faster on some early AMD implementations of AVX. This mode is sometimes known as AVX-128.
Via: [WayBack] Do you use Embarcadero version of FastMM or the “official” bleeding edge version of FastMM from the gitHub repository? Any idea what are the difference… – Tommi Prami – Google+
–jeoren