Skip to content

SIMD version of the transposition

Edwin Carlinet requested to merge development/simd-transposition into next
  • The dilation caneva now uses the SIMD-optimized version of the transposition
  • New buffer primitives in mln::bp namespace:
    • Allocations: mln::bp::aligned_alloc_2d mln::bp::aligned_free_2d
    • Copy: mln::bp::copy
    • Swap: mln::bp::swap
    • Transpose: mln::bp::transpose

Performances

Benchmark Time CPU Time Old Time New
BMPrimitives<uint8_t>/transpose_[inplace_baseline vs. inplace_optimized]_uint8_t/64 -0.4923 -0.4923 1403 713
BMPrimitives<uint8_t>/transpose_[inplace_baseline vs. inplace_optimized]_uint8_t/128 -0.4592 -0.4592 5698 3082
BMPrimitives<uint8_t>/transpose_[inplace_baseline vs. inplace_optimized]_uint8_t/131 -0.4474 -0.4474 6091 3366
BMPrimitives<uint8_t>/transpose_[inplace_baseline vs. inplace_optimized]_uint8_t/256 -0.7137 -0.7137 46747 13383
BMPrimitives<uint16_t>/transpose_[inplace_baseline vs. inplace_optimized]_uint16_t/64 -0.2347 -0.2347 1653 1265
BMPrimitives<uint16_t>/transpose_[inplace_baseline vs. inplace_optimized]_uint16_t/128 -0.5331 -0.5331 11546 5390
BMPrimitives<uint16_t>/transpose_[inplace_baseline vs. inplace_optimized]_uint16_t/131 -0.5235 -0.5236 12246 5836
BMPrimitives<uint16_t>/transpose_[inplace_baseline vs. inplace_optimized]_uint16_t/256 -0.6077 -0.6078 55438 21747
BMPrimitives<uint32_t>/transpose_[inplace_baseline vs. inplace_optimized]_uint32_t/64 -0.3227 -0.3227 2822 1911
BMPrimitives<uint32_t>/transpose_[inplace_baseline vs. inplace_optimized]_uint32_t/128 -0.4731 -0.4732 14469 7624
BMPrimitives<uint32_t>/transpose_[inplace_baseline vs. inplace_optimized]_uint32_t/131 -0.4767 -0.4767 15168 7937
BMPrimitives<uint32_t>/transpose_[inplace_baseline vs. inplace_optimized]_uint32_t/256 -0.5011 -0.5011 63214 31536
BMPrimitives<uint64_t>/transpose_[inplace_baseline vs. inplace_optimized]_uint64_t/64 -0.7059 -0.7059 11035 3246
BMPrimitives<uint64_t>/transpose_[inplace_baseline vs. inplace_optimized]_uint64_t/128 -0.7222 -0.7222 48095 13361
BMPrimitives<uint64_t>/transpose_[inplace_baseline vs. inplace_optimized]_uint64_t/131 -0.7245 -0.7245 50621 13949
BMPrimitives<uint64_t>/transpose_[inplace_baseline vs. inplace_optimized]_uint64_t/256 -0.7105 -0.7105 202610 58661
BMPrimitives<float>/transpose_[inplace_baseline vs. inplace_optimized]_float/64 -0.3464 -0.3466 2945 1925
BMPrimitives<float>/transpose_[inplace_baseline vs. inplace_optimized]_float/128 -0.4881 -0.4881 14801 7577
BMPrimitives<float>/transpose_[inplace_baseline vs. inplace_optimized]_float/131 -0.4934 -0.4933 15702 7955
BMPrimitives<float>/transpose_[inplace_baseline vs. inplace_optimized]_float/256 -0.5099 -0.5100 65050 31882
BMPrimitives<double>/transpose_[inplace_baseline vs. inplace_optimized]_double/64 -0.6916 -0.6916 10478 3232
BMPrimitives<double>/transpose_[inplace_baseline vs. inplace_optimized]_double/128 -0.7084 -0.7085 46015 13417
BMPrimitives<double>/transpose_[inplace_baseline vs. inplace_optimized]_double/131 -0.7111 -0.7111 48427 13993
BMPrimitives<double>/transpose_[inplace_baseline vs. inplace_optimized]_double/256 -0.6985 -0.6986 196441 59229
BMPrimitives<rgb8>/transpose_[inplace_baseline vs. inplace_optimized]_rgb8/64 -0.0167 -0.0168 3115 3063
BMPrimitives<rgb8>/transpose_[inplace_baseline vs. inplace_optimized]_rgb8/128 -0.0416 -0.0411 12636 12111
BMPrimitives<rgb8>/transpose_[inplace_baseline vs. inplace_optimized]_rgb8/131 -0.0455 -0.0450 13461 12849
BMPrimitives<rgb8>/transpose_[inplace_baseline vs. inplace_optimized]_rgb8/256 -0.0200 -0.0196 62205 60963
BMPrimitives<rgba8>/transpose_[inplace_baseline vs. inplace_optimized]_rgba8/64 -0.5446 -0.5447 4217 1920
BMPrimitives<rgba8>/transpose_[inplace_baseline vs. inplace_optimized]_rgba8/128 -0.5853 -0.5852 18559 7697
BMPrimitives<rgba8>/transpose_[inplace_baseline vs. inplace_optimized]_rgba8/131 -0.5833 -0.5833 19507 8128
BMPrimitives<rgba8>/transpose_[inplace_baseline vs. inplace_optimized]_rgba8/256 -0.5981 -0.5981 79421 31917
Benchmark Time CPU Time Old Time New
BMMorpho/Dilation_ApproximatedDisc/2_mean -22.23% -22.23% 73731949 57343963
BMMorpho/Dilation_ApproximatedDisc/4_mean -05.48% -05.47% 280135607 264795373
BMMorpho/Dilation_ApproximatedDisc/8_mean -02.21% -02.20% 531229808 519475166
BMMorpho/Dilation_ApproximatedDisc/16_mean -05.33% -05.33% 564639932 534519001
BMMorpho/Dilation_ApproximatedDisc/32_mean -04.92% -04.90% 944361277 897853724
BMMorpho/Dilation_ApproximatedDisc/64_mean -07.48% -07.47% 1300030125 1202781142
BMMorpho/Dilation_ApproximatedDisc/128_mean -04.80% -04.82% 2292119667 2182019690
BMMorpho/Dilation_ApproximatedDisc_parallel/2_mean -06.39% -08.97% 23494457 21992599
BMMorpho/Dilation_ApproximatedDisc_parallel/4_mean -02.51% -03.86% 70858876 69081980
BMMorpho/Dilation_ApproximatedDisc_parallel/8_mean -03.06% -03.18% 139158983 134895752
BMMorpho/Dilation_ApproximatedDisc_parallel/16_mean -03.62% -03.52% 143996170 138783845
BMMorpho/Dilation_ApproximatedDisc_parallel/32_mean -03.44% -02.44% 243122499 234752964
BMMorpho/Dilation_ApproximatedDisc_parallel/64_mean -07.70% -05.48% 362112873 334222909
BMMorpho/Dilation_ApproximatedDisc_parallel/128_mean -07.42% -06.82% 716831490 663630938
BMMorpho/Dilation_EuclideanDisc_naive/4_mean -08.89% -08.92% 3024343370 2755607022
BMMorpho/Dilation_EuclideanDisc_naive/16_mean -04.07% -04.04% 19220983258 18439328510
BMMorpho/Dilation_EuclideanDisc_incremental/4_mean +04.22% +04.13% 2112364117 2201433522
BMMorpho/Dilation_EuclideanDisc_incremental/8_mean -13.32% -13.32% 3172277202 2749743068
BMMorpho/Dilation_EuclideanDisc_incremental/16_mean -10.04% -10.04% 4373858593 3934788860
BMMorpho/Dilation_EuclideanDisc_incremental/32_mean -14.28% -14.25% 9020567816 7732003076
BMMorpho/Dilation_EuclideanDisc_incremental/128_mean -01.64% -01.64% 48347915089 47556850789
BMMorpho/Dilation_Square/2_mean -21.65% -21.64% 79123420 61993627
BMMorpho/Dilation_Square/4_mean -22.38% -22.36% 74385806 57740013
BMMorpho/Dilation_Square/8_mean -24.04% -24.05% 68613793 52121601
BMMorpho/Dilation_Square/16_mean -20.15% -20.17% 73740573 58878521
BMMorpho/Dilation_Square/32_mean -31.60% -31.62% 92353130 63166265
BMMorpho/Dilation_Square/64_mean -33.38% -33.40% 143492419 95588052
BMMorpho/Dilation_Square/128_mean -42.82% -42.84% 319995133 182963168
BMMorpho/Dilation_Square_parallel/2_mean -08.82% -08.91% 23400252 21336825
BMMorpho/Dilation_Square_parallel/4_mean -10.28% -09.97% 22411414 20108000
BMMorpho/Dilation_Square_parallel/8_mean -10.06% -09.53% 23230937 20894365
BMMorpho/Dilation_Square_parallel/16_mean -10.44% -10.79% 25446166 22789672
BMMorpho/Dilation_Square_parallel/32_mean -09.54% -10.44% 32497370 29397984
BMMorpho/Dilation_Square_parallel/64_mean -08.81% -09.18% 62154230 56675758
BMMorpho/Dilation_Square_parallel/128_mean -10.03% -07.36% 211265735 190080132
BMMorpho/Opening_Disc_mean -02.92% -02.93% 1904755125 1849192892
Edited by Edwin Carlinet

Merge request reports