Intel MPI Benchmarks
Intel MPI Benchmarks -- formerly known as 'Pallas MPI Benchmarks (PMB)' -- is a concise, easy-to-use set of MPI benchmarks. It compares the performance of various computing platforms or MPI implementations. It checks many MPI communication patterns, automatically detects clustering, and reports intra-cluster and inter-cluster performance. The benchmarks are targeted at measuring important MPI functions, such as:
- Point-to-point message passing
- Global data movement and computation routines
- One-sided communications
- File I/O
Intel MPI Benchmarks can be downloaded at http://www.intel.com/cd/software/products/asmo-na/eng/219848.htm
##### sweetie ##### rantanplan #--------------------------------------------------- # Intel (R) MPI Benchmark Suite V3.0, MPI-1 part #--------------------------------------------------- # Date : Tue Apr 24 16:14:00 2007 # Machine : x86_64 # System : Linux # Release : 2.6.17-2-amd64 # Version : #1 SMP Wed Sep 13 17:49:33 CEST 2006 # MPI Version : 1.0 # MPI Thread Environment: MPI_THREAD_SINGLE # # Minimum message length in bytes: 0 # Maximum message length in bytes: 4194304 # # MPI_Datatype : MPI_BYTE # MPI_Datatype for reductions : MPI_FLOAT # MPI_Op : MPI_SUM # # # List of Benchmarks to run: # PingPong # PingPing # Sendrecv # Exchange # Allreduce # Reduce # Reduce_scatter # Allgather # Allgatherv # Alltoall # Alltoallv # Bcast # Barrier #--------------------------------------------------- # Benchmarking PingPong # #processes = 2 #--------------------------------------------------- #bytes #repetitions t[usec] Mbytes/sec 0 1000 2.87 0.00 1 1000 2.86 0.33 2 1000 2.86 0.67 4 1000 2.92 1.31 8 1000 2.93 2.60 16 1000 2.99 5.11 32 1000 3.77 8.09 64 1000 3.93 15.54 128 1000 5.37 22.73 256 1000 5.64 43.26 512 1000 6.31 77.33 1024 1000 8.24 118.56 2048 1000 9.93 196.66 4096 1000 14.62 267.15 8192 1000 19.69 396.73 16384 1000 29.52 529.38 32768 1000 44.16 707.58 65536 640 80.51 776.28 131072 320 133.66 935.20 262144 160 240.73 1038.50 524288 80 460.54 1085.68 1048576 40 884.54 1130.53 2097152 20 1753.16 1140.79 4194304 10 3521.06 1136.02 #--------------------------------------------------- # Benchmarking PingPing # #processes = 2 #--------------------------------------------------- #bytes #repetitions t[usec] Mbytes/sec 0 1000 3.25 0.00 1 1000 3.39 0.28 2 1000 3.26 0.58 4 1000 3.23 1.18 8 1000 3.19 2.39 16 1000 3.24 4.71 32 1000 4.17 7.32 64 1000 4.24 14.40 128 1000 6.15 19.84 256 1000 6.50 37.55 512 1000 6.90 70.72 1024 1000 9.55 102.31 2048 1000 12.70 153.84 4096 1000 20.20 193.40 8192 1000 24.75 315.70 16384 1000 40.36 387.13 32768 1000 62.41 500.74 65536 640 86.39 723.49 131072 320 142.30 878.46 262144 160 251.60 993.66 524288 80 473.08 1056.90 1048576 40 906.45 1103.21 2097152 20 1783.21 1121.57 4194304 10 3719.95 1075.28 #----------------------------------------------------------------------------- # Benchmarking Sendrecv # #processes = 2 #----------------------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] Mbytes/sec 0 1000 3.19 3.19 3.19 0.00 1 1000 3.50 3.50 3.50 0.55 2 1000 3.22 3.22 3.22 1.18 4 1000 3.25 3.26 3.25 2.34 8 1000 3.29 3.29 3.29 4.64 16 1000 3.41 3.41 3.41 8.95 32 1000 3.95 3.95 3.95 15.44 64 1000 4.18 4.18 4.18 29.18 128 1000 6.11 6.11 6.11 39.94 256 1000 6.42 6.42 6.42 76.04 512 1000 7.05 7.06 7.06 138.33 1024 1000 9.61 9.62 9.62 203.03 2048 1000 12.70 12.70 12.70 307.49 4096 1000 20.23 20.24 20.24 385.96 8192 1000 24.83 24.84 24.83 629.00 16384 1000 40.94 40.94 40.94 763.32 32768 1000 62.65 62.65 62.65 997.55 65536 640 84.92 84.93 84.92 1471.83 131072 320 138.24 138.26 138.25 1808.24 262144 160 247.68 247.71 247.70 2018.46 524288 80 472.36 472.42 472.39 2116.78 1048576 40 896.94 897.10 897.02 2229.40 2097152 20 1816.89 1817.02 1816.95 2201.41 4194304 10 3623.87 3624.47 3624.17 2207.22 #----------------------------------------------------------------------------- # Benchmarking Exchange # #processes = 2 #----------------------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] Mbytes/sec 0 1000 5.42 5.42 5.42 0.00 1 1000 5.60 5.61 5.61 0.68 2 1000 5.60 5.60 5.60 1.36 4 1000 5.68 5.68 5.68 2.69 8 1000 5.99 5.99 5.99 5.09 16 1000 6.12 6.12 6.12 9.97 32 1000 6.87 6.87 6.87 17.76 64 1000 7.19 7.19 7.19 33.95 128 1000 10.41 10.41 10.41 46.89 256 1000 10.69 10.70 10.70 91.26 512 1000 11.71 11.72 11.72 166.64 1024 1000 17.41 17.42 17.41 224.28 2048 1000 22.94 22.95 22.95 340.37 4096 1000 36.38 36.40 36.39 429.31 8192 1000 44.18 44.20 44.19 707.08 16384 1000 74.18 74.20 74.19 842.33 32768 1000 120.43 120.44 120.44 1037.82 65536 640 167.05 167.06 167.06 1496.43 131072 320 274.43 274.45 274.44 1821.82 262144 160 492.14 492.18 492.16 2031.77 524288 80 932.26 932.35 932.30 2145.13 1048576 40 1785.98 1786.15 1786.07 2239.45 2097152 20 3549.29 3549.54 3549.41 2253.81 4194304 10 7116.37 7116.86 7116.62 2248.18 #---------------------------------------------------------------- # Benchmarking Allreduce # #processes = 2 #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 6.14 6.14 6.14 4 1000 6.27 6.27 6.27 8 1000 6.40 6.40 6.40 16 1000 6.49 6.49 6.49 32 1000 7.93 7.93 7.93 64 1000 8.26 8.26 8.26 128 1000 11.13 11.13 11.13 256 1000 11.81 11.81 11.81 512 1000 13.27 13.27 13.27 1024 1000 17.37 17.38 17.37 2048 1000 21.07 21.08 21.08 4096 1000 31.62 31.63 31.63 8192 1000 45.51 45.52 45.51 16384 1000 71.46 71.47 71.47 32768 1000 117.28 117.30 117.29 65536 640 223.33 223.35 223.34 131072 320 398.66 398.69 398.68 262144 160 927.19 927.26 927.23 524288 80 2113.77 2113.92 2113.85 1048576 40 4825.84 4826.19 4826.01 2097152 20 9830.98 9831.67 9831.32 4194304 10 19839.45 19840.84 19840.15 #---------------------------------------------------------------- # Benchmarking Reduce # #processes = 2 #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 3.26 3.26 3.26 4 1000 3.22 3.22 3.22 8 1000 3.24 3.24 3.24 16 1000 3.23 3.23 3.23 32 1000 4.05 4.05 4.05 64 1000 4.13 4.13 4.13 128 1000 5.63 5.63 5.63 256 1000 6.02 6.03 6.02 512 1000 6.81 6.81 6.81 1024 1000 8.96 8.96 8.96 2048 1000 11.14 11.15 11.14 4096 1000 16.92 16.93 16.92 8192 1000 25.51 25.53 25.52 16384 1000 42.67 42.70 42.69 32768 1000 71.35 71.40 71.37 65536 640 142.00 142.08 142.04 131072 320 256.21 256.56 256.38 262144 160 719.45 721.31 720.38 524288 80 1739.19 1749.91 1744.55 1048576 40 3961.00 4015.83 3988.42 2097152 20 7935.42 8150.40 8042.91 4194304 10 15676.97 16552.24 16114.60 #---------------------------------------------------------------- # Benchmarking Reduce_scatter # #processes = 2 #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 6.33 6.33 6.33 4 1000 6.42 6.42 6.42 8 1000 6.51 6.51 6.51 16 1000 6.54 6.54 6.54 32 1000 7.26 7.26 7.26 64 1000 8.16 8.16 8.16 128 1000 9.86 9.86 9.86 256 1000 11.73 11.74 11.73 512 1000 12.91 12.91 12.91 1024 1000 15.75 15.75 15.75 2048 1000 20.09 20.09 20.09 4096 1000 28.51 28.52 28.51 8192 1000 46.88 46.89 46.88 16384 1000 77.12 77.12 77.12 32768 1000 129.00 129.00 129.00 65536 640 265.95 265.95 265.95 131072 320 529.43 529.52 529.48 262144 160 1256.84 1257.31 1257.07 524288 80 2702.63 2705.38 2704.01 1048576 40 5651.73 5664.87 5658.30 2097152 20 11320.39 11372.71 11346.55 4194304 10 22465.48 22661.20 22563.34 #---------------------------------------------------------------- # Benchmarking Allgather # #processes = 2 #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 6.52 6.52 6.52 1 1000 6.78 6.78 6.78 2 1000 6.74 6.74 6.74 4 1000 6.63 6.63 6.63 8 1000 6.78 6.78 6.78 16 1000 6.94 6.94 6.94 32 1000 8.24 8.24 8.24 64 1000 8.51 8.51 8.51 128 1000 11.35 11.35 11.35 256 1000 11.85 11.85 11.85 512 1000 13.20 13.20 13.20 1024 1000 17.23 17.23 17.23 2048 1000 20.42 20.43 20.43 4096 1000 30.21 30.22 30.21 8192 1000 40.58 40.59 40.59 16384 1000 63.64 63.66 63.65 32768 1000 100.10 100.12 100.11 65536 640 181.76 181.77 181.77 131072 320 305.95 305.97 305.96 262144 160 555.82 555.85 555.83 524288 80 1237.10 1237.19 1237.15 1048576 40 2799.83 2800.03 2799.93 2097152 20 5532.60 5533.03 5532.82 4194304 10 10964.68 10965.52 10965.10 #---------------------------------------------------------------- # Benchmarking Allgatherv # #processes = 2 #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 6.39 6.39 6.39 1 1000 6.51 6.51 6.51 2 1000 6.52 6.52 6.52 4 1000 6.52 6.52 6.52 8 1000 6.63 6.63 6.63 16 1000 7.34 7.34 7.34 32 1000 8.25 8.25 8.25 64 1000 9.78 9.79 9.79 128 1000 11.66 11.66 11.66 256 1000 12.79 12.79 12.79 512 1000 15.15 15.15 15.15 1024 1000 18.76 18.76 18.76 2048 1000 25.09 25.10 25.10 4096 1000 34.93 34.94 34.94 8192 1000 50.28 50.29 50.29 16384 1000 81.21 81.23 81.22 32768 1000 135.99 136.00 135.99 65536 640 234.22 234.24 234.23 131072 320 412.98 413.01 412.99 262144 160 772.88 772.93 772.91 524288 80 1668.40 1668.52 1668.46 1048576 40 3654.93 3655.20 3655.06 2097152 20 7239.78 7240.32 7240.05 4194304 10 14467.96 14469.02 14468.49 #---------------------------------------------------------------- # Benchmarking Alltoall # #processes = 2 #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 3.65 3.66 3.66 1 1000 3.73 3.73 3.73 2 1000 3.90 3.91 3.90 4 1000 3.91 3.91 3.91 8 1000 3.62 3.62 3.62 16 1000 4.00 4.00 4.00 32 1000 4.68 4.69 4.69 64 1000 4.75 4.75 4.75 128 1000 6.27 6.27 6.27 256 1000 6.52 6.52 6.52 512 1000 7.22 7.23 7.22 1024 1000 10.02 10.02 10.02 2048 1000 13.18 13.19 13.19 4096 1000 21.32 21.33 21.32 8192 1000 26.64 26.65 26.64 16384 1000 47.82 47.83 47.82 32768 1000 71.15 71.16 71.16 65536 640 105.91 105.92 105.92 131072 320 179.49 179.51 179.50 262144 160 327.98 328.02 328.00 524288 80 800.14 800.25 800.20 1048576 40 1952.86 1953.08 1952.97 2097152 20 3889.12 3889.64 3889.38 4194304 10 7741.64 7742.57 7742.10 #---------------------------------------------------------------- # Benchmarking Alltoallv # #processes = 2 #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 3.64 3.64 3.64 1 1000 3.78 3.78 3.78 2 1000 3.64 3.64 3.64 4 1000 3.59 3.59 3.59 8 1000 4.07 4.07 4.07 16 1000 3.66 3.66 3.66 32 1000 4.56 4.56 4.56 64 1000 4.82 4.83 4.82 128 1000 6.25 6.25 6.25 256 1000 6.72 6.73 6.73 512 1000 7.41 7.41 7.41 1024 1000 9.85 9.85 9.85 2048 1000 13.05 13.05 13.05 4096 1000 21.07 21.08 21.07 8192 1000 27.10 27.11 27.11 16384 1000 47.79 47.80 47.79 32768 1000 71.22 71.22 71.22 65536 640 103.44 103.45 103.44 131072 320 175.67 175.69 175.68 262144 160 323.91 323.96 323.94 524288 80 805.34 805.39 805.36 1048576 40 1984.66 1984.78 1984.72 2097152 20 3948.71 3948.93 3948.82 4194304 10 7956.05 7956.52 7956.28 #---------------------------------------------------------------- # Benchmarking Bcast # #processes = 2 #---------------------------------------------------------------- #bytes #repetitions t_min[usec] t_max[usec] t_avg[usec] 0 1000 3.07 3.07 3.07 1 1000 3.17 3.17 3.17 2 1000 3.22 3.22 3.22 4 1000 3.15 3.16 3.15 8 1000 3.22 3.23 3.23 16 1000 3.24 3.25 3.25 32 1000 4.00 4.00 4.00 64 1000 4.10 4.10 4.10 128 1000 5.63 5.64 5.63 256 1000 5.90 5.90 5.90 512 1000 6.44 6.45 6.45 1024 1000 8.43 8.44 8.43 2048 1000 10.03 10.05 10.04 4096 1000 14.68 14.69 14.68 8192 1000 20.02 20.04 20.03 16384 1000 30.12 30.15 30.14 32768 1000 45.36 45.39 45.38 65536 640 82.28 82.29 82.28 131072 320 135.47 135.49 135.48 262144 160 245.56 245.60 245.58 524288 80 468.75 468.84 468.79 1048576 40 890.68 890.85 890.77 2097152 20 1775.95 1776.29 1776.12 4194304 10 3610.93 3611.58 3611.25 #--------------------------------------------------- # Benchmarking Barrier # #processes = 2 #--------------------------------------------------- #repetitions t_min[usec] t_max[usec] t_avg[usec] 1000 11.54 11.54 11.54