FuzzBench: 2020-08-12 report

warning
Please consider this as a preliminary report to demonstrate the capabilities of FuzzBench. While we have tried our best, we have not confirmed that we configured everything correctly. We are hoping to work together with the community to validate results and improve the set of fuzzers, benchmarks, and their configurations in the future. See FAQ for more details.

experiment summary

We show two different aggregate (cross-benchmark) rankings of fuzzers. The first is based on the average of per-benchmarks scores, where the score represents the percentage of the highest reached median coverage on a given benchmark (higher value is better). The second ranking shows the average rank of fuzzers, after we rank them on each benchmark according to their median reached covereges (lower value is better).
By avg. score
average normalized score
fuzzer
aflplusplus_same1 99.91
aflplusplus_same3 99.76
aflplusplus_same2 99.62
By avg. rank
average rank
fuzzer
aflplusplus_same1 1.92
aflplusplus_same2 1.95
aflplusplus_same3 2.12
  • Critical difference diagram
    The diagram visualizes the average rank of fuzzers (second ranking above) while showing the significance of the differences as well. What is considered a "critical difference" (CD) is based on the Friedman/Nemenyi post-hoc test. See more in the documentation.
    Note: If a fuzzer does not support all benchmarks, its ranking as shown in this diagram can be lower than it should be. So please check the list of supported benchmarks for the fuzzer(s) of your interest. The list could be specified in the fuzzer's README.md like this.
  • Median coverages on each benchmark
    fuzzer aflplusplus_same1 aflplusplus_same2 aflplusplus_same3
    benchmark
    bloaty_fuzz_target 6880.0 6870.5 6887.0
    curl_curl_fuzzer_http 16743.5 16834.0 16735.0
    freetype2-2017 17459.0 17584.0 17385.5
    harfbuzz-1.3.2 8184.0 8188.0 8189.0
    jsoncpp_jsoncpp_fuzzer 639.0 639.0 639.0
    lcms-2017-03-21 1319.0 1311.0 1317.0
    libjpeg-turbo-07-2017 3324.0 3325.0 3324.0
    libpcap_fuzz_both 83.0 83.0 83.0
    libpng-1.2.56 1508.0 1509.0 1509.0
    libxml2-v2.9.2 6060.0 6059.5 6055.5
    mbedtls_fuzz_dtlsclient 7868.0 7846.5 7837.0
    openssl_x509 13720.0 13725.0 13720.0
    openthread-2019-12-23 5171.0 5179.0 5179.0
    php_php-fuzz-parser 42109.0 42104.5 42087.5
    proj4-2017-08-14 3968.5 3767.0 3916.5
    re2-2014-12-09 3444.5 3445.0 3448.5
    sqlite3_ossfuzz 21505.0 21308.0 21382.0
    vorbis-2017-12-11 2106.0 2098.0 2099.0
    woff2-2016-05-06 1700.0 1701.0 1700.0
    zlib_zlib_uncompress_fuzzer 942.0 942.0 942.0

bloaty_fuzz_target summary

Ranking by median reached coverage
Reached coverage distribution
Mean coverage growth over time
* The error bands show the 95% confidence interval around the mean coverage.
  • Sample statistics and statistical significance
    Coverage sample statistics
    count mean std min 25% median 75% max
    fuzzer time
    aflplusplus_same3 900 19.0 6891.052632 71.515712 6767.0 6859.50 6887.0 6909.50 7096.0
    aflplusplus_same1 900 20.0 6864.500000 116.620166 6585.0 6836.25 6880.0 6901.75 7033.0
    aflplusplus_same2 900 20.0 6893.650000 81.797745 6796.0 6837.00 6870.5 6915.75 7109.0

    Mann-Whitney U test
    The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.

curl_curl_fuzzer_http summary

Ranking by median reached coverage
Reached coverage distribution
Mean coverage growth over time
* The error bands show the 95% confidence interval around the mean coverage.
  • Sample statistics and statistical significance
    Coverage sample statistics
    count mean std min 25% median 75% max
    fuzzer time
    aflplusplus_same2 900 20.0 16800.05 127.286033 16600.0 16671.5 16834.0 16912.00 16958.0
    aflplusplus_same1 900 20.0 16778.40 115.242262 16612.0 16709.0 16743.5 16850.75 17045.0
    aflplusplus_same3 900 20.0 16746.95 123.138551 16566.0 16665.0 16735.0 16821.75 16947.0

    Mann-Whitney U test
    The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.

freetype2-2017 summary

Ranking by median reached coverage
Reached coverage distribution
Mean coverage growth over time
* The error bands show the 95% confidence interval around the mean coverage.
  • Sample statistics and statistical significance
    Coverage sample statistics
    count mean std min 25% median 75% max
    fuzzer time
    aflplusplus_same2 900 19.0 17559.842105 398.703630 16952.0 17324.50 17584.0 17687.00 18840.0
    aflplusplus_same1 900 20.0 17490.650000 420.577332 17073.0 17161.75 17459.0 17674.50 18610.0
    aflplusplus_same3 900 20.0 17455.200000 336.952847 16966.0 17260.50 17385.5 17629.75 18533.0

    Mann-Whitney U test
    The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.

harfbuzz-1.3.2 summary

Ranking by median reached coverage
Reached coverage distribution
Mean coverage growth over time
* The error bands show the 95% confidence interval around the mean coverage.
  • Sample statistics and statistical significance
    Coverage sample statistics
    count mean std min 25% median 75% max
    fuzzer time
    aflplusplus_same3 900 19.0 8194.473684 15.207339 8167.0 8186.0 8189.0 8206.5 8225.0
    aflplusplus_same2 900 19.0 8189.578947 20.605597 8143.0 8180.5 8188.0 8202.0 8226.0
    aflplusplus_same1 900 19.0 8189.578947 15.735013 8172.0 8180.0 8184.0 8193.5 8234.0

    Mann-Whitney U test
    The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.

jsoncpp_jsoncpp_fuzzer summary

Ranking by median reached coverage
Reached coverage distribution
Mean coverage growth over time
* The error bands show the 95% confidence interval around the mean coverage.
  • Sample statistics and statistical significance
    Coverage sample statistics
    count mean std min 25% median 75% max
    fuzzer time
    aflplusplus_same1 900 20.0 639.0 0.0 639.0 639.0 639.0 639.0 639.0
    aflplusplus_same2 900 20.0 639.0 0.0 639.0 639.0 639.0 639.0 639.0
    aflplusplus_same3 900 18.0 639.0 0.0 639.0 639.0 639.0 639.0 639.0

    Mann-Whitney U test
    The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.

lcms-2017-03-21 summary

Ranking by median reached coverage
Reached coverage distribution
Mean coverage growth over time
* The error bands show the 95% confidence interval around the mean coverage.
  • Sample statistics and statistical significance
    Coverage sample statistics
    count mean std min 25% median 75% max
    fuzzer time
    aflplusplus_same1 900 20.0 1443.250000 415.324334 1175.0 1256.0 1319.0 1328.0 2408.0
    aflplusplus_same3 900 18.0 1542.944444 459.428606 1171.0 1311.0 1317.0 1335.5 2402.0
    aflplusplus_same2 900 19.0 1443.526316 419.344643 1171.0 1222.0 1311.0 1323.5 2396.0

    Mann-Whitney U test
    The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.

libjpeg-turbo-07-2017 summary

Ranking by median reached coverage
Reached coverage distribution
Mean coverage growth over time
* The error bands show the 95% confidence interval around the mean coverage.
  • Sample statistics and statistical significance
    Coverage sample statistics
    count mean std min 25% median 75% max
    fuzzer time
    aflplusplus_same2 900 19.0 3325.578947 2.142579 3323.0 3324.0 3325.0 3327.0 3329.0
    aflplusplus_same1 900 20.0 3324.750000 2.048748 3323.0 3323.0 3324.0 3325.0 3330.0
    aflplusplus_same3 900 17.0 3324.823529 2.480809 3323.0 3323.0 3324.0 3325.0 3331.0

    Mann-Whitney U test
    The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.

libpcap_fuzz_both summary

Ranking by median reached coverage
Reached coverage distribution
Mean coverage growth over time
* The error bands show the 95% confidence interval around the mean coverage.
  • Sample statistics and statistical significance
    Coverage sample statistics
    count mean std min 25% median 75% max
    fuzzer time
    aflplusplus_same1 900 18.0 103.111111 84.826389 83.0 83.0 83.0 83.0 443.0
    aflplusplus_same2 900 20.0 83.000000 0.000000 83.0 83.0 83.0 83.0 83.0
    aflplusplus_same3 900 20.0 83.300000 1.341641 83.0 83.0 83.0 83.0 89.0

    Mann-Whitney U test
    The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.

libpng-1.2.56 summary

Ranking by median reached coverage
Reached coverage distribution
Mean coverage growth over time
* The error bands show the 95% confidence interval around the mean coverage.
  • Sample statistics and statistical significance
    Coverage sample statistics
    count mean std min 25% median 75% max
    fuzzer time
    aflplusplus_same2 900 20.0 1508.600000 0.598243 1507.0 1508.0 1509.0 1509.00 1509.0
    aflplusplus_same3 900 19.0 1508.473684 0.611775 1507.0 1508.0 1509.0 1509.00 1509.0
    aflplusplus_same1 900 18.0 1508.222222 0.548319 1507.0 1508.0 1508.0 1508.75 1509.0

    Mann-Whitney U test
    The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.

libxml2-v2.9.2 summary

Ranking by median reached coverage
Reached coverage distribution
Mean coverage growth over time
* The error bands show the 95% confidence interval around the mean coverage.
  • Sample statistics and statistical significance
    Coverage sample statistics
    count mean std min 25% median 75% max
    fuzzer time
    aflplusplus_same1 900 19.0 6075.578947 45.401072 6013.0 6047.00 6060.0 6086.50 6212.0
    aflplusplus_same2 900 20.0 6073.200000 117.949409 5874.0 6030.25 6059.5 6097.25 6307.0
    aflplusplus_same3 900 20.0 6031.450000 121.133017 5725.0 6011.00 6055.5 6084.50 6251.0

    Mann-Whitney U test
    The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.

mbedtls_fuzz_dtlsclient summary

Ranking by median reached coverage
Reached coverage distribution
Mean coverage growth over time
* The error bands show the 95% confidence interval around the mean coverage.
  • Sample statistics and statistical significance
    Coverage sample statistics
    count mean std min 25% median 75% max
    fuzzer time
    aflplusplus_same1 900 20.0 7870.250000 53.908182 7787.0 7815.00 7868.0 7910.50 7975.0
    aflplusplus_same2 900 20.0 7846.550000 43.582318 7793.0 7813.00 7846.5 7869.25 7967.0
    aflplusplus_same3 900 18.0 7844.555556 42.835145 7785.0 7808.75 7837.0 7878.00 7932.0

    Mann-Whitney U test
    The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.

openssl_x509 summary

Ranking by median reached coverage
Reached coverage distribution
Mean coverage growth over time
* The error bands show the 95% confidence interval around the mean coverage.
  • Sample statistics and statistical significance
    Coverage sample statistics
    count mean std min 25% median 75% max
    fuzzer time
    aflplusplus_same2 900 20.0 13727.400000 8.923063 13714.0 13720.0 13725.0 13733.5 13743.0
    aflplusplus_same1 900 20.0 13724.350000 11.917590 13698.0 13720.0 13720.0 13733.0 13740.0
    aflplusplus_same3 900 19.0 13720.789474 12.340444 13693.0 13720.0 13720.0 13720.0 13751.0

    Mann-Whitney U test
    The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.

openthread-2019-12-23 summary

Ranking by median reached coverage
Reached coverage distribution
Mean coverage growth over time
* The error bands show the 95% confidence interval around the mean coverage.
  • Sample statistics and statistical significance
    Coverage sample statistics
    count mean std min 25% median 75% max
    fuzzer time
    aflplusplus_same2 900 19.0 5173.526316 10.420964 5152.0 5177.0 5179.0 5179.0 5180.0
    aflplusplus_same3 900 18.0 5178.444444 15.305602 5154.0 5177.0 5179.0 5180.5 5213.0
    aflplusplus_same1 900 19.0 5166.526316 25.009706 5086.0 5154.5 5171.0 5179.0 5213.0

    Mann-Whitney U test
    The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.

php_php-fuzz-parser summary

Ranking by median reached coverage
Reached coverage distribution
Mean coverage growth over time
* The error bands show the 95% confidence interval around the mean coverage.
  • Sample statistics and statistical significance
    Coverage sample statistics
    count mean std min 25% median 75% max
    fuzzer time
    aflplusplus_same1 900 20.0 42172.10 190.202081 42078.0 42092.75 42109.0 42136.00 42727.0
    aflplusplus_same2 900 20.0 42195.90 248.188786 42049.0 42079.25 42104.5 42135.00 42864.0
    aflplusplus_same3 900 20.0 42124.45 91.061850 42063.0 42078.25 42087.5 42116.25 42442.0

    Mann-Whitney U test
    The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.

proj4-2017-08-14 summary

Ranking by median reached coverage
Reached coverage distribution
Mean coverage growth over time
* The error bands show the 95% confidence interval around the mean coverage.
  • Sample statistics and statistical significance
    Coverage sample statistics
    count mean std min 25% median 75% max
    fuzzer time
    aflplusplus_same1 900 18.0 3985.388889 314.729041 3567.0 3672.75 3968.5 4309.00 4456.0
    aflplusplus_same3 900 20.0 3997.400000 330.625468 3551.0 3670.50 3916.5 4315.25 4488.0
    aflplusplus_same2 900 20.0 3884.300000 328.391791 3539.0 3600.00 3767.0 4145.00 4448.0

    Mann-Whitney U test
    The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.

re2-2014-12-09 summary

Ranking by median reached coverage
Reached coverage distribution
Mean coverage growth over time
* The error bands show the 95% confidence interval around the mean coverage.
  • Sample statistics and statistical significance
    Coverage sample statistics
    count mean std min 25% median 75% max
    fuzzer time
    aflplusplus_same3 900 20.0 3458.80 20.096085 3435.0 3443.0 3448.5 3483.25 3488.0
    aflplusplus_same2 900 20.0 3451.15 17.196312 3432.0 3442.0 3445.0 3447.75 3492.0
    aflplusplus_same1 900 20.0 3456.00 22.016740 3425.0 3442.0 3444.5 3483.00 3487.0

    Mann-Whitney U test
    The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.

sqlite3_ossfuzz summary

Ranking by median reached coverage
Reached coverage distribution
Mean coverage growth over time
* The error bands show the 95% confidence interval around the mean coverage.
  • Sample statistics and statistical significance
    Coverage sample statistics
    count mean std min 25% median 75% max
    fuzzer time
    aflplusplus_same1 900 20.0 21477.550000 314.773181 20890.0 21247.25 21505.0 21719.00 22014.0
    aflplusplus_same3 900 20.0 21479.100000 366.576035 20907.0 21301.50 21382.0 21686.25 22316.0
    aflplusplus_same2 900 19.0 21318.842105 247.348217 20831.0 21119.50 21308.0 21525.50 21767.0

    Mann-Whitney U test
    The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.

vorbis-2017-12-11 summary

Ranking by median reached coverage
Reached coverage distribution
Mean coverage growth over time
* The error bands show the 95% confidence interval around the mean coverage.
  • Sample statistics and statistical significance
    Coverage sample statistics
    count mean std min 25% median 75% max
    fuzzer time
    aflplusplus_same1 900 19.0 2106.947368 19.831159 2068.0 2090.50 2106.0 2118.50 2140.0
    aflplusplus_same3 900 20.0 2099.550000 18.588267 2076.0 2085.25 2099.0 2107.25 2142.0
    aflplusplus_same2 900 19.0 2101.315789 25.560500 2033.0 2087.00 2098.0 2122.50 2141.0

    Mann-Whitney U test
    The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.

woff2-2016-05-06 summary

Ranking by median reached coverage
Reached coverage distribution
Mean coverage growth over time
* The error bands show the 95% confidence interval around the mean coverage.
  • Sample statistics and statistical significance
    Coverage sample statistics
    count mean std min 25% median 75% max
    fuzzer time
    aflplusplus_same2 900 20.0 1704.150000 19.677064 1671.0 1696.0 1701.0 1708.5 1745.0
    aflplusplus_same1 900 20.0 1617.950000 381.527743 0.0 1682.0 1700.0 1708.0 1770.0
    aflplusplus_same3 900 19.0 1704.263158 25.352716 1673.0 1696.0 1700.0 1705.5 1772.0

    Mann-Whitney U test
    The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.

zlib_zlib_uncompress_fuzzer summary

Ranking by median reached coverage
Reached coverage distribution
Mean coverage growth over time
* The error bands show the 95% confidence interval around the mean coverage.
  • Sample statistics and statistical significance
    Coverage sample statistics
    count mean std min 25% median 75% max
    fuzzer time
    aflplusplus_same1 900 20.0 942.0 0.0 942.0 942.0 942.0 942.0 942.0
    aflplusplus_same2 900 20.0 942.0 0.0 942.0 942.0 942.0 942.0 942.0
    aflplusplus_same3 900 20.0 942.0 0.0 942.0 942.0 942.0 942.0 942.0

    Mann-Whitney U test
    The table summarizes the p values of pairwise Mann-Whitney U tests. Green cells indicate that the reached coverage distribution of a given fuzzer pair is significantly different.

experiment data

You can download the raw data for this report here.

Check out the documentation on how to create customized reports using this data. Also see some example Colab notebooks for doing custom analysis on the data here.

The experiment was conducted using this FuzzBench commit: 0dc59a26ba262ddcdac58d70ce92384780197104