fuzz_one 函数

简单阅读一下 fuzz_one 函数。

总结

fuzz_one 从 queue 中取出 entry 进行 fuzz 并执行，成功返回 0，跳过或退出的话返回 1。

fuzz 阶段：

deterministic 阶段：

bitflip

arithmetic

interest

dictionary

non-deterministic 阶段：

havoc

splice

一些可能有趣的位置

bitflip 1/1：包含建立 token 的过程；

/* Take the current entry from the queue, fuzz it for a while. This
   function is a tad too long... returns 0 if fuzzed successfully, 1 if
   skipped or bailed out. */
 
static u8 fuzz_one(char** argv) {
 
  s32 len, fd, temp_len, i, j;
  u8  *in_buf, *out_buf, *orig_in, *ex_tmp, *eff_map = 0;
  u64 havoc_queued,  orig_hit_cnt, new_hit_cnt;
  u32 splice_cycle = 0, perf_score = 100, orig_perf, prev_cksum, eff_cnt = 1;
 
  u8  ret_val = 1, doing_det = 0;
 
  u8  a_collect[MAX_AUTO_EXTRA];
  u32 a_len = 0;

种子选择

首先是对 IGNORE_FINDS 模式的处理。IGNORE_FINDS 定义在 config.h 中，对它的介绍是：

IGNORE_FINDS

如果开启了 IGNORE_FINDS，则对于能够发现新路径的输入，AFL 不会将它们作为 fuzzing 的种子。这是为了测试 dump fuzzing 算法可以获取的覆盖率。
/* Uncomment this to use instrumentation data to record newly discovered paths,
   but do not use them as seeds for fuzzing. This is useful for conveniently
   measuring coverage that could be attained by a "dumb" fuzzing algorithm: */
 
// #define IGNORE_FINDS
 
#endif /* ! _HAVE_CONFIG_H */
 
指向原始笔记的链接

在 IGNORE_FINDS 中，如果当前队列的深度大于 1 就直接返回 1。

猜测

如果当前队列的深度大于 1，那么说明当前队列包含可以发现新路径的输入，根据 IGNORE_FINDS 的想法，AFL 不会对这样的输入进行变异，因此会直接返回。

#ifdef IGNORE_FINDS
 
  /* In IGNORE_FINDS mode, skip any entries that weren't in the
     initial data set. */
 
  if (queue_cur->depth > 1) return 1;

而如果我们没有开启 IGNORE_FINDS 选项，则会先判断队列中有没有收到青睐的，没有被 fuzz 过的新来者（种子）。

具体到代码上，如果当前队列的种子已经被 fuzz 过或不是感兴趣的种子，则有 99% 的概率跳过，不会继续 fuzz。

#else
 
  if (pending_favored) {
 
    /* If we have any favored, non-fuzzed new arrivals in the queue,
       possibly skip to them at the expense of already-fuzzed or non-favored
       cases. */
 
    if ((queue_cur->was_fuzzed || !queue_cur->favored) &&
        UR(100) < SKIP_TO_NEW_PROB) return 1;
  }

如果有感兴趣的输入则进入下一个判断，进入这个逻辑的要求为：

当前队列没有感兴趣的种子；
不处于 dump_mode（也就是无插桩模式）；
当前队列的种子不感兴趣；
队列中测试用例总数大于 10。

在这种情况下，如果：

已经进行了超过一轮测试；
队列中当前的种子没有被 fuzz 过。

则有 75% 的概率跳过，否则有 95% 的概率跳过。

  else if (!dumb_mode && !queue_cur->favored && queued_paths > 10) {
 
    /* Otherwise, still possibly skip non-favored cases, albeit less often.
       The odds of skipping stuff are higher for already-fuzzed inputs and
       lower for never-fuzzed entries. */
 
    if (queue_cycle > 1 && !queue_cur->was_fuzzed) {
      if (UR(100) < SKIP_NFAV_NEW_PROB) return 1;
    } else {
      if (UR(100) < SKIP_NFAV_OLD_PROB) return 1;
    }
  }
 
#endif /* ^IGNORE_FINDS */
 
  if (not_on_tty) {
    ACTF("Fuzzing test case #%u (%u total, %llu uniq crashes found)...",
         current_entry, queued_paths, unique_crashes);
    fflush(stdout);
  }

除了上述的跳过情况外，我们可以总结一定会进入接下来 fuzz 的情况为：

如果开启了 IGNORE_FINDS 选项，只要当前种子的路径深度小于 1，就一定会进入下面的处理逻辑；
如果没有开启 IGNORE_FINDS 选项：
- 如果开启了 dump_mode、当前的种子是感兴趣的、队列中测试用例小于等于 10，则一定会进入下面的处理逻辑。

总的来说，这一步进行了种子选择。有趣的是，在 fuzzing 后期，按照 AFL 的规则，不感兴趣的种子只有 5% 的概率被选择。在如何选择种子上，学术界也进行了广泛的讨论。尽管到现在也没有一个通用的方法，但围绕种子选择确实也产出了很多论文，可喜可贺。

一些可以参考的关于种子选择的文章

Coverage-based Greybox Fuzzing as Markov Chain（AFLFast）

获取输入

  /* Map the test case into memory. */
 
  fd = open(queue_cur->fname, O_RDONLY);
  if (fd < 0) PFATAL("Unable to open '%s'", queue_cur->fname);
 
  len = queue_cur->len;
  orig_in = in_buf = mmap(0, len, PROT_READ | PROT_WRITE, MAP_PRIVATE, fd, 0);
  if (orig_in == MAP_FAILED) PFATAL("Unable to mmap '%s'", queue_cur->fname);
  close(fd);
 
  /* We could mmap() out_buf as MAP_PRIVATE, but we end up clobbering every
     single byte anyway, so it wouldn't give us any performance or memory usage
     benefits. */
 
  out_buf = ck_alloc_nozero(len);
  subseq_tmouts = 0;
  cur_depth = queue_cur->depth;

校准

进入了队列的种子都需要校准，因此如果之前的校准失败了，这里还会再次校准。

根据代码来看，默认校准错误是 FAULT_TMOUT 也就是超时，接下来如果校准次数小于 3 次就重新校准，否则就按照超时对待这个种子。

  /*******************************************
   * CALIBRATION (only if failed earlier on) *
   *******************************************/
 
  if (queue_cur->cal_failed) {
    u8 res = FAULT_TMOUT;
    if (queue_cur->cal_failed < CAL_CHANCES) {
      /* Reset exec_cksum to tell calibrate_case to re-execute the testcase
         avoiding the usage of an invalid trace_bits.
         For more info: https://github.com/AFLplusplus/AFLplusplus/pull/425 */
      queue_cur->exec_cksum = 0;
      res = calibrate_case(argv, queue_cur, in_buf, queue_cycle - 1, 0);
      if (res == FAULT_ERROR)
        FATAL("Unable to execute target application");
    }

在校准后，如果用户没要求停止（也就是 Ctrl+C）也没有开启 crash_mode 选项，就进入收尾工作。

    if (stop_soon || res != crash_mode) {
      cur_skipped_paths++;
      goto abandon_entry;
    }
  }

修剪

种子修剪的要求是：

不处于无插桩模式；
种子从未修剪过。

具体的 trim 细节还得去看 trim_case 实现。

在 trim 之后会设置标记，避免重复修剪，并更新当前种子的长度，最后更新 out_buf。

  /************
   * TRIMMING *
   ************/
  if (!dumb_mode && !queue_cur->trim_done) {
 
    u8 res = trim_case(argv, queue_cur, in_buf);
    if (res == FAULT_ERROR)
      FATAL("Unable to execute target application");
    if (stop_soon) {
      cur_skipped_paths++;
      goto abandon_entry;
    }
 
    /* Don't retry trimming, even if it failed. */
    queue_cur->trim_done = 1;
 
    if (len != queue_cur->len) len = queue_cur->len;
  }
  memcpy(out_buf, in_buf, len);

打分

这里会调用 calculate_score 函数对每个种子进行打分。在打分后根据分数决定havoc阶段要投入的资源。

  /*********************
   * PERFORMANCE SCORE *
   *********************/
 
  orig_perf = perf_score = calculate_score(queue_cur);

在打分后，如果

设置中启用了“跳过确定性变异”；
当前种子已经变异过了；
当前种子已经通过了确定性测试阶段；

则直接跳转到 havoc 阶段。

  /* Skip right away if -d is given, if we have done deterministic fuzzing on
     this entry ourselves (was_fuzzed), or if it has gone through deterministic
     testing in earlier, resumed runs (passed_det). */
 
  if (skip_deterministic || queue_cur->was_fuzzed || queue_cur->passed_det)
    goto havoc_stage;

此外，如果同时运行了多个 AFL 实例，且当前种子的执行校验和超出主实例的范围，也将跳过确定性 fuzz 阶段。

最后设置 doing_det，这将在后面决定当前的执行阶段。

  /* Skip deterministic fuzzing if exec path checksum puts this out of scope
     for this master instance. */
 
  if (master_max && (queue_cur->exec_cksum % master_max) != master_id - 1)
    goto havoc_stage;
 
  doing_det = 1;

接下来进入变异过程。

bitflip

这一部分包含位翻转与字典构建。bitflip 包含如下子阶段：

bitflip 1/1：步长为 1 bit，每次翻转 1 bit；
bitflip 2/1：步长为 1 bit，每次翻转相邻的 2 bits；
bitflip 4/1：步长为 1 bit，每次翻转相邻的 4 bits；
bitflip 8/8：步长为 1 byte，每次翻转 1 byte；
bitflip 16/8：步长为 1 byte，每次翻转相邻的 2 bytes；
bitflip 32/8：步长为 1 byte，每次翻转 4 bytes。

FLIP_BIT

这里首先定义了 FLIP_BIT 宏。根据下文，每个位翻转的原子操作都是 FLIP_BIT(out_buf, stage_cur);，其中 out_buf 代表输出，stage_cur 代表一个自增的循环变量。

假设当前进行的是 bitflip 1/1 阶段，那么 stage_cur 的最大值是 queue_cur->len << 3，相当于 x8 了。

假设 out_buf 的长度为 1，我们模拟一下 bitflip 1/1 阶段：

FLIP_BIT(out_buf, 8) ->
out_buf[0] ^= (128 >> (0 & 7))
... ...
out_buf[0] ^= (128 >> (7 & 7))

这样，相当于 out_buf 中的每一位都做一次位翻转。总结一下就是“每次翻转 1 个 bit，按照每 1 个 bit 的步长从头开始”。

  /*********************************************
   * SIMPLE BITFLIP (+dictionary construction) *
   *********************************************/
 
#define FLIP_BIT(_ar, _b) do { \
    u8* _arf = (u8*)(_ar); \
    u32 _bf = (_b); \
    _arf[(_bf) >> 3] ^= (128 >> ((_bf) & 7)); \
  } while (0)

bitflip 1/1

步长为 1 bit，每次翻转 1 bit：

  /* Single walking bit. */
 
  stage_short = "flip1";
  stage_max   = len << 3;
  stage_name  = "bitflip 1/1";
 
  stage_val_type = STAGE_VAL_NONE;
  orig_hit_cnt = queued_paths + unique_crashes;
  prev_cksum = queue_cur->exec_cksum;

如果我们仔细看这一步，会发现它还做了这么一个操作（common_fuzz_stuff 函数执行输入）：

  for (stage_cur = 0; stage_cur < stage_max; stage_cur++) {
    stage_cur_byte = stage_cur >> 3;
    FLIP_BIT(out_buf, stage_cur);
 
    if (common_fuzz_stuff(argv, out_buf, len)) goto abandon_entry;
 
    FLIP_BIT(out_buf, stage_cur);

简单来说就是，在变异阶段，如果翻转某个字节后程序的执行路径没有变化且与原执行路径不一致，则把这一段连续的 bytes 认为是 token。这一段在 AFL 作者的 Pulling JPEGs out of thin air 这篇文章中也有描述。

这一段的操作首先将某一位翻转，执行后再翻转回来。接下来的操作就是构建字典的操作，当 AFL 发现 token 的时候就会把它们收集到字典中。

    /* While flipping the least significant bit in every byte, pull of an extra
       trick to detect possible syntax tokens. In essence, the idea is that if
       you have a binary blob like this:
 
       xxxxxxxxIHDRxxxxxxxx
 
       ...and changing the leading and trailing bytes causes variable or no
       changes in program flow, but touching any character in the "IHDR" string
       always produces the same, distinctive path, it's highly likely that
       "IHDR" is an atomically-checked magic value of special significance to
       the fuzzed format.
 
       We do this here, rather than as a separate stage, because it's a nice
       way to keep the operation approximately "free" (i.e., no extra execs).
       
       Empirically, performing the check when flipping the least significant bit
       is advantageous, compared to doing it at the time of more disruptive
       changes, where the program flow may be affected in more violent ways.
 
       The caveat is that we won't generate dictionaries in the -d mode or -S
       mode - but that's probably a fair trade-off.
 
       This won't work particularly well with paths that exhibit variable
       behavior, but fails gracefully, so we'll carry out the checks anyway.
 
      */

Tldr

AFL 会在翻转字节最低位 bit（LSB）的时候检查变异后的执行路径和初始种子的执行路径是否一致，如果执行路径和初始种子不一样，且连续翻转几次之后得到的路径相同，就把这一段当作 token 加入字典中：

具体来说：

如果不在 dumb_mode 且翻转到了一个字符的最低位：
- 计算得到本次运行的哈希值 cksum；
- 如果当前阶段是最后一个字符（文件中的最后一个字符）且本次运行的路径和上次运行的路径相同：
  - 如果收集到的字符串长度未到上限，收集当前字符；
  - 如果收集到的字符串在收集范围内，则调用 maybe_add_auto 函数收集 token。
- 如果不是最后一个字符，且本次运行的路径和上次不一样：
  - 尝试收集 token，并重置 token；
  - 修改 prev_cksum 的值为本次的执行路径哈希值。
- 如果本次执行路径和被变异的种子的执行路径不同：
  - 此时的种子可能是带 token 的，因此添加当前字符。

    if (!dumb_mode && (stage_cur & 7) == 7) {
      u32 cksum = hash32(trace_bits, MAP_SIZE, HASH_CONST);
      if (stage_cur == stage_max - 1 && cksum == prev_cksum) {
        /* If at end of file and we are still collecting a string, grab the
           final character and force output. */
        if (a_len < MAX_AUTO_EXTRA) a_collect[a_len] = out_buf[stage_cur >> 3];
        a_len++;
        if (a_len >= MIN_AUTO_EXTRA && a_len <= MAX_AUTO_EXTRA)
          maybe_add_auto(a_collect, a_len);
      } else if (cksum != prev_cksum) {
        /* Otherwise, if the checksum has changed, see if we have something
           worthwhile queued up, and collect that if the answer is yes. */
        if (a_len >= MIN_AUTO_EXTRA && a_len <= MAX_AUTO_EXTRA)
          maybe_add_auto(a_collect, a_len);
        a_len = 0;
        prev_cksum = cksum;
      }
 
      /* Continue collecting string, but only if the bit flip actually made
         any difference - we don't want no-op tokens. */
      if (cksum != queue_cur->exec_cksum) {
        if (a_len < MAX_AUTO_EXTRA) a_collect[a_len] = out_buf[stage_cur >> 3];   
        a_len++;
      }
    }
  }
 
  new_hit_cnt = queued_paths + unique_crashes;
 
  stage_finds[STAGE_FLIP1]  += new_hit_cnt - orig_hit_cnt;
  stage_cycles[STAGE_FLIP1] += stage_max;

bitflip 2/1

步长为 1 bit，每次翻转相邻的 2 bits：

  /* Two walking bits. */
 
  stage_name  = "bitflip 2/1";
  stage_short = "flip2";
  stage_max   = (len << 3) - 1;
 
  orig_hit_cnt = new_hit_cnt;
 
  for (stage_cur = 0; stage_cur < stage_max; stage_cur++) {
 
    stage_cur_byte = stage_cur >> 3;
 
    FLIP_BIT(out_buf, stage_cur);
    FLIP_BIT(out_buf, stage_cur + 1);
 
    if (common_fuzz_stuff(argv, out_buf, len)) goto abandon_entry;
 
    FLIP_BIT(out_buf, stage_cur);
    FLIP_BIT(out_buf, stage_cur + 1);
 
  }
 
  new_hit_cnt = queued_paths + unique_crashes;
 
  stage_finds[STAGE_FLIP2]  += new_hit_cnt - orig_hit_cnt;
  stage_cycles[STAGE_FLIP2] += stage_max;

bitflip 4/1

步长为 1 bit，每次翻转相邻的 4 bits：

  /* Four walking bits. */
 
  stage_name  = "bitflip 4/1";
  stage_short = "flip4";
  stage_max   = (len << 3) - 3;
 
  orig_hit_cnt = new_hit_cnt;
 
  for (stage_cur = 0; stage_cur < stage_max; stage_cur++) {
 
    stage_cur_byte = stage_cur >> 3;
 
    FLIP_BIT(out_buf, stage_cur);
    FLIP_BIT(out_buf, stage_cur + 1);
    FLIP_BIT(out_buf, stage_cur + 2);
    FLIP_BIT(out_buf, stage_cur + 3);
 
    if (common_fuzz_stuff(argv, out_buf, len)) goto abandon_entry;
 
    FLIP_BIT(out_buf, stage_cur);
    FLIP_BIT(out_buf, stage_cur + 1);
    FLIP_BIT(out_buf, stage_cur + 2);
    FLIP_BIT(out_buf, stage_cur + 3);
 
  }
 
  new_hit_cnt = queued_paths + unique_crashes;
 
  stage_finds[STAGE_FLIP4]  += new_hit_cnt - orig_hit_cnt;
  stage_cycles[STAGE_FLIP4] += stage_max;

创建 effort map

这里 effort map 的长度和当前测试的种子长度一致，并将头尾字节设置为 1，其余字节设置为 0。

  /* Effector map setup. These macros calculate:
 
     EFF_APOS      - position of a particular file offset in the map.
     EFF_ALEN      - length of a map with a particular number of bytes.
     EFF_SPAN_ALEN - map span for a sequence of bytes.
 
   */
 
#define EFF_APOS(_p)          ((_p) >> EFF_MAP_SCALE2)
#define EFF_REM(_x)           ((_x) & ((1 << EFF_MAP_SCALE2) - 1))
#define EFF_ALEN(_l)          (EFF_APOS(_l) + !!EFF_REM(_l))
#define EFF_SPAN_ALEN(_p, _l) (EFF_APOS((_p) + (_l) - 1) - EFF_APOS(_p) + 1)
 
  /* Initialize effector map for the next step (see comments below). Always
     flag first and last byte as doing something. */
 
  eff_map    = ck_alloc(EFF_ALEN(len));
  eff_map[0] = 1;
 
  if (EFF_APOS(len - 1) != 0) {
    eff_map[EFF_APOS(len - 1)] = 1;
    eff_cnt++;
  }

bitflip 8/8

步长为 1 byte，每次翻转 1 byte：

  /* Walking byte. */
 
  stage_name  = "bitflip 8/8";
  stage_short = "flip8";
  stage_max   = len;
 
  orig_hit_cnt = new_hit_cnt;
 
  for (stage_cur = 0; stage_cur < stage_max; stage_cur++) {
    stage_cur_byte = stage_cur;
    out_buf[stage_cur] ^= 0xFF;
    if (common_fuzz_stuff(argv, out_buf, len)) goto abandon_entry;

在进行第一次翻转之后并不会再次翻转回来，而是检查执行过程中完全翻转也对执行路径没有影响的情况，以便在确定性 fuzz 阶段跳过，这样的字符会加入 effort map 中，例如算数或已知的整数。

具体到代码上：

对于长度小于 EFF_MIN_LEN 也就是 128 字节的字符串，AFL 会认为它们都是“有效”的，不在这里浪费时间；
对于长度大于等于 128 字节的字符串，如果超过 90% 的字符串都是“有效”的，就把整个文件都视为“有效”的。

    /* We also use this stage to pull off a simple trick: we identify
       bytes that seem to have no effect on the current execution path
       even when fully flipped - and we skip them during more expensive
       deterministic stages, such as arithmetics or known ints. */
    if (!eff_map[EFF_APOS(stage_cur)]) {
      u32 cksum;
      /* If in dumb mode or if the file is very short, just flag everything
         without wasting time on checksums. */
      if (!dumb_mode && len >= EFF_MIN_LEN)
        cksum = hash32(trace_bits, MAP_SIZE, HASH_CONST);
      else
        cksum = ~queue_cur->exec_cksum;
      if (cksum != queue_cur->exec_cksum) {
        eff_map[EFF_APOS(stage_cur)] = 1;
        eff_cnt++;
      }
    }
    out_buf[stage_cur] ^= 0xFF;
  }
 
  /* If the effector map is more than EFF_MAX_PERC dense, just flag the
     whole thing as worth fuzzing, since we wouldn't be saving much time
     anyway. */
 
  if (eff_cnt != EFF_ALEN(len) &&
      eff_cnt * 100 / EFF_ALEN(len) > EFF_MAX_PERC) {
    memset(eff_map, 1, EFF_ALEN(len));
    blocks_eff_select += EFF_ALEN(len);
  } else {
    blocks_eff_select += eff_cnt;
  }
 
  blocks_eff_total += EFF_ALEN(len);
  new_hit_cnt = queued_paths + unique_crashes;
 
  stage_finds[STAGE_FLIP8]  += new_hit_cnt - orig_hit_cnt;
  stage_cycles[STAGE_FLIP8] += stage_max;

bitflip 16/8

步长为 1 byte，每次翻转相邻的 2 bytes：

  /* Two walking bytes. */
 
  if (len < 2) goto skip_bitflip;
 
  stage_name  = "bitflip 16/8";
  stage_short = "flip16";
  stage_cur   = 0;
  stage_max   = len - 1;
 
  orig_hit_cnt = new_hit_cnt;
 
  for (i = 0; i < len - 1; i++) {

如果两个字节都没意义就跳过当前字节，继续翻转后面的字节：

    /* Let's consult the effector map... */
 
    if (!eff_map[EFF_APOS(i)] && !eff_map[EFF_APOS(i + 1)]) {
      stage_max--;
      continue;
    }
 
    stage_cur_byte = i;
 
    *(u16*)(out_buf + i) ^= 0xFFFF;
    if (common_fuzz_stuff(argv, out_buf, len)) goto abandon_entry;
    stage_cur++;
    *(u16*)(out_buf + i) ^= 0xFFFF;
 
 
  }
 
  new_hit_cnt = queued_paths + unique_crashes;
 
  stage_finds[STAGE_FLIP16]  += new_hit_cnt - orig_hit_cnt;
  stage_cycles[STAGE_FLIP16] += stage_max;

如果种子长度小于四，跳过后面的位翻转阶段：

  if (len < 4) goto skip_bitflip;

bitflip 32/8

步长为 1 byte，每次翻转 4 bytes：

  /* Four walking bytes. */
 
  stage_name  = "bitflip 32/8";
  stage_short = "flip32";
  stage_cur   = 0;
  stage_max   = len - 3;
 
  orig_hit_cnt = new_hit_cnt;
 
  for (i = 0; i < len - 3; i++) {

如果连续的四个字节都无效，那么跳过当前字节，继续翻转后面的字节

    /* Let's consult the effector map... */
    if (!eff_map[EFF_APOS(i)] && !eff_map[EFF_APOS(i + 1)] &&
        !eff_map[EFF_APOS(i + 2)] && !eff_map[EFF_APOS(i + 3)]) {
      stage_max--;
      continue;
    }
 
    stage_cur_byte = i;
 
    *(u32*)(out_buf + i) ^= 0xFFFFFFFF;
 
    if (common_fuzz_stuff(argv, out_buf, len)) goto abandon_entry;
    stage_cur++;
 
    *(u32*)(out_buf + i) ^= 0xFFFFFFFF;
 
  }
 
  new_hit_cnt = queued_paths + unique_crashes;
 
  stage_finds[STAGE_FLIP32]  += new_hit_cnt - orig_hit_cnt;
  stage_cycles[STAGE_FLIP32] += stage_max;

arithmetic

在位翻转后会进入整数加减运算，整数运算也分为多个子阶段：

arith 8/8：每次对 8 个 bit 进行加减运算，按照每 8 个 bit 的步长从头开始，即对文件的每个 byte 进行整数加减变异；
arith 16/8：每次对 16 个 bit 进行加减运算，按照每 8 个 bit 的步长从头开始，即对文件的每个 word 进行整数加减变异；
arith 32/8：每次对 32 个 bit 进行加减运算，按照每 8 个 bit 的步长从头开始，即对文件的每个 dword 进行整数加减变异。

arith 8/8

skip_bitflip:
 
  if (no_arith) goto skip_arith;
 
  /**********************
   * ARITHMETIC INC/DEC *
   **********************/

每次对 8 个 bit 进行加减运算，按照每 8 个 bit 的步长从头开始，即对文件的每个 byte 进行整数加减变异：

  /* 8-bit arithmetics. */
 
  stage_name  = "arith 8/8";
  stage_short = "arith8";
  stage_cur   = 0;
  stage_max   = 2 * len * ARITH_MAX;

ARITH_MAX 的值被设置为 35。

  stage_val_type = STAGE_VAL_LE;
  orig_hit_cnt = new_hit_cnt;
 
  for (i = 0; i < len; i++) {
    u8 orig = out_buf[i];

如果当前字节无效的话，跳过该字节

    /* Let's consult the effector map... */
    if (!eff_map[EFF_APOS(i)]) {
      stage_max -= 2 * ARITH_MAX;
      continue;
    }
    stage_cur_byte = i;

尝试对种子中的每个字节做整数运算，在这里，如果对某个字符做了加减操作之后的效果和 bitflip 相同则跳过。

    for (j = 1; j <= ARITH_MAX; j++) {
 
      u8 r = orig ^ (orig + j);
 
      /* Do arithmetic operations only if the result couldn't be a product
         of a bitflip. */
 
      if (!could_be_bitflip(r)) {
 
        stage_cur_val = j;
        out_buf[i] = orig + j;
 
        if (common_fuzz_stuff(argv, out_buf, len)) goto abandon_entry;
        stage_cur++;
 
      } else stage_max--;
 
      r =  orig ^ (orig - j);
 
      if (!could_be_bitflip(r)) {
 
        stage_cur_val = -j;
        out_buf[i] = orig - j;
 
        if (common_fuzz_stuff(argv, out_buf, len)) goto abandon_entry;
        stage_cur++;
 
      } else stage_max--;
 
      out_buf[i] = orig;
 
    }
 
  }
 
  new_hit_cnt = queued_paths + unique_crashes;
 
  stage_finds[STAGE_ARITH8]  += new_hit_cnt - orig_hit_cnt;
  stage_cycles[STAGE_ARITH8] += stage_max;

arith 16/8

每次对 16 个 bit 进行加减运算，按照每 8 个 bit 的步长从头开始，即对文件的每个 word 进行整数加减变异：

  /* 16-bit arithmetics, both endians. */
 
  if (len < 2) goto skip_arith;
 
  stage_name  = "arith 16/8";
  stage_short = "arith16";
  stage_cur   = 0;
  stage_max   = 4 * (len - 1) * ARITH_MAX;
 
  orig_hit_cnt = new_hit_cnt;
 
  for (i = 0; i < len - 1; i++) {
 
    u16 orig = *(u16*)(out_buf + i);
 
    /* Let's consult the effector map... */
 
    if (!eff_map[EFF_APOS(i)] && !eff_map[EFF_APOS(i + 1)]) {
      stage_max -= 4 * ARITH_MAX;
      continue;
    }
 
    stage_cur_byte = i;

在这里，AFL 还对大端和小端的情况进行了处理：

    for (j = 1; j <= ARITH_MAX; j++) {
 
      u16 r1 = orig ^ (orig + j),
          r2 = orig ^ (orig - j),
          r3 = orig ^ SWAP16(SWAP16(orig) + j),
          r4 = orig ^ SWAP16(SWAP16(orig) - j);

对于大小端，除了要满足和 bitflip 操作不一致外，还需要能够影响多个字节：

      /* Try little endian addition and subtraction first. Do it only
         if the operation would affect more than one byte (hence the 
         & 0xff overflow checks) and if it couldn't be a product of
         a bitflip. */
 
      stage_val_type = STAGE_VAL_LE; 
 
      if ((orig & 0xff) + j > 0xff && !could_be_bitflip(r1)) {
 
        stage_cur_val = j;
        *(u16*)(out_buf + i) = orig + j;
 
        if (common_fuzz_stuff(argv, out_buf, len)) goto abandon_entry;
        stage_cur++;
 
      } else stage_max--;
 
      if ((orig & 0xff) < j && !could_be_bitflip(r2)) {
 
        stage_cur_val = -j;
        *(u16*)(out_buf + i) = orig - j;
 
        if (common_fuzz_stuff(argv, out_buf, len)) goto abandon_entry;
        stage_cur++;
 
      } else stage_max--;
 
      /* Big endian comes next. Same deal. */
 
      stage_val_type = STAGE_VAL_BE;
 
 
      if ((orig >> 8) + j > 0xff && !could_be_bitflip(r3)) {
 
        stage_cur_val = j;
        *(u16*)(out_buf + i) = SWAP16(SWAP16(orig) + j);
 
        if (common_fuzz_stuff(argv, out_buf, len)) goto abandon_entry;
        stage_cur++;
 
      } else stage_max--;
 
      if ((orig >> 8) < j && !could_be_bitflip(r4)) {
 
        stage_cur_val = -j;
        *(u16*)(out_buf + i) = SWAP16(SWAP16(orig) - j);
 
        if (common_fuzz_stuff(argv, out_buf, len)) goto abandon_entry;
        stage_cur++;
 
      } else stage_max--;
 
      *(u16*)(out_buf + i) = orig;
 
    }
 
  }
 
  new_hit_cnt = queued_paths + unique_crashes;
 
  stage_finds[STAGE_ARITH16]  += new_hit_cnt - orig_hit_cnt;
  stage_cycles[STAGE_ARITH16] += stage_max;

arith 32/8

每次对 32 个 bit 进行加减运算，按照每 8 个 bit 的步长从头开始，即对文件的每个 dword 进行整数加减变异：

  /* 32-bit arithmetics, both endians. */
 
  if (len < 4) goto skip_arith;
 
  stage_name  = "arith 32/8";
  stage_short = "arith32";
  stage_cur   = 0;
  stage_max   = 4 * (len - 3) * ARITH_MAX;
 
  orig_hit_cnt = new_hit_cnt;
 
  for (i = 0; i < len - 3; i++) {
 
    u32 orig = *(u32*)(out_buf + i);
 
    /* Let's consult the effector map... */
 
    if (!eff_map[EFF_APOS(i)] && !eff_map[EFF_APOS(i + 1)] &&
        !eff_map[EFF_APOS(i + 2)] && !eff_map[EFF_APOS(i + 3)]) {
      stage_max -= 4 * ARITH_MAX;
      continue;
    }
 
    stage_cur_byte = i;
 
    for (j = 1; j <= ARITH_MAX; j++) {
 
      u32 r1 = orig ^ (orig + j),
          r2 = orig ^ (orig - j),
          r3 = orig ^ SWAP32(SWAP32(orig) + j),
          r4 = orig ^ SWAP32(SWAP32(orig) - j);

这里需要被变异的种子能够影响两个及以上字节：

      /* Little endian first. Same deal as with 16-bit: we only want to
         try if the operation would have effect on more than two bytes. */
 
      stage_val_type = STAGE_VAL_LE;
 
      if ((orig & 0xffff) + j > 0xffff && !could_be_bitflip(r1)) {
 
        stage_cur_val = j;
        *(u32*)(out_buf + i) = orig + j;
 
        if (common_fuzz_stuff(argv, out_buf, len)) goto abandon_entry;
        stage_cur++;
 
      } else stage_max--;
 
      if ((orig & 0xffff) < j && !could_be_bitflip(r2)) {
 
        stage_cur_val = -j;
        *(u32*)(out_buf + i) = orig - j;
 
        if (common_fuzz_stuff(argv, out_buf, len)) goto abandon_entry;
        stage_cur++;
 
      } else stage_max--;
 
      /* Big endian next. */
 
      stage_val_type = STAGE_VAL_BE;
 
      if ((SWAP32(orig) & 0xffff) + j > 0xffff && !could_be_bitflip(r3)) {
 
        stage_cur_val = j;
        *(u32*)(out_buf + i) = SWAP32(SWAP32(orig) + j);
 
        if (common_fuzz_stuff(argv, out_buf, len)) goto abandon_entry;
        stage_cur++;
 
      } else stage_max--;
 
      if ((SWAP32(orig) & 0xffff) < j && !could_be_bitflip(r4)) {
 
        stage_cur_val = -j;
        *(u32*)(out_buf + i) = SWAP32(SWAP32(orig) - j);
 
        if (common_fuzz_stuff(argv, out_buf, len)) goto abandon_entry;
        stage_cur++;
 
      } else stage_max--;
 
      *(u32*)(out_buf + i) = orig;
 
    }
 
  }
 
  new_hit_cnt = queued_paths + unique_crashes;
 
  stage_finds[STAGE_ARITH32]  += new_hit_cnt - orig_hit_cnt;
  stage_cycles[STAGE_ARITH32] += stage_max;

interest

阶段：

interest 8/8：每次对8个 bit 进替换，按照每8个 bit 的步长从头开始，即对文件的每个 byte 进行替换；
interest 16/8：每次对16个 bit 进替换，按照每8个 bit 的步长从头开始，即对文件的每个 word 进行替换；
interest 32/8：每次对32个 bit 进替换，按照每8个 bit 的步长从头开始，即对文件的每个 dword 进行替换。

所谓的有趣值，定义在这些数组中：

/* Interesting values, as per config.h */
 
static s8  interesting_8[]  = { INTERESTING_8 };
static s16 interesting_16[] = { INTERESTING_8, INTERESTING_16 };
static s32 interesting_32[] = { INTERESTING_8, INTERESTING_16, INTERESTING_32 };

具体的，数组中的有趣值 INTERESTING_8 等定义在 config.h 中：

config.h

#define INTERESTING_8 \
  -128,          /* Overflow signed 8-bit when decremented  */ \
  -1,            /*                                         */ \
   0,            /*                                         */ \
   1,            /*                                         */ \
   16,           /* One-off with common buffer size         */ \
   32,           /* One-off with common buffer size         */ \
   64,           /* One-off with common buffer size         */ \
   100,          /* One-off with common buffer size         */ \
   127           /* Overflow signed 8-bit when incremented  */

这些数都是容易引起溢出的数字。

interest 8/8

每次对 8 个 bit 进替换，按照每 8 个 bit 的步长从头开始，即对文件的每个 byte 进行替换：

skip_arith:
 
  /**********************
   * INTERESTING VALUES *
   **********************/
 
  stage_name  = "interest 8/8";
  stage_short = "int8";
  stage_cur   = 0;
  stage_max   = len * sizeof(interesting_8);
 
  stage_val_type = STAGE_VAL_LE;
 
  orig_hit_cnt = new_hit_cnt;

从头开始遍历种子：

  /* Setting 8-bit integers. */
 
  for (i = 0; i < len; i++) {
 
    u8 orig = out_buf[i];

处理种子有效情况：

    /* Let's consult the effector map... */
 
    if (!eff_map[EFF_APOS(i)]) {
      stage_max -= sizeof(interesting_8);
      continue;
    }
 
    stage_cur_byte = i;

遍历字符中的每一位，对应 stage_max = len * sizeof(interesting_8);

    for (j = 0; j < sizeof(interesting_8); j++) {

跳过与位翻转相同的情况：

      /* Skip if the value could be a product of bitflips or arithmetics. */
 
      if (could_be_bitflip(orig ^ (u8)interesting_8[j]) ||
          could_be_arith(orig, (u8)interesting_8[j], 1)) {
        stage_max--;
        continue;
      }

替换为有趣值：

      stage_cur_val = interesting_8[j];
      out_buf[i] = interesting_8[j];
 
      if (common_fuzz_stuff(argv, out_buf, len)) goto abandon_entry;
 
      out_buf[i] = orig;
      stage_cur++;
 
    }
 
  }
 
  new_hit_cnt = queued_paths + unique_crashes;
 
  stage_finds[STAGE_INTEREST8]  += new_hit_cnt - orig_hit_cnt;
  stage_cycles[STAGE_INTEREST8] += stage_max;

interest 16/8

每次对16个 bit 进替换，按照每8个 bit 的步长从头开始，即对文件的每个 word 进行替换；

  /* Setting 16-bit integers, both endians. */
 
  if (no_arith || len < 2) goto skip_interest;
 
  stage_name  = "interest 16/8";
  stage_short = "int16";
  stage_cur   = 0;
  stage_max   = 2 * (len - 1) * (sizeof(interesting_16) >> 1);
 
  orig_hit_cnt = new_hit_cnt;
 
  for (i = 0; i < len - 1; i++) {
 
    u16 orig = *(u16*)(out_buf + i);
 
    /* Let's consult the effector map... */
 
    if (!eff_map[EFF_APOS(i)] && !eff_map[EFF_APOS(i + 1)]) {
      stage_max -= sizeof(interesting_16);
      continue;
    }
 
    stage_cur_byte = i;
 
    for (j = 0; j < sizeof(interesting_16) / 2; j++) {
 
      stage_cur_val = interesting_16[j];

这里处理和位翻转、整数运算、有趣值及大小端的情况：

      /* Skip if this could be a product of a bitflip, arithmetics,
         or single-byte interesting value insertion. */
 
      if (!could_be_bitflip(orig ^ (u16)interesting_16[j]) &&
          !could_be_arith(orig, (u16)interesting_16[j], 2) &&
          !could_be_interest(orig, (u16)interesting_16[j], 2, 0)) {
 
        stage_val_type = STAGE_VAL_LE;
 
        *(u16*)(out_buf + i) = interesting_16[j];
 
        if (common_fuzz_stuff(argv, out_buf, len)) goto abandon_entry;
        stage_cur++;
 
      } else stage_max--;
 
      if ((u16)interesting_16[j] != SWAP16(interesting_16[j]) &&
          !could_be_bitflip(orig ^ SWAP16(interesting_16[j])) &&
          !could_be_arith(orig, SWAP16(interesting_16[j]), 2) &&
          !could_be_interest(orig, SWAP16(interesting_16[j]), 2, 1)) {
 
        stage_val_type = STAGE_VAL_BE;
 
        *(u16*)(out_buf + i) = SWAP16(interesting_16[j]);
        if (common_fuzz_stuff(argv, out_buf, len)) goto abandon_entry;
        stage_cur++;
 
      } else stage_max--;
 
    }
 
    *(u16*)(out_buf + i) = orig;
 
  }
 
  new_hit_cnt = queued_paths + unique_crashes;
 
  stage_finds[STAGE_INTEREST16]  += new_hit_cnt - orig_hit_cnt;
  stage_cycles[STAGE_INTEREST16] += stage_max;

interest 32/8

每次对32个 bit 进替换，按照每8个 bit 的步长从头开始，即对文件的每个 dword 进行替换，做法和 interest 16/8 差不多：

  if (len < 4) goto skip_interest;
 
  /* Setting 32-bit integers, both endians. */
 
  stage_name  = "interest 32/8";
  stage_short = "int32";
  stage_cur   = 0;
  stage_max   = 2 * (len - 3) * (sizeof(interesting_32) >> 2);
 
  orig_hit_cnt = new_hit_cnt;
 
  for (i = 0; i < len - 3; i++) {
 
    u32 orig = *(u32*)(out_buf + i);
 
    /* Let's consult the effector map... */
 
    if (!eff_map[EFF_APOS(i)] && !eff_map[EFF_APOS(i + 1)] &&
        !eff_map[EFF_APOS(i + 2)] && !eff_map[EFF_APOS(i + 3)]) {
      stage_max -= sizeof(interesting_32) >> 1;
      continue;
    }
 
    stage_cur_byte = i;
 
    for (j = 0; j < sizeof(interesting_32) / 4; j++) {
 
      stage_cur_val = interesting_32[j];
 
      /* Skip if this could be a product of a bitflip, arithmetics,
         or word interesting value insertion. */
 
      if (!could_be_bitflip(orig ^ (u32)interesting_32[j]) &&
          !could_be_arith(orig, interesting_32[j], 4) &&
          !could_be_interest(orig, interesting_32[j], 4, 0)) {
 
        stage_val_type = STAGE_VAL_LE;
 
        *(u32*)(out_buf + i) = interesting_32[j];
 
        if (common_fuzz_stuff(argv, out_buf, len)) goto abandon_entry;
        stage_cur++;
 
      } else stage_max--;
 
      if ((u32)interesting_32[j] != SWAP32(interesting_32[j]) &&
          !could_be_bitflip(orig ^ SWAP32(interesting_32[j])) &&
          !could_be_arith(orig, SWAP32(interesting_32[j]), 4) &&
          !could_be_interest(orig, SWAP32(interesting_32[j]), 4, 1)) {
 
        stage_val_type = STAGE_VAL_BE;
 
        *(u32*)(out_buf + i) = SWAP32(interesting_32[j]);
        if (common_fuzz_stuff(argv, out_buf, len)) goto abandon_entry;
        stage_cur++;
 
      } else stage_max--;
 
    }
 
    *(u32*)(out_buf + i) = orig;
 
  }
 
  new_hit_cnt = queued_paths + unique_crashes;
 
  stage_finds[STAGE_INTEREST32]  += new_hit_cnt - orig_hit_cnt;
  stage_cycles[STAGE_INTEREST32] += stage_max;

dictionary

这里代表着 deterministic 的最后一个阶段，包含以下几个阶段：

user extras (over)：从头开始，将用户提供的 tokens 依次替换到原文件中；
user extras (insert)：从头开始，将用户提供的 tokens 依次插入到原文件中；
auto extras (over)：从头开始，将自动检测的 tokens 依次替换到原文件中。

user extras (over)

从头开始，将用户提供的 tokens 依次替换到原文件中，其中用户提供的 tokens 就是通过 load_extras 加载的：

skip_interest:
 
  /********************
   * DICTIONARY STUFF *
   ********************/
 
  if (!extras_cnt) goto skip_user_extras;
 
  /* Overwrite with user-supplied extras. */
 
  stage_name  = "user extras (over)";
  stage_short = "ext_UO";
  stage_cur   = 0;
  stage_max   = extras_cnt * len;
 
  stage_val_type = STAGE_VAL_NONE;
 
  orig_hit_cnt = new_hit_cnt;
 
  for (i = 0; i < len; i++) {
 
    u32 last_len = 0;
 
    stage_cur_byte = i;

对于用户提供的 tokens，AFL 先按照长度从小到大进行排序。这样做的好处是，只要按照顺序使用排序后的 tokens，那么后面的 token 不会比之前的短，从而每次覆盖替换后不需要再恢复到原状。

    /* Extras are sorted by size, from smallest to largest. This means
       that we don't have to worry about restoring the buffer in
       between writes at a particular offset determined by the outer
       loop. */
 
    for (j = 0; j < extras_cnt; j++) {

随后，AFL 会检查 tokens 的数量，如果数量大于预设的 MAX_DET_EXTRAS（默认值为200），那么对每个 token 会根据概率来决定是否进行替换：

      /* Skip extras probabilistically if extras_cnt > MAX_DET_EXTRAS. Also
         skip them if there's no room to insert the payload, if the token
         is redundant, or if its entire span has no bytes set in the effector
         map. */
 
      if ((extras_cnt > MAX_DET_EXTRAS && UR(extras_cnt) >= MAX_DET_EXTRAS) ||
          extras[j].len > len - i ||
          !memcmp(extras[j].data, out_buf + i, extras[j].len) ||
          !memchr(eff_map + EFF_APOS(i), 1, EFF_SPAN_ALEN(i, extras[j].len))) {
 
        stage_max--;
        continue;
 
      }
 
      last_len = extras[j].len;
      memcpy(out_buf + i, extras[j].data, last_len);
 
      if (common_fuzz_stuff(argv, out_buf, len)) goto abandon_entry;
 
      stage_cur++;
 
    }
 
    /* Restore all the clobbered memory. */
    memcpy(out_buf + i, in_buf + i, last_len);
 
  }
 
  new_hit_cnt = queued_paths + unique_crashes;
 
  stage_finds[STAGE_EXTRAS_UO]  += new_hit_cnt - orig_hit_cnt;
  stage_cycles[STAGE_EXTRAS_UO] += stage_max;

user extras (insert)

从头开始，将用户提供的 tokens 依次插入到原文件中：

  /* Insertion of user-supplied extras. */
 
  stage_name  = "user extras (insert)";
  stage_short = "ext_UI";
  stage_cur   = 0;
  stage_max   = extras_cnt * (len + 1);
 
  orig_hit_cnt = new_hit_cnt;
 
  ex_tmp = ck_alloc(len + MAX_DICT_FILE);
 
  for (i = 0; i <= len; i++) {
 
    stage_cur_byte = i;
 
    for (j = 0; j < extras_cnt; j++) {
 
      if (len + extras[j].len > MAX_FILE) {
        stage_max--; 
        continue;
      }
 
      /* Insert token */
      memcpy(ex_tmp + i, extras[j].data, extras[j].len);
 
      /* Copy tail */
      memcpy(ex_tmp + i + extras[j].len, out_buf + i, len - i);
 
      if (common_fuzz_stuff(argv, ex_tmp, len + extras[j].len)) {
        ck_free(ex_tmp);
        goto abandon_entry;
      }
 
      stage_cur++;
 
    }
 
    /* Copy head */
    ex_tmp[i] = out_buf[i];
 
  }
 
  ck_free(ex_tmp);
 
  new_hit_cnt = queued_paths + unique_crashes;
 
  stage_finds[STAGE_EXTRAS_UI]  += new_hit_cnt - orig_hit_cnt;
  stage_cycles[STAGE_EXTRAS_UI] += stage_max;

auto extras (over)

从头开始，将自动检测的 tokens 依次替换到原文件中，其中的 tokens 就是在 bitflip 1/1 阶段检测的。

skip_user_extras:
 
  if (!a_extras_cnt) goto skip_extras;
 
  stage_name  = "auto extras (over)";
  stage_short = "ext_AO";
  stage_cur   = 0;
  stage_max   = MIN(a_extras_cnt, USE_AUTO_EXTRAS) * len;
 
  stage_val_type = STAGE_VAL_NONE;
 
  orig_hit_cnt = new_hit_cnt;
 
  for (i = 0; i < len; i++) {
 
    u32 last_len = 0;
 
    stage_cur_byte = i;
 
    for (j = 0; j < MIN(a_extras_cnt, USE_AUTO_EXTRAS); j++) {
 
      /* See the comment in the earlier code; extras are sorted by size. */
 
      if (a_extras[j].len > len - i ||
          !memcmp(a_extras[j].data, out_buf + i, a_extras[j].len) ||
          !memchr(eff_map + EFF_APOS(i), 1, EFF_SPAN_ALEN(i, a_extras[j].len))) {
 
        stage_max--;
        continue;
 
      }
 
      last_len = a_extras[j].len;
      memcpy(out_buf + i, a_extras[j].data, last_len);
 
      if (common_fuzz_stuff(argv, out_buf, len)) goto abandon_entry;
 
      stage_cur++;
 
    }
 
    /* Restore all the clobbered memory. */
    memcpy(out_buf + i, in_buf + i, last_len);
 
  }
 
  new_hit_cnt = queued_paths + unique_crashes;
 
  stage_finds[STAGE_EXTRAS_AO]  += new_hit_cnt - orig_hit_cnt;
  stage_cycles[STAGE_EXTRAS_AO] += stage_max;

havoc

Tldr

dumb mode 会直接从这一阶段开始，跳过上述确定性过程。后续所有的变异都是随机的。

在这一阶段，AFL 会计算得到一个操作轮数，每一轮再产生一个随机数作为每轮的操作次数，每次在以下操作中选择一个：

随机选取某个 bit 进行翻转

随机选取某个 byte，将其设置为随机的 interesting value

随机选取某个 word，并随机选取大、小端序，将其设置为随机的 interesting value

随机选取某个 dword，并随机选取大、小端序，将其设置为随机的 interesting value

随机选取某个 byte，对其减去一个随机数

随机选取某个 byte，对其加上一个随机数

随机选取某个 word，并随机选取大、小端序，对其减去一个随机数

随机选取某个 word，并随机选取大、小端序，对其加上一个随机数

随机选取某个 dword，并随机选取大、小端序，对其减去一个随机数

随机选取某个 dword，并随机选取大、小端序，对其加上一个随机数

随机选取某个 byte，将其设置为随机数

随机删除一段 bytes

随机删除一段 bytes

随机选取一个位置，插入一段随机长度的内容，其中 75% 的概率是插入原文中随机位置的内容，25% 的概率是插入一段随机选取的数

随机选取一个位置，替换为一段随机长度的内容，其中 75% 的概率是替换成原文中随机位置的内容，25% 的概率是替换成一段随机选取的数

随机选取一个位置，用随机选取的 token（用户提供的或自动生成的）替换

随机选取一个位置，用随机选取的 token（用户提供的或自动生成的）插入

（摘自 https://bbs.pediy.com/thread-254705.htm）

在充满随机的 havoc 大杂烩中，AFL 对测试用例做了一系列天马行空的变异尝试：

skip_extras:
 
  /* If we made this to here without jumping to havoc_stage or abandon_entry,
     we're properly done with deterministic steps and can mark it as such
     in the .state/ directory. */
 
  if (!queue_cur->passed_det) mark_as_det_done(queue_cur);
 
  /****************
   * RANDOM HAVOC *
   ****************/
 
havoc_stage:
 
  stage_cur_byte = -1;
 
  /* The havoc stage mutation code is also invoked when splicing files; if the
     splice_cycle variable is set, generate different descriptions and such. */
 
  if (!splice_cycle) {
 
    stage_name  = "havoc";
    stage_short = "havoc";
    stage_max   = (doing_det ? HAVOC_CYCLES_INIT : HAVOC_CYCLES) *
                  perf_score / havoc_div / 100;
 
  } else {
 
    static u8 tmp[32];
 
    perf_score = orig_perf;
 
    sprintf(tmp, "splice %u", splice_cycle);
    stage_name  = tmp;
    stage_short = "splice";
    stage_max   = SPLICE_HAVOC * perf_score / havoc_div / 100;
 
  }
 
  if (stage_max < HAVOC_MIN) stage_max = HAVOC_MIN;
 
  temp_len = len;
 
  orig_hit_cnt = queued_paths + unique_crashes;
 
  havoc_queued = queued_paths;
 
  /* We essentially just do several thousand runs (depending on perf_score)
     where we take the input file and make random stacked tweaks. */
 
  for (stage_cur = 0; stage_cur < stage_max; stage_cur++) {
 
    u32 use_stacking = 1 << (1 + UR(HAVOC_STACK_POW2));
 
    stage_cur_val = use_stacking;
 
    for (i = 0; i < use_stacking; i++) {

随机选取从 0 到 16 的数字，注意只有存在用户提供的字典或通过 flip 1/1 提取出字典后才会 case 15 或 16

      switch (UR(15 + ((extras_cnt + a_extras_cnt) ? 2 : 0))) {
 
        case 0:
 
          /* Flip a single bit somewhere. Spooky! */
 
          FLIP_BIT(out_buf, UR(temp_len << 3));
          break;
 
        case 1: 
 
          /* Set byte to interesting value. */
 
          out_buf[UR(temp_len)] = interesting_8[UR(sizeof(interesting_8))];
          break;
 
        case 2:
 
          /* Set word to interesting value, randomly choosing endian. */
 
          if (temp_len < 2) break;
 
          if (UR(2)) {
 
            *(u16*)(out_buf + UR(temp_len - 1)) =
              interesting_16[UR(sizeof(interesting_16) >> 1)];
 
          } else {
 
            *(u16*)(out_buf + UR(temp_len - 1)) = SWAP16(
              interesting_16[UR(sizeof(interesting_16) >> 1)]);
 
          }
 
          break;
 
        case 3:
 
          /* Set dword to interesting value, randomly choosing endian. */
 
          if (temp_len < 4) break;
 
          if (UR(2)) {
  
            *(u32*)(out_buf + UR(temp_len - 3)) =
              interesting_32[UR(sizeof(interesting_32) >> 2)];
 
          } else {
 
            *(u32*)(out_buf + UR(temp_len - 3)) = SWAP32(
              interesting_32[UR(sizeof(interesting_32) >> 2)]);
 
          }
 
          break;
 
        case 4:
 
          /* Randomly subtract from byte. */
 
          out_buf[UR(temp_len)] -= 1 + UR(ARITH_MAX);
          break;
 
        case 5:
 
          /* Randomly add to byte. */
 
          out_buf[UR(temp_len)] += 1 + UR(ARITH_MAX);
          break;
 
        case 6:
 
          /* Randomly subtract from word, random endian. */
 
          if (temp_len < 2) break;
 
          if (UR(2)) {
 
            u32 pos = UR(temp_len - 1);
 
            *(u16*)(out_buf + pos) -= 1 + UR(ARITH_MAX);
 
          } else {
 
            u32 pos = UR(temp_len - 1);
            u16 num = 1 + UR(ARITH_MAX);
 
            *(u16*)(out_buf + pos) =
              SWAP16(SWAP16(*(u16*)(out_buf + pos)) - num);
 
          }
 
          break;
 
        case 7:
 
          /* Randomly add to word, random endian. */
 
          if (temp_len < 2) break;
 
          if (UR(2)) {
 
            u32 pos = UR(temp_len - 1);
 
            *(u16*)(out_buf + pos) += 1 + UR(ARITH_MAX);
 
          } else {
 
            u32 pos = UR(temp_len - 1);
            u16 num = 1 + UR(ARITH_MAX);
 
            *(u16*)(out_buf + pos) =
              SWAP16(SWAP16(*(u16*)(out_buf + pos)) + num);
 
          }
 
          break;
 
        case 8:
 
          /* Randomly subtract from dword, random endian. */
 
          if (temp_len < 4) break;
 
          if (UR(2)) {
 
            u32 pos = UR(temp_len - 3);
 
            *(u32*)(out_buf + pos) -= 1 + UR(ARITH_MAX);
 
          } else {
 
            u32 pos = UR(temp_len - 3);
            u32 num = 1 + UR(ARITH_MAX);
 
            *(u32*)(out_buf + pos) =
              SWAP32(SWAP32(*(u32*)(out_buf + pos)) - num);
 
          }
 
          break;
 
        case 9:
 
          /* Randomly add to dword, random endian. */
 
          if (temp_len < 4) break;
 
          if (UR(2)) {
 
            u32 pos = UR(temp_len - 3);
 
            *(u32*)(out_buf + pos) += 1 + UR(ARITH_MAX);
 
          } else {
 
            u32 pos = UR(temp_len - 3);
            u32 num = 1 + UR(ARITH_MAX);
 
            *(u32*)(out_buf + pos) =
              SWAP32(SWAP32(*(u32*)(out_buf + pos)) + num);
 
          }
 
          break;
 
        case 10:
 
          /* Just set a random byte to a random value. Because,
             why not. We use XOR with 1-255 to eliminate the
             possibility of a no-op. */
 
          out_buf[UR(temp_len)] ^= 1 + UR(255);
          break;
 
        case 11 ... 12: {
 
            /* Delete bytes. We're making this a bit more likely
               than insertion (the next option) in hopes of keeping
               files reasonably small. */
 
            u32 del_from, del_len;
 
            if (temp_len < 2) break;
 
            /* Don't delete too much. */
 
            del_len = choose_block_len(temp_len - 1);
 
            del_from = UR(temp_len - del_len + 1);
 
            memmove(out_buf + del_from, out_buf + del_from + del_len,
                    temp_len - del_from - del_len);
 
            temp_len -= del_len;
 
            break;
 
          }
 
        case 13:
 
          if (temp_len + HAVOC_BLK_XL < MAX_FILE) {
 
            /* Clone bytes (75%) or insert a block of constant bytes (25%). */
 
            u8  actually_clone = UR(4);
            u32 clone_from, clone_to, clone_len;
            u8* new_buf;
 
            if (actually_clone) {
 
              clone_len  = choose_block_len(temp_len);
              clone_from = UR(temp_len - clone_len + 1);
 
            } else {
 
              clone_len = choose_block_len(HAVOC_BLK_XL);
              clone_from = 0;
 
            }
 
            clone_to   = UR(temp_len);
 
            new_buf = ck_alloc_nozero(temp_len + clone_len);
 
            /* Head */
 
            memcpy(new_buf, out_buf, clone_to);
 
            /* Inserted part */
 
            if (actually_clone)
              memcpy(new_buf + clone_to, out_buf + clone_from, clone_len);
            else
              memset(new_buf + clone_to,
                     UR(2) ? UR(256) : out_buf[UR(temp_len)], clone_len);
 
            /* Tail */
            memcpy(new_buf + clone_to + clone_len, out_buf + clone_to,
                   temp_len - clone_to);
 
            ck_free(out_buf);
            out_buf = new_buf;
            temp_len += clone_len;
 
          }
 
          break;
 
        case 14: {
 
            /* Overwrite bytes with a randomly selected chunk (75%) or fixed
               bytes (25%). */
 
            u32 copy_from, copy_to, copy_len;
 
            if (temp_len < 2) break;
 
            copy_len  = choose_block_len(temp_len - 1);
 
            copy_from = UR(temp_len - copy_len + 1);
            copy_to   = UR(temp_len - copy_len + 1);
 
            if (UR(4)) {
 
              if (copy_from != copy_to)
                memmove(out_buf + copy_to, out_buf + copy_from, copy_len);
 
            } else memset(out_buf + copy_to,
                          UR(2) ? UR(256) : out_buf[UR(temp_len)], copy_len);
 
            break;
 
          }
 
        /* Values 15 and 16 can be selected only if there are any extras
           present in the dictionaries. */
 
        case 15: {
 
            /* Overwrite bytes with an extra. */
 
            if (!extras_cnt || (a_extras_cnt && UR(2))) {
 
              /* No user-specified extras or odds in our favor. Let's use an
                 auto-detected one. */
 
              u32 use_extra = UR(a_extras_cnt);
              u32 extra_len = a_extras[use_extra].len;
              u32 insert_at;
 
              if (extra_len > temp_len) break;
 
              insert_at = UR(temp_len - extra_len + 1);
              memcpy(out_buf + insert_at, a_extras[use_extra].data, extra_len);
 
            } else {
 
              /* No auto extras or odds in our favor. Use the dictionary. */
 
              u32 use_extra = UR(extras_cnt);
              u32 extra_len = extras[use_extra].len;
              u32 insert_at;
 
              if (extra_len > temp_len) break;
 
              insert_at = UR(temp_len - extra_len + 1);
              memcpy(out_buf + insert_at, extras[use_extra].data, extra_len);
 
            }
 
            break;
 
          }
 
        case 16: {
 
            u32 use_extra, extra_len, insert_at = UR(temp_len + 1);
            u8* new_buf;
 
            /* Insert an extra. Do the same dice-rolling stuff as for the
               previous case. */
 
            if (!extras_cnt || (a_extras_cnt && UR(2))) {
 
              use_extra = UR(a_extras_cnt);
              extra_len = a_extras[use_extra].len;
 
              if (temp_len + extra_len >= MAX_FILE) break;
 
              new_buf = ck_alloc_nozero(temp_len + extra_len);
 
              /* Head */
              memcpy(new_buf, out_buf, insert_at);
 
              /* Inserted part */
              memcpy(new_buf + insert_at, a_extras[use_extra].data, extra_len);
 
            } else {
 
              use_extra = UR(extras_cnt);
              extra_len = extras[use_extra].len;
 
              if (temp_len + extra_len >= MAX_FILE) break;
 
              new_buf = ck_alloc_nozero(temp_len + extra_len);
 
              /* Head */
              memcpy(new_buf, out_buf, insert_at);
 
              /* Inserted part */
              memcpy(new_buf + insert_at, extras[use_extra].data, extra_len);
 
            }
 
            /* Tail */
            memcpy(new_buf + insert_at + extra_len, out_buf + insert_at,
                   temp_len - insert_at);
 
            ck_free(out_buf);
            out_buf   = new_buf;
            temp_len += extra_len;
 
            break;
 
          }
 
      }
 
    }
 
    if (common_fuzz_stuff(argv, out_buf, temp_len))
      goto abandon_entry;
 
    /* out_buf might have been mangled a bit, so let's restore it to its
       original size and shape. */
 
    if (temp_len < len) out_buf = ck_realloc(out_buf, len);
    temp_len = len;
    memcpy(out_buf, in_buf, len);
 
    /* If we're finding new stuff, let's run for a bit longer, limits
       permitting. */
 
    if (queued_paths != havoc_queued) {
 
      if (perf_score <= HAVOC_MAX_MULT * 100) {
        stage_max  *= 2;
        perf_score *= 2;
      }
 
      havoc_queued = queued_paths;
 
    }
 
  }
 
  new_hit_cnt = queued_paths + unique_crashes;
 
  if (!splice_cycle) {
    stage_finds[STAGE_HAVOC]  += new_hit_cnt - orig_hit_cnt;
    stage_cycles[STAGE_HAVOC] += stage_max;
  } else {
    stage_finds[STAGE_SPLICE]  += new_hit_cnt - orig_hit_cnt;
    stage_cycles[STAGE_SPLICE] += stage_max;
  }

splice

splice 会随机拼接两个种子：

具体地，AFL 在 seed 文件队列中随机选取一个，与当前的 seed 文件做对比。如果两者差别不大，就再重新随机选一个；如果两者相差比较明显，那么就随机选取一个位置，将两者都分割为头部和尾部。最后，将当前文件的头部与随机文件的尾部拼接起来，就得到了新的文件。在这里，AFL 还会过滤掉拼接文件未发生变化的情况。

#ifndef IGNORE_FINDS
 
  /************
   * SPLICING *
   ************/
 
  /* This is a last-resort strategy triggered by a full round with no findings.
     It takes the current input file, randomly selects another input, and
     splices them together at some offset, then relies on the havoc
     code to mutate that blob. */
 
retry_splicing:
 
  if (use_splicing && splice_cycle++ < SPLICE_CYCLES &&
      queued_paths > 1 && queue_cur->len > 1) {
 
    struct queue_entry* target;
    u32 tid, split_at;
    u8* new_buf;
    s32 f_diff, l_diff;
 
    /* First of all, if we've modified in_buf for havoc, let's clean that
       up... */
 
    if (in_buf != orig_in) {
      ck_free(in_buf);
      in_buf = orig_in;
      len = queue_cur->len;
    }
 
    /* Pick a random queue entry and seek to it. Don't splice with yourself. */
 
    do { tid = UR(queued_paths); } while (tid == current_entry);
 
    splicing_with = tid;
    target = queue;
 
    while (tid >= 100) { target = target->next_100; tid -= 100; }
    while (tid--) target = target->next;
 
    /* Make sure that the target has a reasonable length. */
 
    while (target && (target->len < 2 || target == queue_cur)) {
      target = target->next;
      splicing_with++;
    }
 
    if (!target) goto retry_splicing;
 
    /* Read the testcase into a new buffer. */
 
    fd = open(target->fname, O_RDONLY);
 
    if (fd < 0) PFATAL("Unable to open '%s'", target->fname);
 
    new_buf = ck_alloc_nozero(target->len);
 
    ck_read(fd, new_buf, target->len, target->fname);
 
    close(fd);
 
    /* Find a suitable splicing location, somewhere between the first and
       the last differing byte. Bail out if the difference is just a single
       byte or so. */
 
    locate_diffs(in_buf, new_buf, MIN(len, target->len), &f_diff, &l_diff);
 
    if (f_diff < 0 || l_diff < 2 || f_diff == l_diff) {
      ck_free(new_buf);
      goto retry_splicing;
    }
 
    /* Split somewhere between the first and last differing byte. */
 
    split_at = f_diff + UR(l_diff - f_diff);
 
    /* Do the thing. */
 
    len = target->len;
    memcpy(new_buf, in_buf, split_at);
    in_buf = new_buf;
 
    ck_free(out_buf);
    out_buf = ck_alloc_nozero(len);
    memcpy(out_buf, in_buf, len);
 
    goto havoc_stage;
 
  }
 
#endif /* !IGNORE_FINDS */

收尾工作

  ret_val = 0;
 
abandon_entry:
 
  splicing_with = -1;
 
  /* Update pending_not_fuzzed count if we made it through the calibration
     cycle and have not seen this entry before. */
 
  if (!stop_soon && !queue_cur->cal_failed && !queue_cur->was_fuzzed) {
    queue_cur->was_fuzzed = 1;
    pending_not_fuzzed--;
    if (queue_cur->favored) pending_favored--;
  }
 
  munmap(orig_in, queue_cur->len);
 
  if (in_buf != orig_in) ck_free(in_buf);
  ck_free(out_buf);
  ck_free(eff_map);
 
  return ret_val;
 
#undef FLIP_BIT
 
}

🚂QRZ的星穹列车

开往

探索

Recent writing

从安全问题研究 C 中的未定义行为：有符号整数溢出

Fuzzing 101 with LibAFL 学习（一）

AFL 运行流程多图流（缓慢更新中）

fuzz_one 函数

目录

种子选择

IGNORE_FINDS

获取输入

校准

修剪

打分

bitflip

FLIP_BIT

bitflip 1/1

bitflip 2/1

bitflip 4/1

创建 effort map

bitflip 8/8

bitflip 16/8

bitflip 32/8

arithmetic

arith 8/8

arith 16/8

arith 32/8

interest

interest 8/8

interest 16/8

interest 32/8

dictionary

user extras (over)

user extras (insert)

auto extras (over)

havoc

splice

收尾工作

参考资料

目录

关系图谱

反向链接