An Empirical Study on ARM Disassembly Tools [pdf]

(yajin.org)

42 points | by matt_d 1422 days ago

4 comments

  • nsajko 1422 days ago
    The Ghidra and R2 used in for the article are now a year old. Is that just because of the journal delay?

    Here are the radare2 and Ghidra reports:

    https://github.com/NationalSecurityAgency/ghidra/issues/657

    https://github.com/radareorg/radare2/issues/14223

    • jcranmer 1421 days ago
      Paper submission deadlines are usually about 9 months before the conference, although ISSTA 2020 apparently had its submission deadline in January 27 (~6 months before the conference).
  • nsajko 1422 days ago
    Any idea why the git repo does not contain the data?
  • JoachimS 1422 days ago
    The server seems to have a rough time at the moment.
    • mdaniel 1422 days ago
      Another fine reason why it'll be bad news if the IA lawsuit blows them away: https://web.archive.org/web/20200602191543/https://yajin.org...
    • nsajko 1422 days ago
      Mirrored on another author's site: https://www.muhui.site/publication/issta_2020
    • WaitWaitWha 1422 days ago
      still struggling, but eventually it did download.

      For those who cannot download it:

      #ABSTRACT

      ===

      With the increasing popularity of embedded devices,ARM isbecom-

      ing the dominant architecture for them. In the meanwhile, there is

      a pressing need to perform security assessments for these devices.

      Due to different types of peripherals, it is challenging to dynami-

      cally run the firmware of these devices in an emulated environment.

      Therefore, the static analysis is still commonly used. Existing work

      usually leverages off-the-shelf tools to disassemble stripped ARM

      binaries and (implicitly) assume that reliable disassembling binaries

      and function recognition are solved problems. However, whether

      this assumption really holds is unknown.

      In this paper, we conduct the first comprehensive study on

      ARM disassembly tools. Specifically, we build 1 , 896 ARM bina-

      ries (including 248 obfuscated ones) with different compilers, com-

      piling options, and obfuscation methods. We then evaluate them

      using eight state-of-the-art ARM disassembly tools (including both

      commercial and noncommercial ones) on their capabilities to lo-

      cate instructions and function boundaries. These two are funda-

      mental ones, which are leveraged to build other primitives. Our

      work reveals some observations that have not been systemati-

      cally summarized and/or confirmed. For instance, we find that

      the existence of both ARM and Thumb instruction sets, and the

      reuse of the BL instruction for both function calls and branches

      bring serious challenges to disassembly tools. Our evaluation sheds

      light on the limitations of state-of-the-art disassembly tools and

      points out potential directions for improvement. To engage the

      community, we release the data set, and the related scripts at

      https://github.com/valour01/arm_disasssembler_study.

  • rcgorton 1421 days ago
    I have multiple disagreements: "ARM is becoming the dominant architecture". No - while a valid statement in 2012-ish, the truth (since about 2016) is "ARM is the dominant architecture". If disassembling binaries without symbols is a problem, then your skills/methods need improvement. Been there, dealt with that: binary translation of VAX/VMS, Mips/Ultrix, Sparc/Solaris to Alpha (various OS's)