2013年1月11日 星期五

Address-sanitizer : 新的 Android 快速記憶體錯誤偵測工具


簡介

Android 在 Jerry Bean 的時候加入了 Address-sanitizer 的支援

這東西是什麼勒, 說白話一點就是個可以偵測一些記憶體使用錯誤的一個小工具,

有點類似 valgrind 中 memcheck 的功能但是更快更簡單(但相對的某些錯誤無法偵測)

偵測錯誤的類型

廢話不多說, 直接看範例程式跟範例輸出

越界存取(Out of bound)

Heap 越界存取
#include 
#include 
int main(int argc, char **argv) {
  char *x = (char*)malloc(10 * sizeof(char));
  memset(x, 0, 10);
  int res = x[argc * 10];  // BOOOM
  free(x);
  return res;
}
範例執行輸出:
=================================================================
==799== ERROR: AddressSanitizer heap-buffer-overflow on address 0x41255cca at pc 0x2a00055b bp 0xbeff6b0c sp 0xbeff6b08
READ of size 1 at 0x41255cca thread T0
    #0 0x40022a4b (/system/lib/libasan_preload.so+0x8a4b)
    #1 0x40023e77 (/system/lib/libasan_preload.so+0x9e77)
    #2 0x4001c947 (/system/lib/libasan_preload.so+0x2947)
    #3 0x2a000559 (/system/bin/heap-out-of-bounds+0x559)
    #4 0x4114371d (/system/lib/libc.so+0x1271d)
0x41255cca is located 0 bytes to the right of 10-byte region [0x41255cc0,0x41255cca)
allocated by thread T0 here:
    #0 0x40022a4b (/system/lib/libasan_preload.so+0x8a4b)
    #1 0x40022aff (/system/lib/libasan_preload.so+0x8aff)
    #2 0x2a0004e9 (/system/bin/heap-out-of-bounds+0x4e9)
    #3 0x4114371d (/system/lib/libc.so+0x1271d)
Shadow byte and word:
  0x0824ab99: 2
  0x0824ab98: 00 02 fb fb
More shadow bytes:
  0x0824ab88: 00 00 00 00
  0x0824ab8c: 04 fb fb fb
  0x0824ab90: fa fa fa fa
  0x0824ab94: fa fa fa fa
=>0x0824ab98: 00 02 fb fb
  0x0824ab9c: fb fb fb fb
  0x0824aba0: fa fa fa fa
  0x0824aba4: fa fa fa fa
  0x0824aba8: fa fa fa fa
Stats: 0M malloced (0M for red zones) by 36 calls
Stats: 0M realloced by 0 calls
Stats: 0M freed by 0 calls
Stats: 0M really freed by 0 calls
Stats: 2M (642 full pages) mmaped in 5 calls
  mmaps   by size class: 7:4095; 8:2047; 11:255; 12:128; 13:64; 
  mallocs by size class: 7:27; 8:4; 11:2; 12:2; 13:1; 
  frees   by size class: 
  rfrees  by size class: 
Stats: malloc large: 0 small slow: 5
==799== ABORTING
全域變數越界存取
int global_array[100] = {-1};
int main(int argc, char **argv) {
  return global_array[argc + 100];  // BOOM
}
範例執行輸出:
=================================================================
==7161== ERROR: AddressSanitizer global-buffer-overflow on address 0x2a002194 at pc 0x2a00051b bp 0xbeeafb0c sp 0xbeeafb08
READ of size 4 at 0x2a002194 thread T0
    #0 0x40022a4b (/system/lib/libasan_preload.so+0x8a4b)
    #1 0x40023e77 (/system/lib/libasan_preload.so+0x9e77)
    #2 0x4001c98f (/system/lib/libasan_preload.so+0x298f)
    #3 0x2a000519 (/system/bin/global-out-of-bounds+0x519)
    #4 0x4114371d (/system/lib/libc.so+0x1271d)
0x2a002194 is located 4 bytes to the right of global variable 'global_array (external/test/global-out-of-bounds.cpp)' (0x2a002000) of size 400
Shadow byte and word:
  0x05400432: f9
  0x05400430: 00 00 f9 f9
More shadow bytes:
  0x05400420: 00 00 00 00
  0x05400424: 00 00 00 00
  0x05400428: 00 00 00 00
  0x0540042c: 00 00 00 00
=>0x05400430: 00 00 f9 f9
  0x05400434: f9 f9 f9 f9
  0x05400438: 00 00 00 00
  0x0540043c: 00 00 00 00
  0x05400440: 00 00 00 00
Stats: 0M malloced (0M for red zones) by 35 calls
Stats: 0M realloced by 0 calls
Stats: 0M freed by 0 calls
Stats: 0M really freed by 0 calls
Stats: 2M (642 full pages) mmaped in 5 calls
  mmaps   by size class: 7:4095; 8:2047; 11:255; 12:128; 13:64; 
  mallocs by size class: 7:26; 8:4; 11:2; 12:2; 13:1; 
  frees   by size class: 
  rfrees  by size class: 
Stats: malloc large: 0 small slow: 5
==7161== ABORTING
區域變數越界存取
int main(int argc, char **argv) {
  int stack_array[100];
  stack_array[1] = 0;
  return stack_array[argc + 100];  // BOOM
}
範例執行輸出:
=================================================================
==7165== ERROR: AddressSanitizer stack-buffer-overflow on address 0xbed0bad4 at pc 0x2a000557 bp 0xbed0b914 sp 0xbed0b910
READ of size 4 at 0xbed0bad4 thread T0
    #0 0x40022a4b (/system/lib/libasan_preload.so+0x8a4b)
    #1 0x40023e77 (/system/lib/libasan_preload.so+0x9e77)
    #2 0x4001c98f (/system/lib/libasan_preload.so+0x298f)
    #3 0x2a000555 (/system/bin/stack-out-of-bounds+0x555)
    #4 0x4114371d (/system/lib/libc.so+0x1271d)
Address 0xbed0bad4 is located at offset 436 in frame 
of T0's stack: This frame has 1 object(s): [32, 432) 'stack_array' HINT: this may be a false positive if your program uses some custom stack unwind mechanism (longjmp and C++ exceptions *are* supported) Shadow byte and word: 0x17da175a: f4 0x17da1758: 00 00 f4 f4 More shadow bytes: 0x17da1748: 00 00 00 00 0x17da174c: 00 00 00 00 0x17da1750: 00 00 00 00 0x17da1754: 00 00 00 00 =>0x17da1758: 00 00 f4 f4 0x17da175c: f3 f3 f3 f3 0x17da1760: 00 00 00 00 0x17da1764: 00 00 00 00 0x17da1768: 00 00 00 00 Stats: 0M malloced (0M for red zones) by 35 calls Stats: 0M realloced by 0 calls Stats: 0M freed by 0 calls Stats: 0M really freed by 0 calls Stats: 2M (642 full pages) mmaped in 5 calls mmaps by size class: 7:4095; 8:2047; 11:255; 12:128; 13:64; mallocs by size class: 7:26; 8:4; 11:2; 12:2; 13:1; frees by size class: rfrees by size class: Stats: malloc large: 0 small slow: 5 ==7165== ABORTING

釋放後使用 (Use after free)

#include 
int main() {
  char *x = (char*)malloc(10 * sizeof(char*));
  free(x);
  return x[5];
}
範例執行輸出:
=================================================================
==7169== ERROR: AddressSanitizer heap-use-after-free on address 0x41255cc5 at pc 0x2a0004cd bp 0xbece1b1c sp 0xbece1b18
READ of size 1 at 0x41255cc5 thread T0
    #0 0x40022a4b (/system/lib/libasan_preload.so+0x8a4b)
    #1 0x40023e77 (/system/lib/libasan_preload.so+0x9e77)
    #2 0x4001c947 (/system/lib/libasan_preload.so+0x2947)
    #3 0x2a0004cb (/system/bin/use-after-free+0x4cb)
    #4 0x4114371d (/system/lib/libc.so+0x1271d)
0x41255cc5 is located 5 bytes inside of 40-byte region [0x41255cc0,0x41255ce8)
freed by thread T0 here:
    #0 0x40022a4b (/system/lib/libasan_preload.so+0x8a4b)
    #1 0x40022a97 (/system/lib/libasan_preload.so+0x8a97)
    #2 0x2a0004b1 (/system/bin/use-after-free+0x4b1)
    #3 0x4114371d (/system/lib/libc.so+0x1271d)
previously allocated by thread T0 here:
    #0 0x40022a4b (/system/lib/libasan_preload.so+0x8a4b)
    #1 0x40022aff (/system/lib/libasan_preload.so+0x8aff)
    #2 0x2a0004ab (/system/bin/use-after-free+0x4ab)
    #3 0x4114371d (/system/lib/libc.so+0x1271d)
Shadow byte and word:
  0x0824ab98: fd
  0x0824ab98: fd fd fd fd
More shadow bytes:
  0x0824ab88: 00 00 00 00
  0x0824ab8c: 04 fb fb fb
  0x0824ab90: fa fa fa fa
  0x0824ab94: fa fa fa fa
=>0x0824ab98: fd fd fd fd
  0x0824ab9c: fd fd fd fd
  0x0824aba0: fa fa fa fa
  0x0824aba4: fa fa fa fa
  0x0824aba8: fa fa fa fa
Stats: 0M malloced (0M for red zones) by 36 calls
Stats: 0M realloced by 0 calls
Stats: 0M freed by 1 calls
Stats: 0M really freed by 0 calls
Stats: 2M (642 full pages) mmaped in 5 calls
  mmaps   by size class: 7:4095; 8:2047; 11:255; 12:128; 13:64; 
  mallocs by size class: 7:27; 8:4; 11:2; 12:2; 13:1; 
  frees   by size class: 7:1; 
  rfrees  by size class: 
Stats: malloc large: 0 small slow: 5
==7169== ABORTING


使用方式

使用方式出乎意料的簡單
只要在你的 Android.mk 裡面加一行
LOCAL_ADDRESS_SANITIZER := true
主要原因在於 Address-sanitizer 也是 google 自家弄的東西, 所以整合的還算不錯

另外有開 Address-sanitizer 有個副作用就是會強迫開啟 clang 模式

什麼意思勒? 就是你的 code 變得不是使用 gcc 編譯而是 clang/llvm

若 code 有用到 gcc 特有的東西的話可能就會炸掉:P

另外 libasan_preload.so 也記得要推到 /system/lib 的地方, 這是 Address-sanitizer run-time library, 可以去 out/target/product/*/system/lib/ 裡面撈撈

如果找不到開了 LOCAL_ADDRESS_SANITIZER := true 但建置系統跟你抱怨 libasan_preload.so 或 libasan.a 的話,
就切到 external/compiler-rt/lib/asan/ 自行建置一下, 或著是用 mmm 讓建置系統自己拉相依關係


錯誤報告的解讀

接著這部份最重要的就是他吐出那團錯誤報告要怎麼讀
就直接從最複雜的釋放後使用 (Use after free)的報告來講解

=================================================================
==7169== ERROR: AddressSanitizer heap-use-after-free on address 0x41255cc5 at pc 0x2a0004cd bp 0xbece1b1c sp 0xbece1b18
首先第一個部份是會跟你報告什麼類型的錯誤, 如這邊是 heap-use-after-free, 另外還有 stack-buffer-overflow, global-buffer-overflow, heap-buffer-overflow 這幾種錯誤類型, 分別對到 stack, global 及 heap 的越界存取

on address 0x41255cc5 部份是你要存取的記憶體位置
at pc 0x2a0004cd bp 0xbece1b1c sp 0xbece1b18 部份則是說發生錯誤的 program counter 值是多少, 以及 frame pointer 跟 stack pointer 的內容

READ of size 1 at 0x41255cc5 thread T0
    #0 0x40022a4b (/system/lib/libasan_preload.so+0x8a4b)
    #1 0x40023e77 (/system/lib/libasan_preload.so+0x9e77)
    #2 0x4001c947 (/system/lib/libasan_preload.so+0x2947)
    #3 0x2a0004cb (/system/bin/use-after-free+0x4cb)
    #4 0x4114371d (/system/lib/libc.so+0x1271d)
0x41255cc5 is located 5 bytes inside of 40-byte region [0x41255cc0,0x41255ce8)
接著很貼心的也會將你炸掉的地方的 call stack 吐出來
只有 pc 值不知道在哪的話就呼叫 addr2line 這個工具來幫忙就可以了
例如你想查 /system/bin/use-after-free+0x4cb 對應的行號是多少則輸入以下指令即可
addr2line -e out/target/product/*/symbols/system/bin/use-after-free -a 0x4cb
* 自行帶入目前建置的 device name, -e 後面塞檔案, -a 後面塞位址, call stack 加號後面的東西

freed by thread T0 here:
    #0 0x40022a4b (/system/lib/libasan_preload.so+0x8a4b)
    #1 0x40022a97 (/system/lib/libasan_preload.so+0x8a97)
    #2 0x2a0004b1 (/system/bin/use-after-free+0x4b1)
    #3 0x4114371d (/system/lib/libc.so+0x1271d)
然後也會吐出 free 的 call stack, 解讀方式同上
previously allocated by thread T0 here:
    #0 0x40022a4b (/system/lib/libasan_preload.so+0x8a4b)
    #1 0x40022aff (/system/lib/libasan_preload.so+0x8aff)
    #2 0x2a0004ab (/system/bin/use-after-free+0x4ab)
    #3 0x4114371d (/system/lib/libc.so+0x1271d)
malloc 的地方也會有 call stack 吐出來
Shadow byte and word:
  0x0824ab98: fd
  0x0824ab98: fd fd fd fd
More shadow bytes:
  0x0824ab88: 00 00 00 00
  0x0824ab8c: 04 fb fb fb
  0x0824ab90: fa fa fa fa
  0x0824ab94: fa fa fa fa
=>0x0824ab98: fd fd fd fd
  0x0824ab9c: fd fd fd fd
  0x0824aba0: fa fa fa fa
  0x0824aba4: fa fa fa fa
  0x0824aba8: fa fa fa fa
Stats: 0M malloced (0M for red zones) by 36 calls
Stats: 0M realloced by 0 calls
Stats: 0M freed by 1 calls
Stats: 0M really freed by 0 calls
Stats: 2M (642 full pages) mmaped in 5 calls
  mmaps   by size class: 7:4095; 8:2047; 11:255; 12:128; 13:64; 
  mallocs by size class: 7:27; 8:4; 11:2; 12:2; 13:1; 
  frees   by size class: 7:1; 
  rfrees  by size class: 
Stats: malloc large: 0 small slow: 5
==7169== ABORTING
最後這團東西看不懂就算了, Address-sanitizer 的內部資訊, 有興趣去挖他們簡報出來就有詳細一點的範例加解釋了

副作用及使用限制

Address-sanitizer 使用時主要有一些額外的負擔:
  1. 每次 new/malloc 會多出額外的 128-255 byte 作為警戒區(Red Zone), 也就是越界時爆炸的引發器:P
  2. Stack 則是每個區域變數會吃掉 32~63 byte 作為警戒區(Red Zone)
  3. 全域變數也是每個變數會吃掉 32~63 byte 作為警戒區(Red Zone)
  4. 一般而言會吃掉 2~4 倍的記憶體, 但最差情況也有可能高達 20 倍的記憶體消耗
  5. Stack 會幾乎長三倍大以上
  6. Address Space 會直接被吃掉八分之一, 32 位元情況下會吃掉 0.5 G, 64 位元則會吃掉 16T, 但僅消耗定址空間, 不會實際耗費掉記憶體 (該區域被設定為不可讀不可寫, 存取到會直接炸掉)
其最大的使用限制是當 Address-sanitizer 偵測到任何記憶體錯誤時就會馬上中斷程式的執行,

這是正常現象, 也因為這樣的設計使得此工具簡單快速又方便

與 Valgrind 的比較表格

Valgrind Address-sanitizer
Heap out-of-bounds
Heap 越界存取
Yes Yes
Stack out-of-bounds
Stack 越界存取
No Yes
Global out-of-bounds
全域變數越界存取
No Yes
Use-after-free
釋放(free/delete)後使用
Yes Yes
Use-after-return
回傳後使用
(例如回傳區域變數指標)
No Sometimes/YES
Uninitialized reads
讀取未初始化的值
Yes No(註1)
Overhead
對程式影響(變慢幾倍)
10x-30x 1.5x-3x
Platforms
可使用平台
Linux, Mac Same as GCC/LLVM (註2)
註1: 這部份有 Address-sanitizer 的好兄弟 Memory-sanitizer 可以處理, 但尚未納入 Android 中
註2: GCC 部份於 2012/11/01 正式納入 Address-sanitizer[4], 可用度沒玩過不太確定, 在 Android 中要開 Address-sanitizer 就是要用 clang 就是了:P...

小結

Address-sanitizer 算是一個輕量級的好用小工具, 根據開發者所釋放出來的簡報 Google 內部自己也有使用此工具並且抓出了上千個 bug, 包含 llvm, gcc, vim, firefox 等大型 Open Source 軟體的記憶體使用 bug, 目前 Address-sanitizer 還有一些相關的兄弟如 Memory-sanitizer 及 Thread-sanitizer, 分別可以偵測 Uninitialized reads 及 Race condition, 不過在 Android 中目前還沒引進, 但都 Google 自家人弄的, 可以期待之後應該是會引進到 Android 中, 讓一些常見的記憶體錯誤可以早期發現:)

參考鍊結


[1] http://code.google.com/p/address-sanitizer
[2] http://llvm.org/devmtg/2011-11/Serebryany_FindingRacesMemoryErrors.pdf
[3] http://address-sanitizer.googlecode.com/files/address_sanity_checker.pdf
[4] http://gcc.gnu.org/ml/gcc/2012-11/msg00016.html

沒有留言:

張貼留言