MySQL · 2016-12-24 0

Percona-Toolkit系列之pt-pmp获取堆栈信息利器

1、简介


pt-pmp 是一个非常简单的工具,可以用来获取MySQL的堆栈信息。工具首先获取运行过程中的mysqld堆栈信息,然后将相似的线程进行汇总排序,根据调用频繁程度从高到低打印出来。

查看pt-pmp的源代码,我们可以发现pt-pmp本身是一个shell脚本,底层调用的是gdb,所以我们需要事先安装gdp。

      for x in $(_seq $OPT_ITERATIONS); do
         gdb -ex "set pagination 0"    \
             -ex "thread apply all bt" \
             -batch                    \
             -p $OPT_PID               \
             >> "$output_file"
         date +'TS %N.%s %F %T' >> "$output_file"
         sleep $OPT_INTERVAL
      done

在pt-pmp的运行过程过程中,他会短暂的冻结mysqld的主程序,导致MySQL无法响应,同时当gdb detach的时候,MySQL存在crash的风险。因此如果在一个很繁忙的生产环境中需要谨慎使用。但是如果你的MySQL数据库现在已经停止响应了,那么可以使用pt-pmp去跟踪MySQL的堆栈信息,来分析他为什么hang住。

如下是我用sysbench压测的时候发现开启pt-pmp之后,MySQL在第203秒和204秒的tps为0,所以生产环境使用一定要谨慎!

root@rht6-mysql ~/scripts # pt-pmp -p 2898 -l 3 -i 3 -s 1 
Sat Dec 24 13:37:30 CST 2016

    195 pthread_cond_wait,native_cond_wait(thr_cond.h:140),my_cond_wait(thr_cond.h:140)
     30 libaio::??(libaio.so.1),LinuxAIOHandler::collect(os0file.cc:2489),LinuxAIOHandler::poll(os0file.cc:2635)
     22 pthread_cond_wait,wait(os0event.cc:165),os_event::wait_low(os0event.cc:165)
     17 pthread_cond_timedwait,os_event::timed_wait(os0event.cc:285),os_event::wait_time_low(os0event.cc:412)
      3 sigwait(libpthread.so.0),signal_hand(mysqld.cc:2104),pfs_spawn_thread(pfs.cc:2188)
      3 sigwaitinfo(libc.so.6),timer_notify_thread_func(posix_timers.c:77),pfs_spawn_thread(pfs.cc:2188)
      3 poll(libc.so.6),Mysqld_socket_listener::listen_for_connection_event(socket_connection.cc:848),connection_event_loop(connection_acceptor.h:66)
      2 send(libpthread.so.0),inline_mysql_socket_send(mysql_socket.h:766),vio_write(mysql_socket.h:766)
      2 nanosleep(libpthread.so.0),os_thread_sleep(os0thread.cc:254),srv_master_sleep(srv0srv.cc:2316)
      2 __lll_lock_wait(libpthread.so.0),_L_lock_854(libpthread.so.0),pthread_mutex_lock(libpthread.so.0)
      2 fsync(libpthread.so.0),os_file_fsync_posix(os0file.cc:3042),os_file_flush_func(os0file.cc:3042)
      1 sched_yield(libc.so.6),trx_purge_wait_for_workers_to_complete(trx0purge.cc:1762),trx_purge(trx0purge.cc:1762)
      1 pread64(libpthread.so.0),os_file_io(os0file.cc:5423),os_file_pread(os0file.cc:5601)
      1 __lll_lock_wait(libpthread.so.0),_L_lock_995(libpthread.so.0),pthread_mutex_lock(libpthread.so.0)
      1 fdatasync(libc.so.6),my_sync(my_sync.c:76),inline_mysql_file_sync(mysql_file.h:1420)
[ 198s] threads: 50, tps: 262.37, reads: 3696.15, writes: 1087.51, response time: 292.90ms (95%), errors: 0.00, reconnects:  0.00
[ 199s] threads: 50, tps: 266.11, reads: 3664.45, writes: 1012.40, response time: 289.07ms (95%), errors: 0.00, reconnects:  0.00
[ 200s] threads: 50, tps: 278.94, reads: 3699.17, writes: 1149.74, response time: 289.59ms (95%), errors: 0.00, reconnects:  0.00
[ 201s] threads: 50, tps: 269.98, reads: 4047.66, writes: 1011.92, response time: 257.30ms (95%), errors: 0.00, reconnects:  0.00
[ 202s] threads: 50, tps: 134.04, reads: 1926.62, writes: 728.23, response time: 246.07ms (95%), errors: 0.00, reconnects:  0.00
[ 203s] threads: 50, tps: 0.00, reads: 0.00, writes: 0.00, response time: 0.00ms (95%), errors: 0.00, reconnects:  0.00
[ 204s] threads: 50, tps: 0.00, reads: 0.00, writes: 0.00, response time: 0.00ms (95%), errors: 0.00, reconnects:  0.00
[ 205s] threads: 50, tps: 61.02, reads: 834.26, writes: 130.04, response time: 3541.98ms (95%), errors: 0.00, reconnects:  0.00
[ 206s] threads: 50, tps: 246.95, reads: 3369.31, writes: 909.81, response time: 392.64ms (95%), errors: 0.00, reconnects:  0.00
[ 207s] threads: 50, tps: 261.96, reads: 3721.40, writes: 1130.82, response time: 282.32ms (95%), errors: 0.00, reconnects:  0.00
[ 208s] threads: 50, tps: 303.95, reads: 4078.34, writes: 1132.82, response time: 259.93ms (95%), errors: 0.00, reconnects:  0.00

2、参数说明


–binary

short form: -b; type: string; default: mysqld
Which binary to trace.

指定哪个进行进行跟踪,如果不指定,他默认去寻找mysqld进程。因此pt-pmp也可以跟踪其他进程。

–help

Show help and exit.

打印帮助信息

–interval

short form: -s; type: int; default: 0
Number of seconds to sleep between --iterations.

每次跟踪的间隔时间,默认为不休眠

–iterations

short form: -i; type: int; default: 1
How many traces to gather and aggregate.

总共的跟踪次数,默认为跟踪1次

–lines

short form: -l; type: int; default: 0
Aggregate only first specified number of many functions; 0=infinity.

pt-pmp在第二步聚合的时候只打印相应分类的几个方法,默认会全部打印出来,你可以让他只打印前面2个

如pt-pmp -l 2

root@rht6-mysql ~/scripts # pt-pmp -l 2
Sat Dec 24 13:11:04 CST 2016

     10 libaio::??(libaio.so.1),LinuxAIOHandler::collect(os0file.cc:2489)
      6 pthread_cond_wait,wait(os0event.cc:165)
      6 pthread_cond_timedwait,os_event::timed_wait(os0event.cc:285)
      1 sigwait(libpthread.so.0),signal_hand(mysqld.cc:2104)
      1 sigwaitinfo(libc.so.6),timer_notify_thread_func(posix_timers.c:77)
      1 pthread_cond_wait,native_cond_wait(thr_cond.h:140)
      1 poll(libc.so.6),Mysqld_socket_listener::listen_for_connection_event(socket_connection.cc:848)
      1 nanosleep(libpthread.so.0),os_thread_sleep(os0thread.cc:254)

–pid

short form: -p; type: int
Process ID of the process to trace; overrides --binary.

指定哪个进程号进行跟踪,这个参数会覆盖–binary指定的参数值,因此指定pid和binary其中一个即可

–save-samples

short form: -k; type: string
Keep the raw traces in this file after aggregation.

将跟踪结果保存到文本文件中

–version

Show version and exit.

输出版本信息

3、使用示例


  • 存在多个mysqld,跟踪指定pid的MySQL进行
root@rht6-mysql ~/scripts # pidof mysqld
2898  2900 2952
root@rht6-mysql ~/scripts # pt-pmp -p 2898
Sat Dec 24 13:14:50 CST 2016

warning: no loadable sections found in added symbol-file system-supplied DSO at 0x7ffc184d3000
     10 libaio::??(libaio.so.1),LinuxAIOHandler::collect(os0file.cc:2489),LinuxAIOHandler::poll(os0file.cc:2635),os_aio_linux_handler(os0file.cc:2691),os_aio_handler(os0file.cc:2691),fil_aio_wait(fil0fil.cc:5815),io_handler_thread(srv0start.cc:308),start_thread(libpthread.so.0),clone(libc.so.6)
      3 pthread_cond_wait,wait(os0event.cc:165),os_event::wait_low(os0event.cc:165),srv_worker_thread(srv0srv.cc:2506),start_thread(libpthread.so.0),clone(libc.so.6)
      1 sigwait(libpthread.so.0),signal_hand(mysqld.cc:2104),pfs_spawn_thread(pfs.cc:2188),start_thread(libpthread.so.0),clone(libc.so.6)
      1 sigwaitinfo(libc.so.6),timer_notify_thread_func(posix_timers.c:77),pfs_spawn_thread(pfs.cc:2188),start_thread(libpthread.so.0),clone(libc.so.6)
      1 pthread_cond_wait,wait(os0event.cc:165),os_event::wait_low(os0event.cc:165),srv_purge_coordinator_suspend(srv0srv.cc:2662),srv_purge_coordinator_thread(srv0srv.cc:2662),start_thread(libpthread.so.0),clone(libc.so.6)
      1 pthread_cond_wait,wait(os0event.cc:165),os_event::wait_low(os0event.cc:165),buf_resize_thread(buf0buf.cc:3009),start_thread(libpthread.so.0),clone(libc.so.6)

  • 进行2次跟踪,每次间隔1秒,然后只需要输出每个分类的前三个函数
root@rht6-mysql ~/scripts # pt-pmp -p 2898 -l 3 -i 2 -s 1
Sat Dec 24 13:16:50 CST 2016

     20 libaio::??(libaio.so.1),LinuxAIOHandler::collect(os0file.cc:2489),LinuxAIOHandler::poll(os0file.cc:2635)
     12 pthread_cond_wait,wait(os0event.cc:165),os_event::wait_low(os0event.cc:165)
     12 pthread_cond_timedwait,os_event::timed_wait(os0event.cc:285),os_event::wait_time_low(os0event.cc:412)
      2 sigwait(libpthread.so.0),signal_hand(mysqld.cc:2104),pfs_spawn_thread(pfs.cc:2188)
      2 sigwaitinfo(libc.so.6),timer_notify_thread_func(posix_timers.c:77),pfs_spawn_thread(pfs.cc:2188)
      2 pthread_cond_wait,native_cond_wait(thr_cond.h:140),my_cond_wait(thr_cond.h:140)
      2 poll(libc.so.6),Mysqld_socket_listener::listen_for_connection_event(socket_connection.cc:848),connection_event_loop(connection_acceptor.h:66)
      2 nanosleep(libpthread.so.0),os_thread_sleep(os0thread.cc:254),srv_master_sleep(srv0srv.cc:2316)