有用的 SystemTap 腳本

時間 2021-02-16

標籤 html node redis 網絡 app ssh socket tcp ide 函數欄目 HTML 简体版

原文原文鏈接

注：該文原文是 Chapter 5. Useful SystemTap Scriptshtml

注：還未完成，先丟上來純粹是爲了測試新功能目錄結構滴。這個備註在文章完成後，會刪除滴。node

本章列舉了幾種能夠用來監測和調查不一樣的子系統的 SystemTap 腳本。一旦你安裝了 systemtap-testsuite RPM 包，全部的這些腳本均可以在 /usr/share/systemtap/testsuite/systemtap.examples/ 目錄下找到。redis

5.1 網絡

後面的章節展現了跟蹤網絡相關的函數和構建一個網絡活動的概要文件的腳本。網絡

5.1.1 網絡性能分析

本節描述瞭如何描述網絡活動，nettop.stp 提供了一個瞭解在每臺機器上每一個進程生成了多少網絡流量的機會。app

nettop.stpssh

#! /usr/bin/env stap

global ifxmit, ifrecv
global ifmerged

probe netdev.transmit
{
  ifxmit[pid(), dev_name, execname(), uid()] <<< length
}

probe netdev.receive
{
  ifrecv[pid(), dev_name, execname(), uid()] <<< length
}

function print_activity()
{
  printf("%5s %5s %-7s %7s %7s %7s %7s %-15s\n",
         "PID", "UID", "DEV", "XMIT_PK", "RECV_PK",
         "XMIT_KB", "RECV_KB", "COMMAND")

  foreach ([pid, dev, exec, uid] in ifrecv) {
      ifmerged[pid, dev, exec, uid] += @count(ifrecv[pid,dev,exec,uid]);
  }
  foreach ([pid, dev, exec, uid] in ifxmit) {
      ifmerged[pid, dev, exec, uid] += @count(ifxmit[pid,dev,exec,uid]);
  }
  foreach ([pid, dev, exec, uid] in ifmerged-) {
    n_xmit = @count(ifxmit[pid, dev, exec, uid])
    n_recv = @count(ifrecv[pid, dev, exec, uid])
    printf("%5d %5d %-7s %7d %7d %7d %7d %-15s\n",
           pid, uid, dev, n_xmit, n_recv,
           n_xmit ? @sum(ifxmit[pid, dev, exec, uid])/1024 : 0,
           n_recv ? @sum(ifrecv[pid, dev, exec, uid])/1024 : 0,
           exec)
  }

  print("\n")

  delete ifxmit
  delete ifrecv
  delete ifmerged
}

probe timer.ms(5000), end, error
{
  print_activity()
}

注意 function print_activity() 使用如下表達式：socket

n_xmit ? @sum(ifxmit[pid, dev, exec, uid])/1024 : 0
n_recv ? @sum(ifrecv[pid, dev, exec, uid])/1024 : 0

這些表達式是 if/else 條件判斷語句，上面第二個語句是如下僞代碼的一個更簡潔的寫做方式：tcp

if n_recv != 0 then
  @sum(ifrecv[pid, dev, exec, uid])/1024
else
  0

nettop.stp 跟蹤在系統上哪一個進程在生成網絡流量，並提供關於進程的如下信息：ide

PID — the ID of the listed process.
UID — user ID. A user ID of 0 refers to the root user.
DEV — which ethernet device the process used to send / receive data (for example, eth0, eth1)
XMIT_PK — number of packets transmitted by the process
RECV_PK — number of packets received by the process
XMIT_KB — amount of data sent by the process, in kilobytes
RECV_KB — amount of data received by the service, in kilobytes

nettop.stp 每 5 秒提供網絡性能分析取樣。你能夠根據 probe timer.ms(5000) 改變這個設置， Example 5.1, 「nettop.stp Sample Output」包含了一份從 nettop.stp 輸出的 20s 內的摘錄。函數

Example 5.1. nettop.stp Sample Output

[...]
  PID   UID DEV     XMIT_PK RECV_PK XMIT_KB RECV_KB COMMAND
    0     0 eth0          0       5       0       0 swapper
11178     0 eth0          2       0       0       0 synergyc

  PID   UID DEV     XMIT_PK RECV_PK XMIT_KB RECV_KB COMMAND
 2886     4 eth0         79       0       5       0 cups-polld
11362     0 eth0          0      61       0       5 firefox
    0     0 eth0          3      32       0       3 swapper
 2886     4 lo            4       4       0       0 cups-polld
11178     0 eth0          3       0       0       0 synergyc

  PID   UID DEV     XMIT_PK RECV_PK XMIT_KB RECV_KB COMMAND
    0     0 eth0          0       6       0       0 swapper
 2886     4 lo            2       2       0       0 cups-polld
11178     0 eth0          3       0       0       0 synergyc
 3611     0 eth0          0       1       0       0 Xorg

  PID   UID DEV     XMIT_PK RECV_PK XMIT_KB RECV_KB COMMAND
    0     0 eth0          3      42       0       2 swapper
11178     0 eth0         43       1       3       0 synergyc
11362     0 eth0          0       7       0       0 firefox
 3897     0 eth0          0       1       0       0 multiload-apple
[...]

5.1.2 在網絡 socket 代碼中跟蹤函數調用

本節描述了怎樣從 net/socket.c 文件中跟蹤函數調用。這個任務能夠幫助你在更多的細節識別，在內核中，每一個進程是怎麼與網絡交互的。

socket-trace.stp

#! /usr/bin/env stap

probe kernel.function("*@net/socket.c").call {
  printf ("%s -> %s\n", thread_indent(1), ppfunc())
}
probe kernel.function("*@net/socket.c").return {
  printf ("%s <- %s\n", thread_indent(-1), ppfunc())
}

socket-trace.stp 是徹底和 Example 3.6, 「thread_indent.stp」同樣的。最先在 SystemTap Functions 中使用用於證實 thread_indent() 是怎麼工做的。

Example 5.2. socket-trace.stp Sample Output

[...]
0 Xorg(3611): -> sock_poll
3 Xorg(3611): <- sock_poll
0 Xorg(3611): -> sock_poll
3 Xorg(3611): <- sock_poll
0 gnome-terminal(11106): -> sock_poll
5 gnome-terminal(11106): <- sock_poll
0 scim-bridge(3883): -> sock_poll
3 scim-bridge(3883): <- sock_poll
0 scim-bridge(3883): -> sys_socketcall
4 scim-bridge(3883):  -> sys_recv
8 scim-bridge(3883):   -> sys_recvfrom
12 scim-bridge(3883):-> sock_from_file
16 scim-bridge(3883):<- sock_from_file
20 scim-bridge(3883):-> sock_recvmsg
24 scim-bridge(3883):<- sock_recvmsg
28 scim-bridge(3883):   <- sys_recvfrom
31 scim-bridge(3883):  <- sys_recv
35 scim-bridge(3883): <- sys_socketcall
[...]

Example 5.2, 「socket-trace.stp Sample Output」包含了 socket-trace.stp 輸出中的 3s 引用。想要腳本 thread_indent() 提供的更多信息，請移步至 SystemTap Functions Example 3.6, 「thread_indent.stp」。

5.1.3 監控傳入的 TCP 鏈接

本節說明如何監控傳入的TCP鏈接。這個任務在識別任何未受權的，可疑的，或是沒必要要的實時網絡訪問請求方面十分有用。

tcp_connections.stp

#! /usr/bin/env stap

probe begin {
  printf("%6s %16s %6s %6s %16s\n",
         "UID", "CMD", "PID", "PORT", "IP_SOURCE")
}

probe kernel.function("tcp_accept").return?,
      kernel.function("inet_csk_accept").return? {
  sock = $return
  if (sock != 0)
    printf("%6d %16s %6d %6d %16s\n", uid(), execname(), pid(),
           inet_get_local_port(sock), inet_get_ip_source(sock))
}

當 tcp_connections.stp 正在運行，它將打印任何關於被系統實時接收的 TCP 鏈接的如下信息：

Current UID
CMD - the command accepting the connection
PID of the command
Port used by the connection
IP address from which the TCP connection originated

Example 5.3. tcp_connections.stp Sample Output

UID            CMD    PID   PORT        IP_SOURCE
0             sshd   3165     22      10.64.0.227
0             sshd   3165     22      10.64.0.227

5.1.4 監控 TCP 包

本節說明了如何監控被系統接收的 TCP 包。這個對分析在系統上運行的應用生成的網絡流量很是有用。

tcpdumplike.stp

#! /usr/bin/env stap

// A TCP dump like example

probe begin, timer.s(1) {
  printf("-----------------------------------------------------------------\n")
  printf("       Source IP         Dest IP  SPort  DPort  U  A  P  R  S  F \n")
  printf("-----------------------------------------------------------------\n")
}

probe udp.recvmsg /* ,udp.sendmsg */ {
  printf(" %15s %15s  %5d  %5d  UDP\n",
         saddr, daddr, sport, dport)
}

probe tcp.receive {
  printf(" %15s %15s  %5d  %5d  %d  %d  %d  %d  %d  %d\n",
         saddr, daddr, sport, dport, urg, ack, psh, rst, syn, fin)
}

當 tcpdumplike.stp 在運行，它將打印如下關於任何被實時接收的 TCP 包的信息：

Source and destination IP address (saddr, daddr, respectively)
Source and destination ports (sport, dport, respectively)
Packet flags

爲了肯定被包使用的標誌，tcpdumplike.stp 使用瞭如下函數：

urg - urgent
ack - acknowledgement
psh - push
rst - reset
syn - synchronize
fin - finished

上述函數返回 1 或 0 來指定包是否使用了匹配的標誌。

Example 5.4. tcpdumplike.stp Sample Output

-----------------------------------------------------------------
       Source IP         Dest IP  SPort  DPort  U  A  P  R  S  F
-----------------------------------------------------------------
  209.85.229.147       10.0.2.15     80  20373  0  1  1  0  0  0
  92.122.126.240       10.0.2.15     80  53214  0  1  0  0  1  0
  92.122.126.240       10.0.2.15     80  53214  0  1  0  0  0  0
  209.85.229.118       10.0.2.15     80  63433  0  1  0  0  1  0
  209.85.229.118       10.0.2.15     80  63433  0  1  0  0  0  0
  209.85.229.147       10.0.2.15     80  21141  0  1  1  0  0  0
  209.85.229.147       10.0.2.15     80  21141  0  1  1  0  0  0
  209.85.229.147       10.0.2.15     80  21141  0  1  1  0  0  0
  209.85.229.147       10.0.2.15     80  21141  0  1  1  0  0  0
  209.85.229.147       10.0.2.15     80  21141  0  1  1  0  0  0
  209.85.229.118       10.0.2.15     80  63433  0  1  1  0  0  0
[...]

5.1.5 監控內核中的網絡丟包

在 Linux 網絡棧能夠由於各類緣由丟棄數據包。一些 Linux 內核包含了跟蹤點，kernel.trace("kfree_skb")，能夠很容易的跟蹤包在哪裏丟棄了。 dropwatch.stp 使用 kernel.trace("kfree_skb") 來追蹤包丟棄；這個腳本概述了每 5 秒的間隔包丟棄的位置。

dropwatch.stp

#! /usr/bin/env stap

############################################################
# Dropwatch.stp
# Author: Neil Horman <nhorman@redhat.com>
# An example script to mimic the behavior of the dropwatch utility
# http://fedorahosted.org/dropwatch
############################################################

# Array to hold the list of drop points we find
global locations

# Note when we turn the monitor on and off
probe begin { printf("Monitoring for dropped packets\n") }
probe end { printf("Stopping dropped packet monitor\n") }

# increment a drop counter for every location we drop at
probe kernel.trace("kfree_skb") { locations[$location] <<< 1 }

# Every 5 seconds report our drop locations
probe timer.sec(5)
{
  printf("\n")
  foreach (l in locations-) {
    printf("%d packets dropped at %s\n",
           @count(locations[l]), symname(l))
  }
  delete locations
}

kernel.trace("kfree_skb") 跟蹤到內核丟棄網絡包的位置。kernel.trace("kfree_skb") 有兩個參數：一個指向緩衝區的指針被釋放（$skb）的 buffer，內核代碼緩衝區的位置被釋放（$location）。dropwatch.stp 腳本提供了包含 $location 的函數。把 $location 映射回函數的信息不是測量的默認值。在 SystemTap 1.4 ，--all-modules 選項將包含要求的映射信息，如下命令能夠被用於運行這個腳本。

stap --all-modules dropwatch.stp

在 SystemTap 的老版本，你可使用如下命令來模仿 --all-modules 選項：

stap -dkernel \
`cat /proc/modules | awk 'BEGIN { ORS = " " } {print "-d"$1}'` \
dropwatch.stp

運行 dropwatch.stp 腳本 15s 將有相似 Example 5.5, 「dropwatch.stp Sample Output」的輸出結果。

Example 5.5. dropwatch.stp Sample Output

Monitoring for dropped packets

1762 packets dropped at unix_stream_recvmsg
4 packets dropped at tun_do_read
2 packets dropped at nf_hook_slow

467 packets dropped at unix_stream_recvmsg
20 packets dropped at nf_hook_slow
6 packets dropped at tun_do_read

446 packets dropped at unix_stream_recvmsg
4 packets dropped at tun_do_read
4 packets dropped at nf_hook_slow
Stopping dropped packet monitor

當腳本在一臺機器上編譯，在另一臺機器上運行， --all-modules 和 /proc/modules 目錄是不可用的。symname 函數將打印出原始地址。爲了使得原始地址丟棄的更有意義，涉及 /boot/System.map-uname -r`` 文件。文件列表列出了每一個函數的開始地址。容許你映射地址到 Example 5.5, 「dropwatch.stp Sample Output」 輸出的一個指定的函數名字。獲得 /boot/System.map-uname -r 文件的如下片斷。 0xffffffff8149a8ed 地址映射到函數 unix_stream_recvmsg:

[...]
ffffffff8149a420 t unix_dgram_poll
ffffffff8149a5e0 t unix_stream_recvmsg
ffffffff8149ad00 t unix_find_other
[...]

5.2 磁盤

後面的章節展現了監控磁盤和 I/O 活動的腳本。

5.2.1 統計磁盤讀寫流量

這節描述了怎樣識別哪一個進程在執行頻繁的磁盤 reads/writes。

disktop.stp

#!/usr/bin/env stap 
#
# Copyright (C) 2007 Oracle Corp.
#
# Get the status of reading/writing disk every 5 seconds,
# output top ten entries 
#
# This is free software,GNU General Public License (GPL);
# either version 2, or (at your option) any later version.
#
# Usage:
#  ./disktop.stp
#

global io_stat,device
global read_bytes,write_bytes

probe vfs.read.return {
  if ($return>0) {
    if (devname!="N/A") {/*skip read from cache*/
      io_stat[pid(),execname(),uid(),ppid(),"R"] += $return
      device[pid(),execname(),uid(),ppid(),"R"] = devname
      read_bytes += $return
    }
  }
}

probe vfs.write.return {
  if ($return>0) {
    if (devname!="N/A") { /*skip update cache*/
      io_stat[pid(),execname(),uid(),ppid(),"W"] += $return
      device[pid(),execname(),uid(),ppid(),"W"] = devname
      write_bytes += $return
    }
  }
}

probe timer.ms(5000) {
  /* skip non-read/write disk */
  if (read_bytes+write_bytes) {

    printf("\n%-25s, %-8s%4dKb/sec, %-7s%6dKb, %-7s%6dKb\n\n",
           ctime(gettimeofday_s()),
           "Average:", ((read_bytes+write_bytes)/1024)/5,
           "Read:",read_bytes/1024,
           "Write:",write_bytes/1024)

    /* print header */
    printf("%8s %8s %8s %25s %8s %4s %12s\n",
           "UID","PID","PPID","CMD","DEVICE","T","BYTES")
  }
  /* print top ten I/O */
  foreach ([process,cmd,userid,parent,action] in io_stat- limit 10)
    printf("%8d %8d %8d %25s %8s %4s %12d\n",
           userid,process,parent,cmd,
           device[process,cmd,userid,parent,action],
           action,io_stat[process,cmd,userid,parent,action])

  /* clear data */
  delete io_stat
  delete device
  read_bytes = 0
  write_bytes = 0  
}

probe end{
  delete io_stat
  delete device
  delete read_bytes
  delete write_bytes
}

disktop.stp 輸出了最頻繁讀寫磁盤的前 10 進程。Example 5.6, 「disktop.stp Sample Output」顯示了這個腳本的取樣輸出，每一個列出的進程包含如下數據：

UID — user ID. A user ID of 0 refers to the root user.
PID — the ID of the listed process.
PPID — the process ID of the listed process's parent process.
CMD — the name of the listed process.
DEVICE — which storage device the listed process is reading from or writing to.
T — the type of action performed by the listed process; W refers to write, while R refers to read.
BYTES — the amount of data read to or written from disk.

disktop.stp 輸出的時間和日期是由函數 ctime() 和 gettimeofday_s(). ctime() 返回的。硬件時鐘從 UNIX 時間（January 1, 1970）以秒爲單位傳遞。 gettimeofday_s() 計算了從 UNIX 時間的實際秒數。給出了一個至關準確的人類可讀的時間戳做爲輸出。

在這個腳本中，$return 是一個本地變量，存儲了每一個進程從虛擬文件系統讀或寫的實際字節數。$return 僅能被用於返回探針（例如， vfs.read.return ）。

Example 5.6. disktop.stp Sample Output

[...]
Mon Sep 29 03:38:28 2008 , Average:  19Kb/sec, Read: 7Kb, Write: 89Kb

UID      PID     PPID                       CMD   DEVICE    T    BYTES
0    26319    26294                   firefox     sda5    W        90229
0     2758     2757           pam_timestamp_c     sda5    R         8064
0     2885        1                     cupsd     sda5    W         1678

Mon Sep 29 03:38:38 2008 , Average:   1Kb/sec, Read: 7Kb, Write: 1Kb

UID      PID     PPID                       CMD   DEVICE    T    BYTES
0     2758     2757           pam_timestamp_c     sda5    R         8064
0     2885        1                     cupsd     sda5    W         1678

5.2.2 爲每一個文件的讀或寫跟蹤 I/O 時間

這節描述了每一個進程讀或寫任何文件所花費的時間。這對肯定哪一個文件在系統中加載慢是很是有用的。

iotime.stp

#! /usr/bin/env stap

/*
 * Copyright (C) 2006-2007 Red Hat Inc.
 * 
 * This copyrighted material is made available to anyone wishing to use,
 * modify, copy, or redistribute it subject to the terms and conditions
 * of the GNU General Public License v.2.
 *
 * You should have received a copy of the GNU General Public License
 * along with this program.  If not, see <http://www.gnu.org/licenses/>.
 *
 * Print out the amount of time spent in the read and write systemcall
 * when each file opened by the process is closed. Note that the systemtap 
 * script needs to be running before the open operations occur for
 * the script to record data.
 *
 * This script could be used to to find out which files are slow to load
 * on a machine. e.g.
 *
 * stap iotime.stp -c 'firefox'
 *
 * Output format is:
 * timestamp pid (executabable) info_type path ...
 *
 * 200283135 2573 (cupsd) access /etc/printcap read: 0 write: 7063
 * 200283143 2573 (cupsd) iotime /etc/printcap time: 69
 *
 */

global start
global time_io

function timestamp:long() { return gettimeofday_us() - start }

function proc:string() { return sprintf("%d (%s)", pid(), execname()) }

probe begin { start = gettimeofday_us() }

global filehandles, fileread, filewrite

probe syscall.open.return {
  filename = user_string($filename)
  if ($return != -1) {
    filehandles[pid(), $return] = filename
  } else {
    printf("%d %s access %s fail\n", timestamp(), proc(), filename)
  }
}

probe syscall.read.return {
  p = pid()
  fd = $fd
  bytes = $return
  time = gettimeofday_us() - @entry(gettimeofday_us())
  if (bytes > 0)
    fileread[p, fd] += bytes
  time_io[p, fd] <<< time
}

probe syscall.write.return {
  p = pid()
  fd = $fd
  bytes = $return
  time = gettimeofday_us() - @entry(gettimeofday_us())
  if (bytes > 0)
    filewrite[p, fd] += bytes
  time_io[p, fd] <<< time
}

probe syscall.close {
  if ([pid(), $fd] in filehandles) {
    printf("%d %s access %s read: %d write: %d\n",
           timestamp(), proc(), filehandles[pid(), $fd],
           fileread[pid(), $fd], filewrite[pid(), $fd])
    if (@count(time_io[pid(), $fd]))
      printf("%d %s iotime %s time: %d\n",  timestamp(), proc(),
             filehandles[pid(), $fd], @sum(time_io[pid(), $fd]))
   }
  delete fileread[pid(), $fd]
  delete filewrite[pid(), $fd]
  delete filehandles[pid(), $fd]
  delete time_io[pid(),$fd]
}

iotime.stp 追蹤系統調用打開, 關閉, 讀, 和寫一個文件的時間。對於每一個系統調用訪問，iotime.stp 會計算任何讀寫花費的微秒數和追蹤讀寫進文件中的數據量。

iotime.stp 也使用本地變量 $count 來追蹤任何系統調用試圖讀和寫的數據量。注意 $return（被用於 Section 5.2.1, 「Summarizing Disk Read/Write Traffic」的 disktop.stp ）存儲讀寫的實際數據量。 $count 僅能被用於追蹤數據讀寫的探針上（是 syscall.read 和 syscall.write）。

Example 5.7. iotime.stp Sample Output

[...]
825946 3364 (NetworkManager) access /sys/class/net/eth0/carrier read: 8190 write: 0
825955 3364 (NetworkManager) iotime /sys/class/net/eth0/carrier time: 9
[...]
117061 2460 (pcscd) access /dev/bus/usb/003/001 read: 43 write: 0
117065 2460 (pcscd) iotime /dev/bus/usb/003/001 time: 7
[...]
3973737 2886 (sendmail) access /proc/loadavg read: 4096 write: 0
3973744 2886 (sendmail) iotime /proc/loadavg time: 11
[...]

Example 5.7, 「iotime.stp Sample Output」 打印如下數據：

時間戳，以微秒爲單位。
進程 ID 和進程名字。
一個 access 或 iotime 標誌。
被訪問的文件。

若是一個進程能夠讀寫任何數據，一對 access 和 iotime 應該出如今一塊兒， access 行的時間戳涉及到一個給定的進程訪問文件的時間；在這行的最後，它將顯示讀寫字節數。iotime 行顯示了一個進程爲了執行讀寫所花費的時間。

若是 access 行後跟隨的不是任何 iotime 行，意味着該進程沒有讀寫任何數據。

5.2.3 跟蹤累積 I/O

這節描述了怎樣跟蹤累積的系統 I/O。

traceio.stp

#! /usr/bin/env stap
# traceio.stp
# Copyright (C) 2007 Red Hat, Inc., Eugene Teo <eteo@redhat.com>
# Copyright (C) 2009 Kai Meyer <kai@unixlords.com>
#   Fixed a bug that allows this to run longer
#   And added the humanreadable function
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License version 2 as
# published by the Free Software Foundation.
#

global reads, writes, total_io

probe vfs.read.return {
  if ($return > 0) {
    reads[pid(),execname()] += $return
    total_io[pid(),execname()] += $return
  }
}

probe vfs.write.return {
  if ($return > 0) {
    writes[pid(),execname()] += $return
    total_io[pid(),execname()] += $return
  }
}

function humanreadable(bytes) {
  if (bytes > 1024*1024*1024) {
    return sprintf("%d GiB", bytes/1024/1024/1024)
  } else if (bytes > 1024*1024) {
    return sprintf("%d MiB", bytes/1024/1024)
  } else if (bytes > 1024) {
    return sprintf("%d KiB", bytes/1024)
  } else {
    return sprintf("%d   B", bytes)
  }
}

probe timer.s(1) {
  foreach([p,e] in total_io- limit 10)
    printf("%8d %15s r: %12s w: %12s\n",
           p, e, humanreadable(reads[p,e]),
           humanreadable(writes[p,e]))
  printf("\n")
  # Note we don't zero out reads, writes and total_io,
  # so the values are cumulative since the script started.
}

traceio.stp 打印了前十的可執行文件生成 I/O 通訊。此外，它也跟蹤 I/O 讀寫的累積數量，經過這些前十的可執行文件。這些信息會被追蹤並每隔 1s 打印出來，以降序的方式。

注意 traceio.stp 也使用本地變量 $return，被 Section 5.2.1, 「Summarizing Disk Read/Write Traffic」章節的 disktop.stp 使用的。

Example 5.8. traceio.stp Sample Output

[...]
           Xorg r:   583401 KiB w:        0 KiB
       floaters r:       96 KiB w:     7130 KiB
multiload-apple r:      538 KiB w:      537 KiB
           sshd r:       71 KiB w:       72 KiB
pam_timestamp_c r:      138 KiB w:        0 KiB
        staprun r:       51 KiB w:       51 KiB
          snmpd r:       46 KiB w:        0 KiB
          pcscd r:       28 KiB w:        0 KiB
     irqbalance r:       27 KiB w:        4 KiB
          cupsd r:        4 KiB w:       18 KiB

           Xorg r:   588140 KiB w:        0 KiB
       floaters r:       97 KiB w:     7143 KiB
multiload-apple r:      543 KiB w:      542 KiB
           sshd r:       72 KiB w:       72 KiB
pam_timestamp_c r:      138 KiB w:        0 KiB
        staprun r:       51 KiB w:       51 KiB
          snmpd r:       46 KiB w:        0 KiB
          pcscd r:       28 KiB w:        0 KiB
     irqbalance r:       27 KiB w:        4 KiB
          cupsd r:        4 KiB w:       18 KiB

5.2.4 I/O 監控 (By Device)

這節描述了怎樣在指定設備上監控 I/O 活動。

traceio2.stp

#! /usr/bin/env stap

global device_of_interest

probe begin {
  /* The following is not the most efficient way to do this.
      One could directly put the result of usrdev2kerndev()
      into device_of_interest.  However, want to test out
      the other device functions */
  dev = usrdev2kerndev($1)
  device_of_interest = MKDEV(MAJOR(dev), MINOR(dev))
}

probe vfs.write, vfs.read
{
  if (dev == device_of_interest)
    printf ("%s(%d) %s 0x%x\n",
            execname(), pid(), ppfunc(), dev)
}

traceio2.stp 須要一個參數：整個設備號。爲了獲取這個數字，使用 stat -c "0x%D" directory，directory 位於被監控的設備。

usrdev2kerndev() 函數把整個設備號轉換成內核可理解的格式。usrdev2kerndev() 產生的輸出被用於鏈接 MKDEV()， MINOR()，和 MAJOR() 函數來肯定指定設備的最大和最小的數字。

traceio2.stp 輸出包含任何執行讀寫進程的 ID 和名字，執行的函數（vfs_read 或 vfs_write），和內核設備號。

如下示例是從 stap traceio2.stp 0x805 的完整輸出摘錄的，0x805 是 /home 的整個設備號，/home 在 /dev/sda5 中，就是咱們但願監控的設備。

Example 5.9. traceio2.stp Sample Output

[...]
synergyc(3722) vfs_read 0x800005
synergyc(3722) vfs_read 0x800005
cupsd(2889) vfs_write 0x800005
cupsd(2889) vfs_write 0x800005
cupsd(2889) vfs_write 0x800005
[...]

5.2.5 監控到一個文件的讀和寫

這節描述了怎樣監控文件的實時讀寫。

inodewatch.stp

#! /usr/bin/env stap

probe vfs.write, vfs.read
{
  # dev and ino are defined by vfs.write and vfs.read
  if (dev == MKDEV($1,$2) # major/minor device
      && ino == $3)
    printf ("%s(%d) %s 0x%x/%u\n",
      execname(), pid(), ppfunc(), dev, ino)
}