coredump配置、產生、分析以及分析示例

關鍵詞:coredump、core_pattern、coredump_filter等等。html

 

應用程序在運行過程當中因爲各類異常或者bug致使退出,在知足必定條件下產生一個core文件。node

一般core文件包含了程序運行時內存、寄存器狀態、堆棧指針、內存管理信息以及函數調用堆棧信息。linux

core就是程序當前工做轉改存儲生成的一個文件,經過工具分析這個文件,能夠定位到程序異常退出的時候對應的堆棧調用等信息,找出問題點並解決。ubuntu

1. 配置coredump

若是須要使用須要經過ulimit進行設置,能夠經過ulimit -c查看當前系統是否支持coredump。若是爲0,則表示coredump被關閉。ide

經過ulimit -c unlimited能夠打開coredump。函數

coredump文件默認存儲位置與可執行文件在同一目錄下,文件名爲core。工具

能夠經過/proc/sys/kernel/core_pattern進行設置。ui

%p  出Core進程的PID
%u  出Core進程的UID
%s  形成Core的signal號
%t  出Core的時間,從1970-01-0100:00:00開始的秒數
%e  出Core進程對應的可執行文件名

經過echo "core-%e-%p-%s-%t" > /proc/sys/kernel/core_pattern。this

在每一個進程下都有coredump_filter節點/proc/<pid>/coredump_filteratom

經過配置coredump_filter能夠選擇需在coredump的時候,將哪些內容dump到core文件中。

  - (bit 0) anonymous private memory
  - (bit 1) anonymous shared memory
  - (bit 2) file-backed private memory
  - (bit 3) file-backed shared memory
  - (bit 4) ELF header pages in file-backed private memory areas (it is effective only if the bit 2 is cleared)
  - (bit 5) hugetlb private memory
  - (bit 6) hugetlb shared memory
  - (bit 7) DAX private memory
  - (bit 8) DAX shared memory 

coredump_filter的默認值是0x33,也即發生coredump時會將全部anonymous內存、ELF頭頁面、hugetlb private memory內容保存。

coredump_filter能夠被子進程繼承,能夠echo 0xXX > /proc/self/coredump_filter設置當前進程的coredump_filter。

static ssize_t proc_coredump_filter_write(struct file *file,
                      const char __user *buf,
                      size_t count,
                      loff_t *ppos)
{
...
    ret = kstrtouint_from_user(buf, count, 0, &val);-------------------------將buf轉換成val值。 if (ret < 0)
        return ret;
...
    for (i = 0, mask = 1; i < MMF_DUMP_FILTER_BITS; i++, mask <<= 1) {
        if (val & mask)
            set_bit(i + MMF_DUMP_FILTER_SHIFT, &mm->flags);------------------將coredump_filter的值映射到mm->flags上,後續coredump時使用。 else
            clear_bit(i + MMF_DUMP_FILTER_SHIFT, &mm->flags);
    }
...
}

其中MMF_DUMP_FILTER_SHIFT爲2,因此flags和coredump_filter存在以下對應關係。

#define MMF_DUMP_ANON_PRIVATE    2
#define MMF_DUMP_ANON_SHARED    3
#define MMF_DUMP_MAPPED_PRIVATE    4
#define MMF_DUMP_MAPPED_SHARED    5
#define MMF_DUMP_ELF_HEADERS    6
#define MMF_DUMP_HUGETLB_PRIVATE 7
#define MMF_DUMP_HUGETLB_SHARED  8
#define MMF_DUMP_DAX_PRIVATE    9
#define MMF_DUMP_DAX_SHARED    10

 

2. coredump原理

在do_signal()中根據信號判斷是否觸發coredump,固然還跟coredump limit、mm->flags等等相關。

知足coredump條件後,由do_coredump()進行coredump文件生成,核心是由binfmt->core_dump()進行的。

2.1 觸發coredump的條件?

在內核返回用戶空間的時候,會調用do_signal()處理信號。

static void do_signal(struct pt_regs *regs, int syscall)
{
    unsigned int retval = 0, continue_addr = 0, restart_addr = 0;
    struct ksignal ksig;
...
    if (get_signal(&ksig)) {
...
    }
...
}

int get_signal(struct ksignal *ksig)
{
...
    for (;;) {
        struct k_sigaction *ka;
...
        signr = dequeue_signal(current, &current->blocked, &ksig->info);
...
        /* Trace actually delivered signals. */
        trace_signal_deliver(signr, &ksig->info, ka);
...
        if (sig_kernel_coredump(signr)) {
            if (print_fatal_signals)------------------------------能夠經過kernel.print-fatal-signals = 1進行設置,對應的節點是/proc/sys/kernel/print-fatal-signals。
                print_fatal_signal(ksig->info.si_signo);----------打印當前信號及當前場景的棧信息。
            proc_coredump_connector(current);
 do_coredump(&ksig->info);
        }
...
    }
    spin_unlock_irq(&sighand->siglock);

    ksig->sig = signr;
    return ksig->sig > 0;
}

#define sig_kernel_coredump(sig)    siginmask(sig, SIG_KERNEL_COREDUMP_MASK)

  #define SIG_KERNEL_COREDUMP_MASK (\
    rt_sigmask(SIGQUIT) | rt_sigmask(SIGILL) | \
    rt_sigmask(SIGTRAP) | rt_sigmask(SIGABRT) | \
    rt_sigmask(SIGFPE) | rt_sigmask(SIGSEGV) | \
    rt_sigmask(SIGBUS) | rt_sigmask(SIGSYS) | \
    rt_sigmask(SIGXCPU) | rt_sigmask(SIGXFSZ) | \
    SIGEMT_MASK )

在get_signal()中,判斷信號是否會致使coredump。這些信號包括SIGQUIT、SIGILL、SIGTRAP、SIGABRT、SIGFPE、SIGSEGV、SIGBUS、SIGSYS、SIGXCPU、SIGXFSZ

「終止w/core」表示在進程當前工做目錄的core文件中複製了該進程的存儲圖像(該文件名爲core,由此能夠看出這種功能好久以前就是UNIX功能的一部分)。

void proc_coredump_connector(struct task_struct *task)
{
    struct cn_msg *msg;
    struct proc_event *ev;
    __u8 buffer[CN_PROC_MSG_SIZE] __aligned(8);

    if (atomic_read(&proc_event_num_listeners) < 1)
        return;

    msg = buffer_to_cn_msg(buffer);
    ev = (struct proc_event *)msg->data;
    memset(&ev->event_data, 0, sizeof(ev->event_data));
    ev->timestamp_ns = ktime_get_ns();
    ev->what = PROC_EVENT_COREDUMP;
    ev->event_data.coredump.process_pid = task->pid;
    ev->event_data.coredump.process_tgid = task->tgid;

    memcpy(&msg->id, &cn_proc_event_id, sizeof(msg->id));
    msg->ack = 0; /* not used */
    msg->len = sizeof(*ev);
    msg->flags = 0; /* not used */
    send_msg(msg);
} 

2.2 coredump如何生成?

void do_coredump(const siginfo_t *siginfo)
{
    struct core_state core_state;
    struct core_name cn;
    struct mm_struct *mm = current->mm;
    struct linux_binfmt * binfmt;
    const struct cred *old_cred;
    struct cred *cred;
    int retval = 0;
    int ispipe;
    struct files_struct *displaced;
    /* require nonrelative corefile path and be extra careful */
    bool need_suid_safe = false;
    bool core_dumped = false;
    static atomic_t core_dump_count = ATOMIC_INIT(0);
    struct coredump_params cprm = {
        .siginfo = siginfo,
        .regs = signal_pt_regs(),
        .limit = rlimit(RLIMIT_CORE),
        /*
         * We must use the same mm->flags while dumping core to avoid
         * inconsistency of bit flags, since this flag is not protected
         * by any locks.
         */
        .mm_flags = mm->flags,
    };

    audit_core_dumps(siginfo->si_signo);

    binfmt = mm->binfmt;------------------------------------------------獲取當前進程所使用的程序加載器。 if (!binfmt || !binfmt->core_dump)
        goto fail;
    if (!__get_dumpable(cprm.mm_flags))---------------------------------從當前進程的mm->flags中取低兩位判斷是否能夠coredump,SUID_DUMP_DISABLE(0)不能夠,其餘狀況均可以。 goto fail;

    cred = prepare_creds();
    if (!cred)
        goto fail;
    /*
     * We cannot trust fsuid as being the "true" uid of the process
     * nor do we know its entire history. We only know it was tainted
     * so we dump it as root in mode 2, and only into a controlled
     * environment (pipe handler or fully qualified path).
     */
    if (__get_dumpable(cprm.mm_flags) == SUID_DUMP_ROOT) {--------------區分SUID_DUMP_USER和SUID_DUMP_ROOT。 /* Setuid core dump mode */
        cred->fsuid = GLOBAL_ROOT_UID;    /* Dump root private */
        need_suid_safe = true;
    }

    retval = coredump_wait(siginfo->si_signo, &core_state);
    if (retval < 0)
        goto fail_creds;

    old_cred = override_creds(cred);

    ispipe = format_corename(&cn, &cprm);

    if (ispipe) {
        int dump_count;
        char **helper_argv;
        struct subprocess_info *sub_info;

        if (ispipe < 0) {
            printk(KERN_WARNING "format_corename failed\n");
            printk(KERN_WARNING "Aborting core\n");
            goto fail_unlock;
        }

        if (cprm.limit == 1) {
            printk(KERN_WARNING
                "Process %d(%s) has RLIMIT_CORE set to 1\n",
                task_tgid_vnr(current), current->comm);
            printk(KERN_WARNING "Aborting core\n");
            goto fail_unlock;
        }
        cprm.limit = RLIM_INFINITY;

        dump_count = atomic_inc_return(&core_dump_count);
        if (core_pipe_limit && (core_pipe_limit < dump_count)) {
            printk(KERN_WARNING "Pid %d(%s) over core_pipe_limit\n",
                   task_tgid_vnr(current), current->comm);
            printk(KERN_WARNING "Skipping core dump\n");
            goto fail_dropcount;
        }

        helper_argv = argv_split(GFP_KERNEL, cn.corename, NULL);
        if (!helper_argv) {
            printk(KERN_WARNING "%s failed to allocate memory\n",
                   __func__);
            goto fail_dropcount;
        }

        retval = -ENOMEM;
        sub_info = call_usermodehelper_setup(helper_argv[0],
                        helper_argv, NULL, GFP_KERNEL,
                        umh_pipe_setup, NULL, &cprm);
        if (sub_info)
            retval = call_usermodehelper_exec(sub_info,
                              UMH_WAIT_EXEC);

        argv_free(helper_argv);
        if (retval) {
            printk(KERN_INFO "Core dump to |%s pipe failed\n",
                   cn.corename);
            goto close_fail;
        }
    } else {
        struct inode *inode;
        int open_flags = O_CREAT | O_RDWR | O_NOFOLLOW |
                 O_LARGEFILE | O_EXCL;

        if (cprm.limit < binfmt->min_coredump)
            goto fail_unlock;

        if (need_suid_safe && cn.corename[0] != '/') {
            printk(KERN_WARNING "Pid %d(%s) can only dump core "\
                "to fully qualified path!\n",
                task_tgid_vnr(current), current->comm);
            printk(KERN_WARNING "Skipping core dump\n");
            goto fail_unlock;
        }

        if (!need_suid_safe) {
            mm_segment_t old_fs;

            old_fs = get_fs();
            set_fs(KERNEL_DS);
            /*
             * If it doesn't exist, that's fine. If there's some
             * other problem, we'll catch it at the filp_open().
             */
            (void) sys_unlink((const char __user *)cn.corename);
            set_fs(old_fs);
        }

        if (need_suid_safe) {---------------------------------------------建立coredump文件。             struct path root;

            task_lock(&init_task);
            get_fs_root(init_task.fs, &root);
            task_unlock(&init_task);
            cprm.file = file_open_root(root.dentry, root.mnt,
                cn.corename, open_flags, 0600);
            path_put(&root);
        } else {
            cprm.file = filp_open(cn.corename, open_flags, 0600);
        }
        if (IS_ERR(cprm.file))
            goto fail_unlock;

        inode = file_inode(cprm.file);
        if (inode->i_nlink > 1)------------------------------------------coredummp文件不能有多個硬連接。 goto close_fail;
        if (d_unhashed(cprm.file->f_path.dentry))
            goto close_fail;

        if (!S_ISREG(inode->i_mode))--------------------------------------coredump文件必須爲普通文件。 goto close_fail;

        if (!uid_eq(inode->i_uid, current_fsuid()))
            goto close_fail;
        if ((inode->i_mode & 0677) != 0600)
            goto close_fail;
        if (!(cprm.file->f_mode & FMODE_CAN_WRITE))-----------------------coredump文件必須可寫。 goto close_fail;
        if (do_truncate(cprm.file->f_path.dentry, 0, 0, cprm.file))
            goto close_fail;
    }

    /* get us an unshared descriptor table; almost always a no-op */
    retval = unshare_files(&displaced);
    if (retval)
        goto close_fail;
    if (displaced)
        put_files_struct(displaced);
    if (!dump_interrupted()) {
        file_start_write(cprm.file);
        core_dumped = binfmt->core_dump(&cprm);---------------------------調用對應程序加載器的core_dump進行處理。
        file_end_write(cprm.file);
    }
    if (ispipe && core_pipe_limit)
        wait_for_dump_helpers(cprm.file);
close_fail:
    if (cprm.file)
        filp_close(cprm.file, NULL);
fail_dropcount:
    if (ispipe)
        atomic_dec(&core_dump_count);
fail_unlock:
    kfree(cn.corename);
    coredump_finish(mm, core_dumped);
    revert_creds(old_cred);
fail_creds:
    put_cred(cred);
fail:
    return;
}

linux內核支持多種linux_binfmt,這裏最經常使用的是ELF。 

因此do_coredump()中的binfmt即爲elf_format,binfmt->core_dump()即爲elf_coredump()。

static struct linux_binfmt elf_format = {
    .module        = THIS_MODULE,
    .load_binary    = load_elf_binary,
    .load_shlib    = load_elf_library,
    .core_dump    = elf_core_dump,
    .min_coredump    = ELF_EXEC_PAGESIZE,
};

static int elf_core_dump(struct coredump_params *cprm)
{
    int has_dumped = 0;
    mm_segment_t fs;
    int segs, i;
    size_t vma_data_size = 0;
    struct vm_area_struct *vma, *gate_vma;
    struct elfhdr *elf = NULL;
    loff_t offset = 0, dataoff;
    struct elf_note_info info = { };
    struct elf_phdr *phdr4note = NULL;
    struct elf_shdr *shdr4extnum = NULL;
    Elf_Half e_phnum;
    elf_addr_t e_shoff;
    elf_addr_t *vma_filesz = NULL;

    elf = kmalloc(sizeof(*elf), GFP_KERNEL);-----------------------申請存放elfhdr空間。 if (!elf)
        goto out;

    segs = current->mm->map_count;---------------------------------經過current->mm->map_count獲得當前進程已映射的內存段數量。
    segs += elf_core_extra_phdrs();--------------------------------增長附加段數量。

    gate_vma = get_gate_vma(current->mm);--------------------------增長一個segment給vma使用。 if (gate_vma != NULL)
        segs++;

    /* for notes section */
    segs++;--------------------------------------------------------保留一個segment給PT_NOTE使用。 /* If segs > PN_XNUM(0xffff), then e_phnum overflows. To avoid
     * this, kernel supports extended numbering. Have a look at
     * include/linux/elf.h for further information. */
    e_phnum = segs > PN_XNUM ? PN_XNUM : segs;

    /*
     * Collect all the non-memory information about the process for the
     * notes.  This also sets up the file header.
     */
    if (!fill_note_info(elf, e_phnum, &info, cprm->siginfo, cprm->regs))-----fill_note_info()填充info信息。 goto cleanup;

    has_dumped = 1;

    fs = get_fs();
    set_fs(KERNEL_DS);------------------------------------------------------在內核中操做用戶空間文件,須要將地址方位擴大。具體參見《Linux內核訪問用戶空間文件:get_fs()/set_fs()的使用

    offset += sizeof(*elf);                /* Elf header */
    offset += segs * sizeof(struct elf_phdr);    /* Program headers */

    /* Write notes phdr entry */
    {
        size_t sz = get_note_info_size(&info);

        sz += elf_coredump_extra_notes_size();

        phdr4note = kmalloc(sizeof(*phdr4note), GFP_KERNEL);
        if (!phdr4note)
            goto end_coredump;

        fill_elf_note_phdr(phdr4note, sz, offset);
        offset += sz;
    }

    dataoff = offset = roundup(offset, ELF_EXEC_PAGESIZE);

    vma_filesz = kmalloc_array(segs - 1, sizeof(*vma_filesz), GFP_KERNEL);
    if (!vma_filesz)
        goto end_coredump;

    for (i = 0, vma = first_vma(current, gate_vma); vma != NULL;
            vma = next_vma(vma, gate_vma)) {
        unsigned long dump_size;

        dump_size = vma_dump_size(vma, cprm->mm_flags);
        vma_filesz[i++] = dump_size;
        vma_data_size += dump_size;
    }

    offset += vma_data_size;
    offset += elf_core_extra_data_size();
    e_shoff = offset;

    if (e_phnum == PN_XNUM) {
        shdr4extnum = kmalloc(sizeof(*shdr4extnum), GFP_KERNEL);
        if (!shdr4extnum)
            goto end_coredump;
        fill_extnum_info(elf, shdr4extnum, e_shoff, segs);
    }

    offset = dataoff;

    if (!dump_emit(cprm, elf, sizeof(*elf)))
        goto end_coredump;

    if (!dump_emit(cprm, phdr4note, sizeof(*phdr4note)))
        goto end_coredump;

    /* Write program headers for segments dump */
    for (i = 0, vma = first_vma(current, gate_vma); vma != NULL;
            vma = next_vma(vma, gate_vma)) {
        struct elf_phdr phdr;

        phdr.p_type = PT_LOAD;
        phdr.p_offset = offset;
        phdr.p_vaddr = vma->vm_start;
        phdr.p_paddr = 0;
        phdr.p_filesz = vma_filesz[i++];
        phdr.p_memsz = vma->vm_end - vma->vm_start;
        offset += phdr.p_filesz;
        phdr.p_flags = vma->vm_flags & VM_READ ? PF_R : 0;
        if (vma->vm_flags & VM_WRITE)
            phdr.p_flags |= PF_W;
        if (vma->vm_flags & VM_EXEC)
            phdr.p_flags |= PF_X;
        phdr.p_align = ELF_EXEC_PAGESIZE;

        if (!dump_emit(cprm, &phdr, sizeof(phdr)))
            goto end_coredump;
    }

    if (!elf_core_write_extra_phdrs(cprm, offset))
        goto end_coredump;

     /* write out the notes section */
    if (!write_note_info(&info, cprm))
        goto end_coredump;

    if (elf_coredump_extra_notes_write(cprm))
        goto end_coredump;

    /* Align to page */
    if (!dump_skip(cprm, dataoff - cprm->pos))
        goto end_coredump;

    for (i = 0, vma = first_vma(current, gate_vma); vma != NULL;
            vma = next_vma(vma, gate_vma)) {
        unsigned long addr;
        unsigned long end;

        end = vma->vm_start + vma_filesz[i++];

        for (addr = vma->vm_start; addr < end; addr += PAGE_SIZE) {
            struct page *page;
            int stop;

            page = get_dump_page(addr);
            if (page) {
                void *kaddr = kmap(page);
                stop = !dump_emit(cprm, kaddr, PAGE_SIZE);
                kunmap(page);
                put_page(page);
            } else
                stop = !dump_skip(cprm, PAGE_SIZE);
            if (stop)
                goto end_coredump;
        }
    }
    dump_truncate(cprm);

    if (!elf_core_write_extra_data(cprm))
        goto end_coredump;

    if (e_phnum == PN_XNUM) {
        if (!dump_emit(cprm, shdr4extnum, sizeof(*shdr4extnum)))
            goto end_coredump;
    }

end_coredump:
    set_fs(fs);

cleanup:
    free_note_info(&info);
    kfree(shdr4extnum);
    kfree(vma_filesz);
    kfree(phdr4note);
    kfree(elf);
out:
    return has_dumped;
}

int dump_emit(struct coredump_params *cprm, const void *addr, int nr)
{
    struct file *file = cprm->file;
    loff_t pos = file->f_pos;
    ssize_t n;
    if (cprm->written + nr > cprm->limit)
        return 0;
    while (nr) {
        if (dump_interrupted())
            return 0;
        n = __kernel_write(file, addr, nr, &pos);
        if (n <= 0)
            return 0;
        file->f_pos = pos;
        cprm->written += n;
        cprm->pos += n;
        nr -= n;
    }
    return 1;
}

判斷一個文件是不是coredump文件,能夠經過readelf命令,若是類型是CORE(Core file)。

或者經過file命令進行判斷。

參考文檔:《Core file 文件格式(Linux Coredump文件結構)》,GDB解析coredump文件參考《GDB如何從Coredump文件恢復動態庫信息》。

3. coredump案例

 下面建立一個簡單產生coredump的示例,而後經過gdb進行分析。

3.1 coredump示例

#include <stddef.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int myfunc(int i) {
    *(int*)(NULL) = i; /* line 7 */
    return i - 1;
}

int main(int argc, char **argv) {
    /* Setup some memory. */
    char data_ptr[] = "string in data segment";
    char *mmap_ptr;
    char *text_ptr = "string in text segment";
    (void)argv;
    mmap_ptr = (char *)malloc(sizeof(data_ptr) + 1);
    strcpy(mmap_ptr, data_ptr);
    mmap_ptr[10] = 'm';
    mmap_ptr[11] = 'm';
    mmap_ptr[12] = 'a';
    mmap_ptr[13] = 'p';
    printf("text addr: %p\n", text_ptr);
    printf("data addr: %p\n", data_ptr);
    printf("mmap addr: %p\n", mmap_ptr);

    /* Call a function to prepare a stack trace. */
    return myfunc(argc);
}

使用以下命令編譯,-ggdb3表示產生更多適合GDB的調試信息,3是最高等級。

gcc -ggdb3 -std=c99 -Wall -Wextra -pedantic -o main.out main.c 

3.2 coredump+gdb分析

經過ulimit -c unlimited打開coredump功能,執行./main.out產生core文件。

text addr: 0x4007d4
data addr: 0x7ffff28fdc30
mmap addr: 0x10bb010
Segmentation fault (core dumped)

經過gdb ./main.out core,顯示了進程因爲什麼信號致使的coredump(SIGSEGV)?在哪一個文件(main.cc)?在哪一個函數(myfunc())?具體位置的代碼?等等信息。

GNU gdb (Ubuntu 7.11.1-0ubuntu1~16.5) 7.11.1...
Reading symbols from ./main.out...done.
[New LWP 8651]
Core was generated by `./main.out'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x0000000000400635 in myfunc (i=1) at main.c:7
7        *(int*)(NULL) = i; /* line 7 */

關於core+gdb更詳細的分析方法能夠參考《經過core+gdb離線分析》,在分析過程當中須要加載動態庫能夠參考《GDB動態庫搜索路徑》。

4. 小結

至此大概總結了,對coredump的設置(ulimit/core_pattern/coredump_filter)?觸發coredump的條件(SIG_KERNEL_COREDUMP_MASK )?coredump生成core文件流程(do_coredump())?gdb如何識別core文件(GDB如何從Coredump文件恢復動態庫信息)?如何經過gdb分析core文件發現問題(gdb->backtrace)?

相關文章
相關標籤/搜索