Linux操做系統--進程/線程（2）

時間 2020-09-14

標籤 linux 系統進程線程欄目 Linux 简体版

原文原文鏈接

前言

在本系列的上一篇博文裏，我已經介紹了進程/線程的基本含義以及一些相關數據結構，如今咱們來看看Linux中進程的管理。node

進程鏈表

Linux內核定義了一個list_head結構，數據結構定義linux

struct list_head {
	struct list_head *next；
	struct list_head *prev;
};

字段next 和 prev 分別表示通用雙向鏈表向前和向後的指針元素！list_head字段的指針中存放的是另外一個list_head字段的元素，而不是自己的數據結構地址。如圖

在咱們上一篇博客介紹到的進程描述符（task_struct）也有這個結構體，稱爲進程鏈表。進程鏈表是一個雙向循環鏈表，它把全部進程的描述符連接起來。每一個task_struct結構都包含一個list_head類型的字段tasks，這個結構的prev和next分別指向前面和後面的task_struct元素。

這個鏈表是一個循環的雙向鏈表，開始的時候只有init_task這一個進程，它是內核的第一個進程，它的初始化是經過靜態分配內存，"手動"(其它的進程初始化都是經過動態分配內存初始化的)進行的，每新建一個進程，就經過SET_LINKS宏將該進程的task_struct結構加入到這條雙向鏈表中，不過要注意的是若是一個進程新建一個線程（不包括主線程），也就是輕量級進程，它是不會加入該鏈表的。經過宏for_each_process能夠從init_task開始遍歷全部的進程。安全

#define for_each_task(p)
for (p = &init_task ; (p = p->next_task) != &init_task ; )

可運行隊列（runqueue）

當內核尋找一個新進程在CPU上運行時，必須只考慮可運行進程（即處在TASK_RUNNING狀態的進程）。把可運行狀態的進程組成一個雙向循環鏈表，也叫可運行隊列（runqueue）。
在task_struct結構中定義了兩個指針。session

struct task_struct *next_run, *prev_run;

由正在運行或是能夠運行的，其進程狀態均爲TASK_RUNNING的進程所組成的一個雙向循環鏈表，即run_queue就緒隊列。該鏈表的先後向指針用next_run和prev_run，鏈表的頭和尾都是init_task(即0號
進程)。
可是，爲了實如今固定的時間內選出「最佳」的可運行程序，內核將可運行進程的優先級劃分爲0-139，併爲此創建了140個可運行進程鏈表，用以組織處於TASK_RUNNING狀態的進程，每一個進程優先權對應一個不一樣的鏈表
linux內核定義了一個prio_array_t類型的結構體來管理這140個鏈表。每一個可運行的進程都在這140個鏈表中的一個，經過進程描述符結構中的run_list來實現，它也是一個list_head類型。enqueue_task是把進程描述符插入到某個可運行鏈表中，dequeue_task則從某個可運行鏈表中刪除該進程描述符。TASK_RUNNING狀態的prio_array_t類型的結構體是runqueue結構的arrays[1]成員。

數據結構

pidhash鏈表

爲了經過pid找到進程的描述符，若是直接遍歷進程間互聯的鏈表來查找進程id爲pid的進程描述符顯然是低效的，因此爲了更爲高效的查找，linux內核使用了4個hash散列表來加快查找，之因此使用4個散列表，是爲了能根據不一樣的pid類型來查找進程描述符，它們分別是進程的pid，線程組領頭進程的pid，進程組領頭進程的pid，會話領頭進程的pid。每一個類型的散列表中是經過宏pid_hashfn(x)來進行散列值的計算的。每一個進程均可能同時處於這是個散列表中，因此在進程描述符中有一個類型爲pid結構的pids成員，經過它能夠將進程加入散列表中，pid結構中包含解決散列衝突的pid_chain成員，它是hlist_node類型的，還有一個是將相同pid鏈起來的pid_list，它是list_head類型。
less

struct pid_link {
    int nr;  // pid的數值
    struct hlist_node pid_chain;
    struct list_head pid_list;
}

struct task_struct {
    …
    struct pid_link pids[4];
    …
}

Linux 進程安全上下文 struct cred

內核2.6，定義一個新的 struct task_security_struct，而後掛接到task_struct的void *security指針上，可是，內核3.x 在task_struct找不到security成員了，原來是將安全相關的信息剝離到一個叫作 cred 的結構體中，由cred負責保存進程安全上下文ide

The security context of a task
   95  *
   96  * The parts of the context break down into two categories:
   97  *
   98  *  (1) The objective context of a task.  These parts are used when some other
   99  *      task is attempting to affect this one.
  100  *
  101  *  (2) The subjective context.  These details are used when the task is acting
  102  *      upon another object, be that a file, a task, a key or whatever.
  103  *
  104  * Note that some members of this structure belong to both categories - the
  105  * LSM security pointer for instance.
  106  *
  107  * A task has two security pointers.  task->real_cred points to the objective
  108  * context that defines that task's actual details.  The objective part of this
  109  * context is used whenever that task is acted upon.
  110  *
  111  * task->cred points to the subjective context that defines the details of how
  112  * that task is going to act upon another object.  This may be overridden
  113  * temporarily to point to another security context, but normally points to the
  114  * same context as task->real_cred.
  115  */
  116 struct cred {
  117         atomic_t        usage;
  118 #ifdef CONFIG_DEBUG_CREDENTIALS
  119         atomic_t        subscribers;    /* number of processes subscribed */
  120         void            *put_addr;
  121         unsigned        magic;
  122 #define CRED_MAGIC      0x43736564
  123 #define CRED_MAGIC_DEAD 0x44656144
  124 #endif
  125         uid_t           uid;            /* real UID of the task */
  126         gid_t           gid;            /* real GID of the task */
  127         uid_t           suid;           /* saved UID of the task */
  128         gid_t           sgid;           /* saved GID of the task */
  129         uid_t           euid;           /* effective UID of the task */
  130         gid_t           egid;           /* effective GID of the task */
  131         uid_t           fsuid;          /* UID for VFS ops */
  132         gid_t           fsgid;          /* GID for VFS ops */
  133         unsigned        securebits;     /* SUID-less security management */
  134         kernel_cap_t    cap_inheritable; /* caps our children can inherit */
  135         kernel_cap_t    cap_permitted;  /* caps we're permitted */
  136         kernel_cap_t    cap_effective;  /* caps we can actually use */
  137         kernel_cap_t    cap_bset;       /* capability bounding set */
  138 #ifdef CONFIG_KEYS
  139         unsigned char   jit_keyring;    /* default keyring to attach requested
  140                                          * keys to */
  141         struct key      *thread_keyring; /* keyring private to this thread */
  142         struct key      *request_key_auth; /* assumed request_key authority */
  143         struct thread_group_cred *tgcred; /* thread-group shared credentials */
  144 #endif
  145 #ifdef CONFIG_SECURITY
  146         void            *security;      /* subjective LSM security */
  147 #endif
  148         struct user_struct *user;       /* real user ID subscription */
  149         struct user_namespace *user_ns; /* cached user->user_ns */
  150         struct group_info *group_info;  /* supplementary groups for euid/fsgid */
  151         struct rcu_head rcu;            /* RCU deletion hook */
  152 };

正如uid,euid的關係同樣，task_struct也有兩種身份cred函數

struct task_struct{
 ...
 /* process credentials */
 const struct cred __rcu *real_cred; /* objective and real subjective task credentials (COW) */
 const struct cred __rcu *cred;  /* effective (overridable) subjective task credentials (COW) */
 ...
 }

這裏詳細說明如下這個安全上下文的做用。
linux系統中，一個對象操做另外一個對象時一般要作安全性檢查。如一個進程操做一個文件，要檢查進程是否有權限操做該文件。
linux內核中，credential機制的引入，正是對象間訪問所需權限的抽象；主體提供本身權限的證書，客體提供訪問本身所需權限的證書，根據主客體提供的證書及操做作安全性檢查。
證書管理術語：
客體：指用戶空間程序直接能夠操做的系統對象，如進程、文件、消息隊列、信號量、共享內存等；每一個客體都有一組憑證，每種客體有不一樣的憑證集
客體全部者：客體憑證集有一部分表示客體全部者；如文件中uid表示文件的全部者
主體：操做客體的對象；除進程外大多數系統對象都不是主體，但在特殊環境下某些對象是主體，如文件在設置F_SETOWN後能夠發送SIGIO信號到進程，這時文件就是主體，進程就是客體
行爲：主體怎樣操做客體，如讀寫執行文件等
客體上下文：客體被訪問時所需權限憑證集
主體上下文：主體的權限憑證集
規則：主體操做客體時，用於安全檢查
當主體操做客體時，根據主體上下文、客體上下文、操做來作安全計算，查找規則看主體是否有權限操做客體。
進程描述符中cred和real_cred字段分別指向主體與客體的證書學習

usage：表於證書引用管理
uid：實際用戶id（real UID of the task，進程真正的uid，即爲建立該進程的用戶的uid）
gid：實際用戶組id
suid：保存的用戶uid（saved UID of the task，保留的UID，例如，當一個特權進程須要臨時下降其權限時，將其euid更改成非特權的UID，而後將原來的EUID保存到SUID，當須要恢復權限時，將EUID改成SUID中保存的UID便可）
sgid；保存的用戶組gid
euid：真正有效的用戶id（effective UID of the task，有效的UID，用於進程訪問資源時的訪問檢查，大多數狀況下，EUID是同於UID的，可是也能夠不一樣，或者說動態獲取的ID）
egid：真正有效的用戶組id
securebits：安全管理標識；用來控制憑證的操做與繼承
cap_inheritable：execve時能夠繼承的權限
cap_permitted：能夠(經過capset)賦予cap_effective的權限
cap_effective：進程實際使用的權限
cap_bset：主要用於uid=0或euid=0時，execve能夠繼承的權限，cap_permitted=cap_inheritable+cap_bset，cap_effective=cap_permitted。能夠將cap_bset中的權限經過調用capset賦給cap_inheritable
user：主要表示用戶信息，如用戶進程數、打開文件數等
rcu：RCU刪除用

struct cred在kernel pwn的利用

注：筆者尚未學習內核pwn的相關知識，因此這裏只是簡單介紹一下cred這個結構體在內核pwn中提權的做用，沒有具體例子說明
能夠經過執行commit_creds(prepare_kernel_cred(0))來得到root權限（root的uid、gid均爲0）
源碼以下：ui

/* /kernel/cred.c */
/**
 * prepare_kernel_cred - Prepare a set of credentials for a kernel service
 * @daemon: A userspace daemon to be used as a reference
 *
 * Prepare a set of credentials for a kernel service.  This can then be used to
 * override a task's own credentials so that work can be done on behalf of that
 * task that requires a different subjective context.
 *
 * @daemon is used to provide a base for the security record, but can be NULL.
 * If @daemon is supplied, then the security data will be derived from that;
 * otherwise they'll be set to 0 and no groups, full capabilities and no keys.
 *
 * The caller may change these controls afterwards if desired.
 *
 * Returns the new credentials or NULL if out of memory.
 *
 * Does not take, and does not return holding current->cred_replace_mutex.
 */
struct cred *prepare_kernel_cred(struct task_struct *daemon)
{
	const struct cred *old;
	struct cred *new;

	new = kmem_cache_alloc(cred_jar, GFP_KERNEL);
	if (!new)
		return NULL;

	kdebug("prepare_kernel_cred() alloc %p", new);

	if (daemon)
		old = get_task_cred(daemon);
	else
		old = get_cred(&init_cred);

	validate_creds(old);

	*new = *old;
	new->non_rcu = 0;
	atomic_set(&new->usage, 1);
	set_cred_subscribers(new, 0);
	get_uid(new->user);
	get_user_ns(new->user_ns);
	get_group_info(new->group_info);

#ifdef CONFIG_KEYS
	new->session_keyring = NULL;
	new->process_keyring = NULL;
	new->thread_keyring = NULL;
	new->request_key_auth = NULL;
	new->jit_keyring = KEY_REQKEY_DEFL_THREAD_KEYRING;
#endif

#ifdef CONFIG_SECURITY
	new->security = NULL;
#endif
	if (security_prepare_creds(new, old, GFP_KERNEL) < 0)
		goto error;

	put_cred(old);
	validate_creds(new);
	return new;

error:
	put_cred(new);
	put_cred(old);
	return NULL;
}
EXPORT_SYMBOL(prepare_kernel_cred);

prepare_kernel_cred()
根據源碼註釋中的描述，這個函數返回一個cred結構體，能夠用於代替進程原來的cred以便可以完成須要不一樣subjective context的任務。若是提供了參數@daemon，那麼security data未來源於此，而這個參數也可爲空，而後內容字段會被設置成0（uid/gid都是0，就是root權限咯？）

/* /kernel/cred.c */
/**
 * commit_creds - Install new credentials upon the current task
 * @new: The credentials to be assigned
 *
 * Install a new set of credentials to the current task, using RCU to replace
 * the old set.  Both the objective and the subjective credentials pointers are
 * updated.  This function may not be called if the subjective credentials are
 * in an overridden state.
 *
 * This function eats the caller's reference to the new credentials.
 *
 * Always returns 0 thus allowing this function to be tail-called at the end
 * of, say, sys_setgid().
 */
int commit_creds(struct cred *new)
{
	struct task_struct *task = current;
	const struct cred *old = task->real_cred;

	kdebug("commit_creds(%p{%d,%d})", new,
	       atomic_read(&new->usage),
	       read_cred_subscribers(new));

	BUG_ON(task->cred != old);
#ifdef CONFIG_DEBUG_CREDENTIALS
	BUG_ON(read_cred_subscribers(old) < 2);
	validate_creds(old);
	validate_creds(new);
#endif
	BUG_ON(atomic_read(&new->usage) < 1);

	get_cred(new); /* we will require a ref for the subj creds too */

	/* dumpability changes */
	if (!uid_eq(old->euid, new->euid) ||
	    !gid_eq(old->egid, new->egid) ||
	    !uid_eq(old->fsuid, new->fsuid) ||
	    !gid_eq(old->fsgid, new->fsgid) ||
	    !cred_cap_issubset(old, new)) {
		if (task->mm)
			set_dumpable(task->mm, suid_dumpable);
		task->pdeath_signal = 0;
		/*
		 * If a task drops privileges and becomes nondumpable,
		 * the dumpability change must become visible before
		 * the credential change; otherwise, a __ptrace_may_access()
		 * racing with this change may be able to attach to a task it
		 * shouldn't be able to attach to (as if the task had dropped
		 * privileges without becoming nondumpable).
		 * Pairs with a read barrier in __ptrace_may_access().
		 */
		smp_wmb();
	}

	/* alter the thread keyring */
	if (!uid_eq(new->fsuid, old->fsuid))
		key_fsuid_changed(task);
	if (!gid_eq(new->fsgid, old->fsgid))
		key_fsgid_changed(task);

	/* do it
	 * RLIMIT_NPROC limits on user->processes have already been checked
	 * in set_user().
	 */
	alter_cred_subscribers(new, 2);
	if (new->user != old->user)
		atomic_inc(&new->user->processes);
	rcu_assign_pointer(task->real_cred, new);
	rcu_assign_pointer(task->cred, new);
	if (new->user != old->user)
		atomic_dec(&old->user->processes);
	alter_cred_subscribers(old, -2);

	/* send notifications */
	if (!uid_eq(new->uid,   old->uid)  ||
	    !uid_eq(new->euid,  old->euid) ||
	    !uid_eq(new->suid,  old->suid) ||
	    !uid_eq(new->fsuid, old->fsuid))
		proc_id_connector(task, PROC_EVENT_UID);

	if (!gid_eq(new->gid,   old->gid)  ||
	    !gid_eq(new->egid,  old->egid) ||
	    !gid_eq(new->sgid,  old->sgid) ||
	    !gid_eq(new->fsgid, old->fsgid))
		proc_id_connector(task, PROC_EVENT_GID);

	/* release the old obj and subj refs both */
	put_cred(old);
	put_cred(old);
	return 0;
}
EXPORT_SYMBOL(commit_creds);

根據源碼註釋的描述，這個函數會將當前進程的real_cred和cred都設置成一組新的cred。 綜上，經過prepare_kernel_cred(0)得到一個root的cred，而後再用commit_creds()將其安裝到當前進程，即commit_creds(prepare_kernel_cred(0))，這樣就能夠提權啦！