Unix System Overview
Introduction
All operating systems provide services for programs they run. Typical services include executing a new program, opening a file, reading a file, allocating a region of memory, getting the current time of day, and so on. The focus of this text is to describe the services provided by various versions of the UNIX operating system.
Describing the UNIX System in a strictly linear fashion, without any forward references to terms that haven’t been described yet, is nearly impossible (and would probably be boring). This chapter provides a whirlwind tour of the UNIX System from a programmer ’s perspective. We’ll give some brief descriptions and examples of terms and concepts that appear throughout the text. We describe these features in much more detail in later chapters. This chapter also provides an introduction to and overview of the services provided by the UNIX System for programmers new to this environment.
UNIX Architecture
In a
strict sense, an operating system can be defined as the software that controls the hardware resources of the computer and provides an environment under which programs can run. Generally, we call this software the kernel, since it is relatively small and
resides at the core of the environment. Figure 1.1 shows a diagram of the UNIX System architecture.
Figure 1.1 Architecture of the UNIX operating system
The interface to the kernel is a layer of software called the system calls (the shaded portion in Figure 1.1). Libraries of common functions are built on top of the system call interface, but applications are free to use both. (We talk more about system calls and library functions in Section 1.11.) The shell is a special application that provides an interface for running other applications.
In a
broad sense, an operating system
consists of the kernel and all the other software that makes a computer useful and gives the computer its personality. This other software includes system
utilities, applications, shells, libraries of common functions, and so on.
For example, Linux is the kernel used by the GNU operating system. Some people
refer to this
combination as the GNU/Linux operating system, but it is more commonly referred to as simply Linux. Although this usage may not
be correct in a strict sense, it is understandable,
given the
dual meaning of the phrase
operating system. (It also
has the advantage of being more
succinct.)
Logging In
Login Name
When we
log in to a UNIX system, we enter our login name, followed by our password. The system then
looks up our login name in its password file, usually the file
/etc/passwd. If we
look at our entry in the password file, we see that it’s composed of seven
colon-separated fields: the login name,
encrypted password, numeric user ID (1000), numeric group ID (1000), a comment field, home directory (/home/fireway), and shell program (/bin/bash).
fireway:x:1000:1000:fireway,,,:/home/fireway:/bin/bash
All contemporary systems have moved the encrypted password to a different file. In Chapter 6, we’ll look at these files and some functions to access them.
Shells
Once we log in, some system information messages are typically displayed, and then we can type commands to the shell program. (Some systems start a window management program when you log in, but you generally
end up with a shell running in one of the windows.) A shell is a command-line interpreter that reads user input and executes commands. The user input to a shell is normally from the terminal (an interactive shell) or sometimes from a file (called a shell script). The common shells in use are summarized in Figure 1.2.
Name | Path | FreeBSD 8.0 | Linux 3.2.0 | Mac OS X 10.6.8 | Solaris 10 |
Bourne shell | /bin/sh | • | • | copy of bash | • |
Bourne-again shell | /bin/bash | optional | • | • | • |
C shell | /bin/csh | link to tcsh | optional | link to tcsh | • |
Korn shell | /bin/ksh | optional | optional | • | • |
TENEX C shell | /bin/tcsh | • | optional | • | • |
Figure 1.2 Common shells used on UNIX systems
The system knows which shell to
execute for us
based on the final field in our entry in the password file.
The Bourne shell, developed by Steve Bourne at Bell Labs, has been
in use since Version 7 and is provided with almost every UNIX system in existence. The control-flow constructs of the Bourne shell
are reminiscent of Algol 68.
ALGOL 68 (short for ALGOrithmic Language 1968) is an imperative computer programming language that was conceived as a successor to the ALGOL 60 programming language, designed with the goal of a much wider scope of application and more rigorously defined syntax and semantics.
The C shell, developed by Bill Joy at Berkeley, is provided with all the BSD releases.
Additionally, the C shell was provided by AT&T with System V/386 Release 3.2 and was also included in System V Release 4 (SVR4). (We’ll have more to say about these different versions of the UNIX System in the next chapter.) The C shell was built on the 6th Edition shell, not the Bourne shell. Its control flow looks more like the C language, and it supports additional features that weren’t provided by the Bourne shell: job control, a history
mechanism, and command-line editing.
The Korn shell is considered a successor to the Bourne shell and was first provided with SVR4. The Korn shell, developed by David Korn at Bell Labs, runs on most UNIX systems, but before SVR4 was usually an
extra-cost
add-on, so it is
not as
widespread as the other two shells. It is upward compatible with the Bourne shell and includes those features that made the C shell popular: job control, command-line editing, and so on.
The Bourne-again shell is the GNU shell provided with all Linux systems. It was designed to be POSIX
conformant, while still remaining
compatible with the Bourne shell. It supports features from both the C shell and the Korn shell.
The TENEX C shell is an enhanced version of the C shell. It borrows several features, such as command completion, from the TENEX operating system (developed in 1972 at Bolt Beranek and Newman). The TENEX C shell adds many features to the C shell and is often
used as a replacement for the C shell.
The shell was
standardized in the POSIX 1003.2 standard. The
specification was based on features from the Korn shell and Bourne shell.
The default shell used by different Linux distributions varies. Some distributions use the Bourne-again shell. Others use the BSD replacement for the Bourne shell, called dash (Debian Almquist shell, originally written by Kenneth Almquist and later ported to Linux). The default user shell in FreeBSD is derived from the Almquist shell. The default shell in Mac OS X is the Bourne-again shell. Solaris, having its heritage in both BSD and System V, provides all the shells shown in Figure 1.2. Free ports of the shells are available on the Internet.
Throughout the text, we will use parenthetical notes such as this to describe historical notes and to compare different implementations of the UNIX System. Often the reason for a particular implementation technique becomes clear when the historical reasons are described.
Throughout this text, we’ll show
interactive shell examples to
execute a program that we’ve developed. These examples use features
common to the Bourne shell, the Korn shell, and the Bourne-again shell.
Files and Directories
File System
The UNIX file system is a
hierarchical arrangement of directories and files. Everything starts in the directory called
root, whose name is the single character
/.
A directory is a file that contains directory
entries. Logically, we can
think of each directory entry as containing a filename
along with a structure of information describing the attributes of the file. The attributes of a file are
such things as the type of file (regular file, directory), the size of the file, the owner of the file, permissions for the file (whether other users may access this file), and when the file was last modified. The
stat and
fstat functions return a structure of information containing all the attributes of a file. In Chapter 4, we’ll examine all the attributes of a file in great detail.
We make a distinction between the logical view of a directory entry and the way it is actually stored on disk. Most implementations of UNIX file systems don’t store attributes in the directory entries themselves, because of the difficulty of keeping them in synch when a file has multiple hard links. This will become clear when we discuss hard links in Chapter 4.
Filename
The names in a directory are called
filename. The only two characters that cannot appear in a filename are the
slash character (/) and the null character. The
slash separates the filenames that form a pathname (described next) and the null character terminates a pathname.
Nevertheless, it’s good practice to
restrict the characters in a filename to a subset of the normal printing characters. (If we use some of the shell’s special characters in the filename, we have to use the shell’s quoting
mechanism to reference the filename, and this can get
complicated.) Indeed, for portability, POSIX.1 recommends
restricting filenames to
consist of the following characters: letters (a-z, A-Z), numbers (0-9),
period (.),
dash (-), and
underscore ( _ ).
Two filenames are automatically created whenever a new directory is created: . (called dot) and .. (called dot-dot). Dot
refers to the current directory, and dot-dot refers to the parent directory. In the root directory, dot-dot is the same as dot.
The Research UNIX System and some older UNIX System V file systems restricted a filename to 14 characters. BSD versions extended this limit to 255 characters. Today, almost all commercial UNIX file systems support at least 255-character filenames.
Pathname
A
sequence of one or more filenames, separated by
slashes and optionally starting with a
slash, forms a pathname. A pathname that begins with a
slash is called
an absolute pathname; otherwise, it’s called
a relative pathname. Relative pathnames
refer to files relative to the current directory. The name for the root of the file system (/) is a special-case absolute pathname that has no filename component.
Example
Listing the names of all the files in a directory is not difficult. Figure 1.3 shows a
bare-bones implementation of the
ls(1) command.
/**
* 文件名: intro/ls1.c
* 內容:List all the files in a directory
* 時間: 2016年 09月 25日 星期日 12:30:51 CST
* 做者:firewaywei
*/
#include"apue.h"
#include<dirent.h>
int
main(int argc,char*argv[])
{
DIR*dp;
struct dirent *dirp;
if(argc !=2)
{
err_quit("usage: ls directory_name");
}
if((dp = opendir(argv[1]))== NULL)
{
err_sys("can't open %s", argv[1]);
}
while((dirp = readdir(dp))!= NULL)
{
printf("%s\n", dirp->d_name);
}
closedir(dp);
exit(0);
}
Figure 1.3 List all the files in a directory
The
notation
ls(1) is the normal way to
reference a particular
entry in the UNIX system
manuals. It
refers to the entry for
ls in Section 1. The sections are normally numbered 1 through 8, and all the entries within each section are
arranged alphabetically. Throughout this text, we assume that you have a copy of the
manuals for your UNIX system.
Historically, UNIX systems lumped all eight sections together into what was called the UNIX Programmer's Manual. As the page count increased, the trend changed to distributing the sections among separate manuals: one for users, one for programmers, and one for system administrators, for example.
Some UNIX systems further divide the manual pages within a given section, using an uppercase letter. For example, all the standard input/output (I/O) functions in AT&T [1990e] are indicated as being in Section 3S, as in fopen(3S). Other systems have replaced the numeric sections with alphabetic ones, such as C for commands.
Today, most
manuals are distributed in electronic form. If your manuals are online, the way to see the manual pages for the
ls command would be something like
man 1 ls
or
man -s1 ls
Figure 1.3 is a program that just prints the name of every file in a directory, and nothing else. If the source file is named
myls.c, we compile it into the default
a.out
executable file by running
cc myls.c
Historically, cc(1) is the C compiler. On systems with the GNU C compilation system, the C compiler is gcc(1). Here, cc is usually linked to gcc.
Some sample output is
$ ./ls1 /dev
.
..
log
fb0
vcsa6
vcs6
vcsa5
vcs5
many more lines that aren’t shownvga_arbiter
$ ./ls1 /etc/ssl/private
can't open /etc/ssl/private: Permission denied
$ ./ls1 /dev/tty
can't open /dev/tty: Not a directory
Throughout this text, we’ll show commands that we run and the resulting output in this
fashion: Characters that we type are shown in
this font, whereas output from programs is shown like this. If we need to add comments to this output, we’ll show the comments in
italics. The dollar sign that
precedes our input is the
prompt that is printed by the shell. We’ll always show the shell
prompt as a dollar sign.
Note that the directory listing is not in
alphabetical order. The
ls command sorts the names before printing them.
There are many details to consider in this 20-line program.
- First, we include a header of our own: apue.h. We include this header in almost every program in this text. This header includes some standard system headers and defines numerous constants and function prototypes that we use throughout the examples in the text. A listing of this header is in Appendix B.
- Next, we include a system header, dirent.h, to pick up the function prototypes for opendir and readdir, in addition to the definition of the dirent structure. On some systems, the definitions are split into multiple header files. For example, in the Ubuntu 12.04 Linux distribution, /usr/include/dirent.h declares the function prototypes and includes bits/dirent.h, which defines the dirent structure (and is actually stored in /usr/include/x86_64-linux-gnu/bits).
- The declaration of the main function uses the style supported by the ISO C standard. (We’ll have more to say about the ISO C standard in the next chapter.)
- We take an argument from the command line, argv[1], as the name of the directory to list. In Chapter 7, we’ll look at how the main function is called and how the command-line arguments and environment variables are accessible to the program.
- Because the actual format of directory entries varies from one UNIX system to another, we use the functions opendir, readdir, and closedir to manipulate the directory.
- The opendir function returns a pointer to a DIR structure, and we pass this pointer to the readdir function. We don’t care what’s in the DIR structure. We then call readdir in a loop, to read each directory entry. The readdir function returns a pointer to a dirent structure or, when it’s finished with the directory, a null pointer. All we examine in the dirent structure is the name of each directory entry (d_name). Using this name, we could then call the stat function (Section 4.2) to determine all the attributes of the file.
- We call two functions of our own to handle the errors: err_sys and err_quit. We can see from the preceding output that the err_sys function prints an informative message describing what type of error was encountered (‘‘Permission denied’’ or ‘‘Not a directory’’). These two error functions are shown and described in Appendix B. We also talk more about error handling in Section 1.7.
- When the program is done, it calls the function exit with an argument of 0. The function exit terminates a program. By convention, an argument of 0 means OK, and an argument between 1 and 255 means that an error occurred. In Section 8.5, we show how any program, such as a shell or a program that we write, can obtain the exit status of a program that it executes.
Working Directory
Every process has a working directory, sometimes called
the current working directory. This is the directory from which all relative pathnames are
interpreted. A process can change its working directory with the
chdir function.
For example, the relative pathname
doc/memo/joe refers to the file or directory
joe, in the directory
memo, in the directory
doc, which must be a directory within the working directory. From looking just at this pathname, we know that both
doc and
memo have to be directories, but we can’t tell whether
joe is a file or a directory. The pathname
/usr/lib/lint is an absolute pathname that refers to the file or directory
lint in the directory lib, in the directory
usr, which is in the root directory.
Home Directory
When we log in, the working directory is set to our home directory. Our home directory is obtained from our entry in the password file (Section 1.3).
Input and Output
File Descriptors
File descriptors are normally small non-negative integers that the kernel uses to
identify the files accessed by a process. Whenever it opens an existing file or creates a new file, the kernel returns a file descriptor that we use when we want to read or write the file.
Standard Input, Standard Output, and Standard Error
By convention, all shells open three descriptors whenever a new program is run:
standard input,
standard output, and
standard error. If nothing special is done, as in the simple command
ls
then all three are connected to the terminal. Most shells provide a way to redirect any or all of these three descriptors to any file. For example,
ls > file.list
executes the
ls command with its
standard output redirected to the file named
file.list.
Unbuffered I/O
Unbuffered I/O is provided by the functions
open,
read,
write,
lseek, and
close. These functions all work with file descriptors.
Example
If we’re
willing to read from the
standard input and write to the
standard output, then the program in Figure 1.4 copies any
regular file on a UNIX system.
/**
* 文件名: intro/mycat.c
* 內容:Copy standard input to standard output
* 時間: 2016年 09月 25日 星期日 15:59:10 CST
* 做者:firewaywei
*/
#include"apue.h"
#define BUFFSIZE 4096
int
main(void)
{
int n;
char buf[BUFFSIZE];
while((n = read(STDIN_FILENO, buf, BUFFSIZE))>0)
{
if(write(STDOUT_FILENO, buf, n)!= n)
{
err_sys("write error");
}
}
if(n <0)
{
err_sys("read error");
}
exit(0);
}
Figure 1.4 Copy standard input to standard output
The <unistd.h> header, included by
apue.h, and the two constants
STDIN_FILENO and
STDOUT_FILENO are part of the POSIX standard (about which we’ll have a lot more to say in the next chapter). This header contains function prototypes for many of the UNIX system services, such as the
read and
write functions that we call.
The constants
STDIN_FILENO and
STDOUT_FILENO are defined in <unistd.h> and
specify the file descriptors for
standard input and
standard output. These values are 0 and 1,
respectively, as required by POSIX.1, but we’ll use the names for readability.
In Section 3.9, we’ll examine the
BUFFSIZE constant in detail, seeing how
various values affect the
efficiency of the program.
Regardless of the value of this constant, however, this program still copies any regular file.
The
read function returns the number of bytes that are read, and this value is used as the number of bytes to write. When the end of the input file is
encountered, read returns 0 and the program stops. If a read error occurs, read returns −1.
Most of the system functions return −1 when an error occurs.
If we compile the program into the standard name (a.out) and
execute it as
./a.out > data
standard input is the terminal,
standard output is redirected to the file data, and
standard error is also the terminal. If this output file doesn’t exist, the shell creates it by default. The program copies lines that we type to the standard output until we type the end-of-file character (usually
Control-D).
If we run
./a.out < infile > outfile
then the file named
infile will be copied to the file named
outfile.
In Chapter 3, we describe the unbuffered I/O functions in more detail.
Standard I/O
The standard I/O functions provide a buffered interface to the unbuffered I/O functions. Using standard I/O
relieves us from having to choose
optimal buffer sizes, such as the BUFFSIZE constant in Figure 1.4. The standard I/O functions also simplify dealing with lines of input (a common
occurrence in UNIX applications). The
fgets function, for example, reads an entire line. The
read function,
in contrast, reads a
specified number of bytes. As we shall see in Section 5.4, the standard I/O library provides functions that let us control the style of buffering used by the library.
The most common standard I/O function is
printf. In programs that call
printf, we’ll always include <stdio.h>—normally by including
apue.h—as this header contains the function prototypes for all the standard I/O functions.
Example
The program in Figure 1.5, which we’ll examine in more detail in Section 5.8, is like the previous program that called
read and
write. This program copies
standard input to
standard output and can copy any regular file.
/**
* 文件名: intro/getcputc.c
* 內容:Copy standard input to standard output, using standard I/O
* 時間: 2016年 09月 25日 星期日 16:40:34 CST
* 做者:firewaywei
*/
#include"apue.h"
int
main(void)
{
int c;
while((c = getc(stdin))!= EOF)
{
if(putc(c, stdout)== EOF)
{
err_sys("output error");
}
}
if(ferror(stdin))
{
err_sys("input error");
}
exit(0);
}
Figure 1.5 Copy standard input to standard output, using standard I/O
The function
getc reads one character at a time, and this character is written by
putc. After the last byte of input has been read,
getc returns the constant
EOF (defined in <stdio.h>). The standard I/O constants
stdin and
stdout are also defined in the <stdio.h> header and
refer to the
standard input and
standard output.
咱們使用gcc -s編譯來查看目標文件getcputc.i, 會發現
extern struct _IO_FILE *stdin;
extern struct _IO_FILE *stdout;
extern struct _IO_FILE *stderr;
常量EOF就是-1
/* End of file character.
Some things throughout the library rely on this being -1. */
#ifndef EOF
# define EOF (-1)
#endif
Programs and Processes
Program
A program is an executable file
residing on disk in a directory. A program is read into memory and is executed by the kernel as a result of one of the seven
exec functions.
We’ll cover these functions in Section 8.10.
Processes and Process ID
An
executing instance of a program is called a process, a
term used on almost every page of this text. Some operating systems use the term
task to
refer to a program that is being executed.
The UNIX System guarantees that every process has a unique
numeric identifier called
the process ID. The process ID is always a non-negative integer.
Example
The program in Figure 1.6 prints its process ID.
/**
* 文件名: intro/hello.c
* 內容:Print the process ID
* 時間: 2016年 09月 25日 星期日 17:21:59 CST
* 做者:firewaywei
*/
#include"apue.h"
int
main(void)
{
printf("hello world from process ID %ld\n",(long)getpid());
exit(0);
}
Figure 1.6 Print the process ID
If we compile this program into the file
a.out and
execute it, we have
$ ./hello
hello world from process ID 6767
$ ./hello
hello world from process ID 6768
$ ./hello
hello world from process ID 6769
When this program runs, it calls the function
getpid to obtain its process ID. As we shall see later,
getpid returns a
pid_t data type. We don’t know its size; all we know is that the standards
guarantee that it will fit in a long integer. Because we have to tell
printf the size of each argument to be printed, we have to cast the value to the largest data type that it might use (in this case, a long integer). Although most process IDs will fit in an int, using a long
promotes portability.
Process Control
There are three primary functions for process control:
fork,
exec, and
waitpid. (The
exec function has seven
variants, but we often
refer to them
collectively as simply the
exec function.)
Example
The process control features of the UNIX System are
demonstrated using a simple program (Figure 1.7) that reads commands from
standard input and executes the commands. This is a
bare-bones implementation of a shell-like program.
/**
* 文件名: intro/shell1.c
* 內容:Read commands from standard input and execute them.
* 時間: 2016年 09月 25日 星期日 17:35:24 CST
* 做者:firewaywei
*/
#include"apue.h"
#include<sys/wait.h>
int
main(void)
{
char buf[MAXLINE];/* from apue.h */
pid_t pid;
int status;
printf("%% ");/* print prompt (printf requires %% to print %) */
while(fgets(buf, MAXLINE, stdin)!= NULL)
{
if(buf[strlen(buf)-1]=='\n')
{
buf[strlen(buf)-1]=0;/* replace newline with null */
}
if((pid = fork())<0)
{
err_sys("fork error");
}
elseif(pid ==0)
{
/* child */
execlp(buf, buf,(char*)0);
err_ret("couldn't execute: %s", buf);
exit(127);
}
/* parent */
if((pid = waitpid(pid,&status,0))<0)
{
err_sys("waitpid error");
}
printf("%% ");
}
exit(0);
}
Figure 1.7 Read commands from standard input and execute them
There are several features to consider in this 45-line program.
- We use the standard I/O function fgets to read one line at a time from the standard input. When we type the end-of-file character (which is often Control-D) as the first character of a line, fgets returns a null pointer, the loop stops, and the process terminates. In Chapter 18, we describe all the special terminal characters—end of file, backspace one character, erase entire line, and so on—and how to change them.
- Because each line returned by fgets is terminated with a newline character, followed by a null byte, we use the standard C function strlen to calculate the length of the string, and then replace the newline with a null byte. We do this because the execlp function wants a null-terminated argument, not a newline-terminated argument.
- We call fork to create a new process, which is a copy of the caller. We say that the caller is the parent and that the newly created process is the child. Then fork returns the non-negative process ID of the new child process to the parent, and returns 0 to the child. Because fork creates a new process, we say that it is called once—by the parent — but returns twice—in the parent and in the child.
- In the child, we call execlp to execute the command that was read from the standard input. This replaces the child process with the new program file. The combination of fork followed by exec is called spawning a new process on some operating systems. In the UNIX System, the two parts are separated into individual functions. We’ll say a lot more about these functions in Chapter 8.
- Because the child calls execlp to execute the new program file, the parent wants to wait for the child to terminate. This is done by calling waitpid, specifying which process to wait for: the pid argument, which is the process ID of the child. The waitpid function also returns the termination status of the child — the status variable — but in this simple program, we don’t do anything with this value. We could examine it to determine how the child terminated.
- The most fundamental limitation of this program is that we can’t pass arguments to the command we execute. We can’t, for example, specify the name of a directory to list. We can execute ls only on the working directory. To allow arguments would require that we parse the input line, separating the arguments by some convention, probably spaces or tabs, and then pass each argument as a separate parameter to the execlp function. Nevertheless, this program is still a useful demonstration of the UNIX System’s process control functions.
If we run this program, we get the following results. Note that our program has a different
prompt — the percent sign—to
distinguish it from the shell’s prompt.
$ ./shell1
% date
2016年 09月 25日 星期日 19:53:48 CST
% who
fireway pts/8 2016-09-24 10:59 (192.168.0.100)
% pwd
/home/fireway/study/apue.3e/intro
% ls
data getcputc.c hello ls1 Makefile mycat.c shell1 shell1_debug shell2.c testerror uidgid
getcputc getcputc.i hello.c ls1.c mycat [options] [file(s)] shell1.c shell2 tags testerror.c uidgid.c
% ^D
$
The notation ˆD is used to indicate a control character. Control characters are special characters formed by holding down the control key—often labeled Control or Ctrl—on your keyboard and then pressing another key at the same time. Control-D, or ˆD, is the default end-of-file character. We’ll see many more control characters when we discuss terminal I/O in Chapter 18.
Threads and Thread IDs
Usually, a process has only one thread of control —
one set of machine
instructions executing at a time. Some problems are easier to solve when more than one thread of control can operate on different parts of the problem.
Additionally,
multiple threads of control can
exploit the
parallelism possible on multiprocessor systems.
All threads within a process share the same address space, file descriptors, stacks, and process-related attributes. Each thread
executes on its own stack, although any thread can access the stacks of other threads in the same process. Because they can access the same memory, the threads need to
synchronize access to shared data among themselves to avoid
inconsistencies.
Like processes, threads are
identified by IDs.
Thread IDs, however, are local to a process. A thread ID from one process has no meaning in another process. We use thread IDs to
refer to
specific threads as we
manipulate the threads within a process.
Functions to control threads
parallel those used to control processes. Because threads were added to the UNIX System
long after the process model was established, however, the thread model and the process model have some
complicated interactions, as we shall see in Chapter 12.
Error Handling
User Identification
User ID
The user ID from our entry in the password file is a
numeric value that
identifies us to the system. This user ID is
assigned by the system administrator when our login name is
assigned, and we cannot change it. The user ID is
normally assigned to be unique for every user. We’ll see how the kernel uses the user ID to check whether we have the
appropriate permissions to perform certain operations.
We call the user whose user ID is 0 either root or the superuser. The entry in the password file normally has a login name of root, and we
refer to the
special privileges of this user as
superuser privileges. As we’ll see in Chapter 4, if a process has superuser privileges, most file permission checks are bypassed. Some operating system functions are
restricted to the superuser. The superuser has free
rein over the system.
Client versions of Mac OS X ship with the superuser account disabled; server versions ship with the account already enabled. Instructions are available on Apple’s Web site describing how to enable it. See http://support.apple.com/kb/HT1528.
Group ID
Our entry in the password file also
specifies our
numeric group ID. This, too, is
assigned by the system administrator when our login name is assigned.
Typically, the password file contains
multiple entries that
specify the same group ID. Groups are normally used to collect users together into projects or departments. This allows the sharing of resources, such as files, among members of the same group. We’ll see in Section 4.5 that we can set the permissions on a file so that all members of a group can access the file, whereas others outside the group cannot.
There is also a group file that maps group names into numeric group IDs. The group file is usually
/etc/group.
The use of numeric user IDs and numeric group IDs for permissions is historical. With every file on disk, the file system stores both the user ID and the group ID of a file’s owner. Storing both of these values requires only four bytes, assuming that each is stored as a two-byte integer. If the full ASCII login name and group name were used instead, additional disk space would be required. In addition, comparing strings during permission checks is more expensive than comparing integers.
Users, however, work better with names than with numbers, so the password file maintains the mapping between login names and user IDs, and the group file provides the mapping between group names and group IDs. The ls -l command, for example, prints the login name of the owner of a file, using the password file to map the numeric user ID into the corresponding login name.
Early UNIX systems used 16-bit integers to represent user and group IDs. Contemporary UNIX systems use 32-bit integers.
Example
The program in Figure 1.9 prints the user ID and the group ID.
/**
* 文件名: intro/uidgid.c
* 內容:prints the user ID and the group ID
* 時間: 2016年 09月 16日 星期五 22:40:34 CST
* 做者:firewaywei
*/
#include"apue.h"
int
main(void)
{
printf("uid = %d, gid = %d\n", getuid(), getgid());
exit(0);
}
Figure 1.9 Print user ID and group ID
We call the functions getuid and getgid to return the user ID and the group ID. Running the program yields
$ ./uidgid
uid = 1000, gid = 1000
$ su
密碼:
# ./uidgid
uid = 0, gid = 0
Supplementary Group IDs
In addition to the group ID specified in the password file for a login name, most versions of the UNIX System allow a user to
belong to other groups. This practice
started with 4.2BSD, which allowed a user to belong to
up to 16
additional groups. These
supplementary group IDs are obtained at login time by reading the file
/etc/group and finding the first 16 entries that list the user as a member. As we shall see in the next chapter, POSIX requires that a system support at least 8
supplementary groups per process, but most systems support at least 16.
Signals
Time Values
System Calls and Library Functions
Summary
參考
《Advanced Programming in the UNIX Envinronment, 2013》Chapter 1. UNIX System Over view