咱們都知道,在Hadoop 2.7.0中,系統剛開始執行時,須要執行一個命令就是java
hadoop namenode -format
下面咱們就來逐行分析下,這個命令對應的腳本的真正內涵!node
---linux
bin=`which $0` bin=`dirname ${bin}` bin=`cd "$bin"; pwd`
打印出來的bin值爲web
/root/hadoop-2.7.0-bin/bin
---apache
DEFAULT_LIBEXEC_DIR="$bin"/../libexec
則致使DEFAULT_LIBEXEC_DIR的值爲windows
/root/hadoop-2.7.0-bin/bin/../libexec
---app
cygwin=false case "$(uname)" in CYGWIN*) cygwin=true;; esac
這個一看就知道跟cywin相關,也就是若是你在windows下執行這個腳本纔有意義,個人是linux,無視這段代碼!curl
---webapp
HADOOP_LIBEXEC_DIR=${HADOOP_LIBEXEC_DIR:-$DEFAULT_LIBEXEC_DIR}
打印其值爲tcp
/root/hadoop-2.7.0-bin/bin/../libexec
---
接下來執行一個腳本
. $HADOOP_LIBEXEC_DIR/hadoop-config.sh
打印的值是/root/hadoop-2.7.0-bin/bin/../libexec/hadoop-config.sh
實際上也就是
/root/hadoop-2.7.0-bin/libexec/hadoop-config.sh
因此先轉向這個腳本看看
|
|
|
---解析/root/hadoop-2.7.0-bin/libexec/hadoop-config.sh
剛開始是
this="${BASH_SOURCE-$0}" common_bin=$(cd -P -- "$(dirname -- "$this")" && pwd -P) script="$(basename -- "$this")" this="$common_bin/$script"
打印出全部的值爲
this=/root/hadoop-2.7.0-bin/libexec/hadoop-config.sh common_bin=/root/hadoop-2.7.0-bin/libexec script=hadoop-config.sh
---
而後執行一段腳本
[ -f "$common_bin/hadoop-layout.sh" ] && . "$common_bin/hadoop-layout.sh"
謝天謝地,這裏並無這個腳本,因此不須要執行
---
接下來是一大波賦值語句
HADOOP_COMMON_DIR=${HADOOP_COMMON_DIR:-"share/hadoop/common"} HADOOP_COMMON_LIB_JARS_DIR=${HADOOP_COMMON_LIB_JARS_DIR:-"share/hadoop/common/lib"} HADOOP_COMMON_LIB_NATIVE_DIR=${HADOOP_COMMON_LIB_NATIVE_DIR:-"lib/native"} HDFS_DIR=${HDFS_DIR:-"share/hadoop/hdfs"} HDFS_LIB_JARS_DIR=${HDFS_LIB_JARS_DIR:-"share/hadoop/hdfs/lib"} YARN_DIR=${YARN_DIR:-"share/hadoop/yarn"} YARN_LIB_JARS_DIR=${YARN_LIB_JARS_DIR:-"share/hadoop/yarn/lib"} MAPRED_DIR=${MAPRED_DIR:-"share/hadoop/mapreduce"} MAPRED_LIB_JARS_DIR=${MAPRED_LIB_JARS_DIR:-"share/hadoop/mapreduce/lib"}
打印值以下:
HADOOP_COMMON_DIR=share/hadoop/common HADOOP_COMMON_LIB_JARS_DIR=share/hadoop/common/lib HADOOP_COMMON_LIB_NATIVE_DIR=lib/native HDFS_DIR=share/hadoop/hdfs HDFS_LIB_JARS_DIR=share/hadoop/hdfs/lib YARN_DIR=share/hadoop/yarn YARN_LIB_JARS_DIR=share/hadoop/yarn/lib MAPRED_DIR=share/hadoop/mapreduce MAPRED_LIB_JARS_DIR=share/hadoop/mapreduce/lib
---
接下來是一小波賦值語句
HADOOP_DEFAULT_PREFIX=$(cd -P -- "$common_bin"/.. && pwd -P) HADOOP_PREFIX=${HADOOP_PREFIX:-$HADOOP_DEFAULT_PREFIX} export HADOOP_PREFIX
打印值以下:
HADOOP_DEFAULT_PREFIX=/root/hadoop-2.7.0-bin HADOOP_PREFIX=/root/hadoop-2.7.0-bin
---
接下來是一段判斷語句
#check to see if the conf dir is given as an optional argument if [ $# -gt 1 ] then if [ "--config" = "$1" ] then shift confdir=$1 if [ ! -d "$confdir" ]; then echo "Error: Cannot find configuration directory: $confdir" exit 1 fi shift HADOOP_CONF_DIR=$confdir fi fi
執行結果以下:
本次命令下沒有執行
---
if [ $# -gt 1 ] then if [ "--loglevel" = "$1" ] then shift HADOOP_LOGLEVEL=$1 shift fi fi HADOOP_LOGLEVEL="${HADOOP_LOGLEVEL:-INFO}"
執行結果爲
HADOOP_LOGLEVEL=INFO
---
# Allow alternate conf dir location. if [ -e "${HADOOP_PREFIX}/conf/hadoop-env.sh" ]; then DEFAULT_CONF_DIR="conf" else DEFAULT_CONF_DIR="etc/hadoop" fi export HADOOP_CONF_DIR="${HADOOP_CONF_DIR:-$HADOOP_PREFIX/$DEFAULT_CONF_DIR}"
打印爲
HADOOP_CONF_DIR=/root/hadoop-2.7.0-bin/etc/hadoop
---
接下來是判斷腳本
# User can specify hostnames or a file where the hostnames are (not both) if [[ ( "$HADOOP_SLAVES" != '' ) && ( "$HADOOP_SLAVE_NAMES" != '' ) ]] ; then echo \ "Error: Please specify one variable HADOOP_SLAVES or " \ "HADOOP_SLAVE_NAME and not both." exit 1 fi
結果爲
經過
---
# Process command line options that specify hosts or file with host # list if [ $# -gt 1 ] then if [ "--hosts" = "$1" ] then shift export HADOOP_SLAVES="${HADOOP_CONF_DIR}/$1" shift elif [ "--hostnames" = "$1" ] then shift export HADOOP_SLAVE_NAMES=$1 shift fi fi
執行結果爲
很抱歉,任何一個分支都沒有被執行
---
# User can specify hostnames or a file where the hostnames are (not both) # (same check as above but now we know it's command line options that cause # the problem) if [[ ( "$HADOOP_SLAVES" != '' ) && ( "$HADOOP_SLAVE_NAMES" != '' ) ]] ; then echo \ "Error: Please specify one of --hosts or --hostnames options and not both." exit 1 fi
顯然執行結果爲不執行
---
接下來又執行另外一個腳本
if [ -f "${HADOOP_CONF_DIR}/hadoop-env.sh" ]; then . "${HADOOP_CONF_DIR}/hadoop-env.sh" fi
路徑爲:
/root/hadoop-2.7.0-bin/etc/hadoop/hadoop-env.sh
下面來執行這個腳本
|
|
|
---/root/hadoop-2.7.0-bin/etc/hadoop/hadoop-env.sh
這個腳本,所有是export,就不細細分析了。
讓咱們再回到hadoop-config.sh
---
接下來是
# check if net.ipv6.bindv6only is set to 1 bindv6only=$(/sbin/sysctl -n net.ipv6.bindv6only 2> /dev/null) if [ -n "$bindv6only" ] && [ "$bindv6only" -eq "1" ] && [ "$HADOOP_ALLOW_IPV6" != "yes" ] then echo "Error: \"net.ipv6.bindv6only\" is set to 1 - Java networking could be broken" echo "For more info: http://wiki.apache.org/hadoop/HadoopIPv6" exit 1 fi
顯然不執行
---
接下來是
# Newer versions of glibc use an arena memory allocator that causes virtual # memory usage to explode. This interacts badly with the many threads that # we use in Hadoop. Tune the variable down to prevent vmem explosion. export MALLOC_ARENA_MAX=${MALLOC_ARENA_MAX:-4}
打印
MALLOC_ARENA_MAX=4
---
接下來是JAVA_HOME
# Attempt to set JAVA_HOME if it is not set if [[ -z $JAVA_HOME ]]; then # On OSX use java_home (or /Library for older versions) if [ "Darwin" == "$(uname -s)" ]; then if [ -x /usr/libexec/java_home ]; then export JAVA_HOME=($(/usr/libexec/java_home)) else export JAVA_HOME=(/Library/Java/Home) fi fi # Bail if we did not detect it if [[ -z $JAVA_HOME ]]; then echo "Error: JAVA_HOME is not set and could not be found." 1>&2 exit 1 fi fi
打印
JAVA_HOME=/usr/java/jdk1.8.0_45
---
JAVA=$JAVA_HOME/bin/java # some Java parameters JAVA_HEAP_MAX=-Xmx1000m # check envvars which might override default args if [ "$HADOOP_HEAPSIZE" != "" ]; then #echo "run with heapsize $HADOOP_HEAPSIZE" JAVA_HEAP_MAX="-Xmx""$HADOOP_HEAPSIZE""m" #echo $JAVA_HEAP_MAX fi
打印值爲
JAVA_HEAP_MAX=-Xmx1000m
---
# CLASSPATH initially contains $HADOOP_CONF_DIR CLASSPATH="${HADOOP_CONF_DIR}"
打印
CLASSPATH=/root/hadoop-2.7.0-bin/etc/hadoop
---
# so that filenames w/ spaces are handled correctly in loops below IFS= if [ "$HADOOP_COMMON_HOME" = "" ]; then if [ -d "${HADOOP_PREFIX}/$HADOOP_COMMON_DIR" ]; then export HADOOP_COMMON_HOME=$HADOOP_PREFIX fi fi
打印
HADOOP_COMMON_HOME=/root/hadoop-2.7.0-bin
---
# for releases, add core hadoop jar & webapps to CLASSPATH if [ -d "$HADOOP_COMMON_HOME/$HADOOP_COMMON_DIR/webapps" ]; then CLASSPATH=${CLASSPATH}:$HADOOP_COMMON_HOME/$HADOOP_COMMON_DIR fi if [ -d "$HADOOP_COMMON_HOME/$HADOOP_COMMON_LIB_JARS_DIR" ]; then CLASSPATH=${CLASSPATH}:$HADOOP_COMMON_HOME/$HADOOP_COMMON_LIB_JARS_DIR'/*' fi CLASSPATH=${CLASSPATH}:$HADOOP_COMMON_HOME/$HADOOP_COMMON_DIR'/*'
打印值
CLASSPATH=/root/hadoop-2.7.0-bin/etc/hadoop:/root/hadoop-2.7.0-bin/share/hadoop/common/lib/*:/root/hadoop-2.7.0-bin/share/hadoop/common/*
---
# default log directory & file if [ "$HADOOP_LOG_DIR" = "" ]; then HADOOP_LOG_DIR="$HADOOP_PREFIX/logs" fi if [ "$HADOOP_LOGFILE" = "" ]; then HADOOP_LOGFILE='hadoop.log' fi # default policy file for service-level authorization if [ "$HADOOP_POLICYFILE" = "" ]; then HADOOP_POLICYFILE="hadoop-policy.xml" fi # restore ordinary behaviour unset IFS
打印
HADOOP_LOG_DIR=/root/hadoop-2.7.0-bin/logs HADOOP_LOGFILE=hadoop.log HADOOP_POLICYFILE=hadoop-policy.xml
---
# setup 'java.library.path' for native-hadoop code if necessary if [ -d "${HADOOP_PREFIX}/build/native" -o -d "${HADOOP_PREFIX}/$HADOOP_COMMON_LIB_NATIVE_DIR" ]; then if [ -d "${HADOOP_PREFIX}/$HADOOP_COMMON_LIB_NATIVE_DIR" ]; then if [ "x$JAVA_LIBRARY_PATH" != "x" ]; then JAVA_LIBRARY_PATH=${JAVA_LIBRARY_PATH}:${HADOOP_PREFIX}/$HADOOP_COMMON_LIB_NATIVE_DIR else JAVA_LIBRARY_PATH=${HADOOP_PREFIX}/$HADOOP_COMMON_LIB_NATIVE_DIR fi fi fi
打印
JAVA_LIBRARY_PATH=/root/hadoop-2.7.0-bin/lib/native
---
# setup a default TOOL_PATH TOOL_PATH="${TOOL_PATH:-$HADOOP_PREFIX/share/hadoop/tools/lib/*}" HADOOP_OPTS="$HADOOP_OPTS -Dhadoop.log.dir=$HADOOP_LOG_DIR" HADOOP_OPTS="$HADOOP_OPTS -Dhadoop.log.file=$HADOOP_LOGFILE" HADOOP_HOME=$HADOOP_PREFIX if $cygwin; then HADOOP_HOME=$(cygpath -w "$HADOOP_HOME" 2>/dev/null) fi export HADOOP_HOME HADOOP_OPTS="$HADOOP_OPTS -Dhadoop.home.dir=$HADOOP_HOME" HADOOP_OPTS="$HADOOP_OPTS -Dhadoop.id.str=$HADOOP_IDENT_STRING" HADOOP_OPTS="$HADOOP_OPTS -Dhadoop.root.logger=${HADOOP_ROOT_LOGGER:-${HADOOP_LOGLEVEL},console}" if [ "x$JAVA_LIBRARY_PATH" != "x" ]; then if $cygwin; then JAVA_LIBRARY_PATH=$(cygpath -w "$JAVA_LIBRARY_PATH" 2>/dev/null) fi HADOOP_OPTS="$HADOOP_OPTS -Djava.library.path=$JAVA_LIBRARY_PATH" export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$JAVA_LIBRARY_PATH fi HADOOP_OPTS="$HADOOP_OPTS -Dhadoop.policy.file=$HADOOP_POLICYFILE" # Disable ipv6 as it can cause issues HADOOP_OPTS="$HADOOP_OPTS -Djava.net.preferIPv4Stack=true"
一大波賦值語句就很少說了。
---
# put hdfs in classpath if present if [ "$HADOOP_HDFS_HOME" = "" ]; then if [ -d "${HADOOP_PREFIX}/$HDFS_DIR" ]; then export HADOOP_HDFS_HOME=$HADOOP_PREFIX fi fi
打印
HADOOP_HDFS_HOME=/root/hadoop-2.7.0-bin
---
if [ -d "$HADOOP_HDFS_HOME/$HDFS_DIR/webapps" ]; then CLASSPATH=${CLASSPATH}:$HADOOP_HDFS_HOME/$HDFS_DIR fi if [ -d "$HADOOP_HDFS_HOME/$HDFS_LIB_JARS_DIR" ]; then CLASSPATH=${CLASSPATH}:$HADOOP_HDFS_HOME/$HDFS_LIB_JARS_DIR'/*' fi CLASSPATH=${CLASSPATH}:$HADOOP_HDFS_HOME/$HDFS_DIR'/*'
打印
CLASSPATH=/root/hadoop-2.7.0-bin/etc/hadoop:/root/hadoop-2.7.0-bin/share/hadoop/common/lib/*:/root/hadoop-2.7.0-bin/share/hadoop/common/*:/root/hadoop-2.7.0-bin/share/hadoop/hdfs:/root/hadoop-2.7.0-bin/share/hadoop/hdfs/lib/*:/root/hadoop-2.7.0-bin/share/hadoop/hdfs/*
---
# put yarn in classpath if present if [ "$HADOOP_YARN_HOME" = "" ]; then if [ -d "${HADOOP_PREFIX}/$YARN_DIR" ]; then export HADOOP_YARN_HOME=$HADOOP_PREFIX fi fi
打印
HADOOP_YARN_HOME=/root/hadoop-2.7.0-bin
---
if [ -d "$HADOOP_YARN_HOME/$YARN_DIR/webapps" ]; then CLASSPATH=${CLASSPATH}:$HADOOP_YARN_HOME/$YARN_DIR fi if [ -d "$HADOOP_YARN_HOME/$YARN_LIB_JARS_DIR" ]; then CLASSPATH=${CLASSPATH}:$HADOOP_YARN_HOME/$YARN_LIB_JARS_DIR'/*' fi CLASSPATH=${CLASSPATH}:$HADOOP_YARN_HOME/$YARN_DIR'/*' # put mapred in classpath if present AND different from YARN if [ "$HADOOP_MAPRED_HOME" = "" ]; then if [ -d "${HADOOP_PREFIX}/$MAPRED_DIR" ]; then export HADOOP_MAPRED_HOME=$HADOOP_PREFIX fi fi if [ "$HADOOP_MAPRED_HOME/$MAPRED_DIR" != "$HADOOP_YARN_HOME/$YARN_DIR" ] ; then if [ -d "$HADOOP_MAPRED_HOME/$MAPRED_DIR/webapps" ]; then CLASSPATH=${CLASSPATH}:$HADOOP_MAPRED_HOME/$MAPRED_DIR fi if [ -d "$HADOOP_MAPRED_HOME/$MAPRED_LIB_JARS_DIR" ]; then CLASSPATH=${CLASSPATH}:$HADOOP_MAPRED_HOME/$MAPRED_LIB_JARS_DIR'/*' fi CLASSPATH=${CLASSPATH}:$HADOOP_MAPRED_HOME/$MAPRED_DIR'/*' fi
打印
CLASSPATH=/root/hadoop-2.7.0-bin/etc/hadoop:/root/hadoop-2.7.0-bin/share/hadoop/common/lib/*:/root/hadoop-2.7.0-bin/share/hadoop/common/*:/root/hadoop-2.7.0-bin/share/hadoop/hdfs:/root/hadoop-2.7.0-bin/share/hadoop/hdfs/lib/*:/root/hadoop-2.7.0-bin/share/hadoop/hdfs/*:/root/hadoop-2.7.0-bin/share/hadoop/yarn/lib/*:/root/hadoop-2.7.0-bin/share/hadoop/yarn/*:/root/hadoop-2.7.0-bin/share/hadoop/mapreduce/lib/*:/root/hadoop-2.7.0-bin/share/hadoop/mapreduce/*
---
# Add the user-specified CLASSPATH via HADOOP_CLASSPATH # Add it first or last depending on if user has # set env-var HADOOP_USER_CLASSPATH_FIRST # if the user set HADOOP_USE_CLIENT_CLASSLOADER, HADOOP_CLASSPATH is not added # to the classpath if [[ ( "$HADOOP_CLASSPATH" != "" ) && ( "$HADOOP_USE_CLIENT_CLASSLOADER" = "" ) ]]; then # Prefix it if its to be preceded if [ "$HADOOP_USER_CLASSPATH_FIRST" != "" ]; then CLASSPATH=${HADOOP_CLASSPATH}:${CLASSPATH} else CLASSPATH=${CLASSPATH}:${HADOOP_CLASSPATH} fi fi
打印
CLASSPATH=/root/hadoop-2.7.0-bin/etc/hadoop:/root/hadoop-2.7.0-bin/share/hadoop/common/lib/*:/root/hadoop-2.7.0-bin/share/hadoop/common/*:/root/hadoop-2.7.0-bin/share/hadoop/hdfs:/root/hadoop-2.7.0-bin/share/hadoop/hdfs/lib/*:/root/hadoop-2.7.0-bin/share/hadoop/hdfs/*:/root/hadoop-2.7.0-bin/share/hadoop/yarn/lib/*:/root/hadoop-2.7.0-bin/share/hadoop/yarn/*:/root/hadoop-2.7.0-bin/share/hadoop/mapreduce/lib/*:/root/hadoop-2.7.0-bin/share/hadoop/mapreduce/*:/contrib/capacity-scheduler/*.jar
---而後再回到最開始的hadoop腳本
|
|
|
---繼續執行
function print_usage(){ echo "Usage: hadoop [--config confdir] [COMMAND | CLASSNAME]" echo " CLASSNAME run the class named CLASSNAME" echo " or" echo " where COMMAND is one of:" echo " fs run a generic filesystem user client" echo " version print the version" echo " jar <jar> run a jar file" echo " note: please use \"yarn jar\" to launch" echo " YARN applications, not this command." echo " checknative [-a|-h] check native hadoop and compression libraries availability" echo " distcp <srcurl> <desturl> copy file or directories recursively" echo " archive -archiveName NAME -p <parent path> <src>* <dest> create a hadoop archive" echo " classpath prints the class path needed to get the" echo " credential interact with credential providers" echo " Hadoop jar and the required libraries" echo " daemonlog get/set the log level for each daemon" echo " trace view and modify Hadoop tracing settings" echo "" echo "Most commands print help when invoked w/o parameters." }
這個就很少說了
很少細說
---
if [ $# = 0 ]; then print_usage exit fi
這個就不用我多說了。
---接下來開始執行命令
走的分支爲
#hdfs commands namenode|secondarynamenode|datanode|dfs|dfsadmin|fsck|balancer|fetchdt|oiv|dfsgroups|portmap|nfs3) exit echo "DEPRECATED: Use of this script to execute hdfs command is deprecated." 1>&2 echo "Instead use the hdfs command for it." 1>&2 echo "" 1>&2 #try to locate hdfs and if present, delegate to it. shift if [ -f "${HADOOP_HDFS_HOME}"/bin/hdfs ]; then exec "${HADOOP_HDFS_HOME}"/bin/hdfs ${COMMAND/dfsgroups/groups} "$@" elif [ -f "${HADOOP_PREFIX}"/bin/hdfs ]; then exec "${HADOOP_PREFIX}"/bin/hdfs ${COMMAND/dfsgroups/groups} "$@" else echo "HADOOP_HDFS_HOME not found!" exit 1 fi ;;
最終執行的是
/root/hadoop-2.7.0-bin/bin/hdfs namenode -format
下一篇來說解這個命令。