升級服務從spark2.3.0-hadoop2.8 至 spark2.4.0 hadoop3.0javascript
一往後致使spark streaming kafka消費數據積壓html
服務不是傳統的部署在yarn上,而是布在kubernetes(1.13.2)上 https://spark.apache.org/docs/latest/running-on-kubernetes.htmljava
由於近期對集羣有大操做,覺得是集羣的io瓶頸致使的積壓,做了幾項針對io優化,但沒什麼效果node
一直盯着服務日誌和服務器的負載狀況linux
忽然發現一點不對,spark相關服務的cpu佔用一直在100%-200%之間,長時間停留在100%chrome
集羣相關機器是32核,cpu佔用100%能夠理解爲只用了單核,這裏明顯有問題docker
猜想數據積壓,極可能不是io瓶頸,而是計算瓶頸(服務內部有分詞,分類,聚類計算等計算密集操做)apache
程序內部會根據cpu核心做優化服務器
獲取環境內核數的方法
def GetCpuCoreNum(): Int = {
Runtime.getRuntime.availableProcessors
}網絡
打印內核心數
spark 2.4.0
root@consume-topic-qk-nwd-7d84585f5-kh7z5:/usr/spark-2.4.0# java -version java version "1.8.0_202" Java(TM) SE Runtime Environment (build 1.8.0_202-b08) Java HotSpot(TM) 64-Bit Server VM (build 25.202-b08, mixed mode) [cuidapeng@wx-k8s-4 ~]$ kb logs consume-topic-qk-nwd-7d84585f5-kh7z5 |more 2019-03-04 15:21:59 WARN NativeCodeLoader:60 - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Cpu core Num 1 2019-03-04 15:22:00 INFO SparkContext:54 - Running Spark version 2.4.0 2019-03-04 15:22:00 INFO SparkContext:54 - Submitted application: topic-quick 2019-03-04 15:22:00 INFO SecurityManager:54 - Changing view acls to: root 2019-03-04 15:22:00 INFO SecurityManager:54 - Changing modify acls to: root 2019-03-04 15:22:00 INFO SecurityManager:54 - Changing view acls groups to: 2019-03-04 15:22:00 INFO SecurityManager:54 - Changing modify acls groups to: 2019-03-04 15:22:00 INFO SecurityManager:54 - SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); groups with view permissions: Set(); users with m odify permissions: Set(root); groups with modify permissions: Set() 2019-03-04 15:22:00 INFO Utils:54 - Successfully started service 'sparkDriver' on port 33016. 2019-03-04 15:22:00 INFO SparkEnv:54 - Registering MapOutputTracker 2019-03-04 15:22:01 INFO SparkEnv:54 - Registering BlockManagerMaster 2019-03-04 15:22:01 INFO BlockManagerMasterEndpoint:54 - Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information 2019-03-04 15:22:01 INFO BlockManagerMasterEndpoint:54 - BlockManagerMasterEndpoint up 2019-03-04 15:22:01 INFO DiskBlockManager:54 - Created local directory at /tmp/blockmgr-dc0c496e-e5ab-4d07-a518-440f2336f65c 2019-03-04 15:22:01 INFO MemoryStore:54 - MemoryStore started with capacity 4.5 GB 2019-03-04 15:22:01 INFO SparkEnv:54 - Registering OutputCommitCoordinator 2019-03-04 15:22:01 INFO log:192 - Logging initialized @2888ms
Cpu core Num 1 服務變爲單核計算,積壓的緣由就在這裏
果真猜想正確,回滾版本至2.3.0
回滾至spark 2.3.0
root@consume-topic-dt-nwd-67b7fd6dd5-jztpb:/usr/spark-2.3.0# java -version java version "1.8.0_131" Java(TM) SE Runtime Environment (build 1.8.0_131-b11) Java HotSpot(TM) 64-Bit Server VM (build 25.131-b11, mixed mode) [cuidapeng@wx-k8s-4 ~]$ kb logs consume-topic-dt-nwd-67b7fd6dd5-jztpb | more 2019-03-04 15:16:22 WARN NativeCodeLoader:62 - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Cpu core Num 32 2019-03-04 15:16:23 INFO SparkContext:54 - Running Spark version 2.3.0 2019-03-04 15:16:23 INFO SparkContext:54 - Submitted application: topic-dt 2019-03-04 15:16:23 INFO SecurityManager:54 - Changing view acls to: root 2019-03-04 15:16:23 INFO SecurityManager:54 - Changing modify acls to: root 2019-03-04 15:16:23 INFO SecurityManager:54 - Changing view acls groups to: 2019-03-04 15:16:23 INFO SecurityManager:54 - Changing modify acls groups to: 2019-03-04 15:16:23 INFO SecurityManager:54 - SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); groups with view permissions: Set(); users with m odify permissions: Set(root); groups with modify permissions: Set() 2019-03-04 15:16:23 INFO Utils:54 - Successfully started service 'sparkDriver' on port 40616. 2019-03-04 15:16:23 INFO SparkEnv:54 - Registering MapOutputTracker 2019-03-04 15:16:23 INFO SparkEnv:54 - Registering BlockManagerMaster 2019-03-04 15:16:23 INFO BlockManagerMasterEndpoint:54 - Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information 2019-03-04 15:16:23 INFO BlockManagerMasterEndpoint:54 - BlockManagerMasterEndpoint up 2019-03-04 15:16:23 INFO DiskBlockManager:54 - Created local directory at /tmp/blockmgr-5dbf1194-477a-4001-8738-3da01b5a3f01 2019-03-04 15:16:23 INFO MemoryStore:54 - MemoryStore started with capacity 6.2 GB 2019-03-04 15:16:23 INFO SparkEnv:54 - Registering OutputCommitCoordinator 2019-03-04 15:16:24 INFO log:192 - Logging initialized @2867ms
Cpu core Num 32,32是物理機的內核數
阻塞並非io引發的,而是runtime可用core變小致使,spark升級至2.4.0後,服務由32核併發執行變成單核執行
這實際不是spark的問題,而是jdk的問題
很早之前有需求限制docker內的core資源,要求jdk獲取到core數docker限制的core數,當時印象是對jdk提了需求將來jdk9,10會實現,jdk8還實現不了,就把docker限制內核數的方案給否了,以分散服務調度的方式做計算資源的限制
對jdk8沒想到這一點,卻在這裏踩了個坑
docker 控制cpu的相關參數
Usage: docker run [OPTIONS] IMAGE [COMMAND] [ARG...] Run a command in a new container Options: --cpu-period int Limit CPU CFS (Completely Fair Scheduler) period --cpu-quota int Limit CPU CFS (Completely Fair Scheduler) quota --cpu-rt-period int Limit CPU real-time period in microseconds --cpu-rt-runtime int Limit CPU real-time runtime in microseconds -c, --cpu-shares int CPU shares (relative weight) --cpus decimal Number of CPUs --cpuset-cpus string CPUs in which to allow execution (0-3, 0,1) --cpuset-mems string MEMs in which to allow execution (0-3, 0,1)
另一點,服務是由kubernetes調度的,kubernetes在docker之上又做一層資源管理
kubernetes對cpu的控制有兩種方案
一種是基於內核的 https://kubernetes.io/blog/2018/07/24/feature-highlight-cpu-manager/
一種是基於百分比的 https://kubernetes.io/docs/concepts/configuration/manage-compute-resources-container/
手動分配cpu資源
resources: requests: cpu: 12 memory: "24Gi" limits: cpu: 12 memory: "24Gi"
更新服務
[cuidapeng@wx-k8s-4 ~]$ kb logs consume-topic-dt-nwd-99cf6d789-6hkcg |more
2019-03-04 16:24:57 WARN NativeCodeLoader:60 - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Cpu core Num 12 2019-03-04 16:24:57 INFO SparkContext:54 - Running Spark version 2.4.0 2019-03-04 16:24:58 INFO SparkContext:54 - Submitted application: topic-dt 2019-03-04 16:24:58 INFO SecurityManager:54 - Changing view acls to: root 2019-03-04 16:24:58 INFO SecurityManager:54 - Changing modify acls to: root 2019-03-04 16:24:58 INFO SecurityManager:54 - Changing view acls groups to: 2019-03-04 16:24:58 INFO SecurityManager:54 - Changing modify acls groups to: 2019-03-04 16:24:58 INFO SecurityManager:54 - SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); groups with view permissions: Set(); users with m odify permissions: Set(root); groups with modify permissions: Set() 2019-03-04 16:24:58 INFO Utils:54 - Successfully started service 'sparkDriver' on port 36429. 2019-03-04 16:24:58 INFO SparkEnv:54 - Registering MapOutputTracker 2019-03-04 16:24:58 INFO SparkEnv:54 - Registering BlockManagerMaster 2019-03-04 16:24:58 INFO BlockManagerMasterEndpoint:54 - Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information 2019-03-04 16:24:58 INFO BlockManagerMasterEndpoint:54 - BlockManagerMasterEndpoint up 2019-03-04 16:24:58 INFO DiskBlockManager:54 - Created local directory at /tmp/blockmgr-764f35a8-ea7f-4057-8123-22cbbe2d9a39 2019-03-04 16:24:58 INFO MemoryStore:54 - MemoryStore started with capacity 6.2 GB 2019-03-04 16:24:58 INFO SparkEnv:54 - Registering OutputCommitCoordinator 2019-03-04 16:24:58 INFO log:192 - Logging initialized @2855ms
Cpu core Num 12 生效
kubernetes(docker) 和spark(jdk)之間core有一個兼容性問題
jdk 1.8.0_131 在docker內 獲取的是主機上的內核數
jdk 1.8.0_202 在docker內 獲取的是docker被限制的內核數,kubernetes不指定resource默認限制爲1
升級至spark2.4.0-hadoop3.0(jdk 1.8.0_202),同時kubernetes同時指定內核數,也能夠切換jdk至低版本,但須要從新打docker鏡像。
指定內核數
[cuidapeng@wx-k8s-4 ~]$ kb describe node wx-k8s-8
Name: wx-k8s-8 Roles: <none> Labels: beta.kubernetes.io/arch=amd64 beta.kubernetes.io/os=linux kubernetes.io/hostname=wx-k8s-8 Annotations: flannel.alpha.coreos.com/backend-data: {"VtepMAC":"26:21:23:bb:3d:62"} flannel.alpha.coreos.com/backend-type: vxlan flannel.alpha.coreos.com/kube-subnet-manager: true flannel.alpha.coreos.com/public-ip: 10.10.3.126 kubeadm.alpha.kubernetes.io/cri-socket: /var/run/dockershim.sock node.alpha.kubernetes.io/ttl: 0 volumes.kubernetes.io/controller-managed-attach-detach: true CreationTimestamp: Thu, 24 Jan 2019 14:11:15 +0800 Taints: <none> Unschedulable: false Conditions: Type Status LastHeartbeatTime LastTransitionTime Reason Message ---- ------ ----------------- ------------------ ------ ------- MemoryPressure False Mon, 04 Mar 2019 17:27:16 +0800 Thu, 24 Jan 2019 14:11:15 +0800 KubeletHasSufficientMemory kubelet has sufficient memory available DiskPressure False Mon, 04 Mar 2019 17:27:16 +0800 Thu, 24 Jan 2019 14:11:15 +0800 KubeletHasNoDiskPressure kubelet has no disk pressure PIDPressure False Mon, 04 Mar 2019 17:27:16 +0800 Thu, 24 Jan 2019 14:11:15 +0800 KubeletHasSufficientPID kubelet has sufficient PID available Ready True Mon, 04 Mar 2019 17:27:16 +0800 Thu, 24 Jan 2019 14:24:48 +0800 KubeletReady kubelet is posting ready status Addresses: InternalIP: 10.10.3.126 Hostname: wx-k8s-8 Capacity: cpu: 32 ephemeral-storage: 1951511544Ki hugepages-1Gi: 0 hugepages-2Mi: 0 memory: 65758072Ki pods: 110 Allocatable: cpu: 32 ephemeral-storage: 1798513035973 hugepages-1Gi: 0 hugepages-2Mi: 0 memory: 65655672Ki pods: 110 System Info: Machine ID: c4ef335760624cd8940eddc0cd568982 System UUID: 4C4C4544-0056-3310-8036-B4C04F393632 Boot ID: 02925b6a-8fc8-4399-a12e-54a77f72b4f3 Kernel Version: 3.10.0-693.el7.x86_64 OS Image: CentOS Linux 7 (Core) Operating System: linux Architecture: amd64 Container Runtime Version: docker://17.3.2 Kubelet Version: v1.13.2 Kube-Proxy Version: v1.13.2 PodCIDR: 10.244.7.0/24 Non-terminated Pods: (15 in total) Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits AGE --------- ---- ------------ ---------- --------------- ------------- --- default consume-buzz-8c5cd6f97-z542c 0 (0%) 0 (0%) 4Gi (6%) 5Gi (7%) 6h18m default consume-daoen-7c946bdf76-kmjp6 0 (0%) 0 (0%) 2Gi (3%) 3Gi (4%) 5h19m default consume-mengqiu-autohome-koubei-cf5d4cb87-cnp2g 0 (0%) 0 (0%) 2Gi (3%) 3Gi (4%) 5h30m default consume-mengqiu-car-7c6575f5fc-zskhw 0 (0%) 0 (0%) 2Gi (3%) 3Gi (4%) 5h19m default consume-mengqiu-dt-5c6d7f8c5c-jf4hr 8 (25%) 0 (0%) 8Gi (12%) 9Gi (14%) 36s default consume-mengqiu-ec-768c647d7b-5wkss 0 (0%) 0 (0%) 2Gi (3%) 3Gi (4%) 5h30m default consume-mengqiu-qk-7c6d96c85-24kwp 8 (25%) 0 (0%) 8Gi (12%) 13Gi (20%) 36s default consume-mengqiu-yp-848c89dd97-6mqsb 0 (0%) 0 (0%) 2Gi (3%) 3Gi (4%) 5h19m default consume-qb-799c98f996-njczw 2 (6%) 0 (0%) 4Gi (6%) 5Gi (7%) 36s default consume-xiaohongshu-6cfcd554f6-gdc9g 0 (0%) 0 (0%) 0 (0%) 0 (0%) 5h30m default consume-yunjiao-article-5bcb58ddcf-lsqj8 0 (0%) 0 (0%) 2Gi (3%) 3Gi (4%) 5h30m default consume-zhihu-6764ff956-zgt79 0 (0%) 0 (0%) 4Gi (6%) 5Gi (7%) 6h18m default consume-zjx-6cf67885c-g5h2s 2 (6%) 0 (0%) 4Gi (6%) 5Gi (7%) 36s kube-system kube-flannel-ds-l594f 100m (0%) 100m (0%) 50Mi (0%) 50Mi (0%) 11d kube-system kube-proxy-vckxf 0 (0%) 0 (0%) 0 (0%) 0 (0%) 39d Allocated resources: (Total limits may be over 100 percent, i.e., overcommitted.) Resource Requests Limits -------- -------- ------ cpu 20100m (62%) 100m (0%) memory 45106Mi (70%) 61490Mi (95%) ephemeral-storage 0 (0%) 0 (0%) Events: <none>