Concurrency != Parallelism

前段時間在公司給你們分享GO語言的一些特性,而後講到了併發概念,你們表示很迷茫,而後分享過程當中我拿來了Rob Pike大神的Slides 《Concurrency is not Parallelism》,反而搞的你們更迷茫了,看來你們丟了不少之前的基本知識。後來我就把Pike大神的slide和網上的一些牛人關於Cocurrency和Parallelism的觀點作了整理,最終寫了本文。程序員

在Rob Pike的《Concurrency is not Parallelism》(http://talks.golang.org/2012/waza.slide)中說到,咱們的世界是Parallelism,好比網絡,好比大量獨立的我的。可是須要協做的。因此就有了併發。不少人認爲併發很COOL,認爲併發就是並行。可是這種觀念在ROB PIKE看來是錯誤的。好比以前有我的寫了一個質數篩選的程序,而後這個運行在4核的平臺上,可是運行結果很慢。這個程序員錯誤的覺得GO提供的併發就是並行計算。golang

在Rob的這個slide中舉了一個地鼠燒書的例子,在我和小夥伴們一塊兒看的時候表示不太理解,因此作了一些功課來理解。瀏覽器

爲了更容易的搞懂Rob提的併發不是並行的概念,就須要搞懂併發和並行到底有什麼區別。網絡

在Rob看來,Concurrency是一種把一些獨立的執行過程組合起來的程序設計方法。併發

而Parallelism是同時執行一些可能相關(結果相關而不是依賴耦合關係的相關)或者獨立計算過程。app

Rob總結到:dom

Concurrency is about dealing with lots of things at one. ide

Parallelism is about doing lots of things at one. Not the same, but related. 函數

Concurrency is about structure, parallelism is about execution. 工具

Concurrency provides a way to structure a solution to solve a problem that may(but not necessarily) be parallelizable.

Rob還提到:

Concurrency is a way to structure a program by breaking it into pieces that can be executed independently.

Communication is the means to coordinate the independent executions.

This is the GO model and it's based on CSP.

Image(1)

上面的只有一隻地鼠推着小車把一堆語言手冊運送到焚燒爐中燒掉。若是手冊不少,運送距離很遠,那麼整個過程就要花費不少時間。

Image(2)

而後增長一隻地鼠來一塊兒幫忙作這個事情,可是兩我的來作,只有一輛小推車是不夠的,因此須要更多的車子

Image(3)

雖然車子增長了,可是有可能手冊不夠少了,或者爐子不夠用了,並且兩隻地鼠須要作工做的時候協商搬書到車裏,或者協商誰先用爐子。工做效率也不是過高。

Image(4)

所以咱們把它們真正的獨立開。它們之間就不須要協商了,也都有足夠的手冊能夠燒了。

雖然上面兩隻地鼠都分開獨立的運行了,可是它們之間有併發成分(Concurrent composition)存在。

假設在一個單位時間內,只有一隻地鼠在工做,它們就不是並行(Parallel),它們仍是併發的。

這個設計不是爲了有意並行而設計,可是能夠天然地轉化爲可並行的(The design is not automatically parallel. However, it's automatically parallelizable)。

並且這個併發成分還暗示了有其餘的模型。

Image(5)

這裏三隻地鼠活動,可能有些延遲。每一個地鼠都是一個獨立的執行過程。它們協做交流。

Image(6)

這裏新增長了一隻地鼠用於推送空車子到裝書處。每隻地鼠都只作一件事情。這個併發的粒度比之前的更細小了。

若是咱們把一切正確的安排好(雖然可能使人難以置信,可是並不是不可能),這個設計會比最開始的那一隻地鼠的效率要快4倍。

咱們在原有的設計裏增長了一個併發的執行過程提高了性能。

Different concurrent designs enable different ways to parallelize.

這種併發的設計能夠很容易的使這個流程並行的執行。好比下面這樣:8只地鼠繁忙的工做。

Image(7)

須要記住的是,即便在單位時間內只有一隻地鼠在活動,此時雖然不是並行的,可是這個設計仍是一個正確的併發的解決方案。

還有另一種結構來組織兩隻地鼠的併發因素.即在兩隻地鼠中間再增長一堆手冊。

Image(8)
而後咱們能夠也很容易的並行執行:

Image(9)

還有一個組織方式:

Image(10)

而後再並行它:

Image(11)

如今有16只地鼠在工做了!

有不少方式能夠分解一個過程。這個分解的過程就是併發的設計。一旦咱們作好分解,並行化和正確性就會變的容易。
There are many ways to break the processing down. That's concurrent design. Once we have the breakdown, parallelization can fall out and correctness is easy.

一個複雜的問題能夠分解爲多個能夠簡單易懂的部分,而後把它們併發地(concurrently)組合在一塊兒。最後獲得一個簡單易懂,高效的,可伸縮的並且正確的設計。甚至能夠並行。

併發是很是強大的,雖然不是並行,可是能夠作到並行,並且能夠容易的作到並行,甚至是可伸縮性或者其餘任何的東西。

把上面地鼠的例子轉換到咱們計算機中,書堆就比如網頁內容,地鼠比如CPU,小推車比如序列化器或者渲染過程,或者網絡,而火爐就是最終的消費者,好比瀏覽器。

就是說瀏覽器發起請求,地鼠開始工做,把須要的網頁內容渲染好,而後經過網絡再發回瀏覽器。這也就變成了一個針對可伸縮的WEB服務作的併發的設計。

在Golang中,Goroutine就是一個與其餘goroutine在同一地址空間可是是獨立執行的函數。Goroutine不是線程,雖然有點像,可是比線程更輕量。

Goroutine會被多路複用到須要的系統線程中。當一個goroutine阻塞了,其相關的線程也阻塞了,可是其餘的goroutine沒有阻塞(注:這種阻塞是指一些系統調用,而像網絡操做,CHANNEL操做之類的,goroutine會被放入等待隊列中,等運行條件成立,好比網絡操做完畢,或者從CHANNEL中處理好收發,那麼就會被切換到運行隊列中運行)

Golang還提供了Channel用於goroutine之間的同步和數據交換。Select語句相似於switch,只是是用於判斷哪一個channel能夠通訊。 

Golang是真的支持併發,一個Golang能夠建立很是多的goroutine,好比一個測試程序建立了130W個goroutine,並且每一個goroutine的棧開始的時候都比較小,可是會需擴大或收縮。可是Goroutine不是免費的,可是是輕量的。


總結


用於區別Concurrency和Parallelism的最好的例子是哲學家進餐問題。經典哲學家進餐問題中,若是刀或者叉不夠的時候,每一個哲學家就須要協商來確保總有一我的可以吃到飯。而若是餐具足夠了,那麼就須要協商了,你們能夠很嗨皮的各吃各的了。

Concurrency與Parallelism最大的區別是要知足Concurrency,每一個獨立的執行體之間必須有協做,協做工具好比鎖或者信號量,或者隊列。

而Parallelism雖然也有不少獨立的執行體,可是它們不是須要協做的,只須要本身運行,計算出結果,最後被彙總起來就行了。

Concurrency是一種問題分解和組織方法,而Parallelism一種執行方式。二者不一樣,可是有必定的關係。

咱們大部分的問題均可以切割成細小的問題,每一個問題做爲一個獨立的執行體執行,相互間有協做,它們就是Concurrency。

這種設計方式並非爲了Parallelism,可是這種設計能夠天然而然的轉化爲可並行的。並且這種設計方式還可以確保正確性。

Concurrency不論是單核機器仍是多核機器,仍是網絡或者其餘平臺,都具備更好的伸縮性;

而Parallelism在單核機器上就作不到真正的Parallelism。可是Concurrency在多核機器上反而有可能天然而然的進化爲Parallelism。

Concurrency的系統的結果是不肯定的(indeterminate),爲了保證肯定性,須要用鎖或者其餘方式解決。而Parallelism的結果是肯定的。

Golang是支持併發的,能夠作到並行的執行Goroutine(即經過runtime.GOMAXPROC設置P的數量)。基於CSP提供了Goroutine,channel和select語句,提倡你們經過Message-Passing進行協做,而不是經過鎖等工具。這種協做方式能夠帶來更好的可擴展性。


在維基百科中,關於Concurrent的定義以下:

Concurrent computing is a form of computing in which several computations are executing during overlapping time periods – concurrently – instead of sequentially (one completing before the next starts). This is a property of a system – this may be an individual program, a computer, or a network – and there is a separate execution point or "thread of control" for each computation ("process"). A concurrent system is one where a computation can make progress without waiting for all other computations to complete – where more than one computation can make progress at "the same time" 

Concurrent computing is related to but distinct from parallel computing, though these concepts are frequently confused, and both can be described as "multiple processes executing at the same time". In parallel computing, execution literally occurs at the same instant, for example on separate processors of a multi-processor machine – parallel computing is impossible on a (single-core) single processor, as only one computation can occur at any instant (during any single clock cycle).(This is discounting parallelism internal to a processor core, such as pipelining or vectorized instructions. A single-core, single-processor machine may be capable of some parallelism, such as with a coprocessor, but the processor itself is not.) By contrast, concurrent computing consists of process lifetimes overlapping, but execution need not happen at the same instant.

For example, concurrent processes can be executed on a single core by interleaving the execution steps of each process via time slices: only one process runs at a time, and if it does not complete during its time slice, it is paused, another process begins or resumes, and then later the original process is resumed. In this way multiple processes are part-way through execution at a single instant, but only one process is being executed at that instant.

Concurrent computations may be executed in parallel, for example by assigning each process to a separate processor or processor core, or distributing a computation across a network. This is known as task parallelism, and this type of parallel computing is a form of concurrent computing.

The exact timing of when tasks in a concurrent system are executed depend on the scheduling, and tasks need not always be executed concurrently. For example, given two tasks, T1 and T2:

  • T1 may be executed and finished before T2
  • T2 may be executed and finished before T1
  • T1 and T2 may be executed alternatively (time-slicing)
  • T1 and T2 may be executed simultaneously at the same instant of time (parallelism)

The word "sequential" is used as an antonym for both "concurrent" and "parallel"; when these are explicitly distinguished, concurrent/sequential and parallel/serial are used as opposing pairs

Concurrency的定義以下:

In computer science, concurrency is a property of systems in which several computations are executing simultaneously, and potentially interacting with each other. 

The computations may be executing on multiple cores in the same chip, preemptively time-shared threads on the same processor, or executed on physically separated processors. 

A number of mathematical models have been developed for general concurrent computation including Petri nets,process calculi, the Parallel Random Access Machine model, the Actor model and the Reo Coordination Language.

Because computations in a concurrent system can interact with each other while they are executing, the number of possible execution paths in the system can be extremely large, and the resulting outcome can be indeterminate. Concurrent use of shared resources can be a source of indeterminacy leading to issues such as deadlock, and starvation.

The design of concurrent systems often entails finding reliable techniques for coordinating their execution, data exchange, memory allocation, and execution scheduling to minimize response time and maximize throughput.

關於Parallelism的定義以下:

Parallel computing is a form of computation in which many calculations are carried out simultaneously

operating on the principle that large problems can often be divided into smaller ones, which are then solved concurrently (「in parallel」).

By contrast, parallel computing by data parallelism may or may not be concurrent computing – a single process may control all computations, in which case it is not concurrent, or the computations may be spread across several processes, in which case this is concurrent. For example, SIMD (single instruction, multiple data) processing is (data) parallel but not concurrent – multiple computations are happening at the same instant (in parallel), but there is only a single process. Examples of this include vector processors and graphics processing units (GPUs). By contrast, MIMD (multiple instruction, multiple data) processing is both data parallel and task parallel, and is concurrent; this is commonly implemented as SPMD (single program, multiple data), where multiple programs execute concurrently and in parallel on different data.

Concurrency有Parallelism所不具有的特性:interacting。併發的程序中,每一個執行體之間都是能夠相互做用的。

相關文章
相關標籤/搜索