寫在前面的話:
這週末我一個同窗在羣上說找到一篇挺有意思的文章(就是下面要說的可讀性代碼的心理學),說要翻譯出來,我就主動請纓了,跟他合做翻譯這篇文章,在看這篇文章的同時,我忽然間想到,爲何代碼的可讀性那麼多人重視呢?固然我也認爲代碼的可讀性很重要,能給咱們的協做開發帶來好處。我忽然聯想起我前一段時間在工做之餘看的一本書,叫《人類簡史》,它在介紹人類這一物種的歷史的同時,也對人類爲何能成爲地球霸主(位於食物鏈頂端)作出瞭解釋,它認爲其中一個緣由就是,人類演化出想象的能力,人類能臉不紅,心不跳的說出他從未見過的東西。認爲它真的存在那樣。好比說,神、科幻小說裏說的技術、以及那些概念(國家、主權、科學、民主以及各類主義)。而且讓全部人相信,利用這些想象的概念創建了能讓陌生人也能合做的框架。重點是合做。這種幾百萬人爲了同一個目標而奮鬥的合做能力,這讓人類可以打敗其餘物種的緣由之一。再往下說就離題了,我想強調的是合做能力的重要性,它讓咱們人類達成了今天這樣的成就,我想把它搬到咱們公司研發的身上,就是說,個體能力能夠不強,可是協做開發的能力必定要強,怎麼提升協做開發的能力呢?方法之一就是代碼的可讀性,我認爲代碼的可讀性是咱們協做的基礎,代碼都看不懂,協做從何談起。又怎麼提升咱們研發的生產效率呢?因此我想把下面這篇文章介紹給你們。git
翻譯協做者:
https://github.com/a1023293003 隨諭
https://github.com/lwhile lwhile程序員
原文:https://medium.com/@egonelbre...github
代碼可讀性心理學
Psychology of Code Readabilityweb
By no means should this be regarded as truth, but rather a model that I’ve found extremely helpful in understanding and finding better ways of writing code.
我發現了一個頗有用的模型,這個模型雖然不是真理,可是卻很是可以幫助我理解和編寫出更好的代碼。express
I think one of the things every programmer strives for is writing better code. Readability is one of the aspects of 「good code」. There have been many papers and books written on the topic, however I find many of them lacking. Not because of the recommendations, but rather the analysis part.
寫出更好的代碼是每一個程序員都在努力的目標。代碼的可讀性是「好代碼」的一個判斷標準之一。關於這個主題的論文和書已經有很是多了,然而我發現他們都存在缺點。不是由於給出的那些建議很差,而是他們都少了分析的那一部分。
(個人理解:就是說,市面上的書都是說該怎麼怎麼作,並無說出爲何怎麼作。)編程
What makes some piece of code more readable than another? It’s one thing to say that it uses better variable names, but what makes a certain variable name easier to read? I really mean digging deeper into human psyche. It is our brain that is doing all the processing after all.
究竟是什麼東西,讓有些代碼的可讀性就是比另一些代碼強?有一種說法是更好的變量名,可到底是什麼東西讓變量名更易讀?個人意思是要深刻到人類的內心層面,畢竟咱們的大腦接管了全部的處理過程。設計模式
心理學入門
Psychology Primer網絡
As any programmer knows we have limited capacity to think about things. This is our working memory limit. There’s an old myth going around that we can hold 7±2 objects in our head. It is known as 「The Magical Number Seven」 and it isn’t entirely accurate. This number has been refined to 4±1 and some even suggest there isn’t a limit, but rather a degradation of ideas over time. For all intents and purposes we can assume that we have a small number of ideas we can process in our head at a given time. The exact number isn’t that important.
任何程序員都知道,咱們的思考容量是有限的。這就是咱們的運行內存限制。有一個古老的傳說,相傳咱們的大腦能夠容納 7±2 個物體,它被稱爲「神奇的 7 號」,不過這並不許確,這個數字有點限制到 4±1,而有些建議則認爲其沒有受到限制,而是隨着時間推算思想受到了退化。出於全部意圖和目的,咱們能夠假設咱們的大腦在給定的時間內,能處理的東西只是一個很小的數字,具體是數字是多少並不重要。app
But some would still confidently say that they can handle problems involving more than 4 ideas. Luckily there’s another process going on in our brain called chunking. Our brain automatically groups information pieces into larger pieces (chunks).
但有些人仍是堅持認爲他們可以同時處理超過 4 個主意。幸運的是,咱們的大腦中正在進行另外一個叫作分塊的過程。 咱們的大腦會自動將信息片斷分紅更大的片斷(塊)。框架
Dates and phone-numbers are good examples of this:
日期和電話號碼就是很好的例子:
From these chunks we build up our long term memory. I like to imagine it as a large web of consisting many chunks, chunk sequences and groupings.
經過這些信息塊,咱們創建起咱們的長期記憶。我喜歡把它們想象成一個由許多塊、塊序列和塊分組組成的大網絡。
(分塊記憶,舉例,電話號碼的分段記憶。由此推出,代碼的方法編寫,一個方法只作一件事,基於大腦用分塊來儲存信息來解釋爲何這麼作)
You might guess from this image that moving from one place to another in memory is slow. And you would be right. In UX there’s a concept called singular focus of attention. Which means that we can focus at a single thing at a time. It also has a friend called locus of attention, which says that our attention is also localized in space.
你能夠從這張圖片得出一個結論,記憶塊之間的信息交流是很慢的。你是對的,在 UX 領域中,這個結論被稱爲單一關注焦點理論。也就是說,咱們一次只能關注單一的事物。也能夠說咱們只能有一個關注點,咱們關注的地方也只能是個局部的。
(也就是,在一堆信息中忽然插入一段不相關的信息,大腦會花時間創建聯繫,由此推出一個方法只作一件事!!)
You might think this is the same thing as working memory limit, however there is a slight difference. Working memory capacity talks how big our focusing area is, the focus/locus of attention say that we can only do that when there is a place in our brain that contains the ideas.
同理,咱們人腦的運行的內存也是有限的。然而,有一個細微的不一樣在於,咱們的人腦內存有多大,聚焦的範圍也就能有多大。單一關注焦點理論也說明了只有咱們大腦中有存在一個包含這些想法的地方時,大腦才能正常工做。
The focus and locus of attention are important to know, because switching cost is significant. It is even slower when we need to create new chunks and groupings. It also goes the other way, the more familiar something is the less time it takes to make it our focus.
認識焦點和關注點很是重要,由於切換他們的成本是很是高的。咱們人腦建立一個認知事物的塊和分組是很慢的。一樣的,若是事物之間類似度很高,建立分組的時間就會縮短,那咱們就能更快的把咱們的關注點聚焦到另外一個事物。
(也就是,在一堆信息中忽然插入一段不相關的信息,大腦會花時間創建聯繫,由此推出一個方法只作一件事!!這段話也是說明這個的)
We also remember things better when we are in a similar context. This is called encoding specificity principle. This means by designing our encoding and recalling conditions we can design better what we remember.
當咱們處在相似的環境時,咱們也能更好地記住事情。這被稱爲編碼特異性原則。這意味着,經過設計編碼和回憶條件,咱們能夠更好地設計咱們記憶中的內容。
譯者注:情境類似性是心理學中編碼特異性原則,描述的是,當回憶時的背景與識記時的背景相匹配時,記憶效果最好。觸景生情,睹物思人。
In an experiment divers were assigned to memorize words on land and under water. Then recall them on land or in water. The best results were for people who memorized and recalled on land. Surprisingly the second best were the people that memorized and recalled on water. This showed that the context where you learn things has an impact on how well you can remember things.
在一個實驗中,潛水員被分配在陸地和水下記憶單詞。而後在陸地或水中回憶它們。最好的結果是那些在陸地上記憶和回憶的人。使人驚訝的是,第二好的是那些在水中記憶和回憶的人。這代表,你學習事物的環境會影響你對事物的回憶能力。
To make things shorter, I’ll use context to refer to 「focus and locus of attention」 and how it relates to other chunks and loci. Effectively our brain is moving from one context to another. When we move our focus of attention we also remember what our previous contexts were, until our memory fades.
爲了縮短篇幅,我會用上文來指代關注點以及它與其它塊的聯繫,以及他們之間是如何聯繫的。咱們的大腦承上啓下的能力是挺強的,當咱們轉移注意力的時候,咱們依舊可以記住上文出現的內容,直到咱們的記憶變淡
譯者注:這樣的描述給個人感受很是像進程和線程在競爭CPU的樣子。
From these contexts and chunks we build up mental representations and a mental model. There’s a slight difference between these two things. Mental representation is our internal cognitive symbol for representing the external world or a mental processes. Mental model can be thought of as a explanation of a mental representation. Often these terms are used interchangeably.
根據這些上下文和塊,咱們能夠構建出心理表徵和一個心理模型,二者之間存在着細微的不一樣。內心表徵是咱們內在的認知中對外部世界或者內心過程的符號。心理模型能夠被認爲是心理表徵的解釋。在大多數狀況下,這些術語一般能夠互換使用。
Mental models have a vital importance in our ability to precisely describe a solution to a problem. There are many different mental models possible for a single problem each having their own benefits and problems.
心理模型對於咱們精確描述問題的解決方案的能力相當重要。對於一個問題,有許多不一樣的心理模型,每種模型都有各類好處和問題。
All of these ideas sound nice and precise, however our brains are quite imprecise. There are many other problems with our brain.
全部這些想法聽起來都很準確,可是咱們的大腦是很是不精確的。咱們的大腦還有不少其它的問題。
Our brains need to do more work when dealing with abstractions.
咱們的大腦在處理抽象概念時須要作更多的努力。
When ideas are similar their chunks are related and linked in our brains in a similar way. This leads to our brain being unable to 「rebuild the contexts properly」 because we are uncertain which chunk is the right one. Example: I and 1; O and 0.
當一些想法類似,咱們把新想法以類似的方法創建起區塊並與已存在的區塊創建聯繫並鏈接這致使了咱們的大腦不能正確地「重構上文說的結論」(context)由於當新區塊與就區塊起衝突,咱們不肯定哪一個是對的。好比說 l 和 1 , O 和 0。
Ambiguity is another source for uncertainty. When a thing is ambiguous then there are multiple interpretations for the same thing. Homonyms are the best example of this property. Example: Crane — the bird or the machine.
歧義是不肯定性的另外一個來源。當一件事模棱兩可時,對同一件事情就會有多種解釋。同義詞是此屬性的最佳示例。例如:Crane-意思多是鶴,也多是起重機。
(起變量名不要有歧義!,緣由下面有解釋)
Uncertainty causes us to slow down. It might be a few milliseconds, but that can be enough to disrupt our state of flow or make us use more working memory than necessary.
不肯定性會讓減緩咱們的速度。這可能只有幾毫秒的時間,可是卻足夠打亂咱們的狀態或者讓咱們使用更多的工做內存。
There are of course interruptions that can disrupt our working memory, but there are also 「smaller interruptions」 called noise. If someone is saying random numbers and you are trying to calculate, then we can end-up accidentally start processing them and use up some of our working memory. This can happen also visually on screen when there are many irrelevant things between the important things.
固然,中斷能夠打斷咱們的工做記憶,可是還有一些小中斷被稱爲噪聲。若是一我的在說一些隨機數,你試着對這些數字作計算,咱們最終會意外得中止,由於這個處理過程會耗盡咱們的一些工做記憶。當重要事物之間存在許多不相關的事物時,這也能夠在屏幕上直觀地發生。
(一樣這言論也能夠對一個方法只作一件事作出解釋,不相干的事情會佔用咱們大腦的工做內存,內存滿了就宕機了)
Our brains also have trouble processing negation, with support from many studies. The effect of negation depends on the context, but negation should be used with care.
在許多研究代表,咱們的大腦也難以處理否認。否認的影響取決於上下文,否認應謹慎使用。
All of these together add up to cognitive load. It is the total amount of mental effort being used. Our processing capacity decreases with prolonged cognitive load and it is restored with rest. With prolonged cognitive load our minds also start to wander.
全部這些共同增長了認知負荷。認知負荷是被使用的精力的總量。長期的認知負荷會使咱們處理能力降低,這經過休息來恢復。長期的認知負荷,也會使得咱們的大腦開始走神。
(小休五到十分鐘,番茄時間之類的)
譯者注:認知負荷理論假設人類的認知結構由工做記憶和長時記憶組成。其中工做記憶也可稱爲短時記憶,它的容量有限,一次只能存儲3-5條基本信息或信息塊。當要求處理信息時,工做記憶一次只能處理2-3條信息,由於存儲在其中的元素之間的交互也須要工做記憶空間,這就減小了能同時處理的信息數。
(一個方法只作一件事)
If this is new information to you, then I highly suggest taking a break now. These form fundamental properties that code analysis will rest upon.
若是這對你來講是新信息,我強烈建議如今休息一下。這些構成了代碼分析所依賴的基本條件。
I’m going to use the term programming artifact. By that I mean everything that is created as a result of programming. It might be a method you write, type declarations for a function, variable names, comments, Unreal Engine Blueprints, UML diagrams etc. Effectively anything that is a direct result of programming.
我將使用編程工件這個術語。指代跟編程相關的全部內容。它多是您編寫的方法、函數的類型聲明、變量名稱、註釋、虛幻的引擎藍圖(Unreal Engine Blueprints)、UML圖等。實際上就是編程的直接結果。
Here are a few recommendations, rules-of-thumb and paradigms analyzed in the context of psychology. By no means is this an exhaustive list or even a guide on what exactly to do. Probably there are many places where the analysis could be better, but this is more about showing how we can gain deeper insight into code readability by using psychology.
這裏有一些在心理學背景下分析的建議,經驗法則和範例。這毫不是一份詳盡的清單,甚至也不是關於究竟要作什麼的指南。有不少地方可能還能作得更好,但更多的是展現如何使用心理學來深刻了解代碼可讀性。
Length is not a virtue in a name; clarity of expression is. — Rob Pike
長度不是名字中的優勢,表達的清晰度是。—羅布·派卡
Let’s take a simple for loop:
讓咱們使用一個簡單的for循環:
A. for(i=0 to N)
B. for(theElementIndex=0 to theNumberOfElementsInTheList)
(我在工做中常用長名字,太長確實也要花時間看)
Most programmers would recommend A. Why?
大多數程序員會推薦A。爲何呢?
B. uses longer names which prevents us from recognizing this as a single chunk. The longer name also doesn’t help creating a better context, effectively it is just noise.
B. 使用較長的名稱,這使咱們沒法將其識別爲單個塊。更長的名字也無助於建立一個更好的上下文,實際上它只是一個噪音。
However, let’s imagine different ways of writing packages / units / modules / namespaces:
可是,讓咱們想象一下編寫包/單元/模塊/命名空間的不一樣方式:
A. strings.IndexOf(x, y)
B. s.IndexOf(x, y)
C. std.utils.strings.IndexOf(x, y)
D. IndexOf(x, y)
In example B. the namespace s is too short and doesn’t help 「to find the right chunk」.
在例子B. 中命令空間s過短,不能幫助「找到正確的信息塊」。
In example C. the namespace std.utils.strings is too long, most of it’s unnecessary, because strings itself is descriptive enough. (Unless you need to use multiple of them).
在例子C. 中,命名空間std.utils.strings太長,大部分都是沒必要要的,由於strings自己具備足夠的描述性(除非你須要使用其中的多個)。
In example D. without namespaces, then the call becomes ambiguous, you might be unsure where the IndexOf comes from and what it is related to.
在例子D. 中,若是沒有命名空間,那麼調用就變的模棱兩可,你可能沒法肯定IndexOf來自任何以及處理它與什麼相關。
(就是說,IndexOf這個方法不知道是幹嗎用的)
It’s important to mention that, if all of code is dealing with strings it will be quite easy to assume that IndexOf is some string related function. In such cases, even the strings part might be too noisy. For example: int16.Add(a, b) compared to a + b, would be much harder to read.
須要注意的是,若是全部代碼都在處理字符串,那麼很容易假定IndexOf是一些與字符串相關的函數。在這種狀況下,甚至strings部分也可能太嘈雜了。例如:int16.Add(a, b)比a + b更難以閱讀。
(變量名沒有統一的說明要如何作,也就是說受人的主觀意願的影響大,有些人以爲這樣足夠了,有些人不認爲。因此,我認爲在咱們團隊內部統一變量命名規則會很好。)
With variables it would be easy to conclude that 「modification is bad, because it makes harder to track what is happening」. But, lets take these examples:
對於變量,很容易得出這樣的結論:「修改是很差的,由於它使跟蹤正在發生的事情變得更加困難」。可是,讓咱們以如下例子爲例:
// A.
func foo() (int, int) { sum, sumOfSquares := 0, 0 for _, v := range values { sum += v sumOfSquares += v * v } return sum, sumOfSquares }
// B.
func GCD(a, b int) int { for b != 0 { a, b = b, a % b } return a }
// C.
func GCD(a, b int) int { if b == 0 { return a } return GCD(b, a % b) }
Here foo is probably easiest to understand. Why? The problem isn’t modifying the variables, but rather how they are modified. A doesn’t have any complex interactions, which both B and C do. I would also guess, that even though C doesn’t have modifications, our brain still processes it as such.
在這裏,foo多是最容易理解的。爲何呢?問題不是修改變量,而是如何修改它們。A不存在任何複雜的相互做用,B和C都存在。我也會猜想,即便C沒有修改,咱們的大腦仍然是這樣處理它的。
// D.
sum = sum + v.x sum = sum + v.y sum = sum + v.z sum = sum + v.w
// E.
sum1 = v.x sum2 := sum1 + v.y sum3 := sum2 + v.z sum4 := sum3 + v.w
Here is another example where the modification based version (D) is easier to follow. E introduces new variables for the same idea, effectively, the different variables become noise.
這裏是另外一個示例,其中基於修改的版本(D)更容易理解。E爲相同的思想引入新的變量,有效地將不一樣的變量轉換爲噪聲。
Let’s take another for loop:
讓咱們再來一次for循環:
A. for(i = 0; i < N; i++) B. for(i = 0; N > i; i++) D. for(i = 0; i <= N-1; i += 1) C. for(i = 0; N-1 >= i; i += 1)
How long did it take for you to figure out what each line is doing? For anyone who has been coding for a while, A probably took the least time. Why is that?
你花了多長時間才弄清楚每一行都作了什麼?對於任何已經編程了一段時間的人來講,A可能花的時間最少的。爲何會這樣呢?
The main reason is familiarity. To be more precise, we have a chunk in our long-term-memory for A, however not for any of the others. This means that we need to do more processing, before we can extract the meaning and concept from it.
主要緣由是熟悉。更準確的說,咱們的長期記憶中有一塊關於A的信息塊,而不是其餘的。這意味着咱們須要作更多的處理,而後才能從中提取含義和概念。
(你們都知道的一些作法)
For any complete beginner, all of these would be processed quite similarly. They wouldn’t notice that one is 「better」 than any other.
對於任何一個徹底的初學者來講,全部這些都會被處理得很是類似。他們不會注意到一個比其它任何一個都「更好」。
A proficient programmers reads A as a single chunk or idea 「i is looped for N items」. However a beginner reads this as 「We initialize i to zero. Then we test whether each time we are still smaller than N. Then we add one to i.」
熟練的程序員將A理解爲「i的N次循環」。可是初學者認爲這是「咱們初始化爲零。而後每次循環都測試i是否比N小。而後咱們在i中加1。」
A is what you call the 「idiomatic way」 of writing the for loop. It’s not really better in terms of intrinsic complexity. However, most programmers can read it more easily, because it is part of our common vocabulary.
A是你所稱爲for循環的「慣用方式」。就內在的複雜性而言,這並非真的更好。可是,大多數程序員能夠更容易地閱讀它,由於它是咱們經常使用詞彙表的一部分。
Most languages have an idiomatic way of writing things. There are even papers and books about them, starting with APL idioms, C++ idioms and more structural idioms like in GoF Design Patterns. These books can be regarded as a vocabulary for writing sentences and paragraphs, such that it will be recognized by people.
大多數語言都有一種慣用的寫做方式。甚至還有關於它們的論文和書籍,從APL慣用語法、C++慣用語法和像是在GoF設計模式中的更加結構化的慣用語法。這些書能夠看做是寫句子和段落的詞彙,這樣才能被人們所承認。
There’s however a downside to all of this. The more idioms there are, the bigger vocabulary you have to have to understand something. Languages with unlimited flexibility often suffer due to this. People end up creating 「idioms」 that help them write more concise code, however everybody else will be slowed down by them.
然而,全部這些都有不利的一面。慣用語法越多,不得不去理解的詞彙量就越大。具備高度靈活性的語言經常所以而受到影響。人們最終會建立「慣用語法」,幫助他們編寫更簡潔的代碼,可是其餘人都會被它們拖慢。
(代碼潛規則,不利用新手,新手須要記住不少潛規則。有些潛規則沒法避免,
最好寫個文檔,讓每一個剛入職的員工先看一遍,熟悉)
With regards to repeated structures names such as 「model」 and 「controller」 act as a chunk to remind of how these structures relate to each other.
對於重複結構,諸如「模型」和「控制器」這樣的名稱做爲信息塊來提醒這些結構是如何相互關聯的。
Frameworks, micro-architectures and game engines all try to create and enforce such relations. This means people have to spend less time figuring out how things communicate and are wired up. Once you grok the structures it becomes easier to jump from one code base to another.
框架、微體系結構和遊戲引擎都試圖建立和增強這種關係。這意味着人們能夠花費更少的時間去弄清楚事物是如何溝通和鏈接起來的。一旦你經過感受意會了這個結構,就更容易從一個代碼庫跳到另外一個代碼庫。
However the main factor with all of this is consistency. The more consistent the code base is in naming, formatting, structure, interaction etc. the easier it is to jump into arbitrary code and understand it.
然而,全部這些的主要因素是一致性。代碼庫在命名、格式化、結構、交互等方面越一致。跳入任意代碼並理解它就越容易。
(一致性,也就是說要用個通用的規則,好比說變量名都用駱駝峯之類的)
As previously mentioned uncertainty can cause stuttering when reading or writing code.
如前所述,當閱讀或編寫代碼時,不肯定性會致使不暢的工做。
Let’s take ambiguity as our first example. The simplest example would be [1,2,3].filter(v => v >= 2). The question is, what will this print, is it 「2 and 3」 or 「1」. It’s a simple question, but it can cause a reading/writing stutter when you don’t use it day-in-out.
讓咱們以模糊度做爲咱們的第一個例子。最簡單的例子是[1, 2, 3].filter(v => v >= 2)。問題是,這個印刷品是「2和3」仍是「1」?這是一個簡單的問題,但當你不使用它時,它會致使讀寫工做的不暢。
譯者注:是過濾出大於等於2的元素?仍是過濾掉大於等於2的元素?
(到底想要【1】仍是【2,3】)
The source of the stutter is ambiguity. In the real-world there are two uses for it, one is to keep the part that is getting stuck in the filter and the other that passes through the filter. For example when you have gold in water, then you want to get rid of the water. When you have dirt in the water, you probably want to get rid of the dirt.
工做不暢的根源是含糊不清。在現實世界中,它有兩種用途,一種是保留被卡在過濾器中的部分,另外一種是經過過濾器。例如,當你有金子落入水中,那麼你想擺脫水。當你在水中有污垢時,你可能想要清除這些污垢。
Even if we precisely define what filter does, it can still cause stutter because it’s hardwired with two meanings in our brain. The common solution is to use functions such as select, discard, keep.
即時咱們精確地定義了filter(過濾器)的做用,它仍然會致使工做不暢,由於它在咱們的大腦中有兩個含義。常見的解決方案是使用諸如select、discard、keep等函數。
We can also attach meaning in different ways, such as types. For example: instead of GetUser(string) you can use type CustomerID string to ensure GetUser(CustomerID) to make clear that the interpretation is 「get user using a customer id」 instead of other possibilities such as 「get user by name」.
咱們還能夠以不一樣的方式附加含義,例如類型。例如:你可使用CustomerID類型的字符串代替GetUser(String),以確保GetUser(CustomerID)解釋爲「使用客戶ID獲取用戶」,而不是「按名稱獲取用戶」等其它可能性。
Similarity is also easy to conceptually understand. For example having variables such as total1, total2, total3 can lead to situation where you make copy paste mistakes or over a longer piece of code lose track what it meant. For example name such as sum, sum_of_squares, total_error can provide more meaning.
類似性在概念上也很容易理解,例如,擁有諸如total一、total二、total3這樣的變量可能會致使複製粘貼錯誤或在代碼較長的時候,沒法跟蹤它的含義。例如,sum、sum_of_squares、total_error等名稱能夠提供更多含義。
Having multiple names for the same thing can also be source of confusion when moving between packages. For example in one package you use variable name c, cl and in another client in the third source. It’s interesting to think about special variables such as this and self.
當在包之間移動時,爲同一件事情設置多個名稱也多是混淆的根源。例如,在一個包中使用變量名稱c、c1,在另外一個地方使用變量名client,在第三個地方使用變量名source。想想特殊的變量,好比this和self,是頗有趣的。
Ambiguity and similarity is not a problem just at the source level. Eric Evans noted this in DDD with the Ubiquitous Language pattern. The notion is that in different contexts such as billing and shipping, words such as 「client」 can have widely different usages and meanings, so it’s helpful to keep a vocabulary around to ensure that everyone communicates clearly.
歧義與類似並不只僅是來源層的問題。Eric Evans用無處不在的語言模式在DDD中注意了這一點。這個概念是,在不一樣的上下文中,例如帳單和發貨,諸如「client」這樣的詞能夠有寬泛而不一樣的用法和含義,因此保持詞彙量有助於確保每一個人都清楚地溝通。
We have all seen the 「stupid beginner examples」 of commenting:
咱們都看到了「愚蠢的初學者的例子」的註釋:
// makes variable i go from 0 to 99 for(var i = 0; i < 100; i++) { // sets value 4 to variable a var a = 4;
(愚蠢指的是每行加註釋吧)
While it may look stupid, it might have some purpose. Think about learning a second or third language. You usually learn the new language by understanding the translation in your primary language. These are the 「chunks」 written out explicitly.
雖然它看起來很愚蠢,但可能有它的目的。考慮學習第二或第三語言。你一般經過理解你的主要語言的翻譯來學習新的語言。這些是明確寫出的「信息塊」。
Once you have learned 「chunk」 the comments become noise, because you already know that information by looking at the second line.
一旦你學會了「信息塊」,這些註釋就會變成噪音,由於你已經經過看第二行就知道了這些信息。
As programmers get better, the intent of comments becomes to condense information and to provide a context for understanding code. Why was a particular approach taken when doing X or what needs to be considered when modifying the code.
當程序員變得更好時,註釋的目的就變成了壓縮信息和提供理解代碼的上下文。爲何在執行X時採用了特定的方法,或者在修改代碼時須要考慮什麼。
Effectively, it’s for setting up the right mental model for reading the code.
實際上,這是爲了創建正確的閱讀代碼的心理模型。
(也就是說,只在關鍵或難以理解或在潛規則代碼處加上註釋)
Working memory limitation leads us to decompose and partition our code into different interacting pieces. We must be mindful in how we relate different pieces and how they interact.
工做記憶限制致使咱們分解和劃分咱們的代碼到不一樣的交互部件。咱們必須注意咱們如何將不一樣的部分聯繫起來,以及它們是如何相互做用的。
For example when we have a very deep inheritance chain and we use things from all different inheritance levels, the class might be too complicated, even if each class has maybe two methods and each method is five lines of code. The class and all the parents form a single 「whole」. Illustratively you can count each 「inheritance step」 as a 「single idea」 that you need to remember when you use that particular class.
例如,當咱們有一個很是深的繼承鏈,而且使用來自全部不一樣繼承級別的東西時,類可能太複雜了,即時每一個類可能有兩種方法,並且每一個方法都是五行代碼。全部類和父類組成一個單一的「總體」。舉例說明,你能夠將每一個「繼承步驟」計算爲使用該特定類時須要記住的「單個想法」。
The other side of contexts is moving between function calls. Each call is a 「context in our mental model」, so we need to remember where we came from and how it relates to the current situation. The deeper the call stack, the more stuff we have to keep in mind.
上下文的另外一面是在函數調用之間移動。每個調用都是一個「心理模型中的上下文」,因此咱們須要記住咱們來自何處以及它是如何與當前的狀況相關聯的。調用堆棧越深,咱們須要記住的東西就越多。
One way to reduce the depth of our mental model contexts is to clearly separate them. One of such examples is early return:
減小咱們的心理模型上下文的深度的另外一種方法是清楚地將它們分開。其中一個例子就是提早返回:
public void SomeFunction(int age) { if (age >= 0) { // Do Something } else { System.out.println("invalid age"); } } public void SomeFunction(int age) { if (age < 0){ System.out.println("invalid age"); return; } // Do Something }
In the first version when we read the 「Do Something」 part we understand it only happens when the age is positive. However, when we reach the 「else」 part we have forgotten what the condition was, because at that point the distance from the condition can be quite far away.
在第一個版本中,咱們讀到「Do Something」的部分時,咱們知道只有當年齡是非負數的時候纔會發生。然而,當咱們到達else部分時,咱們已經忘記了條件是什麼,由於在這一點上,與條件的距離可能很遠。
The second version is somewhat nicer. We have lost the necessity to keep multiple 「contexts」 in our head, but can focus instead of a single context that is setup and verified by multiple checks in the beginning.
第二個版本則要好一些。咱們已經失去了在頭腦中保留多個「上下文」的必要性,咱們能夠集中注意力,而不是在開始時經過屢次檢查來設置和驗證單個上下文。
One of the usual recommendations is 「don’t have global variables」. But, when a variable is set during startup and never changed again, is that a problem? The problem isn’t in the 「variableness」 or 「globalness」 of something, but rather in how it affects our capability to understand code. When something is modified at a distance then we cannot build a contained model of it. The 「globalness」 of course clutters the namespace (depending on the language) and means there are more places it can be accessed from. Of course there are many other things that have same properties, such as 「Singleton」. So, why is it considered better than a global variable?
一般的建議之一是「沒有全局變量」。可是,當一個變量在啓動時被設置,而且不再會改變,這是一個問題嗎?問題不在於事物的「多樣性」或「全局性」,而在於它如何影響咱們理解代碼的能力。當某物在必定時間間隔內被修改時,咱們就沒法創建包含它在內的模型。固然,「全局性」把命名空間(取決於語言)弄得很亂,這意味着能夠從更多的地方訪問它。固然,還有許多其餘的東西具備相同的屬性,好比單例。那麼,爲何人們認爲它比全局變量更好呢?
Single responsibility principle (SRP) is easy to understand with these concepts. It tries to ensure that we have proper chunks for a thing. This constraint often makes chunks smaller. Having a single responsibility also means that we end up with things that have working memory need. However, we need to consider that when we separate a class or function into multiple pieces we introduce many new artifacts. When these artifacts are deeply bound together we may not even gain the benefits of SRP.
單一責任原則(SRP)很容易理解這個概念。它試圖確保咱們有適合某件事的信息塊。這個約束一般使信息塊變小。擁有單一的責任也意味着咱們最終會有工做記憶所需的東西。可是,咱們須要考慮的是,當咱們將類或函數分離爲多個部分時,咱們引入了許多新的構件。當這些構件深深地結合在一塊兒時,咱們甚至可能得不到SRP的好處。
Carmack’s comments on inlined functions is a good example of this. The three examples he gave were these:
Carmack對內聯函數的評價就是一個很好的例子。他列舉了三個例子以下:
// A void MinorFunction1( void ) { } void MinorFunction2( void ) { } void MinorFunction3( void ) { } void MajorFunction( void ) { MinorFunction1(); MinorFunction2(); MinorFunction3(); } // B void MajorFunction( void ) { MinorFunction1(); MinorFunction2(); MinorFunction3(); } void MinorFunction1( void ) { } void MinorFunction2( void ) { } void MinorFunction3( void ) { } // C. void MajorFunction( void ) { { // MinorFunction1 } { // MinorFunction2 } { // MinorFunction3 } }
By making pieces smaller we made the chunks smaller, however understanding the system became harder. We cannot read our code from top-to-bottom and understand what it does, but instead we have to jump around in the code base to read it. Version C preserves the linear ordering while still maintaining the conceptual chunks.
咱們經過使部件更小,從而使信息塊更小,可是理解系統變得更加困難。咱們不能自上而下的閱讀咱們的代碼,也不能理解它的做用,相反,咱們必須在代碼庫中跳來跳去去閱讀它。在保持概念快的同時,C版保留了現行排序。
Overall we can summarize the code readability as trying to balance different aspects:
總之,咱們能夠將代碼可讀性歸納爲試圖平衡不一樣方面:
1.Names help us retrieve the right chunks from memory and help us figure out their meaning. Too long a name can end up being noisy in our code. Too short a name may not help us figure out its true meaning. Bad names are misleading and confusing.
1.名字幫助咱們從記憶中檢索出正確的信息塊,並幫助咱們理解它們的意義。在咱們的代碼中,太長的名字可能會引發噪音。過短的名字可能沒法幫助咱們找出它真正的含義。很差的名字是誤導和使人困惑的。
2.To minimize the cost of shifting attention, we try to write all related code close together. To minimize the burden to our working memory, we try to split the code into smaller and more fathomable units.
爲了將注意力轉移的成本降到最低,咱們嘗試將全部相關代碼緊密地寫在一塊兒。爲了將工做記憶的負擔降到最低,咱們嘗試將代碼分割成更小、更能夠理解的單元。
3.Using common vocabulary allows the author as well as the team to rely on previous code-reading experience. That means reading, understanding and contributing to code is easier. Using unique solutions in place where a common one would do, can slow down new readers of that code.
使用通用詞彙可讓做者和團隊依賴之前的代碼閱讀經驗。這意味着閱讀、理解和貢獻代碼更容易。在一個普通的解決方案能夠解決的問題中使用獨特的解決方案,可讓代碼的新讀者閱讀變得遲鈍。
In practice there is no 「perfect」 way of organizing code, but there are many trade-offs. While I focused on readability, it is never the end goal, there are many other things to consider like reliability, maintainability, performance, speed of prototyping.在實踐中,沒有「完美」的組織代碼方式,但有許多權衡。雖然我關注的是可讀性,但它永遠不是最終的目標,還有許多其餘的事情須要考慮,好比可靠性、可維護性、性能、原型的速度。