爲何數組是從0開始的

時間 2019-11-09

標籤爲何數組開始简体版

原文原文鏈接

本文經過彙總一些網上搜集到的資料，總結出大部分編程語言中數組下標從0開始的緣由html

本博客已經遷移至：

http://cenalulu.github.io/linux

本篇博文已經遷移，閱讀全文請點擊：

http://cenalulu.github.io/linux/why-array-start-from-zero/git

背景

咱們知道大部分編程語言中的數組都是從0開始編號的，即array[0]是數組的第一個元素。這個和咱們平時生活中從1開始編號的習慣相比顯得很反人類。那麼到底是什麼樣的緣由讓大部分編程語言數組都聽從了這個神奇的習慣呢？本文最初是受stackoverflow上的一個問題的啓發，經過蒐集和閱讀了一些資料在這裏作個總結。固然，本文摘錄較多的過程結論，若是你想把這篇文章當作快餐享用的話，能夠直接跳到文章末尾看結論。github

最先的緣由

在回答大部分咱們沒法解釋的詭異問題時，咱們最經常使用的辯詞一般是歷史緣由。那麼，歷史又是出於什麼緣由，使用了0標號數組呢？Mike Hoye就是本着這麼一種追根刨地的科學精神爲咱們找到了解答。如下是一些他的重要結論的摘錄翻譯：編程

據做者的說法，C語言中從0開始標號的作法是沿用了BCPL這門編程語言的作法。而BCPL中若是一個變量是指針的話，那麼該指針能夠指向一系列連續的相同類型的數值。那麼p+0就表明了這一串數值的第一個。在BCPL中數組第5個元素的寫法是p!5，而C語言中把寫法改爲了p[5]，也就是如今的數組。具體原文摘錄以下：數組

If a BCPL variable represents a pointer, it points to one or more consecutive words of memory. These words are the same size as BCPL variables. Just as machine code allows address arithmetic so does BCPL, so if p is a pointer p+1 is a pointer to the next word after the one p points to. Naturally p+0 has the same value as p. The monodic indirection operator ! takes a pointer as it’s argument and returns the contents of the word pointed to. If v is a pointer !(v+I) will access the word pointed to by v+I.服務器

至於爲何C語言中爲何使用[]方括號來表示數組下標，這個設計也有必定來歷。據C語言做者的說法是方括號是現代鍵盤上惟一較爲容易輸入的成對符號（不用shift）不信你對着鍵盤找找？編程語言

爲何這個反人類設計在一段時間內一直沒有被改變

根據Mike的說法，BCPL是被設計在IBM硬件環境下編譯運行的。在1960後的很長一段時間內，服務器硬件幾乎被IBM統治。一個城市內也許至於一臺超級計算機，還須要根據時間配額使用。當你當天的配額用完之後，你的程序就被徹底清出計算隊列。甚至連計算結果都不給你保留，死無全屍。這個時候寫一段高效的程序，就顯得比什麼都重要了。而這時0下標數組又體現了出了它的另外一個優點，就是：相較於1下標數組，它的編譯效率更高。原文摘錄以下：ide

So: the technical reason we started counting arrays at zero is that in the mid-1960’s, you could shave a few cycles off of a program’s compilation time on an IBM 7094. The social reason is that we had to save every cycle we could, because if the job didn’t finish fast it might not finish at all and you never know when you’re getting bumped off the hardware because the President of IBM just called and fuck your thesis, it’s yacht-racing time.post

此外，還有另一種說法。在C語言中有指針的概念，而指針數組標號其實是一個偏移量而不是計數做用。例如對於指針p，第N個元素是*(p+N)，指針指向數組的第一個元素就是*(p+0)，

一些現代語言爲何仍然使用這種作法

上文中提到的爲了計較分秒的編譯時間而使用0下標數組，在硬件飛速發展的今天顯然是沒必要要的。那麼爲何一些新興語言，如Python依然選擇以0做爲數組第一個元素呢？難道也是歷史緣由？對於這個問題，Python的做者Guido van Rossum也有本身的答案。這裏大體歸納一下做者的用意：從0開始的半開放數組寫法在表示子數組（或者子串）的時候格外的便捷。例如：a[0:n]表示了a中前n個元素組成的新數組。若是咱們使用1開始的數組寫法，那麼就要寫成a[1:n+1]。這樣就顯得不是很優雅。那麼問題來了，Python數組爲何使用半開放，即[m,n)左閉合右開發的寫法呢？這個理解起來就比較簡單，讀者能夠參考http://www.cs.utexas.edu/users/EWD/ewd08xx/EWD831.PDF做爲擴展閱讀。下面摘錄一段Python做者的原話：

Using 0-based indexing, half-open intervals, and suitable defaults (as Python ended up having), they are beautiful: a[:n] and a[i:i+n]; the former is long for a[0:n].
Using 1-based indexing, if you want a[:n] to mean the first n elements, you either have to use closed intervals or you can use a slice notation that uses start and length as the slice parameters. Using half-open intervals just isn't very elegant when combined with 1-based indexing. Using closed intervals, you'd have to write a[i:i+n-1] for the n items starting at i. So perhaps using the slice length would be more elegant with 1-based indexing? Then you could write a[i:n]. And this is in fact what ABC did -- it used a different notation so you could write a@i|n.(See http://homepages.cwi.nl/~steven/abc/qr.html#EXPRESSIONS.)

總結

從0標號的數組傳統，沿用了這麼長時間的緣由主要列舉以下：

在計算資源缺少的過去，0標號的寫法能夠節省編譯時間
現代語言中0標號能夠更優雅的表示數組字串
在支持指針的語言中，標號被視做是偏移量，所以從0開始更符合邏輯

參考文獻

(http://developeronline.blogspot.com/2008/04/why-array-index-should-start-from-0.html)
[http://exple.tive.org/blarg/2013/10/22/citation-needed/]
[https://plus.google.com/115212051037621986145/posts/YTUxbXYZyfi]
[http://stackoverflow.com/questions/7320686/why-does-the-indexing-start-with-zero-in-c]