瀏覽器指紋 - HTTP cookie 瀏覽器指紋 欺詐檢測 瀏覽器id hash 瀏覽器插件信息 canvas 字體信息

 

詳解瀏覽器cookie和瀏覽隱私之間的關係
http://www.iefans.net/cookie-yinsi-guanxi/javascript

詳解瀏覽器cookie和瀏覽隱私之間的關係

瀏覽器相關 互聯網 2013-07-05 閱讀(6104)
 
本文所說的"cookie",指的是瀏覽器相關的 cookie(也叫"HTTP cookie")。 瀏覽器 cookie 的主要功能是:幫助網站保存一些小片斷的信息。好比,你曾經在本身的瀏覽器上登陸過某個論壇,下次你再打開論壇的登陸頁面,你會發現用戶名已經幫你填好 了,你只須要輸入口令便可。那麼,這個登陸頁面是如何知道你上次登陸用的帳戶名捏?奧妙就在於:該網站在你的瀏覽器端保存了一個 cookie,裏面包含了你上次登陸使用的賬號名稱。

Cookie 的技術實現

本章節面向懂技術的網友。不太懂技術的讀者,能夠略過本節,直接進入下一章節,以避免浪費時間。

網站如何設置 cookie(寫操做)

一、當你在瀏覽器中點某個書籤、或者在瀏覽器地址欄輸入某個網址,瀏覽器會向對應的網站發起一個 HTTP 請求(術語是 HTTP Request)。 二、而後,網站的服務器收到這個 HTTP 請求以後,會把相應的內容(好比網頁、圖片、等)發回給瀏覽器(這稱爲 HTTP 響應,術語是 HTTP Reponse)。 若是網站想設置 cookie,就在發回的 HTTP Response 中,包含一個設置 cookie 的指令。舉例以下:
Set-Cookie: user=xxxx; Path=/; Domain=www.example.com
上述這個例子中,設置了一個 cookie。這個 cookie 的"名"是 user;cookie 的"值"是 xxxx;cookie 綁定的域名是 www.example.com 三、瀏覽器在收到這個指令後,就會在你的電腦中存儲該 cookie 的信息。

網站如何獲取 cookie(讀操做)

假設過了幾天以後,你再次訪問上述的 www.example.com 網站(在上次的訪問中,已經被設置過 cookie 了)。這時候,瀏覽器發現該網址已經有對應的 cookie,就會把 cookie 的信息放在 HTTP Request 中,而後發送到網站服務器。具體的指令以下:
Cookie: user=xxxx
網站服務器拿到這個 HTTP Request 以後,就能夠經過上述信息,知道 cookie 的"名"和"值"。

Cookie 的特色

存儲信息量小

cookie 在洋文中的意思就是:小甜餅、曲奇餅。這個單詞其實已經暗示了 cookie 技術所能存儲的信息量是比較小滴。 從剛纔的技術實現機制能夠看出,cookie 只能用來存儲純文本信息,並且存儲的內容不能太長——由於 Cookie 的讀寫指令受限於 HTTP Header 的長度。 可是,cookie 的信息量雖小,能耐卻很大哦。請看下面的例子。  舉例 好比某個網站上有不少網頁,每一個網頁上有不少廣告。該網站想要收集:每個訪客點擊了哪些廣告。 因爲這些信息量比較大,直接存儲在 cookie 裏可能放不下。因此,網站一般是在 cookie 中保存一個 惟一的用戶標識。而後把用戶的點擊信息(包括在哪一個時間點擊哪一個廣告)都存儲在服務器上。 下次你再訪問該網站,網站先拿到 cookie 中的用戶標識,由於這個標識具備惟一性,那麼就能夠根據該標識,從網站服務器上查出該用戶的詳細信息。

綁定到域名和路徑

從上述的實現機制能夠看出,cookie 是跟 HTTP Request 對應的網址(域名和路徑)相關的。 因此,不一樣域名的網站設置的 cookie 是互相獨立的(隔離的)。這一點由瀏覽器來保證,以確保安全性。 補充一下:cookie 綁定的域名能夠是 小數點開頭的。舉例以下:
Set-Cookie: user=xxxx; Path=/; Domain=.example.com
這個指令設置的 cookie,能夠被 example.com 的 全部下級域名讀取(好比 www.example.com 或 ftp.example.com)。

Cookie 的類型

第一方 Cookie VS 第三方 Cookie

首先來講說"第一方"和"第三方" Cookie 的區別,由於這跟隱私的關係比較密切。 要說清楚 "第一方 Cookie" 和 "第三方 Cookie" 的差異,俺來舉個例子。  舉例 打個比方,你上新浪去看新聞,而且新浪的網頁上嵌入了阿里巴巴的廣告(假設新浪的頁面和嵌入的廣告都會設置 cookie)。那麼,當你的瀏覽器加載完整個頁面以後,瀏覽器中就會同時存在新浪網站的 cookie 和 阿里巴巴網站的 cookie。這時候,新浪網站的 cookie 稱爲"第一方 Cookie"(由於你訪問的就是新浪嘛),相對的,阿里巴巴的 cookie 稱爲"第三方 Cookie"(由於你訪問的是新浪,阿里巴巴只是不相干的第三方)

內存型 VS 文件型

根據存儲方式的不一樣,分爲兩類:基於內存的 Cookie 和 基於文件的Cookie。基於內存的 cookie,當瀏覽器關閉以後,就消失了;而基於文件的 cookie,即便瀏覽器關閉,依然存在於硬盤上。和隱私問題相關的 cookie,主要是第二類(基於文件的Cookie)。

Cookie 有啥正經用途?

今年的315晚會,央視猛烈抨擊了 cookie 的隱私問題,搞得好像 cookie 是洪水猛獸通常。央視對 cookie 的宣傳,典型是用來嚇唬不懂技術的外行。其實捏,cookie 是有利有弊的。cookie 之因此應用這麼普遍,由於它自己確實是頗有用的。請看下面的幾個例子。

舉例1——自動登陸

目前不少基於 Web 的郵箱,都有自動登陸功能。也就是說,你第一次打開郵箱頁面的時候,須要輸入用戶名和口令;過幾天以後再來打開郵箱網頁,就不須要再次輸入用戶名和口令了(好比 Gmail 和 Hotmail 就是這樣的)。 爲啥郵箱能夠作到自動登陸,就是由於郵箱的網站在你的瀏覽器中保存了 cookie,經過 cookie 中記錄的信息來代表你是已登陸用戶。

舉例2——提供個性化界面

好比某個論壇容許匿名用戶設置頁面的字體樣式和字體大小。那麼,該論壇就能夠把匿名用戶設置的字體信息保存在 cookie 中,下次你用同一個瀏覽器訪問該論壇,自動就幫你把字體設置好了。

小結

通常來講,有正經用途的 cookie,大都是"第一方 Cookie";至於"第三方 Cookie",大部分是用來收集廣告信息和用戶行爲的。

Cookie 如何泄漏隱私?

cookie 就像一把雙刃劍,有不少用途,但也有弊端。一個主要的弊端就是隱私問題。

舉例1

假如你同時使用 Google 的 Gmail 和 Google 的搜索(不少 Google 用戶都這麼幹)。當你登陸過 Gmail 以後,cookie 中會保存你的用戶信息(標識你是誰);即便你在 Gmail 中點了註銷(logout),cookie 中仍是會有你的用戶信息。以後,你再用 Google 的搜索功能,那麼 Google 就能夠經過 cookie 中的信息,知道這些搜索請求是哪一個 Gmail 用戶發起的。 可能有些同窗會問,Gmail 和 Google 搜索,是不一樣的域名,如何共享 cookie 捏?俺前面有介紹過,某些 cookie 綁定的域名是以小數點開頭的,也就是說,這類 cookie 能夠被全部下級域名讀取。由於 Gmail 的域名是 mail.google.com,而 Google 搜索的域名是 www.google.com。因此這二者均可以讀取綁定在 .google.com 的 cookie! 注:俺拿 Google 來舉例是由於俺博客的讀者,大部分都是 Google 用戶。其實不光 Google 存在此問題,百度、騰訊、阿里巴巴、奇虎360、等等,都存在相似問題(這幾家都有搜索功能,也都有本身的一套用戶賬號體系)。

舉例2

不少網站會利用 cookie 來追蹤你訪問該網站的行爲(包括你多久來一次,每次來常常看哪些頁面,每一個頁面的停留時間),這樣一來,網站方面就能夠根據這些數據,分析你的我的的種種偏好(這就涉及到我的隱私)。 請注意:利用 cookie 收集我的隱私的把戲有不少,俺限於篇幅,僅列出上述兩例。

始終用隱私瀏覽模式

關於"隱私瀏覽模式",在 本系列的前一篇已經介紹過了,此處再也不囉嗦。 在隱私瀏覽模式下,瀏覽器關閉以後,期間全部的 cookie 都消失。 可是,這樣設置也可能帶來一些不方便之處(安全性和方便性一般是截然對立)。你可能要先嚐試一段時間,看看本身可否忍受這種模式。

小結

剛纔介紹的幾招,都是針對單個瀏覽器 。大部分狀況下是夠用了。可是某些特殊狀況,仍是會搞不定。 好比:你常常用 Gmail,並且依賴於 Gmail 的自動登陸。這時候,你就不能禁用 .google.com 域名下的 cookie(禁用了就沒法自動登陸 Gmail)。 可是,你在用 Google 搜索的時候,又不但願讓 Google 知道你是誰。咋辦捏?請聽下回分解——用多瀏覽器搭配不一樣的招數。 via: 編程隨想的博客
 

什麼是瀏覽器指紋?它是如何泄露咱們的隱私?
http://www.iefans.net/liulanqi-zhiwen-ruhe-xielou-yinsi/php

什麼是瀏覽器指紋?它是如何泄露咱們的隱私?

瀏覽器 互聯網 2014-01-24 閱讀(12957)
 
以前跟你們分享了 防範瀏覽器泄露上網隱私的基本技巧,對於「隱私要求不高而且技術水平也不高」的同窗,看完這篇文章基本上夠了。下面繼續談談瀏覽器方面的問題, 面向的是那些「對隱私要求較高,同時也具備必定折騰能力」的同窗。今天這篇文章將詳解瀏覽器的「指紋」是如何暴露你的隱私,順便分享一些防範技巧。

什麼是「指紋」?

說到「指紋」可能你們都知道是手指頭的紋理,並且每一個人的指紋都是惟一的。 若是你時常接觸信息安全領域的一些資料,也會聽到「指紋」這個形象的說法(好比:操做系統指紋、網絡協議棧指紋、等等)。IT 領域提到的「指紋」一詞,其原理跟「刑偵」是相似的——「當你須要研究某個對象的類型/類別,但這個對象你又沒法直接接觸到。這時候你能夠利用若干技術來獲取該對象的某些特徵,而後根據這些特徵來猜想/判斷該對象的類型/類別。」

什麼是「指紋」的「信息量」?

在 IT 領域有各類各樣的特徵能夠用來充當「指紋」。這時候就須要判斷,用哪一個特徵作指紋,效果更好。爲了討論這個問題,就得掃盲一下「指紋的信息量」。 爲了幫助大夥兒理解,先舉一個例子: 假設你要在學校中定位某我的,若是你光知道此人的性別,你是比較難定位的(只能排除 1/2 的人);反之若是你不知道性別,可是知道此人的生日,就比較容易定位(能夠排除掉大約 364/365 的人,只剩大約 1/365 的人)。爲何?由於「生日」比「性別」更加獨特,因此「生日」比「性別」可以提供更多的信息量。 從這個例子能夠看出:某個特徵越獨特,則該特徵的信息量越大;反之亦然。信息量越大的特徵,就能夠把對象定位到越小的範圍。

「指紋」的「信息量」如何度量——關於指紋的比特數?

(本節涉及到中學數學,數學不好的或者對數學有恐懼感的讀者,請直接無視) 在 IT 領域中,能夠用【比特數】來衡量某個指紋所包含的信息量。爲了通俗起見,先之前面提到的「性別」來講事兒。性別只有兩種可能性——「男」或者「女」,而且男女的比例是大體平均的。因此,當你知道了某人的性別,就能夠把範圍縮小到原先的 1/2。用 IT 的術語來說,就是:「性別」這個特徵只包含一個比特的信息量。以此類推:
  • 當咱們說:「某特徵包含3比特信息量」,意思就是:該特徵會有8種大體平均的可能性(8等於2的3次方)。一旦知道該特徵,能夠把目標定位到八分之一。
  • 當咱們說:「某特徵包含7比特信息量」,意思就是:該特徵會有128種大體平均的可能性(128=2^7)。一旦知道該特徵,能夠定位到 1/128。
再來講「生日」。(不考慮閏年的狀況下)生日有365種可能性(而且也是平均分佈的),因此生日包含的比特數大約是 8.51。爲何是 8.51 捏,由於 2 的 8.51 次方 約等於 365。所以,知道了某人的生日就能夠把範圍縮小到 1/365 經過上述舉例,大夥兒對於指紋的信息量,應該有一些粗淺的認識了吧?

多個指紋的綜合定位

若是能同時獲取【互不相關】的若干個指紋,就能夠大大增長定位的精確性。 好比要在某個公司裏面定位某人,若是你知道此人的「生日」和「生肖」,那麼就能夠達到 1/4380(1/4380 = 1/12 * 1/365) 的定位精度。由於綜合定位以後,比例之間是【乘法】的關係,因此範圍就被急劇縮小了。 爲何要特別強調「互不相關」呢?假如你同時知道的信息是「生日」和「星座」,那麼定位的精度依然是 1/365——由於生日的信息已經包含了星座的信息。因此,只有那些相互獨立的特徵(所謂的相互獨立,數學稱爲「正交」),在綜合定位的時候才能夠用【乘法】。

什麼是「瀏覽器的指紋」?

當你使用瀏覽器訪問某個網站的時候,瀏覽器【 一定會暴露】某些信息給這個網站。爲何強調「 一定」呢?由於這些信息中,有些是跟 HTTP 協議相關的(本章節說的 HTTP 協議是廣義的,也包括 HTTPS)。只要你基於 HTTP 協議訪問網站,瀏覽器就【一定】會傳輸這些信息給網站的服務器。 再羅嗦一下:HTTP 協議是 Web 的基石。只要你經過瀏覽器訪問 Web,一定是基於 HTTP 協議的。所以,Web 網站的服務器一定能夠獲取到跟你的瀏覽器相關的某些信息(具體是哪些信息,下面會說到)。

「瀏覽器指紋」如何暴露隱私?

「瀏覽器指紋」的機制跟 cookie 有點類似。關於 cookie 的做用,建議那些健忘的同窗先去「 前面的博文」複習一下。 對於「瀏覽器指紋」致使的隱私問題,這裏舉2個例子來講明其危害。

對於無需登陸的網站

若是你的瀏覽器容許記錄 cookie,當你第一次訪問某網站的時候,網站會在你的瀏覽器端記錄一個 cookie,cookie 中包含某個「惟一性的標識信息」。下次你再去訪問該網站,網站服務器先從你的瀏覽器中讀取 cookie 信息,而後就能夠根據 cookie 中的「惟一標識」判斷出,你以前曾經訪問過該網站,而且知道你上次訪問該網站時,幹了些什麼。對付這種 cookie 很簡單,你只須要在先後兩次訪問之間,清空瀏覽器的 cookie,網站就無法用 cookie 的招數來判斷你的身份。 可是「清空 cookie」這招對「瀏覽器指紋」是無效滴。好比說你的瀏覽器具備很是獨特的指紋,那麼當你第一次訪問某網站的時候,網站會在服務器端記錄下你的瀏覽器指紋,而且會記錄你在該網站的行爲;下次你再去訪問的時候,網站服務器再次讀取瀏覽器指紋,而後跟以前存儲的指紋進行比對,就知道你是否曾經來過,而且知道你上次訪問期間幹了些什麼。

對於須要登陸的網站

假如網站沒有采用「指紋追蹤」的技術,那麼你能夠在該網站上註冊若干個賬號(馬甲)。當你須要切換身份的時候,只須要先註銷用戶,清空瀏覽器的 cookie,而後用另外一個賬號登陸。網站是看不出來的。  一旦網站採用「指紋追蹤」的技術,即便你用上述方式僞造馬甲,但由於你用的是同一個瀏覽器,瀏覽器指紋相同。網站的服務器軟件能夠猜想出,這兩個賬號實際上是同一個網民註冊的。

「瀏覽器指紋」比「cookie」更隱蔽,更危險

剛纔對比了「瀏覽器指紋」和「cookie」兩種身份追蹤技術。二者的原理相似——都是利用某些特殊的信息來定位你的身份。二者的本質差別在於:
  1. cookie 須要把信息保存在瀏覽器端,因此會被用戶發現,也會被用戶清除。
  2. 而「瀏覽器指紋」無需在客戶端保存任何信息,不會被用戶發覺,用戶也沒法清除(換句話說:你甚至沒法判斷你訪問的網站到底有沒有收集瀏覽器指紋)。

「瀏覽器指紋」包含哪些信息?

瀏覽器暴露給網站的信息有不少種,常見的有以下幾種:

User Agent

關於 User Agent 是什麼,已經在本系列前面的 博文中有簡單的說明,已瞭解的同窗能夠繼續往下看。

屏幕分辨率

這個比較通俗易懂。稍微補充一下:這一項不只包括屏幕的尺寸,還包括顏色深度(好比你的屏幕是16位色、24位色、仍是32位色)。

時區

這個也比較通俗。咱們應該都是「東8區」。

瀏覽器的插件信息

也就是你的瀏覽器裝了哪些插件。 再羅嗦一次:瀏覽器的「插件」和「擴展」是兩碼事兒,別搞混了。本系列前面的博文掃盲了二者的差別,連接在「 這裏」。

瀏覽器的字體信息

和瀏覽器相關的一些字體信息。 若是你的瀏覽器安裝了 Flash 或 Java 插件,有可能會暴露某些字體信息。因此在「 如何防範瀏覽器泄露上網隱私」一文中就警告了瀏覽器插件的風險。

HTTP ACCEPT

這是 HTTP 協議頭中的一個字段。考慮到列位看官大都不是搞 IT 技術的,這裏就不深刻解釋這項。

其它

以上就是常見的瀏覽器指紋。固然啦,還有其它一些信息也能夠成爲「瀏覽器指紋」,考慮到篇幅就不一一列舉並解釋了。有興趣的同窗,請自行閱讀 Mozilla 官網的 文檔

如何看本身瀏覽器的指紋?

關於瀏覽器指紋致使的隱私問題,多是由「 電子前哨基金會」(簡稱 EFF)率先在2010年曝光的。後來 EFF 提供了一個頁面,幫助網友看本身瀏覽器的指紋(請點擊「 這個連接」)。 打開此頁面以後,當中有一個大大的,紅色的「TEST ME」按鈕。點一下此按鈕,稍等幾秒鐘,會顯示出一個表格,裏面包含你當前的瀏覽器的指紋信息。 在這個表格中會列出每一項指紋的「信息量」以及該指紋的「佔比」。關於「信息量」的含義,本文前面已經掃盲過,此處再也不說明。你只需記住,某項的信息量越大,就說明該項越獨特。而越獨特的指紋,對隱私的威脅也就越大。 考慮到篇幅有點長,今天先聊到這裏。下次跟你們分享如何防範「瀏覽器指紋」致使的隱私風險。 via: 編程隨想的博客
 
 
Fingerprint.js - Browser fingerprinting and fraud detection https://fingerprintjs.com/
 

fingerprint hash 187c8e293354eb2d15d9363a6f52f393
index.js:42 userAgent = Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safa
index.js:42 language = zh-CN
index.js:42 colorDepth = 24
index.js:42 deviceMemory = not available
index.js:42 hardwareConcurrency = 2
index.js:42 screenResolution = 800,1280
index.js:42 availableScreenResolution = 800,1227
index.js:42 timezoneOffset = -480
index.js:42 timezone = Asia/Shanghai
index.js:42 sessionStorage = true
index.js:42 localStorage = true
index.js:42 indexedDb = true
index.js:42 addBehavior = false
index.js:42 openDatabase = true
index.js:42 cpuClass = not available
index.js:42 platform = Win32
index.js:42 plugins = com.sogou.sogoupdfviewer,,application/pdf,pdf,Native Widget Plugin,This plugin allow you to use the
index.js:42 canvas = canvas winding:yes,canvas fp:
index.js:42 webgl = not available
index.js:42 webglVendorAndRenderer = undefined
index.js:42 adBlock = false
index.js:42 hasLiedLanguages = false
index.js:42 hasLiedResolution = false
index.js:42 hasLiedOs = false
index.js:42 hasLiedBrowser = false
index.js:42 touchSupport = 0,false,false
index.js:42 fonts = Arial,Arial Black,Arial Narrow,Calibri,Cambria,Cambria Math,Comic Sans MS,Consolas,Courier,Courier N
index.js:42 audio = 124.04344752358156html

 

time 361
index.js:37 fingerprint hash b303b5c23680c363a36afc5764f3a275
index.js:42 userAgent = Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.109
index.js:42 language = en-US
index.js:42 colorDepth = 24
index.js:42 deviceMemory = 8
index.js:42 hardwareConcurrency = 2
index.js:42 screenResolution = 800,1280
index.js:42 availableScreenResolution = 800,1227
index.js:42 timezoneOffset = -480
index.js:42 timezone = Asia/Shanghai
index.js:42 sessionStorage = true
index.js:42 localStorage = true
index.js:42 indexedDb = true
index.js:42 addBehavior = false
index.js:42 openDatabase = true
index.js:42 cpuClass = not available
index.js:42 platform = Win32
index.js:42 plugins = Chrome PDF Plugin,Portable Document Format,application/x-google-chrome-pdf,pdf,Chrome PDF Viewer,,ap
index.js:42 canvas = canvas winding:yes,canvas fp:
index.js:42 webgl = 
index.js:42 webglVendorAndRenderer = Google Inc.~ANGLE (Mobile Intel(R) 4 Series Express Chipset Family Direct3D9Ex vs_3_0 ps_3_0)
index.js:42 adBlock = false
index.js:42 hasLiedLanguages = false
index.js:42 hasLiedResolution = false
index.js:42 hasLiedOs = false
index.js:42 hasLiedBrowser = false
index.js:42 touchSupport = 0,false,false
index.js:42 fonts = Arial,Arial Black,Arial Narrow,Calibri,Cambria,Cambria Math,Comic Sans MS,Consolas,Courier,Courier N
index.js:42 audio = 124.0434474653739java

 

 Fingerprinting - MozillaWiki https://wiki.mozilla.org/Fingerprintinggit

Overview

The EFF published an excellent study in May, detailing some of the various methods of fingerprinting a browser. See http://www.eff.org/deeplinks/2010/05/every-browser-unique-results-fom-panopticlick. They found that, over their study of around 1 million visits to their study website, 83.6% of the browsers seen had a unique fingerprint; among those with Flash or Java enabled, 94.2%. This does not include cookies! They ranked the various bits of information in order of importance (i.e. how useful they are in uniquely identifying a browser): things like UA string, what addons are installed, and the font list of the system. We need to go through these, one by one, and do what we can to reduce the number of bits of information (entropy) it provides. In their study, they placed a lower bound on the fingerprint distribution of 18.1 bits of entropy. (This means that, choosing a browser at random, at best one in 286,777 other browsers will share its fingerprint.)web

Data

The following data is taken from the published paper, https://panopticlick.eff.org/browser-uniqueness.pdf:chrome

Entropy of various pieces of browser information
Variable Entropy (bits)
plugins 15.4
fonts 13.9
user agent 10.0
http accept 6.09
screen resolution 4.83
timezone 3.04
supercookies 2.12
cookies enabled 0.353

In all cases, data was either collected or inferred via HTTP, or collected by JS code and posted back to the server via AJAX.編程

Plugins

The PluginDetect JS library was used to check for 8 common plugins on that platform, plus extra code to estimate the Acrobat Reader version. Data sent by AJAX post.canvas

IE does not allow enumeration via navigator.plugins[]. Starting in Firefox 28 (bug 757726), Firefox restricts which plugins are visible to content enumerating navigator.plugins[]. This change does not disable any plugins; it just hides some plugin names from enumeration. Websites can still check whether a particular hidden plugin is installed by directly querying navigator.plugins[] like navigator.plugins["Silverlight Plug-In"].瀏覽器

This code change will reduce browser uniqueness by "cloaking" uncommon plugin names from navigator.plugins[] enumeration. If a website does not use the "Adobe Acrobat NPAPI Plug-in, Version 11.0.02" plugin, why does it need to know that the "Adobe Acrobat NPAPI Plug-in, Version 11.0.02" plugin is installed? If a website does need to know whether the plugin is installed or meets minimum version requirements, it can still check navigator.plugins["Adobe Acrobat NPAPI Plug-in, Version 11.0.02"] or navigator.mimeTypes["application/vnd.fdf"].enabledPlugin (to workaround problem plugins that short-sightedly include version numbers in their names, thus allow only individual plugin versions to be queried).

For example, the following JavaScript reveals my installed plugins:

for (plugin of navigator.plugins) { console.log(plugin.name); }

"Shockwave Flash"
"QuickTime Plug-in 7.7.3"
"Default Browser Helper"
"Unity Player"
"Google Earth Plug-in"
"Silverlight Plug-In"
"Java Applet Plug-in"
"Adobe Acrobat NPAPI Plug-in, Version 11.0.02"
"WacomTabletPlugin"

navigator.plugins["Unity Player"].name // get cloaked plugin by name
"Unity Player"

But with plugin cloaking, the same JavaScript will not reveal as much personally-identifying information about my browser because all plugin names except Flash, Shockwave (Director), Java, and QuickTime are hidden from navigator.plugins[] enumeration:

for (plugin of navigator.plugins) { console.log(plugin.name); }

"Shockwave Flash"
"QuickTime Plug-in 7.7.3"
"Java Applet Plug-in"

In theory, all plugin names could be cloaked because web content can query navigator.plugins[] by plugin name. Unfortunately, we could not cloak all plugin names because many popular websites check for Flash or QuickTime by enumerating navigator.plugins[] and comparing plugin names one by one, instead of just asking for navigator.plugins["Shockwave Flash"] by name. These websites should be fixed.

The policy of which plugin names are uncloaked can be changed in the about:config pref plugins.enumerable_names. The pref’s value is a comma-separated list of plugin name prefixes (so the prefix "QuickTime" will match both "QuickTime Plug-in 6.4" and "QuickTime Plug-in 7.7.3"). The default pref cloaks all plugin names except Flash, Shockwave (Director), Java, and QuickTime. To cloak all plugin names, set the pref to the empty string "" (without quotes). To cloak no plugin names, set the pref to magic value "*" (without quotes).

Fonts

System fonts collected by Flash or Java applet, if installed, and sent via AJAX post. Font list was not sorted, which provides a bit or two of additional entropy. We can ask Adobe to either limit this list by default; or ask them to implement an API such that we can provide the list to them; or (made possible by OOPP) replace the OS API calls they use to get the font list, and give them our own. None of these things are easy, but given that this is #1, we should definitely do something here. The fastest option is probably to hack the OS API calls ourselves.

Font lists can also be determined by CSS introspection. We could perhaps reduce the available set to a smaller number of common fonts; and back off (exponentially?) if script attempts to brute-force the list. Could require that sites provide unusual fonts via WOFF?

User Agent

Detected from HTTP header. Pretty simple fix, but has the potential for breakage (as with any UA change!). For instance: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1.7) Gecko/20100106 Ubuntu/9.10 (karmic) Firefox/3.5.7. Remedies: remove the last point digit in the Firefox and Gecko versions, and the Gecko build date; for Linux, remove distribution and version; possibly remove CPU. Windows is actually the least unique since the OS version string only identifies the major version (e.g. XP), and by far the majority of users are on it.

Remove language and "Firefox" as well?

Boris Zbarsky points out that most parts of the UA lead to bad sniffing. Irish "ga-IE" and "Minefield" get detected as IE. Sites incorrectly sniff based on OS. Sites sniff for Gecko years rather than Gecko versions. Going from 3.0.9 to 3.0.10 probably breaks things. And quite a few sites sniff for "Firefox", which is a threat to the continued freedom of the web. So removing things from the UA string has a long-term positive effect on compatibility as well as privacy.

There is another issue with UA spoofing. For some reason, Components.classes and Components.interfaces exist in the content-window javascript namespace. Gregory Fleischer used this to test for the existence of ephemeral interfaces to  fingerprint both OS and Firefox version, down to the minor revision (FF3.5.3 was the latest release at the time). He has a  number of other fingerprinting demos you should investigate as well. --  mikeperry

 

Filed  bugsHsivonen 09:33, 18 June 2010 (UTC)

HTTP ACCEPT

Example: text/html, */* ISO-8859-1,utf-8;q=0.7,*;q=0.7 gzip,deflate en- us,en;q=0.5. Not sure we can do much here?

Screen resolution

Example: 1280x800x24. Can't mess with this, except perhaps to always report "24" for the color depth -- of dubious value.

Mapping "32" to "24" or vice versa in the color depth would reduce entropy by ~0.9 bits. May be worthwhile.
Torbutton takes two countermeasures with respect to screen resolution: quantising AvailWidth and AvailHeight, and setting Width and Height to the values of AvailWidth and AvailHeight. Torbutton currently errs in not doing this if the window is maximised. These measures might be appropriate in private browsing mode. --  Pde 03:12, 15 June 2010 (UTC)

Timezone

Too useful to break.

Supercookies

The reported entropy includes only whether the following were enabled: DOM localStorage, DOM sessionStorage, and (for IE) userData. It did nottest Flash LSOs, Silverlight cookies, HTML5 databases, or DOM globalStorage. We can't do anything to prevent testing whether these are enabled, but we can lock them down for third parties, as we will with cookies.

For Flash and Silverlight we need to pressure them to implement better APIs for controlling and clearing stored data. This is undoubtedly more important than anything else on this list, though it was ignored in this study since it does not fit within their definition of fingerprinting. We could be aggressive here by using the new Flash API for private browsing mode very liberally; or do something with the OS APIs as mentioned above.

Cookies enabled

Irrelevant due to low amount of entropy.

Extra credit

Other fingerprinting methods were mentioned, but not included, in the study. A Gartner report on fingerprinting services was referenced in the study, which will undoubtedly be interesting to read.

Examples:

Other data acquired via plugins

Undoubtedly Flash and Java provide other interesting tidbits. ActiveX and Silverlight, for example, allow querying the "CPU type and many other details". More study needed here.

Clock skew measurements

"41st Parameter looks at more than 100 parameters, and at the core of its algorithm is a time differential parameter that measures the time difference between a user’s PC (down to the millisecond) and a server’s PC." We can't break the millisecond resolution of Date.now, but we could try adding a small (< 100ms) offset to it. This would be generated per-origin, and would last for some relatively short time: life of session, life of tab, etc. Would have to be careful that it can't be reversed.

Clock skew measurement isn't really a browser issue; it tends to be exposed by the operating system at the TCP level. It would be appropriate to assume that an attacker can obtain 4-6 bits of information about the identity of a host by this method. --  Pde 02:55, 15 June 2010 (UTC)
This is not 100% correct. According to  RFC 1323 sections 3.2 and 4.2.2, timestamps may only be used if the initial syn packet (not syn+ack) contains a timestamp field. This is a property of the client OS, and may be controllable on some platforms. The timestamp value is also not absolute, but is typically some arbitrary number of milliseconds with no specific reference point. TLS also has a timestamp, but this value is fully controlled by Firefox. --  mikeperry
Agree that one could turn off the TCP RTTM option at the OS layer. My naive intuition is that all modern OSes have this turned on, and turning it off would be a radical intervention bad for congestion avoidance and possibly fingerprintable itself. Note that clock skew is a function of how fast a clock ticks, not of what time the clock has. An arbitrary reference point is sufficient for measuring clock skew. --  Pde08:23, 9 December 2010 (PST)
Note also that it's not just clock skew, but also clock precision that can allow for fingerprinting - both in terms of how long certain operations take on a system and in terms of user action. For example,  Scout Analytics provides software to fingerprint users based on  typing cadence. One can also imagine tight loops of timed javascript that fingerprint users based on certain resource-intensive calls. One possibility might be to quantize Date values to the second, and then add random, monotonically increasing amounts of milliseconds to subsequent calls during private browsing mode. --  mikeperry

TCP stack

"ThreatMetrix claims that it can detect irregularities in the TCP/IP stack and can pierce through proxy servers". Not sure what this means yet.

nmap's host fingerprinting options (and source code) are the first place to start for understanding the TCP/IP stack issues. Again, there's not much the browser can do about that.
As for "pierce through proxy servers", my best guess is that they use the raw socket infrastructure provided by Flash, which does not respect the browser's proxy settings, in order to learn the client's IP. Not sure if Java and Silverlight have similar problems. --  Pde 02:58, 15 June 2010 (UTC)

JS behavioral tests

Can be used to gather information about whether certain addons are installed, exact browser version, etc. Probably nothing we can do here.

Recommend privacy-related addons and services

"TorButton has evolved to give considerable thought to fingerprint resistance [19] and may be receiving the levels of scrutiny necessary to succeed in that project [15]. NoScript is a useful privacy enhancing technology that seems to reduce fingerprintability."

"We identified only three groups of browser with comparatively good resistance to fingerprinting: those that block JavaScript, those that use TorButton, and certain types of smartphone."

We should study what TorButton does, and see if we can integrate some of its features. We can also recommend it, NoScript, and Flashblock to users. We could suggest improvements to relevant addons, such as providing options for blocking third party but not first party content. (This doesn't strictly solve anything, but makes gathering the data more difficult, since the third party now relies on the first party to collect it.)

Unfortunately Flashblock does not appear to prevent Flash from reading and writing LSOs, so it's doubtful it can be relied upon to protect against fingerprinting. --  Pde 03:00, 15 June 2010 (UTC)

User interface

Things like geolocation, database access and such require the user to grant permission for a given site. For geolocation, this is done with an infobar. We should do everything we can to make it clear to users what they're providing, and give them centralized control of those permissions in the privacy panel. This is what the UX privacy proposals seek to do.

HTML5 Canvas

"After plugins and plugin-provided information, we believe that the HTML5 Canvas is the single largest fingerprinting threat browsers face today." - Tor Project. Original research: Pixel Perfect: Fingerprinting Canvas in HTML5, demo: HTML5 Canvas Fingerprinting.

See Also

相關文章
相關標籤/搜索