Verification Mind Games---how to think like a verifier像驗證工程師同樣思考

1. 有效的驗證須要驗證工程師使用不一樣於設計者的思惟方式思考問題。具體來講,驗證更加關心在嚴格遵循協議的基礎上發現設計裏面的bug,搜索corner cases,對設計的不一致要保持零容忍的態度。

mindset:一套人們應該持有的肯定的態度,有時候又被描述爲內心慣性,羣體思惟,範式,在分析和決策過程當中很難抵消mindset的影響。

舉一個簡單的例子,當你看到任何verification engineer的職位,你會發現這是一個關於語言,方法學,工具以及某種領域的知識集合。
 
不少有經驗的工程師能夠學習閱讀一個規範,創建驗證規範,以及編寫充分的代碼去實現,可是卻在某個關鍵的點達不到驗證的目的。

本文將試圖 使用可以出如今 驗證環境中的 一系列關鍵的選擇 揭示驗證心態:
   1.  Is it the verification  environment’s duty  to accurately  replicate the real  world?
        驗證環境是否有責任複製真實的環境?
   2. Is it acceptable for the testbench and/or testcases to make use of design signals?
        是否容許TB或者testcase使用設計的信號
   3.  Is it worthwhile to target corner cases that designers consider invalid?
         考慮設計者認爲無效的corner cases是否值得

合格的驗證mindset跟設計mindset有很大的不一樣,須要多年的經驗,指點,以及在困難中摸索的經歷才能培養出來。
這貫穿在:每個決定,每一行代碼,每一次會議,每個項目。對設計爲中心的驗證方法說不!!!

當前,驗證思惟,依舊是一個被低估和弱勢的產業。在本文中,咱們將分析爲何這個 驗證有效性的 話題如此關鍵。咱們也將討論影響一個TB debuggability的決策。在這個文章裏面,咱們將介紹一個驗證工程師應該作的,而不是他們應該如何作。

1.  INTEROPERATING VERSUS STRESSING(互操做對抗強調)
 驗證裏面最重要的挑戰就是劃出設計強調的重點,並與設計進行互操做。只有懷有正確的mindset,才能夠正確的劃分驗證重點,
   這些裏面比較常見的有:
      1. clock and data recovery (CDR) 時鐘和數據恢復
      2. 錯誤狀況下的握手
      3. 狀態指標,例如fifo的空滿

A. 時鐘數據恢復
    1. 一些協議規定,數據在發送和接收的時候沒有附帶的時鐘信號。在這種狀況下,設計須要實現這個協議須要實現一個時鐘數據恢復功能,以便在傳輸過來的數據裏面提取時鐘。針對這個設計,開發VC驗證組件,也有一個類似的需求,沒有時鐘信號發送和接收,只有數據。


如圖所示,當實現VC的監控,驗證工程師必須決定以哪一種方法收到來自DUT的數據。 是否應該在monitor實現一個CDR算法,而這個算法已經在設計中實現過。 固然不是這樣,更好的解決辦法是 實現一個路差分(鎖相環)算法基於相同的參考時鐘DUT使用做爲參考,並利用鎖相環的輸出樣本輸入數據。

This approach accomplishes the following:
  1.  It verifies that the DUT’s data stream is in sync with the  reference clock
  2.  It avoids any possibility of the testbench masking a  problem because the CDR algorithm is too tolerant
  3.  It has better simulation performance than doing costly  checks on data rates
  4.  It is faster and simpler to implement than CDR
咱們來看看驗證視角和設計視角有什麼不一樣:
當構建一個RTL CDR組件,設計師努力用最健壯的方式構建算法,可以與普遍的外部設備交互。而驗證工程師反而試圖儘可能建立不夠強健的算法。 so it stresses the design and fails on the slightest  deviation from the protocol specification.
所以,驗證的目標不是爲了複製現實,而是儘量全面的驗證設計,雖然可能這有可能不是很現實。

 
B.  Handshaking Error Handling 
   大部分的協議要求使用一組反饋信息 ACK/NACK報文來指示在最近接受的傳輸中是否有錯誤產生。


1. 若是按照設計的思惟,VC的實現應該遵循協議規範,也就是說自動發送ACK,當接受到一筆沒有錯誤的傳輸,自動發送NACK當接收到一筆錯誤的傳輸。
2. 上面的彷佛忽視一種狀況,正常狀況下DUT永遠不會產生錯誤的傳輸。那麼TB也就會永遠不會返回NACK。
3. 而驗證組件須要負責產生這麼一個錯誤, the testcase writer must be  able to manually control the VC to send a NACK in response.
4. 還有另一種狀況,就是握手異常的狀況,就是握手信號沒有返回給design。這要求VC須要support 錯誤插入的功能。
5. 驗證VC若是實現完整的協議會致使驗證失敗以及浪費精力。

C.  Status Indicators and Clocks (這是你們都知道的典型,這裏省略)

OUTSIDE-THE-BOX VERIFICATION PLANNING 
A. Corner case identification:
   1. Function Input Parameters
       a. whether or not something is valid or  invalid is in fact irrelevant; what is important is, can such a  scenario ever happen, and if the answer it 「yes」, then it must be  simulated to ensure that the design recovers from it.  
       b. It was the responsibility of the  verification engineer to take a higher-level view of things in  order to build the best possible verification environment. 
   
   2. Register Accesses:
      考慮這種狀況: a design is specified to have a  low-power mode that can be activated by writing a ‘1’ to a  given register bit.  The bit’s default value is ‘0’, making the  device be in normal mode by default.  In th
      1. 首先咱們應該很容易想到測試以下的狀況: 
            x  Write ‘1’ to the low-power bit
           x  Check that the device enters low-power mode
           x  Write ‘0’ to the low-power bit
           x  Check that the device exits low-power mode
      2. 可是有另一種狀況沒有考慮到:
           what  happens when a ‘0’ is written to the bit when the bit is already  ‘0’ (or a ‘1’ when it is already ‘1’)? 
         這裏面會隱含一個關鍵性的錯誤,說不定在0的狀況下寫0回不正確的進入低功耗模式。

驗證計劃包括全部可能發生的事情,不論是否DUT旨在處理它們,和不管設計師可能會說什麼
As you  can imagine, doing error injection in creative ways greatly  expands the search space for finding bugs, and so experience  and a degree of gut-feeling is required to target those areas  most likely to be concealing real bugs.   Where is the line  between inter- operating with  the design and  stressing it?

PRIORITIZING DEBUGGABILITY 
1. 一個良好的TB強調可調試性!!
    
    A.  Protocols with Bi-Directional Ports 
Some devices try to save on pins and board trace routing by  employing bi-directional ports for data and/or clock signals.  這種協議的本質,至少有兩個設備負責驅動數據和時鐘信號,不然也不會使用雙向端口。

上面是一種接口的實現方式,這種方式的可調試很低,由於全部的DUT和VC鏈接在同一個雙向端口,可是很難肯定究竟是哪一個組件在驅動總線。

Using a verification mindset, we make use of both  unidirectional and bi-directional signals to achieve both ease of  debug and adherence to the protocol.  

(In Figure 10, 「(highz1, strong0)」 means 「when signal is  assigned with a ‘1’, it takes on ‘Z’; when it is assigned to with  a ‘0’, it takes on ‘0’ 」). 

在DUT裏面,使用以下方式驅動:


ROUNDING OUT THE MINDSET 

To what extent must  the verification  component follow  the design  protocol?
驗證組件在多大程度上必須遵循設計協議?

The DUT is sending data without an  accompanying clock - should my VC do  Clock-Data-Recovery?...No
DUT發送數據沒有附帶時鐘- 個人VC是否須要作Clock-Data-Recovery

The protocol has bi-directional signals  with the potential for multiple masters.   Should I split each signal into two at the  VC interface level?... Yes.

 


​V.  ROUNDING OUT THE MINDSET 

A. No coverage without Checking:沒有檢查就沒有覆蓋率
  之前面對一個low-power使能位寫0當這個bit爲0的時候爲例,突出了另外一個驗證心態,就是 never do coverage on  anything in the absence of doing checks.  This rules out doing  register value coverage, because it is misleading at best and a  waste of compute resources in a large system-on-a-chip (SOC). 

B Approach to Debugging 調試的方法:
    1. 使用waveform的方式debug並不適合VC,由於不少操做並不消耗任什麼時候間,在發送激勵的時候使用動態數據結構。應該有一種心態,最好的debug 工具應該是logfile自己。對於logfile:
    1. 要有適當的和一致的消息模式
    2. 固然這並非說,能夠不用波形去調試,可是更應該依靠日誌。
    3. 若是不可以使用logfile去debug,那隻能說明消息機制須要提升。

C.  Zoom-In, Zoom-Out Thinking 
When zoomed-in the engineer does tasks such as:
x  Understand design specifications
x  Write verification plans based on the specification
x  Write code to implement the verification plan
x  Write testcase code
x  Debug failing testcases 

Verification engineers must also do a series of tasks while  zoomed-out such as:
x  Decide which design features to focus on to maximize  bug discovery  Devise creative ways to tease bugs out
x  Allocate time so as to get  the most important checking  and coverage for the effort  

The main reason for this is difference is that verifiers need  to deal with a larger scope than designers do.  
   1. 不一樣於設計必須在tapeout以前完成他們的工做,驗證能夠再tapeout以後繼續進行,或者放棄
   2. 驗證人員想作到這一點,必需要接觸普遍的信息
   3.  system architecture, design hot-spots, project schedule, and  client deliverables.

D. What Are We Trying to Accomplish Here? 

E.  Coverage, Not Testcases 

F.  Liaison between Design Architect and Design Engineer 

G Quitting on First Error

最後簡單的總結一下:

1. The protocol says that when an error  condition is detected, the design must  send a NACK packet.  Should my VC  automatically send NACK too?... No.
     協議說,當檢測到一個錯誤條件,設計必須發送NACK包。應該個人VC也自動發送NACK嗎?…不。

2. The design indicates FIFO fullness with  signals 「full」 and 「empty」.  Can my VC  or testcase make use of them to  prevent overflow/underflow?... Yes, but  only if checks are made on them

3. Is it sufficient to limit error injection to   the scenarios for which the DUT has   detection capabilities?... No.

4. A design has a register bit that defaults  to ‘0’, and causes the DUT to enter  low-power mode when written with ‘1’.   Is writing ‘0’ when it is already at ‘0’  important to test?... Yes.

5. Can I snoop the design’s internal clock  to synchronize my VC to?... No.

6. Should I ask myself 「what is it I’m trying  to accomplish here?」 when embarking  on a new verification task... Yes
    我應該問本身「是什麼我想完成嗎?「當開始一個新的驗證任務……是的

7. Is it my responsibility to ensure that the  design architect and design engineer  are on the same page?... Yes.
    這是個人責任,以確保設計建築師和設計工程師在同一頁面嗎?…是的。

8.  Should I regularly step back from low- level implementation and take a high- level view of the verification effort as a  whole?... Yes.

9. A design draws circles of radius given  by an input parameter.  Is testing a  radius of zero important?... Yes

10. Should I implement coverage on  individual register values?...No.

11. Should I be relying mainly on waveforms  to debug my VC?...No.

12. Is coverage closure more important than  testcase passing rate?... Yes.


13. Is it necessary to allow a simulation to continue running after it has encountered an error?... No.




相關文章
相關標籤/搜索