Github項目地址 | Github項目地址 |
---|---|
這個做業要求在哪裏 | 做業要求的連接 |
結對同伴的連接 | 同伴連接 |
個人博客地址 | 個人地址 |
這是,咱們在結對編程實現關鍵代碼,在兩我的的齊心合力之下最終實現了關鍵代碼html
PSP2.1 | Personal Software Process Stages | 預估耗時(分鐘) | 實際耗時(分鐘) |
---|---|---|---|
Planning | 計劃 | 40 | 55 |
· Estimate | · 估計這個任務須要多少時間 | 90 | 90 |
Development | 開發 | 60 | 75 |
· Analysis | · 需求分析 (包括學習新技術) | 30 | 35 |
· Design Spec | · 生成設計文檔 | 15 | 18 |
· Design Review | · 設計複審 (和同事審覈設計文檔) | 20 | 20 |
· Coding Standard | · 代碼規範 (爲目前的開發制定合適的規範) | 5 | 5 |
· Design | · 具體設計 | 30 | 40 |
· Coding | · 具體編碼 | 120 | 150 |
· Code Review | · 代碼複審 | 40 | 60 |
· Test | · 測試(自我測試,修改代碼,提交修改) | 50 | 70 |
Reporting | 報告 | 100 | 120 |
· Test Report | · 測試報告 | 50 | 55 |
· Size Measurement | · 計算工做量 | 20 | 20 |
· Postmortem & Process Improvement Plan | · 過後總結, 並提出過程改進計劃 | 15 | 10 |
合計 | 685 | 823 |
b.設計實現過程
大致上,除了主函數這個類,還有一個大類getFile(),在getFile這個類中有七個方法,有一個公共方法getDic 這個方法用於得到字典,存入字典中的是長度大於四且不以數
字開頭的單詞以及他們出現的次數,這個方法會返回一個Hashtable,方法getWordFre()方法將字典按照單詞出現的次數進行排序,並返回一個動態數組。
其餘的,能夠直接調用,getWordFre這個方法利用返回的數組進行相應功能的實現。單元測試是對這幾個方法所
對應的功能進行相應的測試。如下是整個程序的流程圖
git
public Hashtable getDic(string pathName, ref Hashtable wordList) //getDic:從文本文件中統計詞頻保存在Hashtable中 { StreamReader sr = new StreamReader(pathName); string line; line = sr.ReadLine(); //按行讀取 while (line != null) { MatchCollection mc; Regex rg = new Regex("[0-9A-Za-z-]+"); //用正則表達式匹配單詞 mc = rg.Matches(line); for (int i = 0; i < mc.Count; i++) { Regex regNum = new Regex("^[0-9]"); string mcTmp = mc[i].Value.ToLower(); //大小寫不敏感 if (mcTmp.Length >= 4 && regNum.IsMatch(mcTmp) == false)//字符長度大於4且不以數字開頭 { if (!wordList.ContainsKey(mcTmp)) //第一次出現則添加爲Key { wordList.Add(mcTmp, 1); } else //不是第一次出現則Value加 { int value = (int)wordList[mcTmp]; value++; wordList[mcTmp] = value; } } else continue; } line = sr.ReadLine(); } sr.Close(); return wordList; }
getDic(string pathName, ref Hashtable wordList)這個方法用於從文本中將每一個詞提取出來,並統計出每一個詞詞頻放到Hashtable中,而後用StreamReader打開文件,
用while實現按行讀取,在循環體中,用正則表達式匹配每一行的單詞,while中的for循環用於對匹配出來的單詞進行按條件剔除,符合條件的加入字典,不符合的剔除,最後返回一個Hashtablegithub
public ArrayList getWordFre(string pathName, ref Hashtable wordList) { getFile Wordlist = new getFile(); Hashtable Wordlist_fre = new Hashtable(); Wordlist_fre = Wordlist.getDic(pathName, ref wordList); ArrayList keysList = new ArrayList(Wordlist_fre.Keys); keysList.Sort(); string tmp = String.Empty; int valueTmp = 0; for (int i = 1; i < keysList.Count; i++) { tmp = keysList[i].ToString(); valueTmp = (int)wordList[keysList[i]];//次數 int j = i; while (j > 0 && valueTmp > (int)wordList[keysList[j - 1]]) { keysList[j] = keysList[j - 1]; j--; } keysList[j] = tmp;//j=0 } return keysList; }
getWordFre(string pathName, ref Hashtable wordList)將傳遞過來的wordList進行按頻率排序,並將Hashtable轉換成動態數組並返回正則表達式
public void write(string outputPath, ref Hashtable wordList, int lines, int words, int characters, int wordsOutNumFla, int wordsOutNum,int m,string inputPath) { getFile Wordlist = new getFile(); ArrayList keysList = new ArrayList(); ArrayList keysList1 = new ArrayList(); keysList1 = Wordlist.getPhrase(inputPath, outputPath, ref wordList, m); keysList = Wordlist.getWordFre(outputPath, ref wordList); StreamWriter sw = new StreamWriter(outputPath); sw.WriteLine("characters:{0}", characters); sw.WriteLine("words:{0}", words); sw.WriteLine("lines:{0}", lines); if (wordsOutNumFla == 1) { wordsOutNum = wordsOutNum; } else wordsOutNum = 10; for (int i = 0; i < wordsOutNum; i++) { sw.WriteLine("<{0}>:{1}", keysList[i], wordList[keysList[i]]); } sw.WriteLine("如下是長度爲{0}的詞組:\n",m); foreach (string j in keysList1) { sw.WriteLine("<{0}>:{1}", j, 1); } sw.Flush(); sw.Close(); }
寫入文件仍是比較簡單,可是有一個小細節就是在打開文件以後必定要關閉所打開的文件,否則若是要對文件進行二次追加寫入的時候回報錯,
我以前分兩次寫入文件的,而後又忘記了在第一次打開文件以後進行關閉,致使了報錯必定要記住
這個方法,傳入了須要寫入文件的總字符數、單詞數、頻率,以及頻率最高的單詞的個數的標誌位wordsOutNumFla,
經過wordsOutNumFla這個來判斷是輸出默認的十個最高頻率單詞,仍是使用-n參數後面的數字
編程
[TestMethod] public void getHangNum() { int lines; int m = 3; string input_path = "C:/Users/羅偉誠/Desktop/input.txt", out_put = "C:/Users/羅偉誠/Desktop/out.txt"; Hashtable wordList = new Hashtable(); ArrayList keysList = new ArrayList(); getFile c = new getFile(); keysList = c.getWordFre(input_path, ref wordList); lines = c.getHangNum(input_path); }
測試出來如上圖所示,沒有問題c#
[TestMethod] public void getWordNum1() { int words; int m = 3; string input_path = "C:/Users/羅偉誠/Desktop/input.txt", out_put = "C:/Users/羅偉誠/Desktop/out.txt"; Hashtable wordList = new Hashtable(); Hashtable wordList1 = new Hashtable(); ArrayList keysList = new ArrayList(); getFile c = new getFile(); keysList = c.getWordFre(input_path, ref wordList); words = c.getWordNum(input_path); }
[TestMethod] public void getCharactersNum1() { int words, characters = 0, wordsOutNum = 0, wordsOutNumFla = 0, inputPathFla = 0, outputPathFla = 0; int m = 3; string input_path = "C:/Users/羅偉誠/Desktop/input.txt", out_put = "C:/Users/羅偉誠/Desktop/out.txt"; Hashtable wordList = new Hashtable(); Hashtable wordList1 = new Hashtable(); ArrayList keysList = new ArrayList(); getFile c = new getFile(); keysList = c.getWordFre(input_path, ref wordList); words = c.getWordNum(input_path); }
try { if (inputPathFla == 1 || outputPathFla == 1) { Hashtable wordList = new Hashtable(); Hashtable wordList1 = new Hashtable(); ArrayList keysList = new ArrayList(); getFile c = new getFile(); keysList = c.getWordFre(input_path, ref wordList); lines = c.getHangNum(input_path); words = c.getWordNum(input_path); characters = c.getCharactersNum(input_path); c.write(out_put, ref wordList, lines, words, characters, wordsOutNumFla, wordsOutNum,m,input_path ); Console.WriteLine("寫入文件完成,請前往{0}查看\n", out_put); } else { Console.WriteLine("請使用 -i 參數和 -o 參數指定輸入和輸出路徑\n"); } } catch (Exception e) { Console.WriteLine("請檢查輸入路徑是否正確"); }
MatchCollection mc; Regex rg = new Regex("[A-Za-z]+"); //用正則表達式匹配單詞 mc = rg.Matches(line); for (int i = 0; i < mc.Count - m + 1; i++) { Regex regNum = new Regex("^[0-9]"); string mcTmp = ""; int t = i; for (int q = 0; q < m; q++) { mcTmp += mc[t].Value.ToLower() + " "; t++; } k.Add(mcTmp); }
經過此次結對編程,總結了一下結對編程的好處數組