leetcode 解題報告 Word Ladder II

時間 2020-06-08

標籤 leetcode 解題報告 word ladder 欄目應用數學简体版

原文原文鏈接

題目很少說了。見https://oj.leetcode.com/problems/word-ladder-ii/web

這一題我反覆修改了兩天半。嘗試過各類思路，老是報TLE。終於知道這一題爲何是leetcode上經過率最低的一道題了，它對時限的要求實在太苛刻了。算法

在我AC版本代碼的前一個版本，最好也就過了單詞長度爲7的test case。而後就TLE了。數據結構

到底問題在哪兒？我從算法，STL數據結構，代碼優化各類角度思考。比較惋惜的是，直到最後我也沒有弄清爲啥能AC，爲啥會TLE。（都是我寫的代碼，都是個人思路，太詭異了。。。）ide

但無論如何，經過這一題，學到的還真是挺多。這裏總結下吧。模塊化

拿到這一題的時候，首先想到的就是爆搜。依次替換單詞中的字母，而後依次爲基礎進行搜索。優化

是BFS仍是DFS呢？spa

先引用下Stack Overflow上的兩個解答rest

That heavily depends on the structure of the search tree and the number and location of solutions. If you know a solution is not far from the root of the tree, a breadth first search (BFS) might be better. If the tree is very deep and solutions are rare, depth first search (DFS) might take an extremely long time, but BFS could be faster. If the tree is very wide, a BFS might need too much memory, so it might be completely impractical. If solutions are frequent but located deep in the tree, BFS could be impractical. If the search tree is very deep you will need to restrict the search depth for depth first search (DFS), anyway (for example with iterative deepening).code

--------------------------------------------------------------------------------------orm

BFS is going to use more memory depending on the branching factor... however, BFS is a complete algorithm... meaning if you are using it to search for something in the lowest depth possible, BFS will give you the optimal solution. BFS space complexity is O(b^d)... the branching factor raised to the depth (can be A LOT of memory).

DFS on the other hand, is much better about space however it may find a suboptimal solution. Meaning, if you are just searching for a path from one vertex to another, you may find the suboptimal solution (and stop there) before you find the real shortest path. DFS space complexity is O(|V|)... meaning that the most memory it can take up is the longest possible path.

They have the same time complexity.

其實這一題很容易在腦海匯中勾勒一下DFS/BFS搜索樹的大體樣子。

若是選用DFS（即廣義上的爆搜遞歸）

void search(string &word, string &end, unordered_set<string> &dict, int level)
{
     if(word == end)
         return;
     
     if( level == dict.size())
         return;
     
     for(int i = 0; i < word.length(); i++)
    {
           for(int ch = 'a'; j <='z'; j++)
           {
                    string tmp = word;
                    if(tmp[i] == ch)
                             continue;
                    tmp[i] = ch;
                    if(dict.count(tmp) > 0)
                          search(tmp, end, dict, level+1);                  
            }        
    }

如此，必需要遍歷整棵搜索樹，記錄全部可能的解路徑，而後比較最短的輸出，重複節點不少，時間複雜度至關大。有人問能夠剪枝麼，答案是這裏無法剪。若是把已經訪問過的剪掉，那麼就會出現搜索不徹底的狀況。

看來直接上來爆搜是不行的。效率低的不能忍。

這樣看，若是將相鄰的兩個單詞（即只差一個字母的單詞）相互連在一塊兒，這就是一個圖嘛。經典的圖算法，dijiska算法不就是求解最短路徑的算法麼。

那麼就說直接鄰接表建圖，而後dijkstra算法求解咯，固然是能夠的，邊緣權值設爲1就行。並且這種思路工程化，模塊化思路很明顯，比較不容易出錯。但此種狀況下時間需建圖，而後再調用dijkstra，光是後者複雜度就爲o(n^2)，因此仍有可能超時，或者說，至少還不是最優方法。

建圖後進行DFS呢。很惋惜，對於一個無向有環圖，DFS只能遍歷節點，求最短路徑什麼的仍是別想了。（注意，這裏對圖進行DFS搜索也會生成一顆搜索樹，可是與上文提到的遞歸爆搜獲得的搜索樹徹底不同哦，主要是由於對圖進行DFS得不到嚴謹的先後關係，而這是最短路徑必須具有的）

好了，咱們來看看一個例子

如何對這個圖進行數據結構上的優化，算法上的優化是解決問題的關鍵。

經過觀察，容易發現這個圖沒有邊權值，也就是所用dijkstra算法顯得不必了，簡單的BFS就行，呵呵，BFS是能夠求這類圖的最短路徑的，

正如wiki所言：若全部邊的長度相等，廣度優先搜索算法是最佳解——亦即它找到的第一個解，距離根節點的邊數目必定最少。

因此，從出發點開始，第一次"遍歷"到終點時過的那條路徑就是最短的路徑。並且是時間複雜度爲O(|V|+|E|)。時間複雜度較dijkstra小，尤爲是在邊沒那麼多的時候。

到此爲止了麼。固然不是，還能夠優化。

回到最原始的問題，這個圖夠好麼？它能反映問題的本質麼。所謂問題的本質，有這麼兩點，一是具備嚴格的先後關係（由於要輸出全部變換序列），二是圖中的邊數量是否過大，可以減少一些呢？

其實，一個相對完美的圖應該是這樣的

這個圖有兩個很明顯的特色，一是有向圖，具備鮮明的層次特性，二是邊沒有冗餘。此圖完美的描述瞭解的結構。

因此，咱們建圖也要有必定策略，也許大家會問，我是怎麼想出來的。

其實，能夠這樣想，咱們對一個單詞w進行單個字母的變換，獲得w1 w2 w3...，本輪的這些替換結果直接做爲當前單詞w的後繼節點，藉助BFS的思想，將這些節點保存起來，下一輪開始的時候提取將這些後繼節點做爲新的父節點，而後重複這樣的步驟。

這裏，咱們須要對節點「分層」。上圖很明顯分爲了三層。這裏沒有用到隊列，可是思想和隊列一致的。由於隊列沒法體現層次關係，因此建圖的時候，必須設立兩個數據結構，用來保存當前層和下層，交替使用這兩個數據結構保存父節點和後繼節點。

同時，還須要保證，當前層的全部節點必須不一樣於全部高層的節點。試想，若是tot下面又接了一個pot，那麼由此構造的路徑只會比tot的同層pot構造出的路徑長。如何完成這樣的任務呢？能夠這樣，咱們把全部高層節點從字典集合中刪除，而後供給當前層選取單詞。這樣，當前層選取的單詞就不會與上層的重複了。注意，每次更新字典的時候是在當前層處理完畢以後在更新，切不可獲得一個單詞就更新字典。例如咱們獲得了dog，不能立刻把dog從待字典集合中刪除，不然，下次hog生成dog時在字典中找不到dog，從而致使結果不完整。簡單的說，同層的節點能夠重複。上圖也能夠把dog化成兩個節點，由dot和hog分別指向。我這裏爲了簡單就沒這麼畫了。

最後生成的數據結構應該這樣，相似鄰接表

hot---> hop, tot, dot, pot, hog

dot--->dog

hog--->dog, cog

ok。至此，問題算是基本解決了，剩下的就是如何生成路徑。其實很簡單，對於這種「特殊」的圖，咱們能夠直接DFS搜索，節點碰到目標單詞就返回。

這就完了，不能優化了？不，還能夠優化。

能夠看到，在生成路徑的時候，若是可以從下至上搜索的話，就能夠避免那些無用的節點，好比hop pot tot這類的，大大提高效率。其實也簡單，構造數據結構時，交換一下節點，以下圖

dog--->dot, hog

cog--->hog

hop--->hot

tot--->hot

dot--->hot

pot--->hot

hog--->hot

說白了，構造一個反向鄰接表便可。

對了，還沒說整個程序的終止條件。若是找到了，把當前層搜完就退出。若是沒找到，字典早晚會被清空，這時候退出就行。

說了這麼多，上代碼吧

 1 class Solution {
 2 public:
 3 vector<string> temp_path;
 4 vector<vector<string>> result_path;
 5 
 6 void GeneratePath(unordered_map<string, unordered_set<string>> &path, const string &start, const string &end)
 7 {
 8     temp_path.push_back(start);
 9     if(start == end)
10     {
11         vector<string> ret = temp_path;
12         reverse(ret.begin(),ret.end());
13         result_path.push_back(ret);
14         return;
15     }
16 
17     for(auto it = path[start].begin(); it != path[start].end(); ++it)
18     {
19             GeneratePath(path, *it, end);
20             temp_path.pop_back();
21     }
22 }
23 vector<vector<string>> findLadders(string start, string end, unordered_set<string> &dict)
24 {
25     temp_path.clear();
26     result_path.clear();
27 
28     unordered_set<string> current_step;
29     unordered_set<string> next_step;
30 
31     unordered_map<string, unordered_set<string>> path;
32 
33     unordered_set<string> unvisited = dict;
34     
35     if(unvisited.count(start) > 0)
36         unvisited.erase(start);
37     
38     current_step.insert(start);
39 
40     while( current_step.count(end) == 0 && unvisited.size() > 0 )
41     {
42         for(auto pcur = current_step.begin(); pcur != current_step.end(); ++pcur)
43         {
44             string word = *pcur;
45 
46             for(int i = 0; i < start.length(); ++i)
47             {
48                 for(int j = 0; j < 26; j++)
49                 {
50                     string tmp = word;
51                     if( tmp[i] == 'a' + j )
52                         continue;
53                     tmp[i] = 'a' + j;
54                     if( unvisited.count(tmp) > 0 )
55                     {
56                         next_step.insert(tmp);
57                         path[tmp].insert(word);
58                     }
59                 }
60             }
61         }
62 
63         if(next_step.empty()) break;
64         for(auto it = next_step.begin() ; it != next_step.end(); ++it)
65         {
66             unvisited.erase(*it);
67         }
68 
69         current_step = next_step;
70         next_step.clear();
71     }
72     
73     if(current_step.count(end) > 0)
74         GeneratePath(path, end, start);
75 
76     return result_path;
77 }
78 };

此外，這裏還有一份代碼，寫的比較亂，但用的傳統隊列的思想，用兩個標記變量來指示層數的變化。也AC了。

class Solution {
public:
vector<vector<string>> output;
vector<string> cur;

void FindPath(unordered_map<string, unordered_set<string>> &graph, const string &start, const string &end)
{
    cur.push_back(start);
    if(start == end)
    {
        vector<string> ret = cur;
        reverse(ret.begin(),ret.end());
        output.push_back(ret);
        return;
    }

    for(auto it2 = graph[start].begin(); it2 != graph[start].end(); ++it2)
    {
            FindPath(graph, *it2, end);
            cur.pop_back();
    }
}


vector<vector<string>> findLadders(string start, string end, unordered_set<string> & _dict)
{
    unordered_set<string> dict = _dict;
    if(dict.count(start) >0)
        dict.erase(start);

    output.clear();
    cur.clear();
    
    unordered_map<string, unordered_set<string>> graph;
    queue<string> q;
    unordered_map<string, int> depth;
    
    q.push(start);
    depth[start] = 0;
    
    bool found = false;
    
    int cur_deep = 0;
    int pre_deep = 0;

    while(!q.empty())
    {

        string word = q.front();
        q.pop();
        
        pre_deep = cur_deep;
        cur_deep = depth[word];

        if(pre_deep != cur_deep)
        {
            if(depth.count(end) > 0)
            {
                found = true;
                break;
            }
            else if(depth.size() == dict.size() + 1)
                break;
        }


        for( int i = 0; i < start.length(); ++i)
        {
            for(char ch = 'a'; ch <= 'z'; ch++)
            {
                string tmp = word;
                if(tmp[i] != ch)
                {
                    tmp[i] = ch;
                    
                    int t = depth.count(tmp);
                    if((t == 0 && dict.count(tmp) > 0) || (t > 0 && depth[tmp] == cur_deep + 1) )
                    {
                             graph[tmp].insert(word);
                             if(t == 0)
                             {
                                q.push(tmp);
                                depth[tmp] = cur_deep + 1;
                             }
                    }
                }
            }
        }
    }

    if(found)
    {
        FindPath(graph, end, start);
    }

    return output;
}
};

相關標籤/搜索

每日一句

每一个你不满意的现在，都有一个你没有努力的曾经。