C語言實現二叉樹-利用二叉樹統計單詞數目

時間 2019-11-20

標籤 c語言實現二叉樹利用統計單詞數目欄目應用數學简体版

原文原文鏈接

昨天剛參加了騰訊2015年在線模擬考；html

四道大題的第一題就是單詞統計程序的設計思想；node

爲了記住這一天，我打算今天經過代碼實現一下；數據結構

我將用到的核心數據結構是二叉樹；函數

（要是想了解簡單二叉樹的實現，能夠參考個人另外一篇文章：http://www.cnblogs.com/landpack/p/4783120.html）優化

Problem編碼

我須要統計的單詞是在程序直接硬編碼的；spa

這樣作得緣由是省略了文件輸入輸出所帶來的困惑；設計

個人每篇文章，通常只說一個主題；code

這樣也方便我往後複習；htm

Solution

首先，咱們須要定義一個結構體，以下代碼所示：

const int LONGEST_WORD = 32;    // The longest word size

struct binary_tree {
    char str[LONGEST_WORD];
    int count;
    struct binary_tree * left;
    struct binary_tree * right;
};

typedef struct binary_tree node;

注意到，咱們假設最長的單詞定義爲一個常量，在這裏我以爲目前32這個長度應該能夠啦；

若是要統計的文章是化學論文，建議你再加大數字，由於化學式一般都很長；

而後是，咱們的結構體；這應該很容易理解的；

因爲C語言沒有提供我想要的BOOL類型，所以本身動手寫啦下面的代碼；

這個定義很是有用，一般它比define更加值得推薦；

enum BOOL {
    NO,
    YES
};  

typedef enum BOOL BOOL;

接下來，咱們須要知道單詞之間是如何比較大小的；

所以，須要一個函數叫作cmp；

代碼實現以下：

BOOL cmp(char * s, char * t)
{
    int i;
    for (i = 0; s[i] == t[i]; i++)
        if ( s[i] == '\0' )
            return NO;
    return (s[i] - t[i]) < 0 ? NO:YES;
}

同時遍歷兩個字符串，而後對返回值進行一個處理；

這樣只會返回兩種狀況NO/YES，否則的話會返回三種值(-1，0，正數);

那樣的話，不利於咱們日後的工做；

接下來呢，就是若是返回YES咱們該（如何）（作什麼）；

若是返回NO咱們又該（如何）（作什麼）；

所以，咱們須要一個insert函數，把數據的兩種不一樣分別插入左右子樹；

void insert(node ** tree, char * val) {
    node * temp = NULL;
    if(!(*tree)) {
        temp = (node*)malloc(sizeof(node));
        temp->left = temp->right = NULL;
        temp->str = val;    //issue code ...
        temp->count = 1;
        *tree = temp;
        return ;
    }

    if(cmp(val, (*tree)->str)) {
        insert(&(*tree)->left,val);
    }else if (cmp(val,(*tree)->str)) {
        insert(&(*tree)->right,val);
    }else{
        (*tree)->count++;
    }
}

這段代碼和前面提到的(C語言實現二叉樹)裏面的代碼幾乎同樣的，哪裏有詳細介紹；

這裏主要講解一下注釋有issue code的那行，若是這行不修改，程序將會蹦潰；

可是，我會故意不立刻修改它，繼續往下寫；

咱們須要一個函數，銷燬節點：

void deltree(node * tree) {
    if(tree) {
        deltree(tree->left);
        deltree(tree->right);
        free(tree);
    }
}

爲了查看咱們的結果，咱們須要一種遍歷方式；

這裏咱們就選擇中序吧！

void print_inorder(node * tree) {
    if(tree) {
        print_inorder(tree->left);
        printf("[%s\t\t\t]count:[%d]\n",tree->str,tree->count);
        print_inorder(tree->right);
    }
}

咱們把頭文件stdio.h/stdlib.h引入後；

把主int main(int argc, char ** arg{

 node * root; node * tmp; //int i;  root = NULL; /* Inserting nodes into tree */ insert(&root,"hello"); insert(&root,"hey"); insert(&root,"hello"); insert(&root,"ok"); insert(&root,"hey"); insert(&root,"hey"); insert(&root,"hey")； printf("In Order Display\n"); print_inorder(root);/* Deleting all nodes of tree */ deltree(root); }

gcc編譯運行獲得以下結果：

果真，咱們的issue code有問題，緣由是字符串不能像其餘的，例如int類型同樣直接用‘=’號賦值；

因此咱們須要一個cpy函數：

void mystrcpy(char *s, char *t)
{
    while ((*s++ = *t++) != '\0')
        ;
}

全部代碼以下：

#include <stdio.h>
#include <stdlib.h>


const int LONGEST_WORD = 32;    // The longest word size

struct binary_tree {
    char str[LONGEST_WORD];
    int count;
    struct binary_tree * left;
    struct binary_tree * right;
};

typedef struct binary_tree node;

enum BOOL {
    NO, 
    YES 
};

typedef enum BOOL BOOL;

BOOL cmp(char * s, char * t)
{
    int i;
    for (i = 0; s[i] == t[i]; i++)
        if ( s[i] == '\0' )
            return NO; 
    return (s[i] - t[i]) < 0 ? NO:YES;
}
void mystrcpy(char *s, char *t) 
{
    while ((*s++ = *t++) != '\0')
        ;   
}

void insert(node ** tree, char * val) {
    node * temp = NULL;
    if(!(*tree)) {
        temp = (node*)malloc(sizeof(node));
        temp->left = temp->right = NULL;
        //temp->str = val;  //issue code ...
        mystrcpy(temp->str,val);
        temp->count = 1;
        *tree = temp;
        return ;
    }

    if(cmp(val, (*tree)->str)) {
        insert(&(*tree)->left,val);
    }else if (cmp(val,(*tree)->str)) {
        insert(&(*tree)->right,val);
    }else{
        (*tree)->count++;
    }
}

void deltree(node * tree) {
    if(tree) {
        deltree(tree->left);
        deltree(tree->right);
        free(tree);
    }
}

void print_inorder(node * tree) {
    if(tree) {
        print_inorder(tree->left);
        printf("[%s\t\t\t]count:[%d]\n",tree->str,tree->count);
        print_inorder(tree->right);
    }
}




int main(int argc, char ** argv)
{
    node * root;
    node * tmp;
    //int i;

    root = NULL;
    /* Inserting nodes into tree */
    insert(&root,"hello");
    insert(&root,"hey");
    insert(&root,"hello");
    insert(&root,"ok");
    insert(&root,"hey");
    insert(&root,"hey");
    insert(&root,"hey");


    printf("In Order Display\n");
    print_inorder(root);


    /* Deleting all nodes of tree */

    deltree(root);
}

最後運行結果以下：

Discussion

那麼這個程序已經完成啦！

還有不少能夠優化的，也能夠增長更多的功能；

例如，查找特定字符出現的次數；

或者特定字符所出現的行數，等等均可以；

咱們會在往後慢慢完善；