嵌入式c代碼的一些優化方法

編譯器優化選項

-flto 鏈接時優化：編譯時生成GIMPLE，而後鏈接時視同一個編譯單元進行優化。使用此選項進行優化時，須要GCC插件，在生成庫文件時須要指定。若是生成庫時指定了該選項，則能夠在鏈接該庫時進行優化。
-On 編譯優化級別：有0,g,1,2,3,s,fast六個級別，0級不優化，fast最高（但有可能會出現問題），3和s可能須要實驗選擇一個能夠接受結果。g優化級別能夠用來調試，從而初步檢查在更高級別優化時會出現的錯誤。
-ffunction-sections -fdata-sections和鏈接時的選項--gc-sections配合能夠去掉未使用的函數和數據，這個選項包含在了-On選項中。另外：-fdata-sections在avr-gcc編譯器中若是使用變量屬性io和address時會出現問題。
使用函數屬性優化：例如__attribute__((flatten))，這個屬性能夠將函數內部的調用優化成內聯函數；__attribute__((always_inline))能夠將函數內聯到其餘函數中；__attribute__((optimize("Os")))能夠指定函數的優化級別，這個級別能夠和項目的優化選項不同，從而避免在特定級別優化時出現的錯誤。

代碼的優化

宏：使用宏並配合-On和-flto能夠在編譯時得到良好的優化結果，但也不必定，例如

// 使用宏進行優化時沒有下面的代碼好。
#define sprtbl ((uint8_t[][2]){{4, 2}, ..., {3, 128}})

// static uint8_t sprtbl[][2] = {
//     {4, 2}, {0, 4}, {5, 8}, {1, 16}, {6, 32}, {2, 64}, {3, 128},
// };

static inline uint8_t spr(uint32_t baud) {
    uint8_t m = F_CPU / baud;
    for (int i = 0; i < 7; i++) {
        if (m < sprtbl[i][1]) {
            return sprtbl[i][0];
        }
    };
    return 3;
}

2. 如同上面的代碼，使用inline定義函數能夠得到等同於宏的優化效果函數

3. 若是定義io指針結構，則使用const限定會得到較好的優化結果，例如，優化

typedef struct usart_io_s {
    volatile uint8_t* ubrrl;
    volatile uint8_t* ubrrh;
    ... // other io
} usart_io_t;

#define usart0_io { &UBRR0L, ... }
#define usart1_io { &UBRR1L, ... }

volatile struct usart_buffer_s {
    ...
} rx[USART_COUNT] = {}, tx[USART_COUNT] = {};

const struct usart_handle_s {
    usart_io_t                      io;
    volatile struct usart_buffer_s* rx;
    volatile struct usart_buffer_s* tx;
    ... // others fields
} usart[USART_COUNT] = {
    {usart0_io, &rx[0], &tx[0]},
    {usart1_io, &rx[1], &tx[1]},
}; // 當rx,tx使用指針時，usart[]結構能夠得到優化。例如訪問io成員時，能夠將其優化爲io指令。

4. 函數內使用const定義一些內部常量ui

[code A] 
void port_init(int id, int io) {
    const port_io_t p = pio[id / 8];
    if (io == 1) {
        *p.ddr |= ...;
    } else {
        *p.ddr &= ...;
    }
}
相比下面的代碼，能夠得到更多的優化
[code B]
void port_init(int id, int io) {
    if (io == 1) {
        *pio[id / 8].ddr |= ...;
    } else {
        *pio[id / 8].ddr &= ...;
    }
}
上面的代碼中pio即便聲明爲const，A也會得到比B更多的優化(有前提條件，見後面的代碼)；
例如：
void main() {
  port_init(PA0, OUTPUT);
  ...
  port_init(PA7, OUTPUT);
}
若是你調用port_init少於3個（avr-gcc 5.4），則A與B沒什麼區別，若是大於3個，則A，B區別明顯。（爲何3個？忽然間以爲gcc的編寫者是我道門中人啊，三生萬物啊）

5. 位域優化，位域能夠簡化對IO寄存器功能位的訪問，一個賦值語句等同於讀寫改三個語句，固然帶來的問題就是多任務環境下的數據同步問題。若是屢次訪問同一(volatile)寄存器上的不一樣位域，則會生成重複訪問同一寄存器的彙編代碼，下面給出一個方案，能夠將代碼優化成一次訪問spa

將IO寄存器定義成聯合體，以下：
    union sreg_t {
        uint8_t i8;
        struct {
            unsigned c : 1; //!< carry flag.
            unsigned z : 1; //!< zero flag.
            unsigned n : 1; //!< negative flag.
            unsigned v : 1; //!< two's complement overflow flag.
            unsigned s : 1; //!< sign bit.
            unsigned h : 1; //!< half carry flag.
            unsigned t : 1; //!< bit copy storage.
            unsigned i : 1; //!< global interrupt enable.
        };
    };
使用時，
sreg_t sreg = {};
sreg.c = 1;
sreg.i = 1;

// 真正的賦值，SREG爲狀態寄存器指針
SREG = sreg.i8;

以上方法適用於GCC插件

嵌入式C代碼的優化

嵌入式c代碼的一些優化方法