[導讀]Learning from Imbalanced Classes

時間 2019-12-12

標籤導讀 learning imbalanced classes 简体版

原文原文鏈接

原文：Learning from Imbalanced Classesorm

數據不平衡是一個很是經典的問題，數據挖掘、計算廣告、NLP等工做常常遇到。該文總結了可能有效的方法，值得參考：blog

Do nothing. Sometimes you get lucky and nothing needs to be done. You can train on the so-called natural (or stratified) distribution and sometimes it works without need for modification.

Balance the training set in some way:

Oversample the minority class.

Undersample the majority class.

Synthesize new minority classes.

Throw away minority examples and switch to an anomaly detection framework.

At the algorithm level, or after it:

Adjust the class weight (misclassification costs).

Adjust the decision threshold.

Modify an existing algorithm to be more sensitive to rare classes.

Construct an entirely new algorithm to perform well on imbalanced data.

相關標籤/搜索