java 解析csv

opencsv

java讀取csv的類庫主要有兩種,opencsv和javacsv,研究發現,javacsv最後一次更新是2014-12-10,好久不維護了。opencsv是apache的項目,而且至今仍在維護,因此決定使用opencsv。java

csv

csv文件,全名 comma separated values,默認以逗號分隔,是純文本文件。雖然用excel打開後格式排版了,可是那是excel對他進行了處理。用notepad或者sublime text打開能看到最原始的文本。 
爲了後續舉例,這裏編輯了一個test.csvapache

header1,header2,header3
1,a,10
2,b,20
3,c,30
4,d,40
5,e,50
6,f,60

讀取方式

  • 最原始的方式,逐行讀取,而後操做
CSVReader reader = new CSVReader(new InputStreamReader(new FileInputStream("test.csv"),"gbk"));

      /*
       * 逐行讀取
       */
      String[] strArr = null;
      while((strArr = reader.readNext())!=null){
          System.out.println(strArr[0]+"---"+strArr[1]+"----"+strArr[2]);
      }

      reader.close();

綁定csv文件轉換成bean

逐行讀取操做是最原始的操做方式,opencsv提供了基於「策略」的映射,將csv綁定到bean。數組

策略簡介

觀察一下策略的繼承層次 
這裏寫圖片描述app

接口

  • MappingStrategy 
    • 映射的頂層接口
  • HeaderColumnNameMappingStrategy 
    • 列名映射策略,讀取csv文件第一行做爲header,好比header1,header2,header3,而後調用bean的setHeader1方法,setHeader2方法,setHeader3方法分別設置值,因此這種策略要求,列名與bean中的屬性名徹底一致,若是不一致,則值爲空,不會出錯。使用註解時,註解名字必須與csv中列名一致。
  • ColumnPositionMappingStrategy 
    • 列位置映射策略,他沒有header的概念,因此會輸出取全部行。在columnMapping數組中指定bean的屬性,第一個值對應csv的第一列,第二個值對應csv的第二類……
  • HeaderColumnNameTranslateMappingStrategy 
    • 列頭名字翻譯映射策略,與HeaderColumnNameMappintStrategy相比,bean的屬性名能夠與csv列頭不同。經過指定map來映射。

具體映射用法

Java POJO類

public class SimpleBeanInfo {

    private String header1;


    private String header2;

    private String header3;

    public String getHeader1() {
        return header1;
    }

    @Override
    public String toString() {
        return "SimpleBeanInfo [header1=" + header1 + ", header2=" + header2
                + ", header3=" + header3 + "]";
    }

    public void setHeader1(String header1) {
        this.header1 = header1;
    }

    public String getHeader2() {
        return header2;
    }

    public void setHeader2(String header2) {
        this.header2 = header2;
    }

    public String getHeader3() {
        return header3;
    }

    public void setHeader3(String header3) {
        this.header3 = header3;
    }

基於列索引的映射

  • 通俗點來講就是列位置映射,csv文件中的列位置對應到bean中的列maven

  • 非註解方式ide

CSVReader reader = new CSVReader(new InputStreamReader(new FileInputStream("test.csv"),"gbk"));
      /*
       *基於列位置,映射成類 
       */
       //csv文件中的第一列對應類的header,第二列對應類的header2,第三列對應類的header3
      String[] columnMapping={"header1","header2","header3"};
      ColumnPositionMappingStrategy<SimpleBeanInfo> mapper = new ColumnPositionMappingStrategy<SimpleBeanInfo>();
      mapper.setColumnMapping(columnMapping);
      mapper.setType(SimpleBeanInfo.class);
      /* */ 
      CsvToBean<SimpleBeanInfo>  csvToBean = new CsvToBean<SimpleBeanInfo>();

      List<SimpleBeanInfo> list = csvToBean.parse(mapper, reader);

      for(SimpleBeanInfo e : list){
          System.out.println(e);
      }

    }
  • 註解方式
public class SimpleBeanInfo {

    @CsvBindByPosition(position=0)
    private String header1;


    @CsvBindByPosition(position=1)
    private String header2;

    @CsvBindByPosition(position=2)
    private String header3;
}
CSVReader reader = new CSVReader(new InputStreamReader(new FileInputStream("test.csv"),"gbk"));
 ColumnPositionMappingStrategy<SimpleBeanInfo> mapper = new ColumnPositionMappingStrategy<SimpleBeanInfo>();
      mapper.setType(SimpleBeanInfo.class);
      CsvToBean<SimpleBeanInfo>  csvToBean = new CsvToBean<SimpleBeanInfo>();

      List<SimpleBeanInfo> list = csvToBean.parse(mapper, reader);

      for(SimpleBeanInfo e : list){
          System.out.println(e);
      }

基於列名的映射

  • 非註解方式
CSVReader reader = new CSVReader(new InputStreamReader(new FileInputStream("test.csv"),"gbk"));

      /* */ 

      HeaderColumnNameMappingStrategy<SimpleBeanInfo> mapper = new
              HeaderColumnNameMappingStrategy<SimpleBeanInfo>();
      mapper.setType(SimpleBeanInfo.class);
      CsvToBean<SimpleBeanInfo>  csvToBean = new CsvToBean<SimpleBeanInfo>();

      List<SimpleBeanInfo> list = csvToBean.parse(mapper, reader);

      for(SimpleBeanInfo e : list){
          System.out.println(e);
      }
  • 註解方式
public class SimpleBeanInfo {

    @CsvBindByName(column="header1")
    private String header1;


    @CsvBindByName(column="header2")
    private String header2;

    @CsvBindByName(column="header3")
    private String header3;
}
CSVReader reader = new CSVReader(new InputStreamReader(new FileInputStream("test.csv"),"gbk"));

      HeaderColumnNameMappingStrategy<SimpleBeanInfo> mapper = new
              HeaderColumnNameMappingStrategy<SimpleBeanInfo>();
      mapper.setType(SimpleBeanInfo.class);
      CsvToBean<SimpleBeanInfo>  csvToBean = new CsvToBean<SimpleBeanInfo>();

      List<SimpleBeanInfo> list = csvToBean.parse(mapper, reader);

      for(SimpleBeanInfo e : list){
          System.out.println(e);
      }

基於列名轉換映射

CSVReader reader = new CSVReader(new InputStreamReader(new FileInputStream("test.csv"),"gbk"));
      /*
       * 基於列名轉換,映射成類
      */
      HeaderColumnNameTranslateMappingStrategy<SimpleBeanInfo> mapper = 
              new HeaderColumnNameTranslateMappingStrategy<SimpleBeanInfo>();
      mapper.setType(SimpleBeanInfo.class);

      Map<String,String> columnMapping = new HashMap<String,String>();
      columnMapping.put("header1", "header1");//csv中的header1對應bean的header1
      columnMapping.put("header2", "header2");
      columnMapping.put("header3", "header3");
      mapper.setColumnMapping(columnMapping);

      CsvToBean<SimpleBeanInfo>  csvToBean = new CsvToBean<SimpleBeanInfo>();

      List<SimpleBeanInfo> list = csvToBean.parse(mapper, reader);

      for(SimpleBeanInfo e : list){
          System.out.println(e);
      }

過濾器

opencsv提供了過濾器,能夠過濾某些行,好比page header、page footer等ui

  • 全部的過濾器必須實現CsvToBeanFilter 接口
public class MyCsvToBeanFilter implements CsvToBeanFilter {

        public boolean allowLine(String[] line) {
            //過濾第一列值等於1的行
            if("1".equals(line[0])){
                return false;
            }
            return true;
        }

}
MyCsvToBeanFilter filter = new MyCsvToBeanFilter();
 List<SimpleBeanInfo> list = csvToBean.parse(mapper, reader,filter);

轉化器

類中的屬性不必定都是字符串,好比數字、日期等,可是咱們從csv中獲取到的都是字符串,這種狀況就應該使用轉化器。 
這裏定義一個SimpleBeanConverter,繼承AbstractBeanFieldthis

public class SimpleBeanFieldConverter extends AbstractBeanField<SimpleBeanInfo> {

    @Override
    protected Object convert(String value) throws CsvDataTypeMismatchException,
            CsvRequiredFieldEmptyException, CsvConstraintViolationException {
        Field f = getField();
        if("date".equals(f.getName())){
            try {
                return new SimpleDateFormat("yyyy-MM-dd").parse(value);
            } catch (ParseException e) {
                e.printStackTrace();
            }
        }
        return null;
    }
}

test.csv添加一列header4spa

header1,header2,header3,header4
1,a,10,2016-05-01
2,b,20,2016-05-02
3,c,30,2016-05-03
4,d,40,2016-05-04
5,e,50,2016-05-05
6,f,60,2016-05-06

SimpleBeanInfo添加屬性翻譯

@CsvCustomBindByPosition(position=3,converter=SimpleBeanFieldConverter.class)
private Date date;

輸出結果 
因爲ColumnPositionMappingStrategy會連header行也解析,因此第一行會打印異常信息。咱們看到header4列已經轉換爲日期。若是不僅一個列須要轉換怎麼辦?在相應的屬性上添加註解(@CsvCustomBindByPosition或@CsvCustomBindByName),而後在convert(Object value)中擴展便可

java.text.ParseException: Unparseable date: "header4"
    at java.text.DateFormat.parse(DateFormat.java:357)
    at test_maven.SimpleBeanFieldConverter.convert(SimpleBeanFieldConverter.java:24)
    at com.opencsv.bean.AbstractBeanField.setFieldValue(AbstractBeanField.java:70)
    at com.opencsv.bean.CsvToBean.processField(CsvToBean.java:245)
    at com.opencsv.bean.CsvToBean.processLine(CsvToBean.java:220)
    at com.opencsv.bean.CsvToBean.processLine(CsvToBean.java:189)
    at com.opencsv.bean.CsvToBean.parse(CsvToBean.java:166)
    at com.opencsv.bean.CsvToBean.parse(CsvToBean.java:133)
    at test_maven.TestCSV.main(TestCSV.java:46)
SimpleBeanInfo [header1=header1, header2=header2, header3=header3, date=null]
SimpleBeanInfo [header1=2, header2=b, header3=20, date=Mon May 02 00:00:00 CST 2016]
SimpleBeanInfo [header1=3, header2=c, header3=30, date=Tue May 03 00:00:00 CST 2016]
SimpleBeanInfo [header1=4, header2=d, header3=40, date=Wed May 04 00:00:00 CST 2016]
SimpleBeanInfo [header1=5, header2=e, header3=50, date=Thu May 05 00:00:00 CST 2016]
SimpleBeanInfo [header1=6, header2=f, header3=60, date=Fri May 06 00:00:00 CST 2016]
相關文章
相關標籤/搜索