當咱們要導出數據庫數據到Excel文件時,若是數據量特別大,那麼可能須要耗費較多內存形成OOM。即便沒有OOM,也有可能由於生成Excel文件的時間過久致使請求超時。這時候就須要POI的SXSSF(org.apache.poi.xssf.streaming)功能了。html
Excel 97(-2007) file formatjava
Excel 2007 OOXML (.xlsx) file format數據庫
HSSF is the POI Project's pure Java implementation of the Excel '97(-2007) file format. XSSF is the POI Project's pure Java implementation of the Excel 2007 OOXML (.xlsx) file format.apache
HSSF and XSSF provides ways to read spreadsheets create, modify, read and write XLS spreadsheets. They provide:api
Since 3.8-beta3, POI provides a low-memory footprint SXSSF API built on top of XSSF.bash
SXSSF is an API-compatible streaming extension of XSSF to be used when very large spreadsheets have to be produced, and heap space is limited. SXSSF achieves its low memory footprint by limiting access to the rows that are within a sliding window, while XSSF gives access to all rows in the document. Older rows that are no longer in the window become inaccessible, as they are written to the disk.app
In auto-flush mode the size of the access window can be specified, to hold a certain number of rows in memory. When that value is reached, the creation of an additional row causes the row with the lowest index to to be removed from the access window and written to disk. Or, the window size can be set to grow dynamically; it can be trimmed periodically by an explicit call to flushRows(int keepRows) as needed.xss
Due to the streaming nature of the implementation, there are the following limitations when compared to XSSF:ide
SXSSF是如何減少內存消耗的呢?它經過將數據寫到臨時文件來減小內存使用,下降發生OOM錯誤的機率。字體
// turn off auto-flushing and accumulate all rows in memory
SXSSFWorkbook wb = new SXSSFWorkbook(-1);
複製代碼
你也能夠在構造方法裏,指定-1來關閉自動寫入數據到文件,將全部數據內容保持在內存裏。
雖然這裏處理了內存OOM的問題,可是仍是必須將所有數據寫到一個臨時文件以後才能響應請求,請求超時的問題沒有解決。
Excel 2007 OOXML (.xlsx) 文件格式其實本質上是一個zip文件,咱們能夠把.xlsx
文件後綴名改成.zip
,而後解壓:
$ mv output.xlsx output.zip
$ unzip output.zip
$ tree output/
output/
├── [Content_Types].xml
├── _rels
├── docProps
│ ├── app.xml
│ └── core.xml
└── xl
├── _rels
│ └── workbook.xml.rels
├── sharedStrings.xml
├── styles.xml
├── workbook.xml
└── worksheets
└── sheet1.xml
5 directories, 8 files
複製代碼
咱們能夠看到這個Excel文件解壓後包含了上面那些文件,其中styles是咱們定義的樣式格式(包括字體、文字大小、顏色、居中等屬性),worksheets目錄下是咱們的數據內容。
經過具體分析數據格式,咱們能夠本身控制xlsx文件的寫入過程,將數據直接寫到響應流上而非臨時文件就能夠完美解決請求超時的問題。
示例代碼:
XSSFWorkbook wb = new XSSFWorkbook()
XSSFCellStyle headerStyle = genHeaderStyle(wb)
sheets.each { sheet ->
def xssfSheet = wb.createSheet(sheet.name)
sheet.setXSSFSheet(xssfSheet)
sheet.setHeaderStyle(headerStyle)
}
File template = genTemplateFile(wb)
ZipOutputStream zos = new ZipOutputStream(responseStream);
ZipFile templateZip = new ZipFile(template);
Enumeration<ZipEntry> templateEntries = templateZip.entries();
try {
while (templateEntries.hasMoreElements()) {
// copy all template content to the ZipOutputStream zos
// except the sheet itself
}
zos.putNextEntry(new ZipEntry(sheetName)); // now the sheet
OutputStreamWriter sheetOut = new OutputStreamWriter(zos, "UTF-8");
try {
sheetOut.write("<?xml version=\"1.0\" encoding=\"UTF-8\"?>");
sheetOut.write("<worksheet><sheetData>");
// write the content – rows and cells
sheetOut.write("</sheetData></worksheet>");
} finally { sheetOut.close(); }
} finally { zos.close(); }
複製代碼
其中,template包含了一些索引信息,好比建了哪些樣式、幾個sheet等,這些信息是放到ZIP文件的最前面的,最後纔是sheet內容數據。
個人博客原文地址: blog.yu000hong.com/2018/07/24/…