園友好,最近晚輩嘗試作一個.NET簡單、高效、避免OOM的Excel工具 MiniExcelgit
主要目前主流框架大多將資料全載入到內存方便操做,但這會致使內存消耗問題,MiniExcel 嘗試以 Stream 角度寫底層算法邏輯,能讓本來1000多MB佔用下降到幾MB,避免內存不夠狀況。適合像是低規格 azure app service 或是讀取大文件等情境。github
即時
操做每行數據以 Test1,000,000x10.xlsx 作基準與主流框架作性能測試,總共 1千萬筆 "HelloWorld",文件大小 23 MB算法
Benchmarks 邏輯能夠在 MiniExcel.Benchmarks 查看或是提交 PR,運行指令sql
dotnet run -p .\benchmarks\MiniExcel.Benchmarks\ -c Release -f netcoreapp3.1 -- -f * --join
最後一次運行結果 :數據庫
BenchmarkDotNet=v0.12.1, OS=Windows 10.0.19042 Intel Core i7-7700 CPU 3.60GHz (Kaby Lake), 1 CPU, 8 logical and 4 physical cores [Host] : .NET Framework 4.8 (4.8.4341.0), X64 RyuJIT Job-ZYYABG : .NET Framework 4.8 (4.8.4341.0), X64 RyuJIT IterationCount=3 LaunchCount=3 WarmupCount=3
Method | 最大內存耗用 | 平均時間 | Gen 0 | Gen 1 | Gen 2 |
---|---|---|---|---|---|
'MiniExcel QueryFirst' | 0.109 MB | 726.4 us | - | - | - |
'ExcelDataReader QueryFirst' | 15.24 MB | 10,664,238.2 us | 566000.0000 | 1000.0000 | - |
'MiniExcel Query' | 17.3 MB | 14,179,334.8 us | 367000.0000 | 96000.0000 | 7000.0000 |
'ExcelDataReader Query' | 17.3 MB | 22,565,088.7 us | 1210000.0000 | 2000.0000 | - |
'Epplus QueryFirst' | 1,452 MB | 18,198,015.4 us | 535000.0000 | 132000.0000 | 9000.0000 |
'Epplus Query' | 1,451 MB | 23,647,471.1 us | 1451000.0000 | 133000.0000 | 9000.0000 |
'OpenXmlSDK Query' | 1,412 MB | 52,003,270.1 us | 978000.0000 | 353000.0000 | 11000.0000 |
'OpenXmlSDK QueryFirst' | 1,413 MB | 52,348,659.1 us | 978000.0000 | 353000.0000 | 11000.0000 |
'ClosedXml QueryFirst' | 2,158 MB | 66,188,979.6 us | 2156000.0000 | 575000.0000 | 9000.0000 |
'ClosedXml Query' | 2,184 MB | 191,434,126.6 us | 2165000.0000 | 577000.0000 | 10000.0000 |
Method | 最大內存耗用 | 平均時間 | Gen 0 | Gen 1 | Gen 2 |
---|---|---|---|---|---|
'MiniExcel Create Xlsx' | 15 MB | 11,531,819.8 us | 1020000.0000 | - | - |
'Epplus Create Xlsx' | 1,204 MB | 22,509,717.7 us | 1370000.0000 | 60000.0000 | 30000.0000 |
'OpenXmlSdk Create Xlsx' | 2,621 MB | 42,473,998.9 us | 1370000.0000 | 460000.0000 | 50000.0000 |
'ClosedXml Create Xlsx' | 7,141 MB | 140,939,928.6 us | 5520000.0000 | 1500000.0000 | 80000.0000 |
Install-Package MiniExcel -Version 0.10.0 # or dotnet add package MiniExcel --version 0.10.0
其餘請查看 from NuGetapi
強型別
IEnumerable 數據 [Try it]推薦使用 Stream.Query 效率會相對較好。mvc
public class UserAccount { public Guid ID { get; set; } public string Name { get; set; } public DateTime BoD { get; set; } public int Age { get; set; } public bool VIP { get; set; } public decimal Points { get; set; } } var rows = MiniExcel.Query<UserAccount>(path); // or using (var stream = File.OpenRead(path)) var rows = stream.Query<UserAccount>();
Dynamic
IEnumerable 數據 [Try it]A,B,C,D...Z
MiniExcel | 1 |
---|---|
Github | 2 |
var rows = MiniExcel.Query(path).ToList(); // or using (var stream = File.OpenRead(path)) { var rows = stream.Query().ToList(); Assert.Equal("MiniExcel", rows[0].A); Assert.Equal(1, rows[0].B); Assert.Equal("Github", rows[1].A); Assert.Equal(2, rows[1].B); }
note : 同名以右邊數據爲準app
Input Excel :框架
Column1 | Column2 |
---|---|
MiniExcel | 1 |
Github | 2 |
var rows = MiniExcel.Query(useHeaderRow:true).ToList(); // or using (var stream = File.OpenRead(path)) { var rows = stream.Query(useHeaderRow:true).ToList(); Assert.Equal("MiniExcel", rows[0].Column1); Assert.Equal(1, rows[0].Column2); Assert.Equal("Github", rows[1].Column1); Assert.Equal(2, rows[1].Column2); }
Query Firstless
var row = MiniExcel.Query(path).First(); Assert.Equal("HelloWorld", row.A); // or using (var stream = File.OpenRead(path)) { var row = stream.Query().First(); Assert.Equal("HelloWorld", row.A); }
IEnumerable參數``延遲查詢
,除非必要請不要使用 ToList 等方法讀取所有數據到內存圖片 : 是否呼叫 ToList 的內存差異
Anonymous or strongly type:
var path = Path.Combine(Path.GetTempPath(), $"{Guid.NewGuid()}.xlsx"); MiniExcel.SaveAs(path, new[] { new { Column1 = "MiniExcel", Column2 = 1 }, new { Column1 = "Github", Column2 = 2} });
Datatable:
var path = Path.Combine(Path.GetTempPath(), $"{Guid.NewGuid()}.xlsx"); var table = new DataTable(); { table.Columns.Add("Column1", typeof(string)); table.Columns.Add("Column2", typeof(decimal)); table.Rows.Add("MiniExcel", 1); table.Rows.Add("Github", 2); } MiniExcel.SaveAs(path, table);
Dapper:
using (var connection = GetConnection(connectionString)) { var rows = connection.Query(@"select 'MiniExcel' as Column1,1 as Column2 union all select 'Github',2"); MiniExcel.SaveAs(path, rows); }
IEnumerable<IDictionary<string, object>>
var values = new List<Dictionary<string, object>>() { new Dictionary<string,object>{{ "Column1", "MiniExcel" }, { "Column2", 1 } }, new Dictionary<string,object>{{ "Column1", "Github" }, { "Column2", 2 } } }; MiniExcel.SaveAs(path, values);
output :
Column1 | Column2 |
---|---|
MiniExcel | 1 |
Github | 2 |
using (var stream = File.Create(path)) { stream.SaveAs(values); }
e.g
input excel :
Test1 | Test2 | Test3 | Test4 | Test5 | Test6 | Column1 | Column2 |
---|---|---|---|---|---|---|---|
Test1 | Test2 | Test3 | Test4 | Test5 | Test6 | Column1 | Column2 |
public class ExcelAttributeDemo { [ExcelColumnName("Column1")] public string Test1 { get; set; } [ExcelColumnName("Column2")] public string Test2 { get; set; } [ExcelIgnore] public string Test3 { get; set; } public string Test4 { get; set; } public string Test5 { get; } public string Test6 { get; private set; } } var rows = MiniExcel.Query<ExcelAttributeDemo>(path).ToList(); Assert.Equal("Column1", rows[0].Test1); Assert.Equal("Column2", rows[0].Test2); Assert.Null(rows[0].Test3); Assert.Equal("Test4", rows[0].Test4); Assert.Null(rows[0].Test5); Assert.Null(rows[0].Test6);
note : 請不要呼叫 call ToList/ToArray 等方法,這會將全部數據讀到內存內
using (var connection = new SQLiteConnection(connectionString)) { connection.Open(); using (var transaction = connection.BeginTransaction()) using (var stream = File.OpenRead(path)) { var rows = stream.Query(); foreach (var row in rows) connection.Execute("insert into T (A,B) values (@A,@B)", new { row.A, row.B }, transaction: transaction); transaction.Commit(); } }
效能:
public class ExcelController : Controller { public IActionResult Download() { var values = new[] { new { Column1 = "MiniExcel", Column2 = 1 }, new { Column1 = "Github", Column2 = 2} }; var stream = new MemoryStream(); stream.SaveAs(values); return File(stream, "application/vnd.openxmlformats-officedocument.spreadsheetml.sheet", "demo.xlsx"); } }
MiniExcel 預設會根據擴展名或是 Stream 類別判斷是 xlsx 仍是 csv,但會有失準時候,請自行指定。
stream.SaveAs(excelType:ExcelType.CSV); //or stream.SaveAs(excelType:ExcelType.XLSX); //or stream.Query(excelType:ExcelType.CSV); //or stream.Query(excelType:ExcelType.XLSX);
IDictionary<string,object>
數據foreach(IDictionary<string,object> row = MiniExcel.Query(path)) { //.. }
野人獻曝 Github : https://github.com/shps951023/MiniExcel