文件讀取效率比較(Perl,Python,VBA)

測試樣本:編程

http://files.cnblogs.com/files/metree/TextSample.7z編程語言

背景介紹:oop

文本數據的處理是文本數據挖掘的第一步,本文展現瞭如何經過Perl,Python和VBA逐行讀取文本文件(不打印輸出),如下代碼都添加了程序計時功能,主要用於對比各編程語言讀取文本效率。測試

測試環境:spa

1)輸入文件:讀取94M文本文件。操作系統

2)測試環境:3d

  a) 操做系統:Win7 64bit / SSD硬盤 / i7-4900MQ cpu @ 2.80GHz / 16G內存;excel

  b) Perl環境:Strawberry Perl 5.18.2.1-64bit;(同時電腦也安裝了Cygwin 32bit);code

  c) Python環境:Python 3.3.5;blog

  d) VBA環境:Excel2013 64bit, VBA7.0(測試時只保留一個excel文件處於打開狀態,這點很重要!);

測試結果:

1)在Strawberry Perl 5.18.2.1-64bit和Cygwin 32bit下讀取一樣文件分別耗時0.162s0.128s

2)在Python3.3.5環境下,讀取一樣文本約耗時0.639s

3)excel VBA文本空載逐行讀取耗時2.855s

經過以上測試:

1) Cygwin環境下Perl的文本讀取效率最高,爲0.128s,VBA文本讀取效率最低,爲2.855s,二者相差20倍左右

2) Cygwin環境下Perl的文本讀取效率比Python3.3.5高約5倍

3) 僅在文本讀取效率方面,Perl語言優點明顯。

 

Perl逐行讀取文本核心代碼

use strict ;
use Time::HiRes qw(gettimeofday) ;

sub Test
{
    # sec: seconds
    # usec: microsecond
    my ($start_sec, $start_usec) = gettimeofday() ;
#======================# # Place your code here!# #======================#
open MYFILE01,"/home/metree/a/CFGMML-RNC3014-192.168.1.9-20140408040026.txt" || die "cannot open the file: $!\n"; #======================#
# Read text row by row #
#======================#
while (<MYFILE01>)
{
#print; $_ = <MYFILE01>; #print $_; } close MYFILE01; my ($end_sec, $end_usec) = gettimeofday() ; # Compute time elipsed my $timeDelta = ($end_usec - $start_usec) / 1000 + ($end_sec - $start_sec) * 1000; print $timeDelta; } &Test() ; 1 ;

Python逐行讀取文本核心代碼

import datetime
starttime = datetime.datetime.now()
#===================#
# do something here #
#===================#
f = open("d:\CFGMML-RNC3014-192.168.1.9-20140408040026.txt","r") line = f.readline() while line: #print (line) line = f.readline()
f.close endtime
= datetime.datetime.now() interval=endtime - starttime print (interval)

VBA逐行讀取文本核心代碼

Sub VBAtextReadline()

    Dim FileToOpenCsv
    Dim Begin
    Dim Over
    Dim fso_SeqCsv
    
    FileToOpenCsv = Application.GetOpenFilename("CSV文檔(*.*),*.*", 1, "請選擇須要導入的csv文件", , True)
    
    If Not IsArray(FileToOpenCsv) Then
        MsgBox "未選擇任何文件!"
        Exit Sub
    End If
    
    '開始計時
    Begin = Timer
    
    Const ForReading = 1
    Const ForWriting = 2
    Const ForAppending = 8        
    i = 0 '統計文本總行數
    
    Application.ScreenUpdating = False
    Application.DisplayAlerts = False
    
    Set fso_SeqCsv = CreateObject("Scripting.FileSystemObject")
    
    For i_FilesNumCsv = LBound(FileToOpenCsv) To UBound(FileToOpenCsv)        
        Set SeqCsvFiles = fso_SeqCsv.OpenTextFile(FileToOpenCsv(i_FilesNumCsv), ForReading, True, TristateTrue)        
        Do While Not SeqCsvFiles.AtEndOfLine            
            SeqAlarm_Line = SeqCsvFiles.ReadLine
            'pmSglAlarmAll = Split(SeqAlarm_Line, Chr(44), -1)
            i = i + 1            
        Loop
    Next 
Application.ScreenUpdating
= True Application.DisplayAlerts = True
Over
= Timer MsgBox ("已運行完成!共運行" & Over - Begin & "s。" & " " & i) End Sub
相關文章
相關標籤/搜索