spark reduceByKey()和 reduceByKey(,para)的時間差


import time

t=[]
app


for i in range(1,10000000000):
    t.append((i,i))
tsc=sc.parallelize(t)
def fun1(d):
    t1=time.time()
    d.reduceByKey(lambda x,y:x*y)
    t2=time.time()
    return t2-t1
def fun2(d):
    t1=time.time()
    d.reduceByKey(lambda x,y:x*y,10)
    t2=time.time()
    return t2-t1import


>>> fun1(tsc)
0.033590078353881836
>>> fun2(tsc)
0.03184199333190918lambda

相關文章
相關標籤/搜索