Flink開發環境搭建(maven)

一、下載scala sdkjava

http://www.scala-lang.org/download/ 直接到這裏下載sdk,(https://downloads.lightbend.com/scala/2.12.8/scala-2.12.8.msi)git

 

二、下載scala for intellij idea的插件apache

File->setting->plugins裏搜索Scala,而後安裝便可api

 

三、https://maven.apache.org/download.cgimaven

http://mirrors.shu.edu.cn/apache/maven/maven-3/3.6.0/binaries/apache-maven-3.6.0-bin.zipide

 

四、生成工程ui

mvn archetype:generate -DarchetypeGroupId=org.apache.flink -DarchetypeArtifactId=flink-quickstart-scalaidea

或者spa

mvn archetype:generate -DarchetypeGroupId=org.apache.flink -DarchetypeArtifactId=flink-quickstart-java -DarchetypeCatalog=https://repository.apache.org/content/repositories/snapshots/ -DarchetypeVersion=1.7-SNAPSHOT插件

 

五、scala統計詞頻示例

package com.test.s

import org.apache.flink.api.scala._

object WordCount {

  def main(args: Array[String]) {

    val env = ExecutionEnvironment.getExecutionEnvironment

    // get input data
    val text = env.readTextFile("D:\\git\\test\\pom.xml")

    val counts = text.flatMap { _.toLowerCase.split("\\W+") filter { _.nonEmpty } }
      .map { (_, 1) }
      .groupBy(0)
      .sum(1)

    // counts.writeAsCsv("D:\\git\\test\\output.txt", "\n", " ")
    counts.print()
    env.execute("Socket Window WordCount")

  }
}

 

  • 直接按照樣例執行,可能出現如下錯誤:
Exception in thread "main" java.lang.RuntimeException: No new data sinks have been defined since the last execution. The last execution refers to the latest call to 'execute()', 'count()', 'collect()', or 'print()'.
  • 參照此文,緣由是print()方法自動會調用execute()方法,形成錯誤,因此註釋掉env.execute()便可
相關文章
相關標籤/搜索