concat_ws: 用指定的字符鏈接字符串mysql
例如:sql
鏈接字符串:數組
concat_ws("_", field1, field2),輸出結果將會是:「field1_field2」。app
數組元素鏈接:ide
concat_ws("_", [a,b,c]),輸出結果將會是:"a_b_c"。ui
collect_set: 把聚合的數據組合成一個數組,通常搭配group by 使用。url
例若有下表T_course;spa
id | name | course |
1 | zhang san | Chinese |
2 | zhang san | Math |
3 | zhang san | English |
spark.sql("select name, collect_set(course) as course_set from T_course group by name");
結果是:code
name | course_set |
zhang san | [Chinese,Math,English] |
貼上套牌車項目代碼:orm
public class TpcCompute2 { public static void main(String[] args) { SparkSession spark = SparkSession.builder().enableHiveSupport().appName("TpcCompute2").master("local").getOrCreate(); JavaSparkContext sc = new JavaSparkContext(spark.sparkContext()); sc.setLogLevel("ERROR"); //hphm,id,tgsj,lonlat&
spark.udf().register("getTpc", new ComputeUDF(), DataTypes.StringType); spark.sql("use traffic"); spark.sql("select hphm,concat_ws('&',collect_set(concat_ws('_',id,kk_lon_lat,tgsj))) as concatValue from t_cltgxx t where t.tgsj>'2015-01-01 00:00:00' group by hphm").show(false); Dataset<Row> cltgxxDF = spark.sql("select hphm,concatValue from (select hphm,getTpc(concat_ws('&',collect_set(concat_ws('_',id,kk_lon_lat,tgsj)))) as concatValue from t_cltgxx t where t.tgsj>'2015-01-01 00:00:00' group by hphm) where concatValue is not null"); cltgxxDF.show(); //建立集合累加器
CollectionAccumulator<String> acc = sc.sc().collectionAccumulator(); cltgxxDF.foreach(new ForeachFunction<Row>() { @Override public void call(Row row) throws Exception { acc.add(row.getAs("concatValue")); } }); List<String> values = acc.value(); for (String id : accValues) { System.out.println("accValues: " + id); Dataset<Row> resultDF = spark.sql("select hphm,clpp,clys,tgsj,kkbh from t_cltgxx where id in (" + id.split("_")[0] + "," + id.split("_")[1] + ")"); resultDF.show(); Dataset<Row> resultDF2 = resultDF.withColumn("jsbh", functions.lit(new Date().getTime())) .withColumn("create_time", functions.lit(new Timestamp(new Date().getTime()))); resultDF2.show(); resultDF2.write() .format("jdbc") .option("url","jdbc:mysql://lin01.cniao5.com:3306/traffic?characterEncoding=UTF-8") .option("dbtable","t_tpc_result") .option("user","root") .option("password","123456") .mode(SaveMode.Append) .save(); } }
spark.sql語句輸出樣式: