JavaApi寫Spark程序reduceByKey後沒有聚合的問題(自定義類型做爲Key)

            使用JavaApi寫Spark若是PairRDD的key值爲自定義的類型,須要重寫hashcode以及equals方法,否則就會發現相同的Key值並無進行聚合操做。apache

例如:使用User類型做爲Keyeclipse

​
public class User {
	
	private String name;
	private String age;
	public String getName() {
		return name;
	}
	public void setName(String name) {
		this.name = name;
	}
	public String getAge() {
		return age;
	}
	public void setAge(String age) {
		this.age = age;
	}
	@Override
	public int hashCode() {
		final int prime = 31;
		int result = 1;
		result = prime * result + ((age == null) ? 0 : age.hashCode());
		result = prime * result + ((name == null) ? 0 : name.hashCode());
		return result;
	}
	@Override
	public boolean equals(Object obj) {
		if (this == obj)
			return true;
		if (obj == null)
			return false;
		if (getClass() != obj.getClass())
			return false;
		User other = (User) obj;
		if (age == null) {
			if (other.age != null)
				return false;
		} else if (!age.equals(other.age))
			return false;
		if (name == null) {
			if (other.name != null)
				return false;
		} else if (!name.equals(other.name))
			return false;
		return true;
	}
	
}

​

通常eclipse能夠自動的生成類型的hashcode以及equals方法,不須要本身特別處理,ide

若是遇到特殊的狀況的話,咱們能夠使用commons-lang3包裏面的HashCodeBuilder以及EqualsBuilder兩個工具類來生成相應的方法工具

 

package run.aaa.spark;

import org.apache.commons.lang3.builder.EqualsBuilder;
import org.apache.commons.lang3.builder.HashCodeBuilder;

public class User {
	
	private String name;
	private String age;
	public String getName() {
		return name;
	}
	public void setName(String name) {
		this.name = name;
	}
	public String getAge() {
		return age;
	}
	public void setAge(String age) {
		this.age = age;
	}
	@Override
	public int hashCode() {
		return HashCodeBuilder.reflectionHashCode(this);
	}
	@Override
	public boolean equals(Object obj) {
		return EqualsBuilder.reflectionEquals(this, obj);
	}
	
}
相關文章
相關標籤/搜索