当前位置: 动力学知识库 > 问答 > 编程问答 >

java - SetUniqueList, HashSet and Set don't remove duplicates from a List of an object

问题描述:

What I've been trying to do is to sort objects in a List and remove duplicate objects from the same List.

Here is the class of the object

public class Word implements Comparable<Word>{

private String wordName;

private int number;

// There are only simple constructers, getters and setters

// This compareTo might be irrelevant for this question

@Override

public int compareTo(Word word) {

int compareNumber = ((Word) word).getNumber();

return compareNumber - this.number;

}

}

Here is part of the main method

public class CommentEvaluationTester {

final static private List<String> WordsList = new ArrayList<>();

public static void main(String[] args) {

boolean isContained;

String comment = "";

//This "comment" actually has a long string value

for (String word : WordsInDB) {

//WordsInDB is a List, containing String values

isContained = comment.toLowerCase().contains(word.toLowerCase());

if (isContained) {

WordsList.add(word);

}

}

List WordsListWithNumber = new ArrayList<>();

for (String word : WordsList) {

int occurrences = Collections.frequency(WordsList, word);

Word addWord = new Word(word, occurrences);

WordsListWithNumber.add(addWord);

}

//This might be irrelevant too

Collections.sort(WordsListWithNumber, new Comparator<Word>() {

@Override

public int compare(Word w1, Word w2) {

return w2.getNumber() - w1.getNumber();

}

});

At this stage, "WordsListWithNumber" list contains several instances of "Word", and I've been trying to eliminate duplicates from this List.

I have found several ways on Stackoverflow.

  1. SetUniqueList

    List<Word> NoDup = SetUniqueList.setUniqueList(WordsListWithNumber);

  2. HashSet

    HashSet hs = new HashSet();

    hs.addAll(WordsListWithNumber);

    WordsListWithNumber.clear();

    WordsListWithNumber.addAll(hs);

  3. Set

    Set<Word> noDupSet = new LinkedHashSet<Word>(WordsListWithNumber);

    List<Word> noDup = new ArrayList<>();

    noDup.addAll(noDupSet);

I've confirmed that all those methods can remove duplicates from a List of "String", but it didn't seem to remove duplicates from a List of this class.

I checked the contents of the list by doing like this...but both of them show the same value.

 Word testWord = (Word) noDup.get(0);

System.out.println("test1: noDup.get(0) : " + testWord.getWordName() + " , number : " + testWord.getNumber());

testWord = (Word) noDup.get(1);

System.out.println("test2: noDup.get(1) : " + testWord.getWordName() + " , number : " + testWord.getNumber());

I'd appreciate if you would give any insight.

P.S.

I realized that the "number" property should have been named "quantity"... It seems that some people think this "number" property is something like ID numbers, but it actually indicates how many the same word the "WordsList" contains.

I would like to compare the "wordname", not "number".

Sorry for confusing, I'm not an native English speaker.

网友答案:

If you want to remove duplicates from a List, you need to specify when you consider two items to be duplicates. It is important to specify because in your case there are at least 4 possible interpretations for what it means for word1 and word2 to be duplicates:

  1. word1 == word2.
  2. word1.number == word2.number
  3. word1.wordName.equals(word2.wordName)
  4. word1.number == word2.number && word1.wordName.equals(word2.wordName)

You have indicated that you mean 3.

The way you specify what you mean by duplicates is by overriding the equals method. You can do that as follows.

@Override
public boolean equals(Object object) {
    return object instanceof Word && ((Word) object).wordName.equals(wordName);
}

Whenever you override the equals method you must also override hashCode. (Search SO for the explanation for this).

Here is a possible hashCode method for Word.

@Override
public int hashCode() {
    return wordName.hashCode(); 
}

If you do this you will find that if list is a List<Word>, you can remove duplicates by writing

list = new ArrayList<Word>(new LinkedHashSet<Word>(list));

For full details on how to write equals, hashCode and compareTo methods (your compareTo method can fail if the numbers are large), I recommend the book Effective Java by Joshua Bloch.

Good luck!

分享给朋友:
您可能感兴趣的文章:
随机阅读: