当前位置: 动力学知识库 > 问答 > 编程问答 >

apache spark - Scala Function Filtering Field for Only Numbers

问题描述:

I have the following functions:

 def isAllDigits(x: String) = x forall Character.isDigit

def filterNum(x: (Int, String)) : Boolean = {

accumNum.add(1)

if(isAllDigits(x._2)) false

else true

}

I am passing in key/value's and I want to check that the values are numeric. For some reason it is filtering out :

res10: Array[(Int, String)] = Array((1,18964), (2,39612), (3,1), (4,""), (5,""), (6,""), (7,""), (8,""), (9,1), (10,""))

but allowing this:

res9: Array[(Int, String)] = Array((18,1000.0), (22,23.99), (18,1001.0), (22,23.99), (18,300.0), (22,23.99), (18,300.0), (22,23.99), (18,300.0), (22,23.99))

Does .isDigit only allow doubles? But I am confused as to why when x is (Int,String) the double/int being passed in is being seen as a string.

Edit:

I am using this function in Spark with the following:

val numFilterRDD = numRDD.filter(filterNum)

numRDD.take() example:

res11: Array[(Int, String)] = Array((1,18964), (2,39612), (3,1), (4,""), (5,""), (6,""), (7,""), (8,""), (9,1), (10,""), (11,""), (16,""), (18,1000.0), (19,""), (20,""), (21,""), (22,23.99), (23,""), (24,""), (25,""))

网友答案:

The problem is that you are running through each character separately. So, in the case of a double, it gets to the point that the decimal is checked and that by itself is not a number:

Character.isDigit('.') //false

You might be better to use a regex.

x matches """^\d+(\.?\d+)$"""
分享给朋友:
您可能感兴趣的文章:
随机阅读: