当前位置: 动力学知识库 > 问答 > 编程问答 >

python - REGEX Searching in pymongo

问题描述:

I am attempting to create a search in pymongo using REGEX. After the match, I want the data to be appended to a list in the module. I thought that I had everything set, but no matter what I set for the REGEX it returns 0 results. The code is below:

REGEX = '.*\.com'

def myModule(self, data)

#after importing everything and setting up the collection function in the DB I call the following:

cursor = collection.find({'multiple.layers.of.data' : REGEX})

data = []

for x in cursor:

matches.append(x)

return matches

This is but one module of three I am using to filter through a huge amount of json files that have been stored in a mongodb. However, no matter how many times I change this formatting such as /.*.com/ to declare in the operation or using the $regex in mongo...it never finds my data and appends it in the list.

EDIT: Adding in the full code along with what I am trying to identify:

RegEx = '.*\.com' #Or RegEx = re.compile('.*\.com')

def filterData(self, data):

db = self.client[self.dbName]

collection = db[self.collectionName]

cursor = collection.find({'data.item11.sub.level3': {'$regex': RegEx}})

data = []

for x in cursor:

data.append(x)

return data

I am attempting to parse through JSON data in a mongodb. The data is structured like so:

"data": {

"0": {

"item1": "something",

"item2": 0,

"item3": 000,

"item4": 000000000,

"item5": 000000000,

"item6": "0000",

"item7": 00,

"item8": "0000",

"item9": 00,

"item10": "useful",

"item11": {

"0000": {

"sub": {

"level": "letter",

"level1": 0000,

"level2": 0000000000,

"level3": "domain.com"

},

"more_data": "words"

}

}

}

UPDATE: After further testing it appears as though I need to include all of the layers in the search. Thus, it should look like

collection.find({'data.0.item11.0000.sub.level3': {'$regex': RegEx}}).

However, the "0" can be 1 - 50 and the "0000" is randomly generated. Is there a way to set these to index's as variables so that it will step into it no matter what the value? It will always be a number value.

网友答案:

Well, you need to tell mongodb the string should be treated as a regular expression, using the $regex operator:

cursor = collection.find({'multiple.layers.of.data' : {'$regex': REGEX}})

I think simply replacing REGEX = '.*\.com' with import re; REGEX = re.compile('.*\.com') might also work, but I'm not sure (would rely on a specific handling in the pymongo driver).


EDIT:

Regarding the wildcard part of the question: The answer is no.

In a nutshell, values that unknown should never be assigned as keys because it makes querying very inefficient. There are no 'wild card' queries.

It is better to restructure the database such that values that are unknown are not keys

See:

MongoDB wildcard in the key of a query

http://groups.google.com/group/mongodb-user/browse_thread/thread/32b00d38d50bd858

https://groups.google.com/forum/#!topic/mongodb-user/TnAQMe-5ZGs

分享给朋友:
您可能感兴趣的文章:
随机阅读: