当前位置: 动力学知识库 > 问答 > 编程问答 >

How to Update Multiple Array Elements in mongodb

问题描述:

I have a Mongo document which holds an array of elements.

I'd like to reset the .handled attribute of all objects in the array where .profile = XX.

The document is in the following form:

 {

"_id" : ObjectId("4d2d8deff4e6c1d71fc29a07"),

"user_id" : "714638ba-2e08-2168-2b99-00002f3d43c0",

"events" : [

{

"handled" : 1,

"profile" : 10,

"data" : "....."

}

{

"handled" : 1,

"profile" : 10,

"data" : "....."

}

{

"handled" : 1,

"profile" : 20,

"data" : "....."

}

...

]

}

so, I tried the following:

.update({"events.profile":10},{$set:{"events.$.handled":0}},false,true)

however it updates only the first matched array element in each document. (That's the defined behavior for $ - the positional operator.)

How can I update all matched array elements?

网友答案:

At this moment it is not possible to use the positional operator to update all items in an array. See JIRA http://jira.mongodb.org/browse/SERVER-1243

As a work around you can:

  • Update each item individually (events.0.handled events.1.handled ...) or...
  • Read the document, do the edits manually and save it replacing the older one (check "Update if Current" if you want to ensure atomic updates)
网友答案:

What worked for me was this:

db.collection.find({ _id: ObjectId('4d2d8deff4e6c1d71fc29a07') })
  .forEach(function (doc) {
    doc.events.forEach(function (event) {
      if (event.profile === 10) {
        event.handled=0;
      }
    });
    db.collection.save(doc);
  });

I think it's clearer for mongo newbies and anyone familiar with JQuery & friends.

网友答案:

This can also be accomplished with a while loop which checks to see if any documents remain that still have subdocuments that have not been updated. This method preserves the atomicity of your updates (which many of the other solutions here do not).

var query = {
    events: {
        $elemMatch: {
            profile: 10,
            handled: { $ne: 0 }
        }
    }
};

while (db.yourCollection.find(query).count() > 0) {
    db.yourCollection.update(
        query,
        { $set: { "events.$.handled": 0 } },
        { multi: true }
    );
}

The number of times the loop is executed will equal the maximum number of times subdocuments with profile equal to 10 and handled not equal to 0 occur in any of the documents in your collection. So if you have 100 documents in your collection and one of them has three subdocuments that match query and all the other documents have fewer matching subdocuments, the loop will execute three times.

This method avoids the danger of clobbering other data that may be updated by another process while this script executes. It also minimizes the amount of data being transferred between client and server.

网友答案:

I'm amazed this still hasn't been addressed in mongo. Overall mongo doesn't seem to be great when dealing with sub-arrays. You can't count sub-arrays simply for example.

I used Javier's first solution. Read the array into events then loop through and build the set exp:

var set = {}, i, l;
for(i=0,l=events.length;i<l;i++) {
  if(events[i].profile == 10) {
    set['events.' + i + '.handled'] = 0;
  }
}

.update(objId, {$set:set});

This can be abstracted into a function using a callback for the conditional test

网友答案:

This does in fact relate to the long standing issue at http://jira.mongodb.org/browse/SERVER-1243 where there are in fact a number of challenges to a clear syntax that supports "all cases" where mutiple array matches are found. There are in fact methods already in place that "aid" in solutions to this problem, such as Bulk Operations which have been implemented after this original post.

It is still not possible to update more than a single matched array element in a single update statement, so even with a "multi" update all you will ever be able to update is just one mathed element in the array for each document in that single statement.

The best possible solution at present is to find and loop all matched documents and process Bulk updates which will at least allow many operations to be sent in a single request with a singular response. You can optionally use .aggregate() to reduce the array content returned in the search result to just those that match the conditions for the update selection:

db.collection.aggregate([
    { "$match": { "events.handled": 1 } },
    { "$project": {
        "events": {
            "$setDifference": [
               { "$map": {
                   "input": "$events",
                   "as": "event",
                   "in": {
                       "$cond": [
                           { "$eq": [ "$$event.handled", 1 ] },
                           "$$el",
                           false
                       ]
                   }
               }},
               [false]
            ]
        }
    }}
]).forEach(function(doc) {
    doc.events.forEach(function(event) {
        bulk.find({ "_id": doc._id, "events.handled": 1  }).updateOne({
            "$set": { "events.$.handled": 0 }
        });
        count++;

        if ( count % 1000 == 0 ) {
            bulk.execute();
            bulk = db.collection.initializeOrderedBulkOp();
        }
    });
});

if ( count % 1000 != 0 )
    bulk.execute();

The .aggregate() portion there will work when there is a "unique" identifier for the array or all content for each element forms a "unique" element itself. This is due to the "set" operator in $setDifference used to filter any false values returned from the $map operation used to process the array for matches.

If your array content does not have unique elements you can try an alternate approach with $redact:

db.collection.aggregate([
    { "$match": { "events.handled": 1 } },
    { "$redact": {
        "$cond": {
            "if": {
                "$eq": [ { "$ifNull": [ "$handled", 1 ] }, 1 ]
            },
            "then": "$$DESCEND",
            "else": "$$PRUNE"
        }
    }}
])

Where it's limitation is that if "handled" was in fact a field meant to be present at other document levels then you are likely going to get unexepected results, but is fine where that field appears only in one document position and is an equality match.

Future releases ( post 3.1 MongoDB ) as of writing will have a $filter operation that is simpler:

db.collection.aggregate([
    { "$match": { "events.handled": 1 } },
    { "$project": {
        "events": {
            "$filter": {
                "input": "$events",
                "as": "event",
                "cond": { "$eq": [ "$$event.handled", 1 ] }
            }
        }
    }}
])

And all releases that support .aggregate() can use the following approach with $unwind, but the usage of that operator makes it the least efficient approach due to the array expansion in the pipeline:

db.collection.aggregate([
    { "$match": { "events.handled": 1 } },
    { "$unwind": "$events" },
    { "$match": { "events.handled": 1 } },
    { "$group": {
        "_id": "$_id",
        "events": { "$push": "$events" }
    }}        
])

In all cases where the MongoDB version supports a "cursor" from aggregate output, then this is just a matter of choosing an approach and iterating the results with the same block of code shown to process the Bulk update statements. Bulk Operations and "cursors" from aggregate output are introduced in the same version ( MongoDB 2.6 ) and therefore usually work hand in hand for processing.

In even earlier versions then it is probably best to just use .find() to return the cursor, and filter out the execution of statements to just the number of times the array element is matched for the .update() iterations:

db.collection.find({ "events.handled": 1 }).forEach(function(doc){ 
    doc.events.filter(function(event){ return event.handled == 1 }).forEach(function(event){
        db.collection.update({ "_id": doc._id },{ "$set": { "events.$.handled": 0 }});
    });
});

If you are aboslutely determined to do "multi" updates or deem that to be ultimately more efficient than processing multiple updates for each matched document, then you can always determine the maximum number of possible array matches and just execute a "multi" update that many times, until basically there are no more documents to update.

A valid approach for MongoDB 2.4 and 2.2 versions could also use .aggregate() to find this value:

var result = db.collection.aggregate([
    { "$match": { "events.handled": 1 } },
    { "$unwind": "$events" },
    { "$match": { "events.handled": 1 } },
    { "$group": {
        "_id": "$_id",
        "count": { "$sum": 1 }
    }},
    { "$group": {
        "_id": null,
        "count": { "$max": "$count" }
    }}
]);

var max = result.result[0].count;

while ( max-- ) {
    db.collection.update({ "events.handled": 1},{ "$set": { "events.$.handled": 0 }},{ "multi": true })
}

Whatever the case, there are certain things you do not want to do within the update:

  1. Do not "one shot" update the array: Where if you think it might be more efficient to update the whole array content in code and then just $set the whole array in each document. This might seem faster to process, but there is no guarantee that the array content has not changed since it was read and the update is performed. Though $set is still an atomic operator, it will only update the array with what it "thinks" is the correct data, and thus is likely to overwrite any changes occurring between read and write.

  2. Do not calculate index values to update: Where similar to the "one shot" approach you just work out that position 0 and position 2 ( and so on ) are the elements to update and code these in with and eventual statement like:

    { "$set": {
        "events.0.handled": 0,
        "events.2.handled": 0
    }}
    

    Again the problem here is the "presumption" that those index values found when the document was read are the same index values in th array at the time of update. If new items are added to the array in a way that changes the order then those positions are not longer valid and the wrong items are in fact updated.

So until there is a reasonable syntax determined for allowing multiple matched array elements to be processed in single update statement then the basic approach is to either update each matched array element in an indvidual statement ( ideally in Bulk ) or essentiall work out the maximum array elements to update or keep updating until no more modified results are returned. At any rate, you should "always" be processing positional $ updates on the matched array element, even if that is only updating one element per statement.

Bulk Operations are in fact the "generalized" solution to processing any operations that work out to be "multiple operations", and since there are more applications for this than merely updating mutiple array elements with the same value, then it has of course been implemented already, and is the best way to presently approach solving this problem.

网友答案:

I just wanted to add another solution that worked for me and is pretty straightforward. Here it's just an array of tags (strings) so to update a tag called "test" to "changed", just do this:

myDocuments.find({tags: "test" }, {fields: {_id: 1}}).forEach(function (doc) {
    myDocuments.update(
        {_id: doc._id, tags: "test"}, 
        {$set:{'tags.$': "changed"}});
    });
分享给朋友:
您可能感兴趣的文章:
随机阅读: