Create, Update, and Delete Documents in MongoDB - BunksAllowed

BunksAllowed is an effort to facilitate Self Learning process through the provision of quality tutorials.

Community

Create, Update, and Delete Documents in MongoDB

Share This

Inserting and Saving Documents

Inserts are the basic method for adding data to MongoDB. To insert a document into a collection, use the collection’s insert method:
> db.foo.insert({"bar" : "baz"})
This will add an "_id" key to the document (if one does not already exist) and save it to MongoDB.

MongoDB does not do any sort of code execution on inserts, so they are not vulnerable to injection attacks. Traditional injection attacks are impossible with MongoDB, and alternative injection-type attacks are easy to guard against in general, but inserts are particularly invulnerable.

Removing Documents


Now that there’s data in our database, let’s delete it.

> db.users.remove()
This will remove all of the documents in the users collection. This doesn’t actually remove the collection, and any indexes created on it will still exist.
The remove function optionally takes a query document as a parameter. When it’s given, only documents that match the criteria will be removed. Suppose, for instance, that we want to remove everyone from the mailing.list collection where the value for "optout" is true:
> db.mailing.list.remove({"opt-out" : true})
Once data has been removed, it is gone forever. There is no way to undo the remove or recover deleted documents.
Removing documents is usually a fairly quick operation, but if you want to clear an entire collection, it is faster to drop it (and then re-create any indexes). For example, in Python, suppose we insert a million dummy elements with the following:
  for i in range(1000000):
    collection.insert({"foo": "bar", "baz": i, "z": 10 - i})
  
Now we’ll try to remove all of the documents we just inserted, measuring the time it takes. First, here’s a simple remove:
import time
from pymongo import Connection
db = Connection().foo
collection = db.bar
start = time.time()
collection.remove()
collection.find_one()
total = time.time() - start
print "%d seconds" % total
On a MacBook Air, this script prints “46.08 seconds.”
If the remove and find_one are replaced by db.drop_collection("bar"), the time drops to .01 seconds! This is obviously a vast improvement, but it comes at the expense of granularity: we cannot specify any criteria. The whole collection is dropped, and all of its indexes are deleted.

Updating Documents

Once a document is stored in the database, it can be changed using the update method. update takes two parameters: a query document, which locates documents to update, and a modifier document, which describes the changes to make to the documents found.
Updates are atomic: if two updates happen at the same time, whichever one reaches the server first will be applied, and then the next one will be applied. Thus, conflicting updates can safely be sent in rapid-fire succession without any documents being corrupted: the last update will “win.”
The simplest type of update fully replaces a matching document with a new one. This can be useful to do a dramatic schema migration. For example, suppose we are making major changes to a user document, which looks like the following:
{
"_id" : ObjectId("4b2b9f67a1f631733d917a7a"),
"name" : "joe",
"friends" : 32,
"enemies" : 2
}
We want to change that document into the following:
{
"_id" : ObjectId("4b2b9f67a1f631733d917a7a"),
"username" : "joe",
"relationships" :
{
"friends" : 32,
"enemies" : 2
}
}
We can make this change by replacing the document using an update:
> var joe = db.users.findOne({"name" : "joe"});
> joe.relationships = {"friends" : joe.friends, "enemies" : joe.enemies};
{
"friends" : 32,
"enemies" : 2
}
"joe"
> delete joe.friends;
true
> delete joe.enemies;
true
> delete joe.name;
true
> db.users.update({"name" : "joe"}, joe);
Now, doing a findOne shows that the structure of the document has been updated.
A common mistake is matching more than one document with the criteria and then create a duplicate "_id" value with the second parameter. The database will throw an error for this, and nothing will be changed.
For example, suppose we create several documents with the same "name", but we don’t realize it:
> db.people.find()
{"_id" : ObjectId("4b2b9f67a1f631733d917a7b"), "name" : "joe", "age" : 65},
{"_id" : ObjectId("4b2b9f67a1f631733d917a7c"), "name" : "joe", "age" : 20},
{"_id" : ObjectId("4b2b9f67a1f631733d917a7d"), "name" : "joe", "age" : 49},
Now, if it’s Joe #2’s birthday, we want to increment the value of his "age" key, so we might say this:
> joe = db.people.findOne({"name" : "joe", "age" : 20});
{
"_id" : ObjectId("4b2b9f67a1f631733d917a7c"),
"name" : "joe",
"age" : 20
}
> joe.age++;
> db.people.update({"name" : "joe"}, joe);
E11001 duplicate key on update
What happened? When you call update, the database will look for a document matching {"name" : "joe"}. The first one it finds will be the 65-year-old Joe. It will attempt to replace that document with the one in the joe variable, but there’s already a document in this collection with the same "_id". Thus, the update will fail, because "_id" values must be unique. The best way to avoid this situation is to make sure that your update always specifies a unique document, perhaps by matching on a key like "_id".

Usually only certain portions of a document need to be updated. Partial updates can be done extremely efficiently by using atomic update modifiers. Update modifiers are special keys that can be used to specify complex update operations, such as altering, adding, or removing keys, and even manipulating arrays and embedded documents.

Suppose we were keeping website analytics in a collection and wanted to increment a counter each time someone visited a page. We can use update modifiers to do this increment atomically. Each URL and its number of page views is stored in a document that looks like this:
{
"_id" : ObjectId("4b253b067525f35f94b60a31"),
"url" : "www.example.com",
"pageviews" : 52
}
Every time someone visits a page, we can find the page by its URL and use the "$inc" modifier to increment the value of the "pageviews" key.
> db.analytics.update({"url" : "www.example.com"},
... {"$inc" : {"pageviews" : 1}})
Now, if we do a find, we see that "pageviews" has increased by one.
> db.analytics.find()
{
"_id" : ObjectId("4b253b067525f35f94b60a31"),
"url" : "www.example.com",
"pageviews" : 53
}
When using modifiers, the value of "_id" cannot be changed. (Note that "_id" can be changed by using whole-document replacement.) Values for any other key, including other uniquely indexed keys, can be modified.

Getting started with the "$set" modifier

"$set" sets the value of a key. If the key does not yet exist, it will be created. This can be handy for updating schema or adding user-defined keys. For example, suppose you have a simple user profile stored as a document that looks something like the following:
> db.users.findOne()
{
"_id" : ObjectId("4b253b067525f35f94b60a31"),
"name" : "joe",
"age" : 30,
"sex" : "male",
"location" : "Wisconsin"
}
This is a pretty bare-bones user profile. If the user wanted to store his favorite book in his profile, he could add it using "$set":
> db.users.update({"_id" : ObjectId("4b253b067525f35f94b60a31")},
... {"$set" : {"favorite book" : "war and peace"}})
Now the document will have a “favorite book” key:
> db.users.findOne()
{
"_id" : ObjectId("4b253b067525f35f94b60a31"),
"name" : "joe",
"age" : 30,
"sex" : "male",
"location" : "Wisconsin",
"favorite book" : "war and peace"
}
If the user decides that he actually enjoys a different book, "$set" can be used again to change the value:
> db.users.update({"name" : "joe"},
... {"$set" : {"favorite book" : "green eggs and ham"}})
"$set" can even change the type of the key it modifies. For instance, if our fickle user decides that he actually likes quite a few books, he can change the value of the “favorite book” key into an array:
... {"$set" : {"favorite book" :
... ["cat's cradle", "foundation trilogy", "ender's game"]}})
If the user realizes that he actually doesn’t like reading, he can remove the key altogether with "$unset":
> db.users.update({"name" : "joe"},
... {"$unset" : {"favorite book" : 1}})
Now the document will be the same as it was at the beginning of this example.
You can also use "$set" to reach in and change embedded documents:
> db.blog.posts.findOne()
{
"_id" : ObjectId("4b253b067525f35f94b60a31"),
"title" : "A Blog Post",
"content" : "...",
"author" : {
"name" : "joe",
"email" : "joe@example.com"
}
}
> db.blog.posts.update({"author.name" : "joe"}, {"$set" : {"author.name" : "joe schmoe"}})
> db.blog.posts.findOne()
{
"_id" : ObjectId("4b253b067525f35f94b60a31"),
"title" : "A Blog Post",
"content" : "...",
"author" : {
"name" : "joe schmoe",
"email" : "joe@example.com"
}
}
You must always use a $ modifier for adding, changing, or removing keys. A common error people often make when starting out is to try to set the value of "foo" to "bar" by doing an update that looks like this:
> db.coll.update(criteria, {"foo" : "bar"})
This will not function as intended. It actually does a full-document replacement, replacing the matched document with {"foo" : "bar"}. Always use $ operators for modifying individual key/value pairs.

Incrementing and decrementing

The "$inc" modifier can be used to change the value for an existing key or to create a new key if it does not already exist. It is very useful for updating analytics, karma, votes, or anything else that has a changeable, numeric value.
Suppose we are creating a game collection where we want to save games and update scores as they change. When a user starts playing, say, a game of pinball, we can insert a document that identifies the game by name and user playing it:
> db.games.insert({"game" : "pinball", "user" : "joe"})
When the ball hits a bumper, the game should increment the player’s score. As points in pinball are given out pretty freely, let’s say that the base unit of points a player can earn is 50. We can use the "$inc" modifier to add 50 to the player’s score:
> db.games.update({"game" : "pinball", "user" : "joe"},
... {"$inc" : {"score" : 50}})
If we look at the document after this update, we’ll see the following:
> db.games.findOne()
{
"_id" : ObjectId("4b2d75476cc613d5ee930164"),
"game" : "pinball",
"name" : "joe",
"score" : 50
}
The score key did not already exist, so it was created by "$inc" and set to the increment amount: 50.
If the ball lands in a “bonus” slot, we want to add 10,000 to the score. This can be accomplished by passing a different value to "$inc":
> db.games.update({"game" : "pinball", "user" : "joe"},
... {"$inc" : {"score" : 10000}})
Now if we look at the game, we’ll see the following:
> db.games.find()
{
"_id" : ObjectId("4b2d75476cc613d5ee930164"),
"game" : "pinball",
"name" : "joe",
"score" : 10050
}
The "score" key existed and had a numeric value, so the server added 10,000 to it. "$inc" is similar to "$set", but it is designed for incrementing (and decrementing) numbers. "$inc" can be used only on values of type integer, long, or double. If it is used on any other type of value, it will fail. This includes types that many languages will automatically cast into numbers, like nulls, booleans, or strings of numeric characters:
> db.foo.insert({"count" : "1"})
> db.foo.update({}, {$inc : {count : 1}})
Cannot apply $inc modifier to non-number
Also, the value of the "$inc" key must be a number. You cannot increment by a string, array, or other non-numeric value. Doing so will give a “Modifier "$inc" allowed for numbers only” error message. To modify other types, use "$set" or one of the array operations described in a moment.

Array modifiers

An extensive class of modifiers exists for manipulating arrays. Arrays are common and powerful data structures: not only are they lists that can be referenced by index, but they can also double as sets.
Array operators can be used only on keys with array values. For example, you cannot push on to an integer or pop off of a string, for example. Use "$set" or "$inc" to modify scalar values.
"$push" adds an element to the end of an array if the specified key already exists and creates a new array if it does not. For example, suppose that we are storing blog posts and want to add a "comments" key containing an array. We can push a comment onto the nonexistent "comments" array, which will create the array and add the comment:
> db.blog.posts.findOne()
{
"_id" : ObjectId("4b2d75476cc613d5ee930164"),
"title" : "A blog post",
"content" : "..."
}
> db.blog.posts.update({"title" : "A blog post"}, {$push : {"comments" :
... {"name" : "joe", "email" : "joe@example.com", "content" : "nice post."}}})
> db.blog.posts.findOne()
{
"_id" : ObjectId("4b2d75476cc613d5ee930164"),
"title" : "A blog post",
"content" : "...",
"comments" : [
{
"name" : "joe",
"email" : "joe@example.com",
"content" : "nice post."
}
]
}
Now, if we want to add another comment, we can simple use "$push" again:
> db.blog.posts.update({"title" : "A blog post"}, {$push : {"comments" :
... {"name" : "bob", "email" : "bob@example.com", "content" : "good post."}}})
> db.blog.posts.findOne()
{
"_id" : ObjectId("4b2d75476cc613d5ee930164"),
"title" : "A blog post",
"content" : "...",
"comments" : [
{
"name" : "joe",
"email" : "joe@example.com",
"content" : "nice post."
},
{
"name" : "bob",
"email" : "bob@example.com",
"content" : "good post."
}
]
}
A common use is wanting to add a value to an array only if the value is not already present. This can be done using a "$ne" in the query document. For example, to push an author onto a list of citations, but only if he isn’t already there, use the following:
> db.papers.update({"authors cited" : {"$ne" : "Richie"}},
... {$push : {"authors cited" : "Richie"}})
This can also be done with "$addToSet", which is useful for cases where "$ne" won’t work or where "$addToSet" describes what is happening better.
For instance, suppose you have a document that represents a user. You might have a set of email addresses that they have added:
> db.users.findOne({"_id" : ObjectId("4b2d75476cc613d5ee930164")})
{
"_id" : ObjectId("4b2d75476cc613d5ee930164"),
"username" : "joe",
"emails" : [
"joe@example.com",
"joe@gmail.com",
"joe@yahoo.com"
]
}
When adding another address, you can use "$addToSet" to prevent duplicates:
> db.users.update({"_id" : ObjectId("4b2d75476cc613d5ee930164")},
... {"$addToSet" : {"emails" : "joe@gmail.com"}})
> db.users.findOne({"_id" : ObjectId("4b2d75476cc613d5ee930164")})
{
"_id" : ObjectId("4b2d75476cc613d5ee930164"),
"username" : "joe",
"emails" : [
"joe@example.com",
"joe@gmail.com",
"joe@yahoo.com",
]
}
> db.users.update({"_id" : ObjectId("4b2d75476cc613d5ee930164")},
... {"$addToSet" : {"emails" : "joe@hotmail.com"}})
> db.users.findOne({"_id" : ObjectId("4b2d75476cc613d5ee930164")})
{
"_id" : ObjectId("4b2d75476cc613d5ee930164"),
"username" : "joe",
"emails" : [
"joe@example.com",
"joe@gmail.com",
"joe@yahoo.com",
"joe@hotmail.com"
]
}
You can also use "$addToSet" in conjunction with "$each" to add multiple unique values, which cannot be done with the "$ne"/"$push" combination. For instance, we could use these modifiers if the user wanted to add more than one email address:
> db.users.update({"_id" : ObjectId("4b2d75476cc613d5ee930164")}, {"$addToSet" :
... {"emails" : {"$each" : ["joe@php.net", "joe@example.com", "joe@python.org"]}}})
> db.users.findOne({"_id" : ObjectId("4b2d75476cc613d5ee930164")})
{
"_id" : ObjectId("4b2d75476cc613d5ee930164"),
"username" : "joe",
"emails" : [
"joe@example.com",
"joe@gmail.com",
"joe@yahoo.com",
"joe@hotmail.com"
"joe@php.net"
"joe@python.org"
]
}
There are a few ways to remove elements from an array. If you want to treat the array like a queue or a stack, you can use "$pop", which can remove elements from either end. {$pop : {key : 1}} removes an element from the end of the array. {$pop : {key : -1}} removes it from the beginning.
Sometimes an element should be removed based on specific criteria, rather than its position in the array. "$pull" is used to remove elements of an array that match the given criteria. For example, suppose we have a list of things that need to be done but not in any specific order:
> db.lists.insert({"todo" : ["dishes", "laundry", "dry cleaning"]})
If we do the laundry first, we can remove it from the list with the following:
> db.lists.update({}, {"$pull" : {"todo" : "laundry"}})
Now if we do a find, we’ll see that there are only two elements remaining in the array:
> db.lists.find()
{
"_id" : ObjectId("4b2d75476cc613d5ee930164"),
"todo" : [
"dishes",
"dry cleaning"
]
}
Pulling removes all matching documents, not just a single match. If you have an array that looks like [1, 1, 2, 1] and pull 1, you’ll end up with a single-element array, [2].

Array manipulation becomes a little trickier when we have multiple values in an array and want to modify some of them. There are two ways to manipulate values in arrays: by position or by using the position operator (the "$" character).
Arrays use 0-based indexing, and elements can be selected as though their index were a document key. For example, suppose we have a document containing an array with a few embedded documents, such as a blog post with comments:
> db.blog.posts.findOne()
{
"_id" : ObjectId("4b329a216cc613d5ee930192"),
"content" : "...",
"comments" : [
{
"comment" : "good post",
"author" : "John",
"votes" : 0
},
{
"comment" : "i thought it was too short",
"author" : "Claire",
"votes" : 3
},
{
"comment" : "free watches",
"author" : "Alice",
"votes" : -1
}
]
}
If we want to increment the number of votes for the first comment, we can say the following:
> db.blog.update({"post" : post_id},
... {"$inc" : {"comments.0.votes" : 1}})
In many cases, though, we don’t know what index of the array to modify without querying for the document first and examining it. To get around this, MongoDB has a positional operator, "$", that figures out which element of the array the query document matched and updates that element. For example, if we have a user named John who updates his name to Jim, we can replace it in the comments by using the positional operator:
db.blog.update({"comments.author" : "John"},
... {"$set" : {"comments.$.author" : "Jim"}})
The positional operator updates only the first match. Thus, if John had left more than one comment, his name would be changed only for the first comment he left.

Happy Exploring!

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.