MongoDB has supported geospatial queries for a while, and in the upcoming 1.7 release it’ll get even better. Let’s take a look at how easy it is to query MongoDB in an idiomatic geospatial manner.
First, as described in the docs, you’ll need your data structured a certain way, and then you’ll need to add a 2d index. In my example, I’ll have documents that contain non-geospatial data (names, dates, other stuff) and then an embedded “LOC” document which contains “LAT” and “LONG” fields. These are arbitrary names… what’s important is that you get the order right. First, your “location” field will contain 2 values. You can do this in two ways:
- array
- nested document with keys
For example:
{ LOC: [38, –102] } is a valid “location” field because it contains an array with 2 values
{ LOC: { LAT: 38, LONG: -102 } } is a valid location field because it contains a nested document with two keys, and those keys contain numeric values.
Note that the order is important! You’ll insert the data in the order in which you’d query the data. In general, stick with lat/long or x/y order.
Sample data
As you can see, this “treatment unit” is positioned at roughly [38,-102]
Adding the 2d Index
Let’s add an index:
> db.treatmentunits.ensureIndex( { LOC: “2d” } )
Cool, that was easy.
Querying with geospatial operators
Exact matches are rarely useful unless, and become even more so the more granular your lat/long storage becomes. You’ll notice in this example that I have ridiculously precise location data (it’s precise b/c it’s fake). Still, if you want to query for exact matches, you’d do this:
> db.treatmentunits.find( { LOC: [38,-102] } )
Due to the precision of my data, this will yield no results. So let’s widen the net by using the “$near” operator:
> db.treatmentunits.find( { LOC: { $near: [38,-102] } } )
That’s better… lots of results. By default, Mongo will give you the closest 100 results. You probably don’t want that. So, let’s tighten it up by setting some “bounds”, using the $maxDistance operator:
> db.treatmentunits.find( { LOC: { $near: [38,-102], $maxDistance: 5 } } )
You might be thinking: how’s that different from using limit(x)? Simple: limit() is merely a limit on the number of rows returned. If you want only 5, you get only 5. But by using $maxDistance, you’re not specifying how many results you want but rather how close the locations must be to your target location in order to be included in the results. If you want the closest 10 locations that meet a $maxDistance of 5, that’s where you’d use limit:
> db.treatmentunits.find( { LOC: { $near: [38,-102], $maxDistance: 5 } } ).limit(10)
Now, picture a map in your head, and draw a box somewhere on that map. Let’s say you want to find results whose location is within that bounding box. You’ll use the $within and $box operators, like so:
> db.treatmentunits.find( { LOC: { $within : { $box : [[40,-120], [48,-108]] } } } )
Now that’s a big box, comprising nearly the entire southwestern US, so it’s going to return a lot of records. Tighten up the box to limit your results.
Finally, to find records whose location starts at a center point and radiates out, you’ll use the $within and $center operators:
> center = [38, -102]
> radius = 10
> db.treatmentunits.find({ LOC: { $within: { $center: [center, radius] } } } )
You may be wondering at this point, “Why don’t you have any sort() applied?”. The answer is: MongoDB will provide the correct sort for you when using geospatial queries