NoSQL is a buzzword I’ve been hearing of for quite sometime. It’s just one of those things that you hear about and want to try out but never get the chance. I was even more motivated when I came across this article which attempted to summarize what are the “7 Surprising Trends That Show What Tech Skills You Need to Succeed”, that stated: “NoSQL is a small but growing niche.” I just had to try coding a NoSQL “Hello World” example.
I never had an experience with NoSQL before. After a little reading, I understood NoSQL to be simply a common name for databases which are not tabular, and (naturally) – are not using SQL. In simple words, these databases do not force us to store our data in tables as in the more popular and traditional databases. Instead, data can be stored as hierarchical data, which is usually the case for real world business implementation. Personally I like this concept as long before I started to use LINQ to SQL as an OR/M, I used to provide a data layer which was not only good for retrieving data, but also for converting DataSets to business objects/entities, and vice versa. I disliked the fact the databases force us to store business objects in tabular data when this was not the case. Moreover, I believe that although I consistently use business objects for years now, there’s no doubt in my mind that my thinking is affected by the relational module, as you always have to keep in mind that in the end, your hierarchical objects will be stored in a tabular database.
Seems like there is a variety of NoSQL database architectures and features, some of which are designed to handle very large masses of data and scale up; others are more limited in capabilities. Turns out that one has to do homework in order to choose the “right database”. I was reading some more and thought I’d try out mongoDB, which seemed like a leading solution in the field.
Actually, this was quite simple: just to the downloads section of mongoDB and download the latest version suitable to your machine (as of this writing, the latest version was 1.8.2, released in June 2011). After download, perform the following:
1. Unzip the downloaded zip file.
2. Create a directory for the db files. This seems to default to data\db in the root of the drive you use, but you can use the command line to override the default. Let’s assume that you create x:\data\db
3. Open the command line (cmd) and go to the directory where you unzipped the db.
4. Type: mongod.exe –dbpath x:\data\db.
That’s it. No lengthy downloads; no lengthy wizard installations as in SQL Server; no impossibly cumbersome Oracle installations; no complex MySQL installations etc. You get the idea.
mongoDB drivers for C#
In order to use mongoDB in C#, there are several alternatives. I usually prefer an official driver over an open source driver as long as it is available. I just guess that once an official driver exists, it will be probably more updated and consistent with the product itself. However, I did notice that LINQ support doesn’t really exist in the current official driver (version 1.1), although it does seem to exist in the open source driver – so you might want to consider it after all, at least as long the official driver does not support it. After I downloaded the official driver and unzipped it, all that remains is to Add a Reference to the two assemblies from a VS (console) project.
I went on with the reading of the “CSharp Driver Tutorial“. Unfortunately the documentation is quite lacking and as one of the commentators (“Mike”) commented: it is more of a reference than a tutorial. So, I had to read the documentation/reference and to gather a quick start for myself.
In mongoDB, the basic type stored and retrieved is BSON, which stands for Binary JSON. I was unfamiliar with what BSON is, so I read a little. It seems like BSON resembles Json and adds more support for data types which are not Json types, such as Date or BinData types. It is also claimed that “Compared to JSON, BSON is designed to be efficient both in storage space and scan-speed. Large elements in a BSON document are prefixed with a length field to facilitate scanning. In some cases, BSON will use more space than JSON due to the length prefixes and explicit array indices.” Well, I found the fact that that mongoDB uses a JSON-like format as it’s entities quite cool and easy, especially when it comes to web programming. The mongoDB assemblies contain a set of API methods which interact with the basic BsonDocument class. This is important to understand so the following will be trivial.
Here’s the first example I made for opening a connection, inserting and retrieving JSON-like entities:
- Line 11 opens a connection to a database server. By default, the parameterless Create() seems like the local machine is used as the database server. If no such database is available, you will soon enough receive an exception. You can also pass a connection string to other servers. The documentation states that per server a single instance is created, and a connection pool is implicitly created as required.
- Line 12 returns the actual required database on the server. Seems like you can provide a non-existing string here and a database will still be returned.
- Line 13: according to the documentation, it isn’t mandatory but recommended to call RequestStart in order to ensure that a series of operations is performed on the same connection. The documentation even states that this is required “in order to guarantee correct results.” If you use RequestStart, you mush either call RequestDone to decrement the an inner counter involved, or simply use the using(…) statement which calls the Dispose() method (as shown in the example above).
- Line 15: as no tables are involved, mongoDB uses the term Collections to represent a collection of BsonDocuments. The GetCollection doesn’t actually return all the collection specified in the argument, but a simple reference which is used to perform the different CRUD operations.
- Lines 17-18: we finally get to the point where we insert items into our collection. I created an on-the-fly BsonDocuments, with JSON-like attributes. The Insert seems to take place immediately, hence, I believe that RequestStart is not transactional as I hoped it would be.
- Line 20: Find methods are the way to retrieve information from mongoDB. They return cursors which can be iterated upon. FindAll returns a non-filtered cursor. If we would like to return a filtered cursor, we have to use one of the other Find methods which accepts a Query object as its argument. Unfortunately, LINQ is not supported in the current official driver.
- Lines 21-26 iterate over the cursor. Note that the items returns are of BsonDocument type, and the indexer returns a BsonValue subclass type. BsonValue is an abstract class for the different types which inherit from it, such as BsonString or BsonDateTime. From there you can downcast to your requested type.
That’s it. Quite simple.
This seems like the weak link in the current driver as there’s no LINQ support. The different Find methods only accept IMongoQuery objects, which seem to be a narrow list of supported classes. For example, if you’d like to Find entities which have a “Smith” for a last name, you’ll have to use:
var cursor = col.FindAs(Query.EQ(“last”, “Smith”));
If you’d like all entities with a last name of ‘Jo’ or later in the alphabet, here’s a possibility:
If you’d like something more complex like “all last names which begin with ‘S’”, you’ll have to combine a Query.And( ), Query.GTE( ) method and possibly Query.LT( ).
I liked the following part: the data can be serialized and deserialized to strongly typed classes. For simple typed classes, this can be done implicitly without having to go through too much trouble. It seems like all there’s to be done is to create a class with the relevant properties, add a BsonObjectId member, use generics and you’re done:
- Lines 6-11 define a Person class with properties and an Object Id.
- Line 21 returns a reference for a Person-typed collection.
- Line 22 inserts a typed Person object into the collection.
- Line 23 returns a Person cursor!
- Lines 24-31 iterates over the Persons in the cursor.
- Line 26 relates to the ‘last’ property of the Person (not hashmap!)
- Lines 28-29 demonstrates an Update operation.
If you require more complex serializations, you’ll probably get some ideas from this tutorial.
Complex data types and null
I was curious to see how mongoDB and it’s driver dealt with complex types, such as binary and datetime, and how it dealt with null. For example, as mongoDB is a hierarchical db, I could store different items in a collection. Some would have missing data, which is equivalent to a null value in a relational database. As you can see below, mongoDB acted as expected and was doing just fine with missing data and complex types:
- Line 12 has a byte array property.
- Line 13 has a nullable DateTime.
- Line 26 inserts a Person with all the attributes.
- Line 27 inserts a Person without the byte array or the nullable DateTime.
- The two watches show how the two entities Deserialize as expected (note the nulls in the second object).
I believe that this post just touches the tip of the iceberg. Quite a few topics remain untouched so far: Indexes, Replications, “Foreign Keys” (i.e. whether FK-like relations are possible between collections and entities), Transactions, and many others. But the general idea was much clearer now. To be honest, I really loved the idea of replacing the tabular database, but I reckon it’ll be some time before I actually switch to working with one on a “professional level”.
LINQ support is desperately something that I miss in the current official driver.
Note: You might want to investigate RavenDB which looks promising, but I’m uncertain how ripe it is at this time. mongoDB’s current poor documentation and lack of LINQ support is certainly disappointing and I really hope that the guys there perform a worthy and noticeable change.