RSS

First attempt at NoSQL: C# and mongoDB

08 Aug

NoSQL is a buzzword I’ve been hearing of for quite sometime. It’s just one of those things that you hear about and want to try out but never get the chance. I was even more motivated when I came across this article which attempted to summarize what are the “7 Surprising Trends That Show What Tech Skills You Need to Succeed”, that stated: “NoSQL is a small but growing niche.” I just had to try coding a NoSQL “Hello World” example.

I never had an experience with NoSQL before. After a little reading, I understood NoSQL to be simply a common name for databases which are not tabular, and (naturally) – are not using SQL. In simple words, these databases do not force us to store our data in tables as in the more popular and traditional databases. Instead, data can be stored as hierarchical data, which is usually the case for real world business implementation. Personally I like this concept as long before I started to use LINQ to SQL as an OR/M, I used to provide a data layer which was not only good for retrieving data, but also for converting DataSets to business objects/entities, and vice versa. I disliked the fact the databases force us to store business objects in tabular data when this was not the case. Moreover, I believe that although I consistently use business objects for years now, there’s no doubt in my mind that my thinking is affected by the relational module, as you always have to keep in mind that in the end, your hierarchical objects will be stored in a tabular database.

Seems like there is a variety of NoSQL database architectures and features, some of which are designed to handle very large masses of data and scale up; others are more limited in capabilities. Turns out that one has to do homework in order to choose the “right database”. I was reading some more and thought I’d try out mongoDB, which seemed like a leading solution in the field.

Installation

Actually, this was quite simple: just to the downloads section of mongoDB and download the latest version suitable to your machine (as of this writing, the latest version was 1.8.2, released in June 2011). After download, perform the following:

1. Unzip the downloaded zip file.

2. Create a directory for the db files. This seems to default to data\db in the root of the drive you use, but you can use the command line to override the default. Let’s assume that you create x:\data\db

3. Open the command line (cmd) and go to the directory where you unzipped the db.

4. Type: mongod.exe –dbpath x:\data\db.

That’s it. No lengthy downloads; no lengthy wizard installations as in SQL Server; no impossibly cumbersome Oracle installations; no complex MySQL installations etc. You get the idea.

mongoDB drivers for C#

In order to use mongoDB in C#, there are several alternatives. I usually prefer an official driver over an open source driver as long as it is available. I just guess that once an official driver exists, it will be probably more updated and consistent with the product itself. However, I did notice that LINQ support doesn’t really exist in the current official driver (version 1.1), although it does seem to exist in the open source driver – so you might want to consider it after all, at least as long the official driver does not support it. After I downloaded the official driver and unzipped it, all that remains is to Add a Reference to the two assemblies from a VS (console) project.

I went on with the reading of the “CSharp Driver Tutorial“. Unfortunately the documentation is quite lacking and as one of the commentators (“Mike”) commented: it is more of a reference than a tutorial. So, I had to read the documentation/reference and to gather a quick start for myself.

Quick start

In mongoDB, the basic type stored and retrieved is BSON, which stands for Binary JSON. I was unfamiliar with what BSON is, so I read a little. It seems like BSON resembles Json and adds more support for data types which are not Json types, such as Date or BinData types. It is also claimed that “Compared to JSON, BSON is designed to be efficient both in storage space and scan-speed. Large elements in a BSON document are prefixed with a length field to facilitate scanning. In some cases, BSON will use more space than JSON due to the length prefixes and explicit array indices.” Well, I found the fact that that mongoDB uses a JSON-like format as it’s entities quite cool and easy, especially when it comes to web programming. The mongoDB assemblies contain a set of API methods which interact with the basic BsonDocument class. This is important to understand so the following will be trivial.

Here’s the first example I made for opening a connection, inserting and retrieving JSON-like entities:

Here’s an explanation:

  • Line 11 opens a connection to a database server. By default, the parameterless Create() seems like the local machine is used as the database server. If no such database is available, you will soon enough receive an exception. You can also pass a connection string to other servers. The documentation states that per server a single instance is created, and a connection pool is implicitly created as required.
  • Line 12 returns the actual required database on the server. Seems like you can provide a non-existing string here and a database will still be returned.
  • Line 13: according to the documentation, it isn’t mandatory but recommended to call RequestStart in order to ensure that a series of operations is performed on the same connection. The documentation even states that this is required “in order to guarantee correct results.” If you use RequestStart, you mush either call RequestDone to decrement the an inner counter involved, or simply use the using(…) statement which calls the Dispose() method (as shown in the example above).
  • Line 15: as no tables are involved, mongoDB uses the term Collections to represent a collection of BsonDocuments. The GetCollection doesn’t actually return all the collection specified in the argument, but a simple reference which is used to perform the different CRUD operations.
  • Lines 17-18: we finally get to the point where we insert items into our collection. I created an on-the-fly BsonDocuments, with JSON-like attributes. The Insert seems to take place immediately, hence, I believe that RequestStart is not transactional as I hoped it would be.
  • Line 20: Find methods are the way to retrieve information from mongoDB. They return cursors which can be iterated upon. FindAll returns a non-filtered cursor. If we would like to return a filtered cursor, we have to use one of the other Find methods which accepts a Query object as its argument. Unfortunately, LINQ is not supported in the current official driver.
  • Lines 21-26 iterate over the cursor. Note that the items returns are of BsonDocument type, and the indexer returns a BsonValue subclass type. BsonValue is an abstract class for the different types which inherit from it, such as BsonString or BsonDateTime. From there you can downcast to your requested type.

That’s it. Quite simple.

Filtering

This seems like the weak link in the current driver as there’s no LINQ support. The different Find methods only accept IMongoQuery objects, which seem to be a narrow list of supported classes. For example, if you’d like to Find entities which have a “Smith” for a last name, you’ll have to use:

var cursor = col.FindAs(Query.EQ(“last”, “Smith”));

If you’d like all entities with a last name of ‘Jo’ or later in the alphabet, here’s a possibility:

If you’d like something more complex like “all last names which begin with ‘S’”, you’ll have to combine a Query.And( ), Query.GTE( ) method and possibly Query.LT( ).

Typed Entities

I liked the following part: the data can be serialized and deserialized to strongly typed classes. For simple typed classes, this can be done implicitly without having to go through too much trouble. It seems like all there’s to be done is to create a class with the relevant properties, add a BsonObjectId member, use generics and you’re done:

  • Lines 6-11 define a Person class with properties and an Object Id.
  • Line 21 returns a reference for a Person-typed collection.
  • Line 22 inserts a typed Person object into the collection.
  • Line 23 returns a Person cursor!
  • Lines 24-31 iterates over the Persons in the cursor.
  • Line 26 relates to the ‘last’ property of the Person (not hashmap!)
  • Lines 28-29 demonstrates an Update operation.

If you require more complex serializations, you’ll probably get some ideas from this tutorial.

Complex data types and null

I was curious to see how mongoDB and it’s driver dealt with complex types, such as binary and datetime, and how it dealt with null. For example, as mongoDB is a hierarchical db, I could store different items in a collection. Some would have missing data, which is equivalent to a null value in a relational database. As you can see below, mongoDB acted as expected and was doing just fine with missing data and complex types:

  • Line 12 has a byte array property.
  • Line 13 has a nullable DateTime.
  • Line 26 inserts a Person with all the attributes.
  • Line 27 inserts a Person without the byte array or the nullable DateTime.
  • The two watches show how the two entities Deserialize as expected (note the nulls in the second object).

Summary

I believe that this post just touches the tip of the iceberg. Quite a few topics remain untouched so far: Indexes, Replications, “Foreign Keys” (i.e. whether FK-like relations are possible between collections and entities), Transactions, and many others. But the general idea was much clearer now. To be honest, I really loved the idea of replacing the tabular database, but I reckon it’ll be some time before I actually switch to working with one on a “professional level”.

LINQ support is desperately something that I miss in the current official driver.

Note: You might want to investigate RavenDB which looks promising, but I’m uncertain how ripe it is at this time. mongoDB’s current poor documentation and lack of LINQ support is certainly disappointing and I really hope that the guys there perform a worthy and noticeable change.

 

 

 

 

 

About these ads
 
10 Comments

Posted by on 08/08/2011 in Software Development

 

Tags: ,

10 responses to “First attempt at NoSQL: C# and mongoDB

  1. Roger Jennings (@rogerjenn)

    10/08/2011 at 21:07

    I recommend you try Ayende’ RavenDB. It appears to me to be reasonably well matured at this point.

    Roger Jennings
    OakLeaf Blog: http://oakleafblog.blogspot.com

     
  2. Mathias Stearn

    11/08/2011 at 01:05

    I would suggest exploring the full mongodb documentation at http://www.mongodb.org/display/DOCS/Home in addition to just the C# docs as most of the information applies to all languages even if the examples use javascript syntax.

    To solve the “starts with S” problem, I’d suggest using regular expressions (
    http://www.mongodb.org/display/DOCS/Advanced+Queries#AdvancedQueries-RegularExpressions). It would looks something like col.find({“last”: /^S/}) or in C# col.FindAs(Query.Matches(“last”, “^S”)).

    If you are looking for a “quickstart” guide, there is a nice tutorial at try.mongodb.org and another written by a third party at mongly.com.

     
    • evolpin

      11/08/2011 at 06:29

      Thanks for your pointers Mathias. The regular expression seems to provide a reasonable solution at this time; nevertheless, I still prefer LINQ support.

       
  3. Robert Stam

    11/08/2011 at 20:30

    I share your opinion that LINQ support is very important. Look for it to appear as soon as we can get it in there!

    The existing documentation is called a “tutorial”, but like you, others have also pointed out that is longer than a tutorial normally is. We are planning to refactor the documentation into both a smaller tutorial type documentation and an expanded reference type documentation. If only there were more hours in the day… :) Thanks for writing this tutorial, I expect readers will find it useful.

    RequestStart is very rarely needed. The sentence that includes the words “in order to guarantee correct results” starts with the word “sometimes”! I have edited that paragraph of the documentation to emphasize that RequestStart is rarely needed and to describe one scenario where it is needed.

    One other minor point: when using POCOs it is usually easier to use ObjectId instead of BsonObjectId as the data type of your Id field. The difference is minor, but since BsonObjectId can be thought of as a BsonValue wrapper around an ObjectId it adds a level of indirection.

    Thanks for trying out MongoDB and for writing about it!

     
  4. Mark

    21/08/2011 at 15:54

    Yes Linq support would be the icing on the cake – I am really looking at using mongdb with the CQRS pattern as the querying engine in an MVC application – as in theory it seems well suited to this model. Does anyone have any experience of this – especially in relation to migration of complex relational data models to use a noSQL solution. Converting relational data to a noSQL solution in practice seems quite a daunting task and currently not sure how this would be achieved – I am in the early stages of research so hopefully these questions will be answered.

    The whole concept of using a noSQL database to map directly to view models for read only queries seem very appealing in relation to both scalability and performance for internet based applications.

     
  5. Ryan

    13/01/2012 at 02:08

    Hi thanks for your tutorial it really helped alot. I have some questions though how do i retrieve values from nested documents like in your first example and is there a better way to deal with nulls coming in than using a try catch

     
  6. Dorf

    16/01/2012 at 04:37

    Hi nice tutorial how would you go about retrieving a value from an array or an array of documents this is what i have but i am not sure of the correct syntax or if you can even do it like this

    var cursor = col.FindAll();
    foreach (var item in cursor)
    {
    BsonArray Phones;
    var phone = item["Phones[0]“].AsBsonDocument["Number"].AsString;
    }

    the mongodb i am trying to get it from is like this
    “Phones” : [{
    "Number" : "(08)85682115",
    "Type" : 0.0,
    "VisibleTo" : 0.0
    }],

     

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

 
Follow

Get every new post delivered to your Inbox.

Join 60 other followers

%d bloggers like this: