2012-08-29

GROUP BY with MongoDB (and Scala and Morphia)

This is an example of how to run a query similar to GROUP BY from the SQL world, but in MongoDB. MongoDB doesn't have GROUP BY, but it has a very similar group command.

In my example I want to return the number of songs per artist, therefore I'm grouping by artist name.

db.MongoDbSong.group(
    {key: {artistName: true},
    initial: {totalSongs: 0},
    reduce: function(obj, prev) { prev.totalSongs++;}
})

This example is from my MongoDB + Scala + Gradle + Morphia app. It turns out that Morphia does not help with that at the moment, so we have to run the query raw and then process the results.

class SongPerArtistMining @Inject() (private val persistenceProvider: MorphiaMongoDbPersistenceProvider) {

 private val query = "db.MongoDbSong.group("+
    "{key: {artistName: true},"+
    "initial: {totalSongs: 0},"+
    "reduce: function(obj, prev) { prev.totalSongs++;}"+
    "})";
 
 private val formatter = NumberFormat.getInstance(Locale.ENGLISH)


 def mine(maxResults: Int): Option[List[(String, Int)]] = {
   val mongo = persistenceProvider.getMongo
   val result = mongo.getDB("audiolicious").eval(query)
   result match {
     case list: BasicDBList => return Some(processAndRetrieveResults(list, maxResults))
     case _ => println("Error while obtaining stats. Result of unknown type [%s].".format(result))
         return None;
   }
 }

Essentially, this does it:

val result = mongo.getDB("audiolicious").eval(query)

Processing results

And here's how to process results.

private def processAndRetrieveResults(list: BasicDBList, maxResults: Int): List[(String, Int)] = {
   return processResults(list).sortWith(compareSecondValueInteger).slice(0, maxResults)
 }

 def compareSecondValueInteger(e1: (String, Int), e2: (String, Int)) = e1._2 > e2._2
 
 private def processResults(list: BasicDBList): List[(String, Int)] = {
   var results : List[(String, Int)] = List()
   for (i <- 0 until list.size()) {
     val item = list.get(i)
     item match {
       case dbObject: BasicDBObject => results ::= processResult(dbObject)
       case _ => println("Error while obtaining stats. Result item of unknown type [%s].".format(item.getClass()))
     }
   }
   return results
 }
 
 private def processResult(dbObject: BasicDBObject): (String, Int) = {
   val artistName = dbObject.get("artistName").toString()
   val totalSongs = parseDouble(dbObject.get("totalSongs").toString(), formatter).intValue()
   return (artistName, totalSongs)
 }

 // http://stackoverflow.com/a/9542323
 def parseDouble(s: String, nf: NumberFormat) = {
   val pp = new ParsePosition(0)
   val d = nf.parse(s, pp)
   if (pp.getErrorIndex == -1) d.doubleValue else 0
 }

Abandoned Sanatorium - Beelitz-Heilstätten (my pictures)

Hi, these pictures are from Beelitz-Heilstätten, the abandoned sanatorium in Germany, where Adolf Hitler once was a patient.

Beelitz-Heilstätten

2012-08-25

Unique compound index with Morphia (MongoDb) and Scala

Hi, here's how to create a unique compound index with Morphia and Scala. The example is from my Gradle + Scala + MongoDb application Also, here's Part 1 of my tutorial on building a Scala + Gradle + Guice applications.

package me.m1key.audioliciousmigration.entities.mongodb
import com.google.code.morphia.annotations.Id
import com.google.code.morphia.annotations.Entity
import org.bson.types.ObjectId
import com.google.code.morphia.annotations.Indexes
import com.google.code.morphia.annotations.Index

@Entity
@Indexes(Array(new Index(value = "name, albumName", unique = true)))
class MongoDbSong(val name: String, val albumName: String) {

 @Id
 var id: ObjectId = _
 
 override def equals(that: Any) = {
   that match {
     case that: MongoDbSong => that.name == this.name &&
             that.albumName == this.albumName
     case _ => false
   }
 }

 override def hashCode(): Int = name.hashCode()

}

Remember to run this code in your app:

datastore.ensureIndexes()

Also, my final note here is this: If you make any such changes to your MongoDb schema, you have to drop the collection from the Mongo client.

db.MongoDbSong.drop()