database - MongoDB count occurances of a substring in a collection -
hello i'm mongodb beginner. have database of irc chatlog. document structure simple
{ "_id" : objectid("000"), "user" : "username", "message" : "foobar foobar potato idontknow", "time" : numberlong(1451775601469) }
i have thousands of these , want count number of occurrences of string "foobar". have googled issue , found aggregations. looks complicated , haven't found issue "simple". i'd glad if pointed me in right direction research , wouldn't mind example command want. thank you.
there no built-in operator solve request.
you can try query, has poor performance:
db.chat.find().foreach(function(doc){ print(doc["user"] + " > " + ((doc["message"].match(/foobar/g) || []).length)) })
if change message
field array, apply aggregation
...
edit:
if add array of splitted words entry, can apply aggregation
sample:
{ "_id" : objectid("569bb7040586bcb40f7d2539"), "user" : "username", "fullmessage" : "foobar foobar potato idontknow", "message" : [ "foobar", "foobar", "potato", "idontknow" ], "time" : numberlong(1451775601469) }
aggregation. create new entry each array element, match given word (foobar, in case) , count matched result.
db.chat.aggregate([ {"$unwind" : "$message"}, {"$match" : {"message" : {"$regex" : "foobar", "$options" : "i"}}}, {"$group" : {_id:{"_id" : "$_id", "user" : "$user", "time" : "$time", "fullmessage" : "$fullmessage"}, "count" : {$sum:1}}}, {"$project" : {_id:"$_id._id", "user" : "$_id.user", "time" : "$_id.time", "fullmessage" : "$_id.fullmessage", "count" : "$count"}} ])
result:
[ { "_id" : objectid("569bb7040586bcb40f7d2539"), "count" : 2, "user" : "username", "time" : numberlong(1451775601469), "fullmessage" : "foobar foobar potato idontknow" } ]
Comments
Post a Comment