viernes, 12 de agosto de 2016

Mongotail 2.0 with new performance features, SSL and MongoDB 3.2 support

A couple of weeks ago I released Mongotail 2.0.0, with new features and support for the latest version of MongoDB 3.2. Also one of the most important features is the ability to login against a database with SSL authentication, and new options that give us more information about what is happening in our database, like the consumed time by a query, or how many registries the engine had to "walk" to find a particular record.

What is Mongotail? mongotail is a command line tool to output any operation from a Mongo database in the standard output. You can see the operations collected by the database profiler from your console.

In the following steps we will see how to use these new features, and at the end of the post are the instructions to update or install the tool.

Activating the profiler of a database

You need to activate the profiler of a database to see what is happening. In the following steps we will assume that we are connecting against a database called sales in your local computer (localhost). For more instructions of how to connect with an external database and other options, executes mongotail --help.

First at all, go to the console and activate the profiler with the -l 2 option:

$ mongotail sales -l 2
Profiling level set to level 2

Watching the logs with the new features

Once the profiling is activated, let started to see the logs. Go to a Mongo client and just execute some queries, or launch the application that connects with your database, and all the operations will be registered by the profiler. Then to see the latest instructions executed, go back to the command line and execute:

$ mongotail sales
2016-07-27 22:30:36.822 UPDATE  [mongeez] : {"type": {"$exists": false}}, {"$set": {"type": "changeSetExecution"}}. 0 updated.
2016-07-27 22:30:36.828 QUERY   [mongeez] : {"type": "configuration"}. 1 returned.
2016-07-27 22:30:36.839 COUNT   [operations] : {"user": "mrsarm", "type": ...

The first change you can see compared with previous versions is the final information at each line counting how many records were affected (0 updated. and 1 returned.).

But the more interesting feature is to see the changes in "real time" with the -f option, and occasionally filter the result with grep to find a particular operation (this is an old feature). For example, to see in "real time" only new changes, you can use the -f option and pipe the command with grep like the following:

$ mongotail sales -f | egrep "(INSERT|UPDATE|REMOVE)"

The -m option

One of the options is -m, --metadata, that give us extra metadata fields from the profiler. Fields may vary depending of the operation and the MongoDB version, some well known fields are: millis, nscanned, docsExamined, execStats and lockStats.

This is an example getting the latest 2 queries (-n 2) with the milliseconds taken by each query and some statistics about the execution:

$ mongotail sales -n 2 -m millis execStats
2016-08-10 02:01:47.646 QUERY   [users] : {"_id": ObjectId("56aa65eff7e82fee8b4433c3")}. 1 returned. millis: 0, execStats: {"needYield": 0, "docsExamined": 1, "saveState": 0, "restoreState": 0, "isEOF": 1, "nReturned": 1, "needTime": 0, "keysExamined": 1, "executionTimeMillisEstimate": 0, "invalidates": 0, "works": 1, "advanced": 1, "stage": "IDHACK"}
2016-08-10 02:04:29.654 REMOVE  [invites] : {"_id": ObjectId("57aa8b16e4b0aef63a46c4e7")}. 1 deleted. millis: 0, execStats: {}

If you want to know how many records MongoDB needs to "walk" to find a particular result, use with -m nscanned (doesn't work for all types of operations):

$ mongotail sales -n 3 -m nscanned
2016-08-10 02:04:29.654 QUERY   [ops] : {"deleted": {"$exists": false}, "prod_id": "367133"}. 2 returned. nscanned: 9

The -v option

But, how many meta-data fields are available for a query? As I said before, it depends of the operation and the MongoDB version, but you can see all the information of each operation if you use the -v, --verbose option instead of -m option. All the information will printed in "RAW" format, without the pre-formatting Mongotail does, so I don't recommend this option unless you want to know what information is available for a given query, so the next execution you can use with -m FIELD_I_FOUND_INTERESTING ...

$ mongotail sales -v
{"cursorExhausted": true, "writeConflicts": 0, "docsExamined": 1, "keysExamined": 1, "execStats": {"needYield": 0, "saveState": 0, "restoreState": 0, "isEOF": 1, "inputStage": {"needYield": 0, "docsExamined": 1, "saveState": 0, "restoreState": 0, "isEOF": 0, "inputStage": {"saveState": 0, "isEOF": 0, "seenInvalidated": 0, "keysExamined": 1, "nReturned": 1, "invalidates": 0, "keyPattern": {"_id": 1}, "isUnique": true, "needTime": 0, "isMultiKey": false, "executionTimeMillisEstimate": 0, "dupsTested": 0, "restoreState": 0, "direction": "forward", "indexName": "_id_", "isSparse": false, "advanced": 1, "stage": "IXSCAN", "dupsDropped": 0, "needYield": 0, "isPartial": false, "indexBounds": {"_id": ["[ObjectId("'5751b9a64c5c774e25434d14'), ObjectId("'5751b9a64c5c774e25434d14')]"]}, "works": 1, "indexVersion": 1}, "nReturned": 1, "needTime": 0, "filter": {"$not": {"deleted": {"$exists": true}}}, "executionTimeMillisEstimate": 0, "alreadyHasObj": 0, "invalidates": 0, "works": 1, "advanced": 1, "stage": "FETCH"}, "nReturned": 1, "needTime": 0, "executionTimeMillisEstimate": 0, "limitAmount": 1, "invalidates": 0, "works": 1, "advanced": 1, "stage": "LIMIT"}, "responseLength": 423, "millis": 0, "ts": ISODate("2016-08-10T22:56:44.951Z"), "nreturned": 1, "client": "172.17.0.4", "locks": {"Global": {"acquireCount": {"r": 2}}, "Collection": {"acquireCount": {"r": 1}}, "Database": {"acquireCount": {"r": 1}}}, "numYield": 0, "user": "", "keyUpdates": 0, "query": {"filter": {"_id": ObjectId("5751b9a64c5c774e25434d14")}, "find": "users", "ntoreturn": -1}, "ns": "sales.users", "allUsers": [], "op": "query"}
...

Find slow queries in MongoDB

When you activate the profiler with Mongotail, you can choose level 1 profiling instead of level 2. Level 1 configure the profiler system to log only "slow" operations. Then you have to set the threshold in milliseconds for the profile to consider an operation "slow". In the following example the threshold is set to 5 milliseconds:

$ mongotail sales -l 1
Profiling level set to level 1
$ mongotail sales -s 5
Threshold profiling set to 5 milliseconds

Then when you check your databases only operations that take 5 or more milliseconds will be displayed.

The -i option

The option -i, --info is a working in progress feature, but for now it gives you some useful information about your database and the MongoDB instance you are connecting to.

Considerations for production environments

Mongotail purpose is allow users debug their databases to find a particular problem, and see what is happening in real time. However the level chosen can affect performance, and the queries are written to the profiler log, which might have information security implications for production deployments. Remember to setup your database profiling level to 0 again after debugging your data:

$ mongotail sale -l 0

The profiling level configuration is not saved in any place and it's applied on the "fly", so restart the MongoDB instance always will set to level 0 (no profiling) for all databases again if you forgot doing so.

What about the new logging capabilities present in MongoDB 3.0+

Since Mongo 3.0 it's possible to debug the activity in a log file similarly other databases do, instead of save the activity in a capped collection like the MongoDB profiler does (Mongotail uses this system, watching, filtering and formatting the information saved at the system.profile collection). In this case you don't need Mongotail to see the activity, just look at the /var/log/mongodb/mongod.log file with any editor or with cat, less, tail... from the command line, but, like the profiler, the format used by the logging system is ugly, extremely verbose, and the information for a single query is displayed in many lines instead of a single one, getting less readable and more difficult to process the log using scripts.

Also some information available in the profiler system (therefore Mongotail) isn't registered in the logging system.

How to update the tool

If you have a previous version of Mongotail, in any Unix OS (Linux, OSX) you only need to execute:

$ sudo pip install mongotail --upgrade

Or if you don't have pip installed, try with easy_install:

$ sudo easy_install --upgrade mongotail

How to install the tool

Mongotail is available at PyPI Python Package Repository. You can find a more complete guide of how to install Mongotail in the README instructions. But basically to do it:

$ sudo pip install mongotail

Or if you don't have pip installed, it can be used easy_install to install Mongotail:

$ sudo easy_install mongotail

Whats next?

Well, next version I think it will be fine to add more "status" capability options, like get the consumed resource for a given instance in real time. Also I want to add some pretty print capabilities to the command, like colorizing the JSON output.

If you think there is a better feature to add, or you find a bug, go to the Issues section and post a new issue. Pull requests also are welcome!

Spanish documentation

If you want to see how to use Mongotail in "Español", you can see this previous post.