NewRelic as we all know is an awesome tool to monitor our web application. It’s like looking at the x-ray of our running web app and knowing what is happening internally.
Another area where NewRelic excels is, with it super easy integration with our code. Just in few steps we can have all the goodness of NewRelic integrated into our platform. But there are days when we are not so lucky and the integration just does not work as expected. Today was one such day and what we found can help others faced with similar issue.
We are developing a solution hosted on Azure WebRole and decided to integrate NewRelic into our web application for obvious reasons. We followed the instructions and registered ourselves, got the license key, installed the NewRelic package, did a deployment and waited!
It did not work. No logs! We had integrated NewRelic earlier and it had always worked without any issue.
We double checked, configurations were correct, license key was correct but still no logs. The next obvious step was to look for some help in NewRelic site. This article gave some pointers around what should be checked.
As per the article we looked at web role evntlogs to see any error but there were none. The article also suggested to generate diagnostic logging for the NewRelic Agent deployed on the machine to know why log generation is failing. This article described how to achieve the same.
Working through the steps to generate diagnostic logs for NewRelic Agent we realized that the Agent itself was not installed on the machine we were probing into. The agent should have been present at location.
%ALLUSERSPROFILE%New Relic.NET Agent
But there was no trace of the agent in %ALLUSERSPROFILE% folder. We now knew why no logs are being generated. The focus now shifted to determine the reason for Agent installation failure.
NewRelic does generate logs when installing the NewRelic Agent and Server Monitor. Looking at one of the log files (nr_install.log) on the web role machine we found these few lines of interest:
MSI (c) (2C:34) [09:38:08:853]: Client-side and UI is none or basic: Running entire install on the server.
MSI (c) (2C:34) [09:38:11:859]: Failed to grab execution mutex. System error 258.
MSI (c) (2C:34) [09:38:11:859]: Cloaking enabled.
On the web role this file is located at the role root folder. Which in this case is the same drive where the siteroot folder is (or the folder that points to IIS web application). Look at the newrelic.cmd file (that gets added to the web role project) to know more about the location of the log files.
While the error was not conclusive it was worth a search. Googling for 5 minutes and we realized this error comes when windows tries to run a new installation while one is already in progress.
Our web role indeed had another start-up task and it too was installing few components during the role initialization. It seemed that the NewRelic Agent installer ran when there was already some installation going on.
We looked at the role configuration file (.csdef) that had start-up task configurations to see the start-up task setup.
There were two task, one background and one simple. And if you have not guessed therein lies the problem. Azure background and foreground tasks start asynchronously and then the simple task starts. Since the simple tasks starts in parallel with the background task multiple installations start together which cause the NewRelic Agent installation to fail.
The solution to this problem was to make the startup.cmd task also a simple task. This way the two tasks would run sequentially and not tread over each other.
We made the changes and redeployed the solution and it all worked like a charm!
In case you are having problems with NewRelic agent logging, this area is worth probing into.