Tuesday, October 26, 2021

Antivirus software and performance impact

Sometime back, I received a request  about Appdynamics not fetching data in windows server 2016  but it works fine in other operating systems. To give you a background, Appdynamics fetches data via WMI scripts, which inturn obtain data from perfmon counters in regular time intervals.

I was presented with the following metrics for WMI scritps(written in VB) used by appdynamics and clearly shows windows 2016 is slower. 






Appdynamics vendor provided proof that it works in other companies where same OS was hosted.  

So the challange is to debug in our environment, I took an approach to trace all the system calls using procmon to determine the rootcause.




Root cause of this WMI script slowness in 2016 is due to MacAfee interfering with the process. If you see the snapshot above, McAfee took 352 sec which attributes to 90% of the total time. The same behavior is not found in 2012 version.

Mcafee accepted the issue and gave us a patch which resolved the problem. 

Debugging is always fun :)


Tuesday, October 19, 2021

IIS Application app pool getting crashed intermittently

I was debugging another interesting issue  where after doing a specific business transaction, Application pool getting stopped and service become unavailable.

Application logs doesn’t give much details

2021-10-08 19:48:25.8085 4.10.191 INFO Beginning Startup...

2021-10-08 19:48:25.8505 4.10.191 INFO CORS configuration found, allowing the following Origins: 

2021-10-08 19:48:25.8505 4.10.191 INFO Allowed Origin: https://xxx.hostname..local

2021-10-08 19:48:25.8505 4.10.191 INFO Applying CORS Configurations

2021-10-08 19:48:25.8505 4.10.191 WARN `Auth Configuration not found, defaulting to IWA

2021-10-08 19:48:25.8505 4.10.191 INFO Configuring for Intergrated Windows Authentication

2021-10-08 19:48:28.0246 4.10.191 INFO Authorization.Extension: Start parsing : "D:\inetpub\wwwroot \1.0\bin\Authorization.rules".

2021-10-08 19:48:28.1396 4.10.191 INFO Authorization Role Definitions have been created.

2021-10-08 19:48:28.1396 4.10.191 INFO Authorization.Extension: Successfully finished parsing : "D:\inetpub\wwwroot\1.0\bin\Authorization.rules".

2021-10-08 19:48:31.2079 4.10.191 INFO Startup Completed.

Another Http exception was also observed as below 

Message: Request is not available in this context
Stack:
[HelperMethodFrame]
System.Web.HttpContext.get_Request()
System.Web.HttpContextWrapper.get_Request()
NLog.Web.Internal.HttpContextExtensions.TryGetRequest(System.Web.HttpContextBase)
NLog.Web.LayoutRenderers.AspNetRequestValueLayoutRenderer.DoAppend(System.Text.StringBuilder, NLog.LogEventInfo)
NLog.LayoutRenderers.LayoutRenderer.RenderAppendBuilder(NLog.LogEventInfo, System.Text.StringBuilder)
NLog.Layouts.SimpleLayout.RenderAllRenderers(NLog.LogEventInfo, System.Text.StringBuilder)
NLog.Layouts.Layout.RenderAppendBuilder(NLog.LogEventInfo, System.Text.StringBuilder, Boolean)
NLog.Layouts.SimpleLayout.PrecalculateBuilder(NLog.LogEventInfo, System.Text.StringBuilder)
NLog.Targets.Target.PrecalculateVolatileLayoutsConcurrent(NLog.LogEventInfo)
NLog.Targets.Wrappers.AsyncTargetWrapper.Write(NLog.Common.AsyncLogEventInfo)
NLog.Targets.Wrappers.AsyncTargetWrapper.WriteAsyncThreadSafe(NLog.Common.AsyncLogEventInfo)
NLog.Targets.Target.WriteAsyncLogEvent(NLog.Common.AsyncLogEventInfo)
NLog.LoggerImpl.WriteToTargetWithFilterChain(NLog.Targets.Target, NLog.Filters.FilterResult, NLog.LogEventInfo, NLog.Common.AsyncContinuation)
NLog.LoggerImpl.Write(System.Type, NLog.Internal.TargetWithFilterChain, NLog.LogEventInfo, NLog.LogFactory)
NLog.Logger.Trace(System.String)
System.Web.HttpApplication.InitModulesCommon()
System.Web.HttpApplication.InitInternal(System.Web.HttpContext, System.Web.HttpApplicationState, System.Reflection.MethodInfo[])
System.Web.HttpApplicationFactory.GetNormalApplicationInstance(System.Web.HttpContext)
System.Web.HttpRuntime.ProcessRequestNotificationPrivate(System.Web.Hosting.IIS7WorkerRequest, System.Web.HttpContext)
System.Web.Hosting.PipelineRuntime.ProcessRequestNotificationHelper(IntPtr, IntPtr, IntPtr, Int32)
System.Web.Hosting.PipelineRuntime.ProcessRequestNotification(IntPtr, IntPtr, IntPtr, Int32)
DomainNeutralILStubClass.IL_STUB_ReversePInvoke(Int64, Int64, Int64, Int32)

 

Events logs

Events logs has below entries and it doesn’t have stack trace to debug

  •         A process serving application pool name.Psh' suffered a fatal communication error with the Windows Process Activation Service. The process id was '53196'. The data field contains the error number.
  •        Application pool name.Psh ' is being automatically disabled due to a series of failures in the process(es) serving that application pool.
  •         Application popup: w3wp.exe - System Error : A new guard page for the stack cannot be created.

 

To debug, I have configured debugdiag to capture memory dumps on crash and found couple of exceptions on the threads

  45   60 1ef18 00000214ce10fff0  1029220 Preemptive  0000000000000000:0000000000000000 00000214af874010 0     Ukn (Threadpool Worker)

  46   62 7884 00000214cdf937a0  1029220 Cooperative 00000214B811CB00:00000214B811D8D0 00000214af95f900 0     Ukn (Threadpool Worker)

  47   47 8b50 00000214ce0f28c0  8029220 Preemptive  0000000000000000:0000000000000000 00000214af874010 0     Ukn (Threadpool Completion Port)

  48   65 17114 00000214ce10a1f0  8029220 Preemptive  0000000000000000:0000000000000000 00000214af874010 0     Ukn (Threadpool Completion Port)

  49   63 7bd4 00000214cdbafd90  1029220 Preemptive  0000000000000000:0000000000000000 00000214af874010 0     Ukn (Threadpool Worker)

  50   61 1598 00000214ce0ab520  1029220 Preemptive  0000000000000000:0000000000000000 00000214af874010 0     Ukn (Threadpool Worker)

  51   58 16c04 00000214ce0f20f0  1029220 Preemptive  0000000000000000:0000000000000000 00000214af874010 0     Ukn (Threadpool Worker)

  52   57 1314c 00000214cdfaf0e0  1029220 Preemptive  0000000000000000:0000000000000000 00000214af874010 0     Ukn (Threadpool Worker)

  53   55 1ecfc 00000214cdfad060  1029220 Preemptive  0000000000000000:0000000000000000 00000214af874010 0     Ukn (Threadpool Worker)

  54   54 12004 00000214ce10c130  1029220 Preemptive  0000000000000000:0000000000000000 00000214af874010 0     Ukn (Threadpool Worker)

  55   53 128f0 00000214ce0f3860  1029220 Preemptive  00000214B811A580:00000214B811B8D0 00000214af95f900 0     Ukn (Threadpool Worker) System.Web.HttpException 00000214b811a498

  56   52 c7a0 00000214ce0fa0d0  1029220 Preemptive  00000214B80E8168:00000214B80E98D0 00000214af95f900 1     Ukn (Threadpool Worker) System.StackOverflowException 00000214afc91158

  57   48 e110 00000214ce0c0950  1029220 Preemptive  0000000000000000:0000000000000000 00000214af874010 0     Ukn (Threadpool Worker)

  58   34 7a4c 00000214ce0f0980  1029220 Preemptive  0000000000000000:0000000000000000 00000214af874010 0     Ukn (Threadpool Worker)

  59   39 3450 00000214ce0f1150  1029220 Preemptive  0000000000000000:0000000000000000 00000214af874010 0     Ukn (Threadpool Worker)

 

from the dump, Its clear that app pool crashed due to System.StackOverflowException on thread 56

 

Review the stack trace of thread 56 gives hint that automapper causing stack overflow


RootCause

Application has circular reference in its classes which are used on the workflow because of this when user attempt to perform type casting via automapper, stack is getting filled up.

Solution to the problem is to Avoid type conversion via auto mapper instead converting types using manual mapping.

 


Tuesday, October 12, 2021

Fiddler improves Service Response times?

One of the development teams approached me with an interesting issue. The issue description states the response time of a service call is faster by 100% when ever Fiddler is running.  I was requested to suggest if the same performance can be achieved without fiddler.

I took some Network traces  at the client with and with out fiddler and found some interesting observations

Without fiddler, Only 2 TCP connections were used between the client and the server



With Fiddler - there were 4 active parallel TCP connections resulting  faster response time




With the data it is clear more the parallel connections the better the response time. With the issue identified solution is even simpler

By default, we can create only two simultaneous connections to an HTTP server.   This can be increased to avoid a backlog of requests during times of very high transaction volume. It can be done in many ways as given in the link below

https://developer.cybersource.com/library/documentation/dev_guides/Simple_Order_API_Clients/html/Topics/Setting_the_Connection_Limit.htm


Claim Based Authorization

  1.      Claim Based Authorization ·         Token Validation: As requests come into the Ocelot API Gateway, the first step is to validat...