5 Replies Latest reply on Oct 8, 2009 2:55 PM by Cj Applequist

    How to Monitor Long Running Sessions.

    New Member

       

      How do I design a workflow to monitor itself for long running session tasks?

       

       

      I have tried using timers to send out emails after an hour, but then the workflow runs for as long as the timer (1 hr) even if the session runs successfully in 10 minutes, and the email is always sent anyway.

       

       

      I have tried using a worklet that uses two execution paths. One path contains a Timer Task and a Email Task to test for and notifiy on a long running session. The other path has an Event Wait File Watch Task and a Control Task that look for a "successful run" flag file (created by the session when it completes successfully) and then stop the worklet if it is seen (stopping the timer execution path also). However, this method doesn't work either because when the worklet is stopped the parent workflow is also stopped and it doesn't reschedule.

       

       

      Are there any ideas on how to have workflow monitor itself, or the sessions if contains, to make sure they are not taking longer than expected? I would like this to be able to be self contained to a specific workflow as we have external programs the monitor for long running workflows/sessions  generall (see below). Any help/ideas would be GREATLY appreciated.

       

       

      Why I want to monitor for long running sessions:

       

       

      We have an issue where a workflow that runs daily contains a session that hangs randomly. When the session hangs it simply creates a 308 byte log (that can not be viewed with the log viewer) and then sits idle until we manually intervine with the session and parent workflow. To correct the problem we have to manually kill the session at the OS level, delete the session log file, and then kick off the workflow containing the session again.

       

       

      How we currently deal with the problem:

       

       

      I have an external java program that kicks off hourly from CRON and uses the Powercenter command tools to query the Domain for long running processes in a very general form. If there are any long running processes a notification email is sent out to be manually read, reviewed and interacted with. But again this is all general to our entire environment and not specific to the workflow in question.

       

       

       

       

       

       

       

       

        • 1. Re: How to Monitor Long Running Sessions.
          New Member

           

          Hi Millet,

           

           

                         Is the workflow thats running for a long time a real time workflow . Are you using this worflow to do CDC capture with the help of a condenser/real time?

           

           

          If not why is the workflow running for a long time ?

           

           

          Is the volume of data that is read into worflow  huge by any chance? You say its hanging up.

           

           

          Is there a network rule in place to cut off any connection if it exceeds a specific timeframe ?

           

           

          Thanks,

           

           

          Sri 

           

           

           

           

           

          • 2. Re: How to Monitor Long Running Sessions.
            New Member

             

            Sri,

             

             

            Thank you for the reply.

             

             

            As for the workflow, it is not a CDC/Realtime session. The reason the workflow runs so long is because of a hung session task contained in the workflow. Once the session task is manually killed at the OS level the workflow continues and finishes as normal. There is a problem with the Session Task that we have yet to determine the cause of because it happens randomly, we have an SR open for this and are better prepared to capture more information the next time it occurs. So far we have not been able to determine a corrilation to the environment that could be causeing the hanging session, other sessions in the same workflow work fine.

             

             

            I agree we need to actually determine and resolve the reason for the stalling session task (as I stated we are working towards that), but that doesn't remove our intrest in designing workflows to monitor themselves where possible. For instance to simply send out an email if that particular workflow is taking longer than it should. Again we have a general monitoring tool, in the form of an custom Java application called hourly by CRON, after all you should never rely on solely on something monitoring itself. Any ideas or tips you, or anyone, could present would be greatly appreciated.

             

             

             

             

             

            Thanks!

             

             

            • 3. Re: How to Monitor Long Running Sessions.
              sdepriest Seasoned Veteran

               

              bmillet,

               

               

              I don't have a solution for you about the self monitoring workflow.  Our workflows run using a 3rd party scheduler and we have Operations staff to monitor our critical jobs.  They know how long the jobs are supposed to take so if they run significantly longer (an hour or two), we can a phone call in the middle of the night.

               

               

              I am posting a reply to let you know about our hanging session problem we were having in June 2009.  We would have 2-8 sessions randomly hang.  It had happened once in May and twice in June.  We opened an SR about it also and tech support told us to run a Windows program called ADPlus.  I'll attach a text file of its download location & its use.  I sent the output of ADPlus to tech support and they told us the log suggested that Oracle (target system & location of my Informatica repository) was not closing the tcp socket connection & to contact Oracle support to resolve.  We did that and they suggested to upgrade Oracle to 10.2.04.  We still have not upgraded Oracle yet (in August), but the sessions have not hung since June 24th-25th (knock on wood).  Our workflows have also started executing much faster (about an hour faster) since June 25th when a patch was applied to our SAN controller.  The Oracle server's storage is on the SAN.

               

               

              So my advice is to try out the ADPlus utility when the session hangs, it might be able to tell you and/or tech support something.

               

               

              • 4. Re: How to Monitor Long Running Sessions.
                New Member

                 

                You can add a control task after your session, and configure it to "stop parent"

                 

                 

                This way the timer will be stopped and your workflow wont be running for one hour

                 

                 

                • 5. Re: How to Monitor Long Running Sessions.
                  Cj Applequist New Member

                   

                  A very different approach: A PowerCenter developer named William Flood has created a dynamic scheduling tool based in PowerCenter using just a few database tables.  It can restart failed sessions a prescribed number of times, alert you to long running sessions, and schedule dependencies.  It's very cool and completely free (although it isn't what I'd call polished yet...)

                   

                   

                   

                   

                   

                  This may be what you're looking for and more, but it takes a bit to work through how it does it's thing.  Will is a nice guy and his email is posted on the link below which also provides the .xml of the PowerCenter mappings/sessions/workflows as well as the sql for the tables needed to support it.  It's a different way of doing things, but it may be just what you're looking for.

                   

                   

                   

                   

                   

                  http://sites.google.com/site/loadmanagersite/Home