IA4J

Version 1

    Introduction

    • All sorts of wild things happen when your code leaves the safe and warm development environment. Unlike the comfort of the debugger in your favorite IDE, when errors happen on a live server - you better come prepared. No more breakpoints, step over, or step into, and you can forget about adding that quick line of code to help you understand what just happened. In production, bad things happen first and then you have to figure out what exactly went wrong. In such tight time bounded production server, it is inconvenient to get down time and debug the issue. Here comes the IA4J tool to get you what you want with ZERO downtime.

    Usecase / Scenario

    • The tool can be used when:
      1. The diagnostic information from Java applications is meager
      2. The error/exception/warning message is not informative enough and needs a thread stack
      3. To generate a thread stack or heap dump or other diagnostics on specific API invocation or within a method on certain condition like
           At a specific line of code
           Upon receiving an exception
           when a condition is met
      4. To trace the entry and exit values of an API
      5. To generate diagnostics when a method takes unusually long time to execute
      6. To load an instrumented/fix class file without bringing down the application

    Features

    1. Enable/Disable instrumentation on the fly
    2. Helps in tracing the arguments passed to Java APIs.
    3. Generates thread/heap dumps at different snapshots of execution
    4. Enables injection of custom code at runtime without rebuilding the application
    5. Helps in capturing dumps and invoking external collectors/scripts on various events
    6. Track the slow running APIs and capture diagnostics on such APIs
    7. Allows to add/change instrumentation at run time without application shutdown
    8. Allows application of instrumented/fix class file at runtime without application shutdown (EBF/patch testing in PROD)

    Architecture
          
                    

     

                                 

     

                                                               

    Configuration

    Files

    • Ia4j.jar: gent jar that contains the implementation for performing instrumentation

    • default-ia4j.props: Contains include, exclude statements and other config parameters

    • ia4j_script.sh: Launcher script for Active mode attach of tool on Linux/AIX

    • ia4j_script.bat: Launcher script for Active mode attach of tool on Windows

    Props file contents

    • Enable/Disable tracing on the fly
       enableInstrumentation=true/false

    • Generate Heap/Thread on demand
      ia4j.heapOnDemand=true/false
      ia4j.threadsOnDemand= true/false

    • Limit the number of thread/heap dump file generated
      ia4j.maxDumps=0
      ia4j.maxHeapDumps=5

    • Trace on startup
      ia4j.traceOnStartup=true/false

    • Location of ia4j trace directory
      ia4j.traceFilePrefix=<PATH/TO/IA4JLOGDIR>

    • Include & Exclude statements
      ia4j.includeXX=LOCATION~~EVENT=>ACTION
      Injects bytecode as specified in the LOCATION~~EVENT=>ACTION
      Each injection to be specified as a separate property that begins with a prefix “ia4j.include”

      ia4j.excludeXX=LOCATION~~EVENT=>ACTION
      Excludes matching locations from injection

    LOCATION~~EVENT=>ACTION

    • Inject an “ACTION” to be performed on an “EVENT” at a “LOCATION”

    Location

    •Location is a method within class
          CLASS::METHOD

    •Perl-style regex can be specified for CLASS or METHOD to do ‘bulk’ injection
          “DomainService.*” matches DomainServiceImpl
          “DomainService*” is invalid

    •No injection available for
            Interface classes
            Methods that are native or abstract

    Event      

    ENTRY – entry of a method
    EXIT – exit of a method
    EXCEPTION – when a method throws an exception
    ATLINE – at a particular line number of a method
    TIMED – when a method takes more than a given time
    CLSFILE  – when a loaded class to be replaced with the new class file (EBF/patch testing with many changes in the file)

    Action

    VALUES – log values (args for ENTRY, ret-value for EXIT, message & stack for EXCEPTION, does not apply for remaining EVENTS)
    NOVALUES – log just entry/exit/exception but no values
    STACK – log stack trace for the thread at that point
    THREADS – Write thread dump to a .tdump file
    HEAP – generates thread dump & a .hprof/IBM-system dump file
    CMD – execute system command (output is written to a .out file)
    CUSTOM – execute custom code (doesn’t apply for TIMED) in a safe manner (within a try/catch block)
    UNSAFECUSTOM – execute custom code (doesn’t apply for TIMED) in an unsafe manner (without a try/catch block)
    JSON – Dumps the object contents in JSON format(applies only for ENTRY)

    Usage

    • How to launch/attach the tool to target Java process
      Passive Mode
        Add the following JVM command argument before starting java process
        -Xbootclasspath/a:<PATH_TO_ia4j.jar>
        -javaagent:<PATH_TO_ia4j.jar>=configFile=<PATH_TO_ia4j.props>

      Active Mode (Attach to a running java process)
         Launch ia4j_script.sh/bat
         # ./ia4j_script.sh <PATH_TO_JDK_HOME>

              Please enter the ia4j.jar path(For an example :/home/user/IA4J/ia4j.jar)=/tmp/ia4j.jar
              Please enter configuration file path(For an example :/home/user/IA4J/ia4j.props)=/tmp/node.props
              Enter pid =
              XXXXX
              Ia4j Attached to PID XXXXX

     

    Sample include and exclude statements for quick reference

    # traces arguments for entry of matching methods
    ia4j.include1=com.informatica.test.*::foo.*~~ENTRY=>VALUES

    # traces entry of matching methods but without values (incase value contains sensitive information such as passwords)
    ia4j.include1b=com.informatica.test.*::foo.*~~ENTRY=>NOVALUES

    # traces arguments for return value and stack on exit of mathcing methods
    ia4j.include2=com.informatica.test.*::bar.*~~EXIT=>VALUES~~EXIT=>STACK

    # logs the exception and stack when thrown by matching methods
    ia4j.include3=com.informatica.test.*::bar.*~~EXCEPTION=>VALUES

    # Collects thread dump if matching method takes more than 10 seconds
    ia4j.include4=com.informatica.test.*::bar.*~~TIMED(10)=>THREADS

    # Collects heap dump (and thread dump implicitly) if matching method takes more than 60 seconds, and for every 30 seconds until after that until method exits
    ia4j.include5=com.informatica.test.*::bar.*~~TIMED(60,30)=>HEAP

    # Triggers pstack on this process when method takes more than 60 seconds to complete
    ia4j.include6=com.informatica.test.*::bar.*~~TIMED(60)=>CMD{pstack $IA4J_PID >> pstack.output &}

    # Executes the specified script/command if the specified line number is reached in the matching method (both line number and method needs to be specified)
    ia4j.include7=com.informatica.test.*::bar.*~~ATLINE(200)=>CMD{/tmp/collectdata.sh}

    # log value of an event at line xxx based on a custom condition
    ia4j.include8=com.informatica.test.*::bar.*~~ATLINE(200)=>CUSTOM{ if(errors.size() > 0) { TRACE(("" + errors));} }

    # Collect heap dump at line xxx based on a custom condition
    ia4j.include9=com.informatica.test.*::bar.*~~ATLINE(200)=>CUSTOM{ if(errors.size() > 0) { HEAP } }

    # Collect thread dump at specific line
    ia4j.include9=com.informatica.test.*::bar.*~~ATLINE(200)=>CUSTOM{ THREADS; }

    # Inject sleep at specifid line number on a custom condition
    ia4j.include9=com.informatica.test.*::bar.*~~ATLINE(200)=>CUSTOM{ if(errors.size() > 0) { Thread.sleep(10000l);} }

    # Fake an exception at specifid line number on a custom condition (to test behavior)
    ia4j.include9=com.informatica.test.*::bar.*~~ATLINE(200)=>UNSAFECUSTOM{ if(errors.size() > 0) { throw new java.io.IOException("this is a dummy io exception"); } }

    # Need to apply a class file at runtime
    ia4j.include10=com.informatica.test.ClassName::.*~~CLSFILE=>FILE{/path/to/ClassName.class}
    ia4j.include101=com.informatica.test.ClassName::.*~~CLSFILE=>FILE{C:\\path\\to\\ClassName$Innerclass.class}

    # Example for inner class
    ia4j.include11=com.informatica.test.ClassName\$InnerClassName::foo~~EVENT=>ACTION

    # To instrument multiple methods of same class in single include
    ia4j.include2mthds=com.informatica.test.classABC::foo~~EVENT1=>ACTION1~~EVENT2=>ACTION2::boo~~EVENT3=>ACTION3

    Usecase: Domain DB Refresh issue

    •Issue: Domain shutdown abruptly due to Domain DB Refresh Update failure

    FATAL [Domain Monitor] [DOM_10094] Cannot update the data for the master gateway node [NODE] within the refresh interval time [32000]. The node will not continue as a master gateway node. Verify that the connection to the domain configuration repository database is valid.

    •Challenge :
      Intermittent issue
      Identify pattern of events
      Isolation of issue with DB/network

    •Code to be instrumented

          

     

    •IA4J include statements

    ia4j.include1=com.informatica.*MasterElector::updateMyRowInDB~~TIMED(8,5)=>THREADS ~~ENTRY=>NOVALUES~~EXIT=>VALUES
    ia4j.include2=com.informatica.isp.domainservice.MasterElectData::getLastrefreshTime~~ENTRY=>NOVALUES~~EXIT=>VALUES

    Sample Run Output

    Files (Runnable & Source Code)

     

    System Requirement

    • JRE 1.8
    • Java process

    More Info

    • Technologies used:
      javassist
    • Steps to build
      /* TODO */

     

    Current Limitations / Bugs

    • JIRA query : type = Bug AND component = "ia4j" and project = "GCS Tools"