8 Replies Latest reply on May 22, 2019 5:28 AM by user165569

    Significance of osgi parent workdir

    Akshay HB Guru

      Hi All,

       

      May I know the significance of the property "infapdo.osgi.parent.workdir" in hive pushdown. Lately we had few job failures and on checking yarn application logs observed below message

       

      ERROR [main] com.informatica.platform.dtm.executor.hive.boot.INFAEnvSetup: Error resetting the workspace

      [/tmp/infa/osgi-working-dir-1/osgi-dir]. Please check with the Administrator.

       

      I can see above property is set to /tmp/infa. Does osgi working directories gets cleaned up during the execution of every pushdown job and recreated again?

       

      In addition, had below entry in the yarn application log

      Caused by: java.io.FileNotFoundException: /tmp/infa_rpm/<Domain Name>/<DIS Name>/<MD5 Sum of rpm tar>/infa_rpm.tar/services/shared/jars/thirdparty/org.eclipse.osgi-org.eclipse.osgi-3.11.2.jar (No such file or directory)

       

      Per my understanding, local copy of rpm gets installed under /tmp/infa_rpm by default. As this is on local/regular filesystem, does the directory structure gets created on data nodes during the execution of map and reduce tasks, if not present?

       

      Also, the error started appearing after configuring the DIS on GRID. Earlier DIS was running on just one node and had no issues.

       

      Thanks and Regards

      Akshay

        • 1. Re: Significance of osgi parent workdir
          Abhilash Mula Guru

          Is this version 10.2.1?

          It could be an issue with multiple DIS services doing the untar operation from HDFS to local /tmp. There should be a fix for this...can you raise a support ticket?

          • 2. Re: Significance of osgi parent workdir
            Akshay HB Guru

            Sorry, missed to mention the product version. Yes, it's 10.2.1 running on SUSE 11.2.

             

            Will put in a case with vendor. What about OSGi parent working directory? Couldn't find much info on that. Could you shed some light?

             

            Thanks and Regards

            Akshay

            • 3. Re: Significance of osgi parent workdir
              Akshay HB Guru

              To add, not all jobs are failing. I see successful execution in native mode on newly added node to the grid and also in pushdown mode when dispatched by the DIS service process running on the newly added node.

               

              Few jobs were getting failed. So made DIS to run on the node on which it was running earlier. Post which, all pushdown jobs are executing fine.

               

              Is this a known issue with DIS on GRID?

               

              Thanks and Regards

              Akshay

              • 4. Re: Significance of osgi parent workdir
                user165569 Guru

                Hi Akshay,

                 

                While using DIS on grid for hadoop pushdown jobs, please ensure that the md5 checksum across in the grid is the same. Otherwise, there can be frequent archiving of binaries in the DIS nodes which can cause job failures.

                 

                Regarding the OSGI directory, we can set the infa.osgi.enable.workdir.reuse property to false at the Hadoop connection level if you wish to clean up the files after every job run. Also refer below KB article for more details:

                 

                https://kb.informatica.com/solution/23/Pages/62/516806.aspx

                 

                 

                Hope this helps.

                 

                Thanks,

                Ninju

                1 of 1 people found this helpful
                • 5. Re: Significance of osgi parent workdir
                  Akshay HB Guru

                  This was really helpful, Ninju.

                   

                  Is this a known bug in v10.2.1 as I see different md5 checksum on nodes part of the grid? Let me get that fixed and will update post making required changes.

                   

                  Thanks and Regards

                  Akshay

                  • 6. Re: Significance of osgi parent workdir
                    user165569 Guru

                    Hi Akash,

                     

                    MD5 cksum can be different across nodes if there are any additional files or additional entries in the odbc.ini file under INFA_HOME. The difference in MD5 is not a bug with Informatica but usually caused by changes done by user.

                     

                    Thanks,
                    Ninju

                    • 7. Re: Significance of osgi parent workdir
                      Akshay HB Guru

                      Yes, that's right. When I compared the md5 checksum of files available on both the nodes, difference was only in those files which were manually modified on one node and not on another.

                       

                      For ex, $INFA_HOME/java/jre/lib/logging.properties, $INFA_HOME/java/jre/lib/security/cacerts.

                       

                      Per my understanding, this shouldn't create any issue.

                       

                      As the tar archive gets pushed from DIS service process running on both the nodes, I see md5sum directory created on HDFS (/tmp/<SPN User>/<Domain>/<DIS>/<Hadoop Distribution directory>/infa_rpm/<md5sum>) corresponding to both the nodes.

                       

                      Per my understanding, local copy of the same (tar extract) will be placed under /tmp on hadoop data nodes (/tmp/infa_rpm/<Domain>/<md5sum>/infa_rpm.tar/) where I don't see md5sum directory of both the nodes part of the grid. Is this as expected or could this be the root cause of the issue?

                       

                      Caused by: java.io.FileNotFoundException: /tmp/infa_rpm/<Domain Name>/<DIS Name>/<MD5 Sum of rpm tar>/infa_rpm.tar/services/shared/jars/thirdparty/org.eclipse.osgi-org.eclipse.osgi-3.11.2.jar (No such file or directory)

                       

                      Thanks and Regards

                      Akshay

                      • 8. Re: Significance of osgi parent workdir
                        user165569 Guru

                        Hi Akshay,

                         

                         

                         

                        The directory structure in local dir is the expected one, and we do not have DIS specific folder in the local rpm directory on data nodes.  The reported issue will not occur if the MD5 cksum of the nodes part of the grid are matching.

                         

                         

                         

                        If you are hitting the "No such file or directory" error even with nodes in sync, please raise a support case so that we can assist you further.

                         

                         

                         

                        Thanks,

                         

                        Ninju