3 Replies Latest reply on Jul 29, 2014 9:38 AM by user193578

    Hadoop vs Nearline

    New Member

      Hi,

       

      Is Hadoop an alternative for SAP BW Nearline Solution? As  far as I understand Hadoop is used to store data using MapReduce for large scale data processing. Please let me know the view from Informatica NLS product team.

        • 1. Re: Hadoop vs Nearline
          Guru

          Hi Mahesh,

           

          I am not sure what is your objective.

           

          But the difference between HADOOP and SAP BW Nearline is -

           

          • HADOOP - it is used to fast processing of data via different algorithms like "MAPREDUCE". It is not used to store the data.
          • SAP BW NLS - Stores the data and it has no relation with respect to faster processing.

           

          Thanks,

          Aditya Prakash

          • 2. Re: Hadoop vs Nearline
            New Member

            Thanks Aditya for your response. Of course HADOOP uses HFDS and MadReduce as key technolgies. My question was more towards how Hadoop relates with Nearline or does it have any relation to SAP BW Nearline.

             

            This question was raised during one of the client discussions where it was asked if Hadoop can be an alternative for Nearline. The context of this question is based on this.

             

            Also in recent SAP landscape diagrams NLS is integrated with Hadoop.

             

            Regards,

             

            Manesh

            • 3. Re: Hadoop vs Nearline
              user193578 New Member

              Hi Manesh,

               

              Informatica Nearline stores the data in our columnar database, called Data Vault (formerly known ad File Archive Service). The data files create can be stored in Hadoop HDFS.

               

              It does not mean you can query the data directly from Hadoop using PIG or HIVE. The query will continue to come from SAP BW, through the Nearline interface and to Data Vault, where the query is executed and the data required is pulled on the fly from the data files residing  in HDFS.

               

              Hope that is clear.


              Regards,
              Ricardo