Skip navigation

The attached Informatica Big Data Edition Tools package is distributed as a compressed zip folder. It includes five tests that help validate Hadoop cluster services connectivity from Informatica BDE installation for supported Hadoop distributions.

 

The five tools in summary are:

TestDescription
HiveJDBCTest Validates the JDBC connection to HiveServer2 and runs basic DML queries
HiveCLITest Validates if client can submit a Hive Job to Hadoop cluster using Hive CLI driver and runs basic HIVE queries
DisplayClientServerVersion Displays the versions of Informatica BDE Hadoop client libraries and Hadoop cluster/server libraries.
HDFSConnectionTestValidates the ability to connect/read/write to the Hadoop File System
HBaseConnectionTestValidates the ability to connect/read/write to HBase

 

Refer to the Big Data Edition Tools User Guide (within the zip folder) for more information.This Guide is written for the Informatica administrator who is responsible for installing and configuring Informatica and related tools. This guide assumes you are familiar with the Hadoop ecosystem including MapReduce, Yarn, HDFS, Hive, and HBase.

This release includes:

 

BDE Update 3 features

  • Hadoop PAM : IBM 4.1, CDH 5.5
  • Enhancement in Big Data Config utility
  • HiveServer2 Integration with Big Data Edition

BDE Update3 PAM additions

  • Cloudera CDH 5.5
  • IBM BigInsights 4.1

 

Release Notes:

https://kb.informatica.com/proddocs/Product%20Documentation/4/IN_BDE961HF3Update3_ReleaseNotes_en.pdf

 

Note : This EBF is with above Update3 features and PAM support can be directly applied on top 9.6.1HotFix 3  or on top of the  9.6.1 HotFix 3 Update2.

BDE Update2 Features

  • Hive 14  Feature support: Support for update strategy transformation on hadoop mode of execution.
  • Active Directory KDC support : Support of Active directory based Kerberos domain controller for Hortonworks and Cloudera.
  • Merge of the 9.6.1 HF2 Update1 features
  • EBF Merges mentioned in release notes section .

 

EBF16193 specific feature along with above .

  • Hortonworks 2.3 Support for the Informatica  BDE Edition

Informatica Big Data Edition - 9.6.1 HotFix 3 Update 2 - Release Notes - (English)

PAM for Informatica 9.6.1 Hotfix 3 (Update 2) - Big Data Edition (Hadoop)

The release includes the following

 

  • Hive 14  Feature support: Support for update strategy transformation on hadoop mode of execution.
  • Active Directory KDC support : Support of Active directory based Kerberos domain controller for Hortonworks and Cloudera.
    • Release on top of 9.6.1 HF3(All the HF3 fixes are now available for BDE such as MM performance enhancements )
    • Merge of the 9.6.1 HF2 Update1 features
    • EBF Merges mentioned in release notes

 

Informatica Big Data Edition - 9.6.1 HotFix 3 Update 2 - Release Notes - (English)

PAM for Informatica 9.6.1 Hotfix 3 (Update 2) - Big Data Edition (Hadoop)

This release includes:

  • Hadoop Distributions on prem
    • Support for new versions: CDH 5.4, MapR 4.0.2
  • Hadoop Distributions on Cloud
    • Support for Cloudera and Hortonworks on Microsoft Azure and Amazon EC2.
  • Performance
    • HBase writer Performance Enhancement.
    • Big Data Edition on Tez.
  • Ease of Big Data Edition Configuration (Phase1)

Informatica Big Data Edition - 9.6.1 HotFix 2 Update 1 - Release Notes - (English)

So you're an IT manager (or an architect, or a business analyst, ...), and a vendor tells you that the amount of data in your org is exploding, and unless you buy from them RIGHT NOW, then you'll be left with an unmanageable Big Mess.   By now you have probably heard it often enough that you're not sure whether to believe it.  You've gotten this far without the latest newfangled tool from company X, you're not yet up to your eyeballs in messes, so why the urgency?  Why not just keep doing what you're doing?   

 

Good question.  Actually, any vendor that tries to sell this way is missing the point.  I've heard the same pitches - enough times to wonder if they are all just copying the same script.  "You're drowning in data, blah blah."  Where's the imagination?    Instead the point should be about the opportunities to be seized, not the problems to solve.  It should be about the business value of that data - and helping you understand how you can benefit from it.  For example...

 

- One of our customers, a retail company, is staying closer to their repeat customers by inviting them to engage via social media, to comment on new consumer products and product trends, and give feedback, both positive and negative.  The opportunity is to get more real-time insights into the pulse of their customer base, be more responsive and tailor their product offerings and their whole shopping experience accordingly.  The results - better competitiveness, and more customer loyalty.

 

- Another customer, a financial institution serving high net-worth clients, had been using our products for nightly batch updates of their data warehouse and other analysis tools.  Old school ETL.  But they realized, over the years, that markets change quickly and market data volumes are only increasing.  Their front-desk operations can't wait until tomorrow to see what is happening today - they need to know same day.  Plus it no longer was practical to process ever-increasing batch volumes within the same nightly batch runs.  So they moved to a real-time mode.  Turns out our technology makes this transition simple.  No "rip and replace" necessary.  The results - much better awareness and responsiveness to market trends - at little additional cost.

 

- And many others planning to move more of their traditional data processing to a NoSQL scalable environment (Hadoop), not only so they can keep up with volumes, but so they can run analytics against that data more quickly, resulting in more actionable intelligence faster.  It's also about maintaining that data at lower cost, as Hadoop can easily scale on commodity hardware.

 

In short, it's not just about avoiding the Big Mess of ever increasing volume, variety, velocity of data.  It's about extracting Value from that data.

 

Everybody needs to take Big Data seriously.  It's easy to think of it as just a mess to be managed.  It takes a little more effort to think of how much you can benefit from it, but the benefits will be huge for those who do.  It just takes a little bit of imagination.

 

 

Dominic

Filter Blog

By date: By tag: