Cloud Data Integration : 2013 : July Skip navigation
2013

Last Tuesday we held our first ever Informatica Cloud #TechTuesdays session, "Better Business Intelligence: Blazing Fast Data Replication for Cloud Apps".

 

We focused on how cloud applications such as Salesforce were built from the ground up to allow rapid customization, and as a result, contain numerous custom objects and custom fields. As a consequence, when extracting data from Salesforce into a staging database, it is important to ensure that all the changed fields have propagated their way into the relevant mappings for the business intelligence process.

 

The demo focused on four critical elements of a data replication process:

 

  • Replicating Deleted & Archived Rows
  • Auto-creation of Database Tables
  • “Full Load” vs. “Incremental Load”
  • Using the Salesforce Bulk API

 

Here are the slides from the session, and the entire recording featuring Ron Lunasin who heads our Cloud Platform Adoption team.

 

Session 2 takes place tomorrow, Tuesday, July 30th at 830am PDT and will focus on Big Data, specifically how to perform data warehousing through the cloud with Amazon RedShift.

One of the questions we often hear from people who see Informatica Cloud integration for the first time (after they are amazed by how easy it is to get up and running and use of course!) is:

 

“We have PowerCenter, how can we best take advantage of the cloud integration service?”

 

There are whitepapers on the topic of Hybrid IT and Extending PowerCenter, but I thought I’d put the question to Mark Murray, one of Informatica’s long-time technical specialists. Here is his summary:

 

Self-Service

  • Informatica Cloud enables solution support to be self-sufficient. It requires very little training to set up and administer.
  • With Informatica Cloud, business unit subject matter experts and/or data analysts can be self-sufficient in defining initial data integration needs.
    • As complexity grows, center or excellence / integration competency center (ICC) experts can easily provide assistance. With Informatica Cloud integration, mappings and mapplets (known Plugins in Informatica Cloud) can be shared between cloud and premise environments.

 

 

Software a Service

  • Standards-based and delivered as an on-demand service, with Informatica Cloud it’s never been easier to load, synchronize, and replicate data. And new process automation and data masking features, not to mention data quality and MDM, are adding more and more cloud data management capabilities with each release.
  • With Informatica Cloud, total cost of ownership (TCO) is lower. SaaS-based with subscription pricing, Informatica Cloud customers typically highlight minimal administrative and support costs, rapid implementation time and lower development cost as key benefits.
  • Ease of maintenance (automatic upgrades) and the need for fewer resources are also key strengths. As a result, Informatica Cloud has proven to be great for emerging markets and new cloud data integration projects with limited resources.
  • It’s ever evolving and growing! Informatica Cloud Integration provides an upgrade approximately three times a year so there are always new features, function, services and connectors available for you.

 

Cloud Connectors and Templates

  • It’s hard to keep up with all of the new Cloud Connectors that are now available via the Informatica Marketplace Cloud Integration Mall. Check it out for yourself!
  • How do I activate or access a new connector in my Informatica Cloud Org? It’s Easy! Just navigate to the Administrator | Connectors tab and select Free Trial and your good to go!
  • Cloud Integration Templates go beyond simply connecting two endpoints. They recognize the repeatable patterns and allow our customers to jumpstart projects and take advantage of best practices.
  • And the great part about the extensive Cloud Connectors and Cloud Integration Templates is that customers, partners and employees can access the tools we use to build them and built and share their own. Details are available on our Developer site.

 

 

Thanks for the overview Mark! I hope this is enough to get more and more PowerCenter users getting started with an Informatica Cloud Integration trial. We're interested in how our customers and partners answer this question as well. Please share your thoughts here.

As companies adopt software as a service (SaaS) applications, such as a CRM solution like Salesforce, the collection of customer information moves from on-premise to the cloud, but the need for business intelligence still holds. Cloud application data silos and data quality are issues when undergoing any business intelligence initiative.  Utilizing customer information can provide a substantial competitive advantage in your industry; however it needs to be put to work.

 

A majority of companies have been attempting to put business intelligence strategies in place with little success.  Issues arise when companies assume that after a business intelligence application is purchased, they are guaranteed a successful program.  These tools are used in tandem with effective strategies; the applications themselves are not developed methods.  Current tactics revolve around creating synergy between the applications and data together.

 

According to Julie Hall from her article Business Intelligence: The Missing Link in Your CRM Strategy, “An often-overlooked component of CRM is the process of applying lessons learned from customer information to enhance business and customer relationship behaviors. Even after purchasing BI software, it's important that you invest both time and money in defining how the information will be used for business advantage.”

 

Developing a solid business intelligence strategy requires correctly leveraging customer information, and the results will improve company relationship with the current consumer base.  Utilizing a variety of statistical models and analytics will provide a range of metrics to monitor successful activities and best practices for your field.  These can then be implemented to examine a variety of factors around consumer information. As mentioned by Julie a critical step any BI strategy is to ensure you're analyzing results based on the most relevant information available. Maintaining correct data for every customer interaction and result, as well as keeping up to date with conditions surrounding the interactions, the better your chances to predict key influencing behaviors.

 

Business intelligence can assist companies across a spectrum of industries from retail outlets to healthcare and technology.  Managing concepts such as customer relations have shown to significantly increase revenues and profits through the advantages they provide due to the strategies that better BI provides. Getting data in and out of your cloud applications and keeping it clean is difficult without the right tools, which is not so different than the issue with business intelligence. A strong strategy needs to be put into place when adopting new SaaS applications. 

 

Join the #TechTuesday webinar around BI for Cloud apps next Tuesday July 23rd to continue this discussion. RSVP here

This is a continuation of the discussion we started on Part 1 of this topic about Cloud Integration Templates, where we saw how to define field link rules using regular expressions. Refer to the post for other details. Let’s recap from the benefits of doing it:

 

  1. We did not have to specify rules for each individual field separately
  2. If the number of fields of each of these types changes in future these rules will still handle those.
  3. We have broken down the data structure into field sets of different logical types, for which we can define transformations specific to those

 

The template we used is as in the screenshot below. The rules described in the earlier post are about linking fields from the object “SQ” to the object “EXPRegEx”. Now we need to define transformations for each field set.

 

 

For that, the ability to perform #3 above is very important. Having defined the field links like this, if we then have to define transformations using the individual field names, it diminishes the benefits of using regex patterns for links.

 

Thankfully, templates allow a user to define transformations on regex patterns as well! Now we will see how we can do that.

 

For this we need to use what is called as “Macro definition” in expressions. We use a declaration field to identify all field name patterns, and then define pattern specific transformations on those.

 

In the screenshot below, the first field in the Expression is the declaration discussed above. It declares four patterns:

 

Name

Pattern

Description

QQ

Qty$

Any expression defined on the name “QQ” will be applied to all fields   with names ending with Qty

QY

Qtd$|Ytd$

Any expression defined on the name “QY” will be applied to all fields   with names ending with Qtd or Ytd

DF

Dt$|Date$

Any expression defined on the name “DF” will be applied to all fields   with names ending with Dt or Date

CC

Comment\d

Any expression defined on the name “CC” will be applied to all fields   with names having the text “Comment” followed by a digit

 

 

 

 

The names used for the patterns above are just for convenience and do not mean anything in particular. Now a quick word about those actual field names and expressions used in this Expression:

 

Let’s consider the field “OUT2_%DF%”. Since we have four date fields in the input the actual Custom Integration Task will create four fields with the values as below

 

Field

Expression

OUT2_CreatedDate

TO_DATE(CreatedDate,'yyyymmdd')

OUT2_LastModifiedDate

TO_DATE(LastModifiedDate,'yyyymmdd')

OUT2_FirstTransactionPostedDt

TO_DATE(FirstTransactionPostedDt,'yyyymmdd')

OUT2_MostRecentTransactionDt

TO_DATE (MostRecentTransactionDt,'yyyymmdd')

 

As you can see the task has “vertically expanded” the expression defined on the field pattern. In future if two more date fields are added to the input file structure, this will handle those automatically.

 

We do not have to use those prefixes like OUT2_ here, but doing that allows us to configure link rules or expressions downstream if necessary. For example, we can create a link rule that uses “^OUT2” in the pattern to link only these fields to another object.

 

Hope this example illustrates how you can identify and work on logical field sets using regex patterns through your data flow. Feel free to post your questions on the Informatica Cloud Developer Community as you try these!

As we define data integration patterns using Templates on Informatica Cloud, one of the most powerful features is the ability to specify link rules- the rules that define how fields are linked between objects- using various ways. You can define these using field names, field data types, using an externally created file of field pairs (“Dictionary” file) etc. One of these options is the use of “patterns”. You can use regex patterns to identify field sets on which you can perform transformations.

 

Let me admit something before I get into details. For somebody who is writing this to help others, I am usually quite confused by those strange looking combinations of special characters. Despite of having done my share of “awk”,”grep” and find/replace with “vi” using such tokens, it all looks very unfamiliar to me when I look at it after a few days. On such occasions I always wish I had written some illustrations down somewhere so I could refer to those.

 

So I am hoping this post will help me as much it might help others.

 

When we take a look at these within the context of specifying rules for linking fields, we don’t have to go all “^ShipTo|BillTo\s{0,1}\d{2}[#\-]$”, at least right in the first rule. In fact I don’t expect most such rules to be too complicated. Let’s start with something simple.

 

Let’s assume we are reading from a sample file or table with the field names listed below:

  • CustomerId
  • CustomerKey
  • RecordType
  • CustName
  • CustDesc
  • CreatedDate
  • LastModifiedDate
  • FirstTransactionPostedDt
  • MostRecentTransactionDt
  • TotalOrderedQty
  • TotalReceivedQty
  • InvoicedAmtQtd
  • PaidAmtQtd
  • InvoiceAmtYtd
  • PaidAmtYtd
  • Comment1
  • Comment2
  • Comment3
  • Comment4
  • StatusCode
  • CurrencyCode
  • ShipToLocation
  • BillToLocation

 

Now if you are one of those fortunate people who have never had to deal with data definitions created by somebody else you might think this is a convenient structure created to demonstrate the point, and to some extent it might be. But otherwise for those of us who have to work with such structures, this looks similar to those in real life.

 

The use case here is that we need to load this data into some other table/file. While loading that we need to transform some of these values differently:

  1. All date fields to be converted to a “datetime” data type
  2. All QTD/YTD amounts need to be converted to amounts in the company currency
  3. Need to use first 80 characters of each comment field
  4. Fields with “Total” in name need to be linked to similar field names but with no “Total” in it.

 

Moreover such structure can undergo changes over time with more fields of some type getting added (e.g. a “Comment5” gets added later). So you want to future-proof your DI flow to handle such cases.

 

To achieve this let’s refer to a template that defines different ways of transforming this data to implement the requirements listed above.

 

This template depicts a pattern where we read some customer records from a flat file, perform some transformations over the values and then write to another flat file. The use of files here is just for illustration, this can be done between any source/target types supported on Informatica Cloud. The rules we have described below are between the object “SQ” and the object “EXPRegEx” below.

 

 

Let’s look at the rules for linking.

 

Rule #1                 Link fields with names ending with “Dt” or “Date” to fields with identical names

 

The screenshot below shows the actual rule definition. For the “from” side of fields we use the pattern “Dt$|Date$” which indicates any name that ends with either of “Dt” or “Date”. This will link the four date fields in the list above. For the “to” side we use “$0”, which means these will be linked to fields with identical names. Note that we have used the “Pattern” option in the rule link.

 

 

 

For other field sets we will use similar rule definitions

 

Rule #2                 Qtd/Ytd amounts to be linked as-is

 

This will be done in a similar way

 

 

Rule #3                 All comment fields to be linked as-is

 

There are comment fields with a digit suffix in the name in the list above. We define a rule that will apply to all such fields

 

 

Rule #4                 Quantify fields with names starting with Total to be linked, without the Total prefix

 

 

 

The empty “to” pattern here means a field name “TotalOrderedQty” will be linked to a field “OrderedQty”, taking out the “Total” at the start.

 

This allows us to link fields based on the naming pattern. There are several benefits of doing this:

  1. We did not have to specify rules for each individual field separately
  2. If the number of fields of each of these types changes in future these rules will still handle those.
  3. We have broken down the data structure into field sets of different logical types, for which we can define transformations specific to those

 

What next?

 

The first time I tried this, at this point I almost felt like those fish in the last scene of Finding Nemo, where they have crossed the street in a plastic bag, are into the ocean, but are wondering how to get out of the bag.

 

To solve it, we are going to define the transformations in a similar way, where we can use regex patterns. Let’s look at that in the second part of this post!

Informatica Cloud will be holding weekly webinars starting Tuesday July 23rd, covering a wide range of deeply technical issues with experts from our product management and engineering team. The first session is on performing BI with cloud applications such as Salesforce using Data Replication. Subsequent sessions focus on Big Data with Amazon Redshift, web services, SAP integration, integration templates, and much more.

 

You can view the full list of sessions here and register for any that you find interesting. The schedule currently runs until October 1, and will be updated with more sessions in late August.

 

Please submit any questions you'd like answered during the webinars about Informatica Cloud and we will do our best to answer them during a session that is relevant to the topic. You can submit your questions either on this Community site, the Informatica Cloud LinkedIn group, or by tagging us on the @infacloud twitter handle.

With new healthcare reforms, increasing competition and a more difficult path to bringing new products to market, pharmaceutical and life sciences companies face many unique challenges.

 

The right approach to customer relationship management (CRM) and cloud data integration and master data management (MDM) allows these organizations to overcome these obstacles and unlock their information potential with a single customer view so they can:

 

  • Deliver superior customer service
  • Improve customer loyalty and retention rates
  • Collaborate and innovate to stay ahead
  • Reduce time to market for new products
  • Improve sales productivity and reduce cost of sales
  • Maximize marketing spend
  • and more!

 

Informatica is hosting a webinar on July 16th that will go over the basics of cloud MDM and how it helps improve Salesforce and Veeva adoption and return on investment. Register here.

 

Additional resources: