Components > Intelligent structure models > Troubleshooting intelligent structure models
  

Troubleshooting intelligent structure models

Consider the following troubleshooting tips when you create intelligent structure models.
Using differently structured files causes data loss.
If the intelligent structure model does not match the input file or only partially matches the input file, there might be data loss.
For example, you created a model for a sample file that contains rows with six fields of data, computer ID, computer IP address, access URL, username, password, and access timestamp. However, some of the input files contained rows with eight fields of data, that is a computer ID, computer name, computer IP address, country of origin, access URL, username, password, access code, and access timestamp. The data might be misidentified and some data might be designated as unidentified data.
If some input files contain more types of data than other input files, or different types of data, for best results create a sample file that contains all the different types of data.
Data from PDF forms was not modeled or parsed.
An intelligent structure model can model and parse the data within PDF form fields but not data outside the fields. A field title, or other data outside the field, will not be identified.
Data from Microsoft Word was not modeled or parsed.
An intelligent structure model can model and parse data within Microsoft Word tables. All other data is collected as unparsed data.
Error: Unsupported field names might cause data loss.
Do not use duplicate names for different elements.
If you use Big Data Management 10.2.1, ensure that the names of output groups follow Informatica Developer naming conventions. An element name must contain only English letters (A- Z, a-z), numerals (0-9), and underscores. Do not use reserved logical terms, and do not start element names with a number.
In later versions of Big Data Management or Data Engineering Integration, Intelligent Structure Discovery replaces special characters in element names with underscores and inserts underscores before element names that start with numerals and before element names that are reserved logical terms.
Intelligent Structure Discovery assigns long records to an Unassigned Data field.
Intelligent Structure Discovery assigns records that exceed the maximum record size to an Unassigned Data field. The default maximum record size is 640,000 bytes.
You can increase the maximum record size by configuring one of the DTM JVM properties of the Data Integration Server service in Administrator.
Use the following syntax to define the maximum record size:
-DISD_MAX_RECORD_SIZE=<maximum record size in bytes>
For example, to define a maximum record size of 2 MB, enter the following value for the JVMOption1 property:
-DISD_MAX_RECORD_SIZE=2000000
It is recommended that the maximum record size doesn't exceed 10 MB.
For more information about configuring Data Integration Server service properties, see the Administrator help.
When you try to base a model on a sample ORC file that contains Union data, the model creation fails.
Intelligent Structure Discovery doesn't process the Union data type in ORC input. Select a file that doesn't contain Union data to base the model on.