One of the most important tasks in the data integration process is to set realistic expectations. The term data integration calls for seamless coordination of diverse databases, software, equipment, and personnel into a fully functioning alliance, free from the persistent headaches that characterize less comprehensive information management systems. Think again. The requirements analysis phase offers one of the best opportunities in the process to recognize and assimilate the full scope of the complexity of the data integration task. In-depth attention to this analysis is perhaps the most important ingredient in creating a system that will live to see adoption and maximum usage. Say no to plagiarism. Get a tailor-made essay on "Why Violent Video Games Shouldn't Be Banned"? Get an Original Essay As the field of data integration progresses, however, other common obstacles and compensating solutions will be easily identified. Current integration practices have already highlighted some family challenges and strategies to address them, as described below. Uncorrelated Data For most transportation agencies, data integration involves synchronizing massive amounts of variable, heterogeneous data resulting from internal legacy systems that vary in data format. Legacy systems may have been built around flat files, networks, or hierarchical databases, as opposed to new generations of databases that use relational data. Data in different formats from external sources continues to be added to legacy databases to improve the value of the information. Each generation, product, and internally developed system has unique needs to address when storing or extracting data. Therefore data integration may involve various strategies to deal with heterogeneity. In some cases, the effort becomes a major data homogenization exercise, which may not improve the quality of the data offered. Corrupted Data Data quality is a major concern in any data integration strategy. Data must be cleansed before conversion and integration, otherwise an agency will almost certainly face serious data issues later. Legacy data impurities have a cumulative effect; by nature, they tend to focus on users of large data volumes. If this information is corrupt, so will the decisions made from it. It is not uncommon for previously undiscovered data quality issues to emerge in the process of cleaning information for use by the embedded system. The problem of bad data leads to procedures to regularly check the quality of the information used. But it is not always clear who has ultimate responsibility for this work. Challenges Related to Lack of Storage Capacity The unexpected need for additional performance and capacity is one of the most common challenges for data integration, particularly in data warehousing. Two storage requirements generally come into play: extensibility and scalability. Anticipating the magnitude of growth in an environment where the need for storage can increase exponentially once a system goes live raises concerns that storage costs will outweigh the benefits of data integration. Introducing such massive amounts of data can push the limits of hardware and software. This could force developers to undertake expensive solutions if an architecture for processing much larger amounts of data had to be adapted to the planned system. Unexpected Fees Data integration costs are driven largely by elements that are difficult to quantify and.
tags