Concluding Remarks

In this article, we have proposed temporal data update methodologies for data warehousing. The goal here is to come up with mechanisms for capturing transaction lineage for each record in data warehouse tables. We identified the key areas of temporal data warehouse refreshes based on practical experience in data warehouse implementation. Many database applications need to retain all previous and current states of data from accountability and traceability perspectives. We provided methodologies to extract operational data from heterogeneous operational sources and to capture them in staging areas of the data warehouse with transaction lineage. We showed how to pull source data from more than one source staging tables in DW, how to perform join operations with inequality predicates for more than one joins and then load them in analytical subject area tables. We have provided SQL syntax to pull and load data from staging area tables to analytical tables maintaining transaction lineage.

We proposed load methodologies by way of database specific load utilities and software components such as stored procedures and macros. We showed onetime full refresh and subsequent incremental refreshes of temporal data in conventional data warehouse. In order to implement temporal data update methodologies we proposed a data model consisting of four additional columns, such as 'row effective date', 'row effective time', 'row expired date' and 'row expired time', to mark row effective date/time and row expired date/time against each row in the table. In order to make each row unique we proposed the row effective date and time columns to be part of primary key.

The transaction lineage that we capture in data warehouse provides different applications and analytical community read data whatever the subset of data they need is based on different time slices. Temporal joins have been presented for temporal relations. Also suggestions have been made to take advantage of several indices offered by current commercial databases.

We proposed several business views based on timestamp filters for use by different applications. Application specific business views have been defined with timestamps as needed by individual applications. There are several different business application needs which can be filled by providing a separate set of views with different timestamp filters as required. The application specific views with timestamp filters are defined and dynamically recompiled right after each individual application's upstream subject area refreshes are completed per service level agreement (SLA). The temporal data update methodologies presented in this article should sufficiently meet the needs of application owners and customers as it takes into consideration several factors such as providing common data source with different time-varying data.