Geetika Agrawal - San Jose CA, US Mary Ann Roth - San Jose CA, US Peter Martin Schwarz - San Jose CA, US
Assignee:
International Business Machines Corporation - Armonk NY
International Classification:
G06F 7/00 G06F 17/00
US Classification:
707 10, 707100, 707 2, 707103 Y, 709203
Abstract:
Various embodiments of a method to access metadata from a plurality of data servers from a federated database management system are provided. In one embodiment, a request for metadata, from a client application, is received by the federated database management system. Data servers which are accessible from the federated database management system are identified. For each data server, metadata describing data of a data source of that data server is retrieved in accordance with the application request. The retrieved metadata from each of the data servers is aggregated to produce an aggregated result in a uniform format. The aggregated result is provided. In another embodiment, for each data server, a source metadata request for metadata of that data server is generated in accordance with the application request and a source metadata application programming interface. A view is created based on the source metadata request for metadata for each data server.
Lucian Popa - San Jose CA, US Mary Ann Roth - San Jose CA, US Craig Salter - Hamilton, CA
Assignee:
International Business Machines Corporation - Armonk NY
International Classification:
G06F 7/00
US Classification:
707100
Abstract:
A method and system for specifying, in a schema mapping framework, a mapping between a source schema and a target schema. The source and target schemas are schemas included in respective groups of registered, heterogeneous schemas. The source and target schemas may be of different types. Serialized versions of the source and target schemas include source objects and target objects, respectively. A mapping model is serialized into mapping objects that include logical references representing the source objects and logical references representing the target objects. The logical references are resolved to the source objects and target objects, thereby storing pointers to the source objects and to the target objects. After resolving the logical references, the mapping model includes the logical references and the pointers to the source and target objects.
Using Data Mining Algorithms Including Association Rules And Tree Classifications To Discover Data Rules
Mary Ann Roth - San Jose CA, US Blayne Harold Chard - San Jose CA, US Yannick Saillet - Stuttgart, DE Harald Clyde Smith - Groveland MA, US
Assignee:
International Business Machines Corporation - Armonk NY
International Classification:
G06N 5/02
US Classification:
706 47, 706 59, 707776, 707797, 707E17044
Abstract:
Provided are a method, system, and article of manufacture for using a data mining algorithm to discover data rules. A data set including multiple records is processed to generate data rules for the data set. Each record has a record format including a plurality of fields and each rule provides a predicted condition for one field based on at least one predictor condition in at least one other field. The generated data rules are provided to a user interface to enable a user to edit the generated data rules. The data rules are stored in a rule repository to be available to use to validate data sets having the record format.
Dynamically Building And Populating Data Marts With Data Stored In Repositories
Benjamin Honzal - Boeblingen, DE Holger Kache - San Jose CA, US Mary A. Roth - San Jose CA, US Guenter A. Sauter - Ridgefield CT, US
Assignee:
International Business Machines Corporation - Armonk NY
International Classification:
G06F 17/30
US Classification:
707600, 707695, 707806
Abstract:
Methods, systems, and articles of manufacture for constructing and populating data marts with dimensional data models from a set of data repositories that contain factual and association information about a set of related assets are disclosed. An intermediate data warehouse is generated to process the facts and associations for each asset. Using the intermediate warehouse, one or more data marts are generated with fact tables, dimensions, and hierarchies to fully model the information available for each asset.
Common Interface To Access Catalog Information From Heterogeneous Databases
Geetika Agrawal - San Jose CA, US Mary Ann Roth - San Jose CA, US Peter Martin Schwarz - San Jose CA, US
Assignee:
International Business Machines Corporation - Armonk NY
International Classification:
G06F 7/00 G06F 17/30
US Classification:
707770, 709203
Abstract:
Various embodiments of a system and computer program product to access metadata from a plurality of data servers from a federated database management system are provided. In one embodiment, a request for metadata, from a client application, is received by the federated database management system. Data servers which are accessible from the federated database management system are identified. For each data server, metadata describing data of a data source of that data server is retrieved in accordance with the application request. The retrieved metadata from each of the data servers is aggregated to produce an aggregated result in a uniform format. The aggregated result is provided. In another embodiment, for each data server, a source metadata request for metadata of that data server is generated in accordance with the application request and a source metadata application programming interface. A view is created based on the source metadata request for metadata for each data server.
Lucian Popa - San Jose CA, US Mary Anne Roth - San Jose CA, US Craig Salter - Ontario, CA
Assignee:
International Business Machines Corporation - Armonk NY
International Classification:
G06F 7/00
US Classification:
707601, 707758, 707774
Abstract:
A method, system and program product for specifying, in a schema mapping framework, a mapping between a source schema and a target schema. The source and target schemas are schemas included in respective groups of registered, heterogeneous schemas. The source and target schemas may be of different types. Serialized versions of the source and target schemas include source objects and target objects, respectively. A mapping model is serialized into mapping objects that include logical references representing the source objects and logical references representing the target objects. The logical references are resolved to the source objects and target objects, thereby storing pointers to the source objects and to the target objects. After resolving the logical references, the mapping model includes the logical references and the pointers to the source and target objects.
Using A Data Mining Algorithm To Generate Format Rules Used To Validate Data Sets
Jacques Joseph Labrie - Sunnyvale CA, US David Thomas Meeks - Ashland MA, US Mary Ann Roth - San Jose CA, US Yannick Saillet - Stuttgart, DE
Assignee:
International Business Machines Corporation - Armonk NY
International Classification:
G06F 17/00
US Classification:
707694, 706 46
Abstract:
Provided are a method, system, and article of manufacture for using a data mining algorithm to generate format rules used to validate data sets. A data set has a plurality of columns and records providing data for each of the columns. Selection is received of at least one format column for which format rules are to be generated and selection is received of at least one predictor column. A format mask column is generated for each selected format column. For records in the data set, a value in the at least one format column is converted to a format mask representing a format of the value in the format column and storing the format mask in the format mask column in the record for which the format mask was generated. The at least one predictor column and the at least one format mask column are processed to generate at least one format rule. Each format rule specifies a format mask associated with at least one condition in the at least one predictor column.
Using A Data Mining Algorithm To Generate Rules Used To Validate A Selected Region Of A Predicted Column
Mary Ann Roth - San Jose CA, US Yannick Saillet - Stuttgart, DE
Assignee:
International Business Machines Corporation - Armonk NY
International Classification:
G06F 17/00
US Classification:
707694, 706 46
Abstract:
Provided are an article of manufacture, system, and method for using a data mining algorithm to generate rules used to validate a selected region of a predicted column. A data set has a plurality of columns and records providing data for each of the columns. Selection is received of at least one predicted column for which rules are to be generated and at least one region of the selected at least one predicted column, wherein each region specifies data positions in the column. The data set is processed to determine association relationships among data in at least one predictor column and subsequences in the selected at least one region of the at least one predicted column. At least one rule is generated from the relationships specifying a condition involving at least one predictor column that predicts at least one value in the selected region of the at least one predicted column.
Isbn (Books And Publications)
Uncertainty in the Geologic Environment: From Theory to Practice, Proceedings of Uncertainty '96, July 31-August 3, 1996, Madison, Wisconsin
Creekside Elementary School Bloomington MN 1964-1967, Oak Grove Elementary School Bloomington MN 1967-1969, River Ridge Elementary School Bloomington MN 1969-1970, Oak Grove Intermediate School Bloomington MN 1970-1973, Jefferson Junior High School Minneapolis MN 1973-1976
Community:
Keith Barden, Melissa Schmidtbauer, John Haga, Sherry Benjamin, Yvonne Capouch, Allen Friedland