Ibm
Principal Research Staff Member at Ibm Research - Almaden
Ibm
Research Staff Member
Education:
University of Erlangen - Nuernberg, Germany Jun 1993
Doctorates, Doctor of Philosophy, Computer Science
University of Erlangen - Nuremberg 1983 - 1993
Doctorates, Doctor of Philosophy, Computer Science
Mehmet Altinel - San Jose CA, US Christof Bornhoevd - San Francisco CA, US Chandrasekaran Mohan - San Jose CA, US Mir Hamid Pirahesh - San Jose CA, US Berthold Reinwald - Los Gatos CA, US Saileshwar Krishnamurthy - Palo Alto CA, US
Assignee:
International Business Machines Corporation - Armonk NY
International Classification:
G06F 17/30
US Classification:
707 3
Abstract:
A local database cache enabling persistent, adaptive caching of either full or partial content of a remote database is provided. Content of tables comprising a local cache database is defined on per-table basis. A table is either: defined declaratively and populated in advance of query execution, or is determined dynamically and asynchronously populated on-demand during query execution. Based on a user input query originally issued against a remote DBMS and referential cache constraints between tables in a local database cache, a Janus query plan, comprising local, remote, and probe query portions is determined. A probe query portion of a Janus query plan is executed to determine whether up-to-date results can be delivered by the execution of a local query portion against a local database cache, or whether it is necessary to retrieve results from a remote database by executing a remote query portion of Janus query plan.
Ariel Fuxman - San Jose CA, US Peter Jay Haas - San Jose CA, US Berthold Reinwald - San Jose CA, US Yannis Sismanis - San Jose CA, US Ling Wang - San Jose CA, US
Assignee:
International Business Machines Corporation - Armonk NY
International Classification:
G06F 17/30 G06F 7/00
US Classification:
707769, 707602, 707718
Abstract:
A method is disclosed for conducting a query to transform data in a pre-existing database, the method comprising: collecting database information from the pre-existing database, the database information including inconsistent dimensional tables and fact tables; running an entity discovery process on the inconsistent dimensional tables and the fact tables to produce entity mapping tables; using the entity mapping tables to resolve the inconsistent dimensional tables into resolved dimensional tables; and running the query on a resolved database to obtain a query result, the resolved database including the resolved dimensional table.
Rajesh P. Manjrekar - San Jose CA, US Berthold Reinwald - San Jose CA, US John Sismanis - San Jose CA, US Wensheng Wu - San Jose CA, US
Assignee:
International Business Machines Corporation - Armonk NY
International Classification:
G06F 7/00 G06F 17/30
US Classification:
707738, 707737, 707736, 707802, 707803
Abstract:
A system and method for automatically discovering topical structures of databases includes a model builder adapted to compute various kinds of representations for the database based on schema information and data values of the database. A plurality of base clusterers is also provided, one for each representation. Each base clusterer is adapted to perform, for the representation, preliminary topical clustering of tables within the database to produce a plurality of clusters, such that each of the clusters corresponds to a set of tables on the same topic. A meta-clusterer aggregates results of the clusterers into a final clustering, such that the final clustering comprises a plurality of the clusters. A representative finder identifies representative tables from the clusters in the final clustering. The representative finder identifies at least one representative table for each of the clusters in the final clustering.
Methods For Obtaining Improved Text Similarity Measures Which Replace Similar Characters With A String Pattern Representation By Using A Semantic Data Tree
Rema Ananthanarayanan - New Delhi, IN Sreeram V. Balakrishnan - Los Altos CA, US Yuen Y. Lo - Cambridge MA, US Berthold Reinwald - San Jose CA, US
Assignee:
International Business Machines Corporation - Armonk NY
International Classification:
G06F 17/00
US Classification:
706 45
Abstract:
The embodiments of the invention provide methods for obtaining improved text similarity measures. More specifically, a method of measuring similarity between at least two electronic documents begins by identifying similar terms between the electronic documents. This includes basing similarity between the similar terms on patterns, wherein the patterns can include word patterns, letter patterns, numeric patterns, and/or alphanumeric patterns. The identifying of the similar terms also includes identifying multiple pattern types between the electronic documents. Moreover, the basing of the similarity on patterns identifies terms within the electronic documents that are within a category of a hierarchy. Specifically, the identifying of the terms reviews a hierarchical data tree, wherein nodes of the tree represent terms within the electronic documents. Lower nodes of the tree have specific terms; and, wherein higher nodes of the tree have general terms.
Mehmet Altinel - San Jose CA, US Chandrasekaran Mohan - San Jose CA, US Mir Hamid Pirahesh - San Jose CA, US Berthold Reinwald - Los Gatos CA, US Saileshwar Krishnamurthy - Palo Alto CA, US
Assignee:
International Business Machines Corporation - Armonk NY
International Classification:
G06F 17/00
US Classification:
707769
Abstract:
A local database cache enabling persistent, adaptive caching of either full or partial content of a remote database is provided. Content of tables comprising a local cache database is defined on per-table basis. A table is either: defined declaratively and populated in advance of query execution, or is determined dynamically and asynchronously populated on-demand during query execution. Based on a user input query originally issued against a remote DBMS and referential cache constraints between tables in a local database cache, a Janus query plan, comprising local, remote, and probe query portions is determined. A probe query portion of a Janus query plan is executed to determine whether up-to-date results can be delivered by the execution of a local query portion against a local database cache, or whether it is necessary to retrieve results from a remote database by executing a remote query portion of Janus query plan.
Ariel Fuxman - San Jose CA, US Peter Jay Haas - San Jose CA, US Berthold Reinwald - San Jose CA, US Yannis Sismanis - San Jose CA, US Ling Wang - San Jose CA, US
Assignee:
International Business Machines Corporation - Armonk NY
International Classification:
G06F 7/00
US Classification:
707737
Abstract:
A method is disclosed for conducting a query to transform data in a pre-existing database, the method comprising: collecting database information from the pre-existing database, the database information including inconsistent dimensional tables and fact tables; running an entity discovery process on the inconsistent dimensional tables and the fact tables to produce entity mapping tables; using the entity mapping tables to resolve the inconsistent dimensional tables into resolved dimensional tables; and running the query on a resolved database to obtain a query result, the resolved database including the resolved dimensional table.
Method For Estimating The Number Of Distinct Values In A Partitioned Dataset
Kevin Scott Beyer - San Jose CA, US Rainer Gemulla - Dresden, DE Peter Jay Haas - San Jose CA, US Berthold Reinwald - San Jose CA, US John Sismanis - San Jose CA, US
Assignee:
International Business Machines Corporation - Armonk NY
International Classification:
G06F 17/00 G06F 17/30
US Classification:
707713, 707600, 707698, 707719, 707747
Abstract:
The task of estimating the number of distinct values (DVs) in a large dataset arises in a wide variety of settings in computer science and elsewhere. The present invention provides synopses for DV estimation in the setting of a partitioned dataset, as well as corresponding DV estimators that exploit these synopses. Whenever an output compound data partition is created via a multiset operation on a pair of (possibly compound) input partitions, the synopsis for the output partition can be obtained by combining the synopses of the input partitions. If the input partitions are compound partitions, it is not necessary to access the synopses for all the base partitions that were used to construct the input partitions. Superior (in certain cases near-optimal) accuracy in DV estimates is maintained, especially when the synopsis size is small. The synopses can be created in parallel, and can also handle deletions of individual partition elements.
Larry Brown - Austin TX, US James C. Kleewein - San Jose CA, US Berthold Reinwald - Los Gatos CA, US Peter M. Schwarz - San Jose CA, US Charles Daniel Wolfson - Austin TX, US
Assignee:
International Business Machines Corporation - Armonk NY
International Classification:
G06F 17/30
US Classification:
707702
Abstract:
The present invention provides a method, system and program product for integrating a service external to a database into a database such that the service may be easily invoked from the database. Preferably, the service is a web service available over the internet The service may be invoked from any of a number of invoking mechanisms of the database. In a first specific embodiment, the mechanism comprises a user-defined function within an SQL statement. In a second specific embodiment, the mechanism comprises a virtual table. In a third specific embodiment, the mechanism comprises a stored procedure. In a fourth specific embodiment, the mechanism comprises a trigger. In a fifth specific embodiment, the mechanism comprises a federated table accessed via a nickname and implemented using a wrapper.
Youtube
Interview with Berthold Reinwald, the Brain ...
Duration:
14m 15s
sfspark.org: Berthold Reinwald, Apache SystemML
IBM is presenting at data.bythebay.io... Scalable machine learning is...
Duration:
1h 23m 8s
AI and OpenPOWER Meetup - Berthold Reinwald
This meetup was held in Mountain View on 25th March, 2018. Title: Scal...
Duration:
24m 42s
sfspark.org: Berthold Reinwald Spark/ML Q&A ...
IBM is presenting at data.bythebay.io... Berthold and Alexy talk abut...
Duration:
21m 46s
AI Use-cases | Berthold Reinwald | IBM Research
AI Use-cases by Berthold Reinwald, IBM Research.
Duration:
45m 20s
Scalable Machine Learning with SystemML
Speaker: Berthold Reinwald Title/Affliation... Principal Research Sta...