Jacobi Generations | Jacobi Monographs | Jacobi Inventory | Gorr Index | Phonetic Matching | Merging Datasets | Hebrew Glossary | Jacobi Index | Digital Maps

Genealogical Datasets Integration

This project, entitled “Strategies for the Integration of Genealogical Datasets”, was conducted between the Fall of 2007 and the fall of 2008 by Prof. H. Daniel Wagner of the Weizmann Institute, with two research collaborators in Poland, Kamila Klauzinska of the Jagellonian University in Krakow and Jakub Zajdel of the University of Silesia in Katowice.

It successfully developed algorithms that enable the merging of diverse genealogical datasets, using civil and cemetery records, in a small Polish town (shtetl) as a case study – and thereby achieved a break-through in a complex technical area of pertinence to genealogists and family historians generally.

A large number of local and regional Jewish genealogical databases have survived in Europe. Such databases, when suitably integrated – or “merged” – lead not only to improved genealogical and biographical depictions of individuals and a more accurate (and probably much larger) account of the number of victims in the Holocaust, but also to the multi-generational reconstruction of lost branches of the shared Jewish family tree in Eastern Europe and elsewhere.

Developing and demonstrating a merging exercise of modest dimensions was the essence of the project. Its general aim was to integrate details about individuals listed in different databases relevant to a single ancestral town, namely Zdunska Wola, thereby increasing our knowledge about these individuals and eventually reconstructing their family trees. The specific purpose was to develop algorithms/software for full data extraction from genealogical/Jewish-oriented databases and for processing the data retrieved, with a view to merging the discrete databases and progressively reconstruct family trees. A gradualist approach was consciously adopted, at the outset by focusing on a well-defined issue, namely, the merging of metrical death data and cemetery listings in Zdunska Wola.

This limited endeavour achieved its immediate goals and can now serve as a pilot project to compare results from a small-scale manually merged database with those from the algorithms to be developed. The long-term significance and contribution of the research relate to the challenge of developing as yet nonexistent integration tools for the merging of very large Jewish genealogical databases of different types.

Click here for a preliminary article on this project by Prof. H. Daniel Wagner, from AVOTAYNU, xxiv, 1 (Spring 2008), pp. 8-10

Click here for the Final Report on this project submitted by Prof. Wagner, March 2009

Click here for a power point presentation on this project made by Prof. Daniel Wagner at the 15th Congress of the World Union of Jewish Studies (August 2009)

Click here for a narrative description of this project by Kamila Klauzinska (from October 2009), based on a paper she presented at the 29th IAJGS Conference of Jewish Genealogy (Philadelphia, August, 2009)

Click here for the power point presentation to Ms. Klauzinska’s description of this project