Preserving logical and functional dependencies in synthetic tabular data

Chaithra Umesh, Kristian Schultz, Manjunath Mahendra, Saptarshi Bej, Olaf Wolkenhauer*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

3 Scopus citations

Abstract

Dependencies among attributes are a common aspect of tabular data. However, whether existing tabular data generation algorithms preserve these dependencies while generating synthetic data is yet to be explored. In addition to the existing notion of functional dependencies, we introduce the notion of logical dependencies among the attributes in this article. Moreover, we provide a measure to quantify logical dependencies among attributes in tabular data. Utilizing this measure, we compare several state-of-the-art synthetic data generation algorithms and test their capability to preserve logical and functional dependencies on several publicly available datasets. We demonstrate that currently available synthetic tabular data generation algorithms do not fully preserve functional dependencies when they generate synthetic datasets. In addition, we also showed that some tabular synthetic data generation models can preserve inter-attribute logical dependencies. Our review and comparison of the state-of-the-art reveal research needs and opportunities to develop task-specific synthetic tabular data generation models.

Original languageEnglish
Article number111459
JournalPattern Recognition
Volume163
DOIs
StatePublished - Jul 2025

Keywords

  • Functional dependencies
  • Generative models
  • Logical dependencies
  • Synthetic tabular data

Fingerprint

Dive into the research topics of 'Preserving logical and functional dependencies in synthetic tabular data'. Together they form a unique fingerprint.

Cite this