Algorithmic Perspective of Strongly Possible Keys and Functional Dependencies
Abstract
It is common to encounter missing values in database tables. For an incomplete table, a possible world can be obtained by replacing any missed value with a value from the attribute (infinite) domain. A possible key (possible functional dependency) is satisfied in an incomplete table ”T” if there exists a possible world of ”T” that satisfies the key (the functional dependency) constraint. If all possible worlds of ”T” satisfy the key (functional dependency), then we say that ”T” satisfies a certain key (functional dependency). The concept of strongly possible worlds was introduced recently that considers only the active domain (the set of values that are already appearing in each attribute in the table), in a way that a strongly possible world
is obtained by replacing any missing value with a value from the corresponding attributes active domain. So, a strongly possible key spKey (functional dependency spFD) is satisfied by a table ”T” if there exists a strongly possible world that satisfies the key (functional dependency). In this paper, we investigate the approximation measures of spKeys and spFDs when the strongly possible constraint is not satisfied by a given table. We introduce the g5 measures which is the ratio of the minimum number of tuples that need to be added so that the constraint is satisfied. The measure g3 represent the ratio of the minimum number of tuples that need to be removed so that the table satisfies the constraint. We introduce a new measure
g5, which is the ratio of the minimum number of tuples to be added to the table so the result satisfies the constraint. Where adding new tuples with new values will extend the active domain. We prove that g3 is an upper bound of g5 for a constraint in a table. Furthermore, g3 and g5 are independent of each other, where there exist tables of some large number of tuples that satisfy g3 − g5 = p/q for any rational number 0 ≤ p/q < 1. We study the complexity of determining these approximate measures.
Full Text:
PDFReferences
M. Al-Atar and A. Sali. Approximate keys
and functional dependencies in incomplete
databases with limited domains. In Foun-
dations of Information and Knowledge Sys-
tems 12th International Symposium, FoIKS
Helsinki, Finland, June 20–23, 2022
Proceedings, volume 13388 of LNCS, pages
–167. Springer Nature Switzerland AG,
M. Al-Atar and A. Sali. Strongly possible
functional dependencies for sql. Acta Cyber-
netica, 2022.
M. Alattar and A. Sali. Keys in rela-
tional databases with nulls and bounded
domains. In European Conference on Ad-
vances in Databases and Information Sys-
tems, pages 33–50. Springer, 2019.
M. Alattar and A. Sali. Functional depen-
dencies in incomplete databases with lim-
ited domains. In International Symposium
on Foundations of Information and Knowl-
edge Systems, pages 1–21. Springer, 2020.
M. Alattar and A. Sali. Strongly possible
keys for sql. Journal on Data Semantics,
(2):85–99, 2020.
L. Bertossi. Database repairs and consis-
tent query answering: Origins and further
developments. In Proceedings of the 38th
ACM SIGMOD-SIGACT-SIGAI Symposium
on Principles of Database Systems, pages 48–
, 2019.
J. Biskup and L. Wiese.
A sound
and complete model-generation procedure
for consistent and confidentiality-preserving
databases. Theoretical Computer Science,
(31):4044–4072, 2011.
A. Farhangfar, L. A. Kurgan, and
W. Pedrycz.
A novel framework for
imputation of missing values in databases.
IEEE Transactions on Systems, Man, and
Cybernetics-Part A: Systems and Humans,
(5):692–709, 2007.
L. A. Goodman and W. H. Kruskal. Mea-
sures of association for cross classifications.
Measures of association for cross classifica-
tions, pages 2–34, 1979.
C. Giannella and E. Robertson. On approx-
imation measures for functional dependen-
cies. Information Systems, 29(6):483–507,
H. Köhler, U. Leck, S. Link, and X. Zhou.
Possible and certain keys for sql. The VLDB
Journal, 25(4):571–596, 2016.
J. Kivinen and H. Mannila. Approximate
inference of functional dependencies from
relations. Theoretical Computer Science,
(1):129–149, 1995.
W. Lipski Jr. On databases with incomplete
information. Journal of the ACM (JACM),
(1):41–70, 1981.
L. Lovász and M. Plummer. Matching theory,
volume 367. American Mathematical Soc.,
J. Wijsen. Foundations of query answering
on inconsistent databases. ACM SIGMOD
Record, 48(3):6–16, 2019.
DOI: https://doi.org/10.31449/inf.v48i4.4655
This work is licensed under a Creative Commons Attribution 3.0 License.