Abstract
We consider documents as words and trees on some alphabet Σ and study how to compare them with some regular schemas on an alphabet Σ′. Given an input document I, we decide if it may be transformed into a document J which is ε-close to some target schema T: we show that this approximate decision problem can be efficiently solved. In the simple case where the transformation is the identity, we describe an approximate algorithm which decides if I is close to a target regular schema (DTD). This property is testable, i.e. can be solved in time independent of the size of the input document, by just sampling I. In the general case, the Structural Consistency decides if there is a transducer \(\mathcal {T}\) with at most m states such that I is ε-close to I′ and his image \(\mathcal {T}(I')\) is both close to T and of size comparable to the size of I. We show that Structural Consistency is also testable, i.e. can be solved by sampling I.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Broder, A.: On the Resemblance and Containment of Documents. In: SEQUENCES 1997: Proceedings of the Compression and Complexity of Sequences (1997)
de Rougemont, M., Vieilleribière, A.: Approximate Data Exchange. In: Schwentick, T., Suciu, D. (eds.) ICDT 2007. LNCS, vol. 4353, pp. 44–58. Springer, Heidelberg (2006)
Fagin, R., Kolaitis, P.G., Miller, R.J., Popa, L.: Data Exchange: Semantics and Query Answering. In: Calvanese, D., Lenzerini, M., Motwani, R. (eds.) ICDT 2003. LNCS, vol. 2572, pp. 207–224. Springer, Heidelberg (2002)
Fischer, E., Magniez, F., de Rougemont, M.: Approximate Satisfiability and Equivalence. In: IEEE Logic in Computer Science, pp. 421–430 (2006)
Goldreich, O., Goldwasser, S., Ron, D.: Property Testing and Its Connection to Learning and Approximation. Journal of the ACM 45(4), 653–750 (1998)
Magniez, F., de Rougemont, M.: Property Testing of Regular Tree Languages. In: Díaz, J., Karhumäki, J., Lepistö, A., Sannella, D. (eds.) ICALP 2004. LNCS, vol. 3142, pp. 932–944. Springer, Heidelberg (2004)
Martens, W., Neven, F.: Typechecking Top-Down Uniform Unranked Tree Transducers. In: International Conference on Database Theory, pp. 64–78 (2002)
Rubinfeld, R., Sudan, M.: Robust Characterizations of Polynomials with Applications to Program Testing. SIAM Journal on Computing 25(2), 23–32 (1996)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
de Rougemont, M., Vieilleribière, A. (2010). Approximate Structural Consistency. In: van Leeuwen, J., Muscholl, A., Peleg, D., Pokorný, J., Rumpe, B. (eds) SOFSEM 2010: Theory and Practice of Computer Science. SOFSEM 2010. Lecture Notes in Computer Science, vol 5901. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-11266-9_57
Download citation
DOI: https://doi.org/10.1007/978-3-642-11266-9_57
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-11265-2
Online ISBN: 978-3-642-11266-9
eBook Packages: Computer ScienceComputer Science (R0)