Carrasco, Rafael C., Rico-Juan, Juan Ramón A similarity between probabilistic tree languages: application to XML document families CARRASCO JIMÉNEZ, Rafael Carlos; RICO JUAN, Juan Ramón. "A similarity between probabilistic tree languages: application to XML document families". Pattern Recognition. Vol. 36, No. 9 (Sept. 2003). ISSN 0031-3203, pp. 2197-2199 URI: http://hdl.handle.net/10045/14023 DOI: 10.1016/S0031-3203(02)00320-5 ISSN: 0031-3203 (Print) Abstract: We describe a general approach to compute a similarity measure between distributions generated by probabilistic tree automata that may be used in a number of applications in the pattern recognition ÿeld. In particular, we show how this similarity can be computed for families of structured (XML) documents. In such case, the use of regular expressions to specify the right part of the expansion rules adds some complexity to the task. Keywords:Distance between tree languages, Similarity of structured documents Elsevier info:eu-repo/semantics/article