Rico-Juan, Juan Ramón, Calera Rubio, Jorge, Carrasco, Rafael C. Smoothing and compression with stochastic k-testable tree languages RICO JUAN, Juan Ramón; CALERA RUBIO, Jorge; CARRASCO JIMÉNEZ, Rafael Carlos. "Smoothing and compression with stochastic k-testable tree languages". Pattern Recognition. Vol. 38, No. 9 (Sept. 2005). ISSN 0031-3203, pp. 1420-1430 URI: http://hdl.handle.net/10045/14022 DOI: 10.1016/j.patcog.2004.03.024 ISSN: 0031-3203 (Print) Abstract: In this paper, we describe some techniques to learn probabilistic k-testable tree models, a generalization of the well known k-gram models, that can be used to compress or classify structured data. These models are easy to infer from samples and allow for incremental updates. Moreover, as shown here, backing-off schemes can be defined to solve data sparseness, a problem that often arises when using trees to represent the data. These features make them suitable to compress structured data files at a better rate than string-based methods. Keywords:Tree grammars, Stochastic models, Ssmoothing, Backing-off, Data compression Elsevier info:eu-repo/semantics/article