Restoring Semantically Incomplete Document Collections Using Lexical Signatures Conference Paper uri icon


  • Unexpected changes create a problem when managing missing resources in a digital collection. In decentralized and distributed collections such as Walden's Paths, a missing point or an incomplete resource is of grave importance as it can potentially interrupt the continuity in the narration and render the collection semantically incomplete. We can foresee two possible scenarios occurring when resources cannot be found. First, we have access to a copy of the missing document or to its lexical signatures, which allows us to find the missing resource. The second case is more interesting to us. What happens if we don't have any valid metadata associated to the missing resource? To solve this problem, we used the lexical signatures of valid documents within a collection to find suitable replacements for absent resources. As results we found that traditional similarity metrics do not adequately convey the relationships between the elements in the collections. Our analyses also showed that our procedures were able to restore the semantic integrity of incomplete document collections. 2013 Springer-Verlag.

published proceedings

  • Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

author list (cited authors)

  • Meneses, L., Barthwal, H., Singh, S., Furuta, R., & Shipman, F.

citation count

  • 0

complete list of authors

  • Meneses, Luis||Barthwal, Himanshu||Singh, Sanjeev||Furuta, Richard||Shipman, Frank

publication date

  • October 2013