Restoring Semantically Incomplete Document Collections Using Lexical Signatures
- Additional Document Info
- View All
Unexpected changes create a problem when managing missing resources in a digital collection. In decentralized and distributed collections such as Walden's Paths, a missing point or an incomplete resource is of grave importance as it can potentially interrupt the continuity in the narration and render the collection semantically incomplete. We can foresee two possible scenarios occurring when resources cannot be found. First, we have access to a copy of the missing document or to its lexical signatures, which allows us to find the missing resource. The second case is more interesting to us. What happens if we don't have any valid metadata associated to the missing resource? To solve this problem, we used the lexical signatures of valid documents within a collection to find suitable replacements for absent resources. As results we found that traditional similarity metrics do not adequately convey the relationships between the elements in the collections. Our analyses also showed that our procedures were able to restore the semantic integrity of incomplete document collections. © 2013 Springer-Verlag.
author list (cited authors)
Meneses, L., Barthwal, H., Singh, S., Furuta, R., & Shipman, F.