TY - JOUR
T1 - Value-Based File Retention
T2 - File Attributes as File Value and Information Waste Indicators
AU - Wijnhoven, Fons
AU - Amrit, Chintan
AU - Dietz, Pim
PY - 2014
Y1 - 2014
N2 - Several file retention policy methods propose that a file retention policy should be based on file value. Though such a retention policy might increase the value of accessible files, the method to arrive at such a policy is underresearched. This article discusses how one can arrive at a method for developing file retention policies based on the use values of files. The method’s applicability is initially assessed through a case study at Capgemini, Netherlands. In the case study, we hypothesize that one can develop a file retention policy by testing causal relations between file attributes (as used by file retention methods) and the use value of files. Unfortunately, most file attributes used by file retention methods have a weak correlation with file value, resulting in the conclusion that these methods do not well select out high- and low-value files. This would imply the ineffectiveness of the used attributes in our study or errors in our conceptualization of file value. We continue with the last possibility and develop indicators for file utility (with low utility being waste). With this approach we were able to detect waste files, in a sample of files, with an accuracy of 80%. We therefore not only suggest further research in information waste detection as part of a file retention policy, but also to further explore other file attributes that could better predict file value and file utility.
AB - Several file retention policy methods propose that a file retention policy should be based on file value. Though such a retention policy might increase the value of accessible files, the method to arrive at such a policy is underresearched. This article discusses how one can arrive at a method for developing file retention policies based on the use values of files. The method’s applicability is initially assessed through a case study at Capgemini, Netherlands. In the case study, we hypothesize that one can develop a file retention policy by testing causal relations between file attributes (as used by file retention methods) and the use value of files. Unfortunately, most file attributes used by file retention methods have a weak correlation with file value, resulting in the conclusion that these methods do not well select out high- and low-value files. This would imply the ineffectiveness of the used attributes in our study or errors in our conceptualization of file value. We continue with the last possibility and develop indicators for file utility (with low utility being waste). With this approach we were able to detect waste files, in a sample of files, with an accuracy of 80%. We therefore not only suggest further research in information waste detection as part of a file retention policy, but also to further explore other file attributes that could better predict file value and file utility.
KW - Methodology
KW - Case study
KW - Quantitative
KW - Data mining
UR - http://www.scopus.com/inward/record.url?eid=2-s2.0-84901507457&partnerID=MN8TOARS
U2 - 10.1145/2567656
DO - 10.1145/2567656
M3 - Article
SN - 1936-1955
VL - 4
JO - ACM journal of data and information quality
JF - ACM journal of data and information quality
IS - 4
M1 - 15
ER -