Dealing with poor data quality of OSINT data in fraud risk analysis

Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademic

64 Downloads (Pure)

Abstract

Governmental organizations responsible for keeping certain types of fraud under control, often use data-driven methods for both immediate detection of fraud, or for fraud risk analysis aimed at more effectively targeting inspections. A blind spot in such methods, is that the source data often represents a 'paper reality'. Fraudsters will attempt to disguise themselves in the data they supply painting a world in which they do nothing wrong. This blind spot can be counteracted by enriching the data with traces and indicators from more 'real-world' sources such as social media and internet. One of the crucial data management problems in accomplishing this enrichment is how to capture and handle data quality problems. The presentation will start with a real-world example, which is also used as starting point for a problem generalization in terms of information combination and enrichment (ICE). We then present the ICE technology as well as how data quality problems can be managed with probabilistic databases. In terms of the 4 V's of big data -- volume, velocity, variety and veracity -- this presentation focuses on the third and fourth V's: variety and veracity.
Original languageUndefined
Title of host publicationSIKS workshop Smart Auditing
Place of PublicationTilburg
PublisherTilburg University
Pages-
Number of pages1
ISBN (Print)not assigned
Publication statusPublished - 25 Feb 2015

Publication series

Name
PublisherTilburg University

Keywords

  • EWI-25899
  • METIS-312540
  • IR-95713

Cite this