Repository logo
 
Publication

SmartClean: an incremental data cleaning tool

dc.contributor.authorOliveira, Paulo
dc.contributor.authorRodrigues, Fátima
dc.contributor.authorHenriques, Pedro
dc.date.accessioned2013-05-15T11:42:13Z
dc.date.available2013-05-15T11:42:13Z
dc.date.issued2009
dc.description.abstractThis paper presents the SmartClean tool. The purpose of this tool is to detect and correct the data quality problems (DQPs). Compared with existing tools, SmartClean has the following main advantage: the user does not need to specify the execution sequence of the data cleaning operations. For that, an execution sequence was developed. The problems are manipulated (i.e., detected and corrected) following that sequence. The sequence also supports the incremental execution of the operations. In this paper, the underlying architecture of the tool is presented and its components are described in detail. The tool's validity and, consequently, of the architecture is demonstrated through the presentation of a case study. Although SmartClean has cleaning capabilities in all other levels, in this paper are only described those related with the attribute value level.por
dc.identifierDOI 10.1109/QSIC.2009.67
dc.identifier.isbn978-1-4244-5912-4
dc.identifier.issn1550-6002
dc.identifier.urihttp://hdl.handle.net/10400.22/1583
dc.language.isoengpor
dc.peerreviewedyespor
dc.publisherIEEEpor
dc.relation.ispartofseriesQuality Software
dc.relation.publisherversionhttp://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=5381543por
dc.subjectLimpeza de dadospor
dc.subjectProblemas de qualidade de dadospor
dc.subjectData cleaningpor
dc.subjectDetectionpor
dc.subjectCorrectionpor
dc.subjectData quality problemspor
dc.subjectArchitecturepor
dc.subjectToolpor
dc.titleSmartClean: an incremental data cleaning toolpor
dc.typeconference object
dspace.entity.typePublication
oaire.citation.conferencePlaceJeju, Coreia do Sulpor
oaire.citation.endPage457por
oaire.citation.startPage452por
oaire.citation.title9th International Conference on Quality Softwarepor
rcaap.rightsclosedAccesspor
rcaap.typeconferenceObjectpor

Files

Original bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
COM_PauloOliveira_2009_GECAD.pdf
Size:
115.47 KB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description: