Repository logo
 
Publication

Feature selection based on dataset variance optimization using Hybrid Sine Cosine: Firehawk algorithm (HSCFHA)

dc.contributor.authorRaza Moosavi, Syed Kumayl
dc.contributor.authorSaadat, Ahsan
dc.contributor.authorAbaid, Zainab
dc.contributor.authorNi, Wei
dc.contributor.authorLi, Kai
dc.contributor.authorGuizani, Mohsen
dc.date.accessioned2024-03-22T15:30:09Z
dc.date.available2024-03-22T15:30:09Z
dc.date.issued2024
dc.description.abstractFeature selection plays a pivotal role in preprocessing data for machine learning (ML) models. It entails choosing a subset of pertinent features to enhance the model’s accuracy and minimize overfitting. Wrapper methods based on metaheuristics are one approach to feature selection, leveraging the predictive accuracy of a learning algorithm to form a condensed set of features. Traditionally, this method uses K-Nearest Neighbor (KNN) for maximizing accuracy as its cost function. However, this approach often yields less than optimal results in large sample spaces and demands considerable computational resources. To circumvent the shortcomings of this approach, this work proposes a novel metaheuristic algorithm, termed the Hybrid Sine Cosine Firehawk Algorithm. Furthermore, a novel feature selection technique is designed that uses this hybrid algorithm to eliminate insignificant and redundant features by incorporating the minimization of dataset variance in the cost function. Additionally, the hybridization of multiple metaheuristic algorithms produces the best features of each algorithm to improve the exploration ability. The proposed technique is tested on 22 University of California Irvine datasets containing low, medium and high dimensional datasets and compared to the traditional KNN-based approach. The technique is also compared with other state-of-the-art metaheuristic techniques, namely Particle Swarm Optimizer, Grey Wolf Optimizer, Whale Optimization Algorithm, Hybrid Ant Colony Optimizer and Improved Binary Bat Algorithm. The results show significant improvements over previous techniques in terms of minimal loss in essential data while reducing the size of the raw data in considerably less time, as well as a well-balanced confusion matrix.pt_PT
dc.description.versioninfo:eu-repo/semantics/publishedVersionpt_PT
dc.identifier.citationSyed Kumayl Raza Moosavi, Ahsan Saadat, Zainab Abaid, Wei Ni, Kai Li, Mohsen Guizani, Feature selection based on dataset variance optimization using Hybrid Sine Cosine – Firehawk Algorithm (HSCFHA), Future Generation Computer Systems, Volume 155, 2024, Pages 272-286, ISSN 0167-739X, https://doi.org/10.1016/j.future.2024.02.017pt_PT
dc.identifier.doihttps://doi.org/10.1016/j.future.2024.02.017pt_PT
dc.identifier.issn0167-739X
dc.identifier.urihttp://hdl.handle.net/10400.22/25218
dc.language.isoengpt_PT
dc.peerreviewedyespt_PT
dc.publisherElsevierpt_PT
dc.relationUIDP/UIDB/04234/2020pt_PT
dc.relationPTDC/EEICOM/3362/2021pt_PT
dc.relation.publisherversionhttps://www.sciencedirect.com/science/article/pii/S0167739X24000621pt_PT
dc.rights.urihttp://creativecommons.org/licenses/by-nc/4.0/pt_PT
dc.subjectFeature selectionpt_PT
dc.subjectMetaheuristic algorithmspt_PT
dc.subjectMachine learningpt_PT
dc.subjectHybrid Sine Cosinept_PT
dc.subjectFirehawk Algorithmpt_PT
dc.subjectOptimization algorithmspt_PT
dc.subjectDataset variance optimizationpt_PT
dc.subjectClassificationpt_PT
dc.titleFeature selection based on dataset variance optimization using Hybrid Sine Cosine: Firehawk algorithm (HSCFHA)pt_PT
dc.typejournal article
dspace.entity.typePublication
oaire.citation.endPage286pt_PT
oaire.citation.startPage272pt_PT
oaire.citation.titleFuture Generation Computer Systems: The International Journal of eScience (FGCS)pt_PT
oaire.citation.volume115pt_PT
person.familyNameLi
person.givenNameKai
person.identifier.ciencia-idEE10-B822-16ED
person.identifier.orcid0000-0002-0517-2392
rcaap.rightsclosedAccesspt_PT
rcaap.typearticlept_PT
relation.isAuthorOfPublication21f3fb85-19c2-4c89-afcd-3acb27cedc5e
relation.isAuthorOfPublication.latestForDiscovery21f3fb85-19c2-4c89-afcd-3acb27cedc5e

Files

Original bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
CISTER-TR-240203.pdf
Size:
1.73 MB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description: