Clustering of renewable energy assets to enhance performance evaluation

Abreu, Sara Isabel Gonçalves

http://hdl.handle.net/10400.22/25980

Utilize este identificador para referenciar este registo.

Nome:	Descrição:	Tamanho:	Formato:
Tese_5246.pdf		10.73 MB	Adobe PDF	Ver/Abrir

Contacte-nos

Autores

Abreu, Sara Isabel Gonçalves

Orientador(es)

Rodrigues, Maria de Fátima Coutinho

Resumo(s)

This study clusters solar inverters and wind turbines to aid Enlitia’s clients in identifying assets similar to theirs based on historical power production, meteorological data, and power curve characteristics. This knowledge enables clients to optimize resource allocation and operational strategies, thereby avoiding unnecessary costs. This project falls under the category of Data Mining and follows the CRISP-DM methodology. A crucial step in this approach is data cleaning, which involves treating null and duplicated values and reducing unnecessary features. During data cleaning, outlier values are identified and removed using various methods. For wind turbines, outliers are treated based on their power curve, which is defined by the power produced and the wind speed. For solar inverters, outliers are treated using the I-V curve, representing the DC power through the DC voltage and DC current. Following datacleaning, the clustering phase begins. This project employs algorithms from three clustering categories: classical, ensemble, and time series clustering. Principal Component Analysis (PCA) is applied to the datasets to reduce computational costs while preserving at least 90% of the original variation in the data. If feature reduction results in less than the minimum variation, feature values are only normalized. The resultant datasets are used in classical and ensemble clustering. In classical clustering, five hierarchical, two partitional, one soft, one model-based, and two density-based algorithms are applied. Five evaluation indexes, such as the silhouettes core and the Davies-Bouldin index, assess the resulting segmentations. The top three classical algorithms proceed to ensemble clustering, where combinations of two and three algorithms are performed using major voting with weighted label assignment based on the best segmentations. Finally, two time series clustering algorithms are applied, with the data sets reduced to two components through the use of PCA. The final step involves evaluating all obtained segmentations. The scores of each algorithm indicate that timesignificantly explains the variation in the data. For both solar and wind datasets, time series clustering produces the best segmentations.

Palavras-chave

Renewable Energy Solar Inverters Wind Turbines Data Mining CRISP-DM Clustering Time Series PCA

URI

http://hdl.handle.net/10400.22/25980

Coleções

ISEP - DM - Engenharia e Gestão Industrial

Ver registo completo