EBSCO Logo
Connecting you to content on EBSCOhost
Results
Title

Missing data estimation for 1–6 h gaps in energy use and weather data using different statistical methods.

Authors

Claridge, David E.; Hui Chen

Abstract

Analysing hourly energy use to determine retrofit savings or diagnose system problems frequently requires rehabilitation of short periods of missing data. This paper evaluates four methods for rehabilitating short periods of missing data. Single variable regression, polynomial models, Lagrange interpolation, and linear interpolation models are developed, demonstrated, and used to fill 1–6 h gaps in weather data, heating data and cooling data for commercial buildings. The methodology for comparing the performance of the four different methods for filling data gaps uses 11 1-year data sets to develop different models and fill over 500 000 ‘pseudo-gaps’ 1–6 h in length for each model. These pseudo-gaps are created within each data set by assuming data is missing, then these gaps are filled and the ‘filled’ values compared with the measured values. Comparisons are made using four statistical parameters: mean bias error (MBE), root mean square error, sum of the absolute errors, and coefficient of variation of the sum of the absolute errors. Comparison based on frequency within specified error limits is also used. A linear interpolation model or a polynomial model with hour-of-day as the independent variable both fill 1–6 missing hours of cooling data, heating data or weather data, with accuracy clearly superior to the single variable linear regression model and to the Lagrange model. The linear interpolation model is the simplest and most convenient method, and generally showed superior performance to the polynomial model when evaluated using root mean square error, sum of the absolute errors, or frequency of filling within set error limits as criteria. The eighth-order polynomial model using time as the independent variable is a relatively simple, yet powerful approach that provided somewhat superior performance for filling heating data and cooling data if MBE is the criterion as is often the case when evaluating retrofit savings. Likewise, a tenth-order polynomial model provided the best performance when filling dew-point temperature data when MBE is the criterion. It is possible that the results would differ somewhat for other data sets, but the strength of the linear and polynomial models relative to the other models evaluated seems quite robust. Copyright © 2006 John Wiley & Sons, Ltd.

Subjects

ESTIMATION theory; DATA analysis; STATISTICS; HEATING; COOLING; TEMPERATURE; DEW point; REGRESSION analysis; INTERPOLATION; COMMERCIAL buildings

Publication

International Journal of Energy Research, 2006, Vol 30, Issue 13, p1075

ISSN

0363-907X

Publication type

Academic Journal

DOI

10.1002/er.1207

EBSCO Connect | Privacy policy | Terms of use | Copyright | Manage my cookies
Journals | Subjects | Sitemap
© 2025 EBSCO Industries, Inc. All rights reserved