Problem 1.1#

Integrated Energy Grids

Problem 1.1. Analyzing solar and wind generation time series.

Note

If you have not yet set up Python on your computer, you can execute this tutorial in your browser via Google Colab. Click on the rocket in the top right corner and launch “Colab”. If that doesn’t work download the .ipynb file and import it in Google Colab.

Then install pandas and numpy by executing the following command in a Jupyter cell at the top of the notebook.

!pip install pandas numpy scipy
import pandas as pd
from scipy import fftpack
import numpy as np
import matplotlib.pyplot as plt

Data import#

In this example, wind data from https://zenodo.org/record/3253876#.XSiVOEdS8l0 and solar PV data from https://zenodo.org/record/2613651#.X0kbhDVS-uV is used. The data is downloaded in csv format and saved in the ‘data’ folder. The Pandas package is used as a convenient way of managing the datasets.

For convenience, the column including date information is converted into Datetime and set as index

data_pv = pd.read_csv('data/pv_optimal.csv',sep=';')
data_pv.index = pd.DatetimeIndex(data_pv['utc_time'])
data_wind = pd.read_csv('data/onshore_wind_1979-2017.csv',sep=';')
data_wind.index = pd.DatetimeIndex(data_wind['utc_time'])

The data format can now be analyzed using the .head() function to show the first lines of the data set

data_pv.head()
utc_time AUT BEL BGR BIH CHE CYP CZE DEU DNK ... MLT NLD NOR POL PRT ROU SRB SVK SVN SWE
utc_time
1979-01-01 00:00:00+00:00 1979-01-01T00:00:00Z 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
1979-01-01 01:00:00+00:00 1979-01-01T01:00:00Z 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
1979-01-01 02:00:00+00:00 1979-01-01T02:00:00Z 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
1979-01-01 03:00:00+00:00 1979-01-01T03:00:00Z 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
1979-01-01 04:00:00+00:00 1979-01-01T04:00:00Z 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0

5 rows × 33 columns

We will use Denmark as an example for this solution

country = 'DNK'

1) Start by plotting the capacity factors for wind and solar throughout the first two weeks in January and the first two weeks in July. Do this for the most recent year for which you have available data.

data_pv.loc['2017-01-01':'2017-01-14'][country].plot(kind='line', ylabel='Capacity Factor solar PV', color='orange')
<Axes: xlabel='utc_time', ylabel='Capacity Factor solar PV'>
../_images/e3428482a0f2e26c020439ab93d9ba9b9449642f52a67b79c55b61df08c3db33.png
data_pv.loc['2017-06-01':'2017-06-14'][country].plot(kind='line', ylabel='Capacity Factor solar PV', color='orange')
<Axes: xlabel='utc_time', ylabel='Capacity Factor solar PV'>
../_images/75fc3362436232adfeb6aa75d7c6897c021da33b11eb9c0b5875f9ef33ad7e67.png
data_wind.loc['2017-01-01':'2017-01-14'][country].plot(kind='line', ylabel='Capacity Factor Onshore Wind')
<Axes: xlabel='utc_time', ylabel='Capacity Factor Onshore Wind'>
../_images/643423388360f640f2e2047536fec5ee715c0b0d6c6e0554b0911901a2f4ffb1.png
data_wind.loc['2017-06-01':'2017-06-14'][country].plot(kind='line', ylabel='Capacity Factor Onshore Wind')
<Axes: xlabel='utc_time', ylabel='Capacity Factor Onshore Wind'>
../_images/4a76f8574d365cde259bab57a3fd22d3740fbeca65c83e0ebb814ff83422ace1.png

2)Calculate the average capacity factor for every day of the year and plot them. Do the same for capacity factors averaged by week and month. Based on steps 1 and 2, what are the dominant frequencies for every technology?

Daily, weekly and monthly means are calculated using the pandas .groupby function combined with the pandas.Grouper automaticly creationg groups with the desired size.

# Daily
daily_mean_pv = data_pv.loc['2017-01-01':'2017-12-31'][country].resample('D').mean()
daily_mean_wind = data_wind.loc['2017-01-01':'2017-12-31'][country].resample('D').mean()
# Weekly
weekly_mean_pv = data_pv.loc['2017-01-01':'2017-12-31'][country].resample('W').mean()
weekly_mean_wind = data_wind.loc['2017-01-01':'2017-12-31'][country].resample('W').mean()
# Monthly
monthly_mean_pv = data_pv.loc['2017-01-01':'2017-12-31'][country].resample('ME').mean()
monthly_mean_wind = data_wind.loc['2017-01-01':'2017-12-31'][country].resample('ME').mean()
daily_mean_pv.plot(kind='line', ylabel='Capacity Factor solar PV', color='orange')
<Axes: xlabel='utc_time', ylabel='Capacity Factor solar PV'>
../_images/961eb78fae7d46f8fe0c515bd4b44316bde39a430ceb3ed5d3eefd01b055feec.png
weekly_mean_pv.plot(kind='line', ylabel='Capacity Factor solar PV', color='orange')
<Axes: xlabel='utc_time', ylabel='Capacity Factor solar PV'>
../_images/f882bd080739e059a03fcb37f0ce4f08df32319d39cea129c9307481cea335c5.png
monthly_mean_pv.plot(kind='line', ylabel='Capacity Factor solar PV', color='orange')
<Axes: xlabel='utc_time', ylabel='Capacity Factor solar PV'>
../_images/776460f00c07ddd9175e4039a103c9260da3a13af3eb4741f6dab0836b2e603a.png
daily_mean_wind.plot(kind='line', ylabel='Capacity Factor onshore wind')
<Axes: xlabel='utc_time', ylabel='Capacity Factor onshore wind'>
../_images/6faea4148cc5abc8955657df30fe9877311843aa18a0932643f8c56d0a07e2b2.png
weekly_mean_wind.plot(kind='line', ylabel='Capacity Factor onshore wind')
<Axes: xlabel='utc_time', ylabel='Capacity Factor onshore wind'>
../_images/8154d638c8c810db848155521b69f6d9c9f4a4d411e25e62a5ad953bb305bbb0.png
monthly_mean_wind.plot(kind='line', ylabel='Capacity Factor onshore wind')
<Axes: xlabel='utc_time', ylabel='Capacity Factor onshore wind'>
../_images/831671af3b0feea7ddd6aabf18f022a8d11f6956b2bbd2da79934486ab18f674.png

Based on the figures shown above the following trends are seen:

  • Solar PV has a clear daily pattern seen on the capacity factor plot for the first two weeks of the year. High capacity factors are seen in the middle of the day and zero at night.

  • Solar PV furthermore, has a clearly seasonal pattern, with high capacity factors in summer, and lower in winter.

  • Analyzing the daily average capacity factor of wind, fluctuations showing approximately a weekly period are seen.

  • A slightly seasonal pattern is also seen for wind, with more wind in the winter and less in the summer.

3)One useful way of investigating the previous question is by calculating the Fast Fourier Transform (FTT) power spectra of the time series. Do so and plot the power spectra for wind and solar capacity factor time series. Are these results in agreement with Sections 1 and 2?

The function fft.fft() is used to calculate the Fast Fourier Transform \(a_{\omega}=\int^T_0 X(t) e^{i \omega t}dt\) where \(X(t)\) is the time series (wind or solar) that we are analyzing.

The power spectra \(\| a_{\omega}\|^2\) is plotted to identify the dominant frequencies.

t_sampling=1 # sampling rate, 1 data per hour
x = np.arange(1,8761, t_sampling) 
y = data_pv.loc['2017-01-01':'2017-12-31'][country]
n = len(x)
y_fft = np.fft.fft(y)/n #n for normalization    
frq = np.arange(0,1/t_sampling,1/(t_sampling*n))        
period = np.array([1/f for f in frq]) 
/tmp/ipykernel_2105/3780475453.py:7: RuntimeWarning: divide by zero encountered in scalar divide
  period = np.array([1/f for f in frq])

We plot the power spectra as a function of the period (1/frequency).

plt.semilogx(period[1:n//2],
             abs(y_fft[1:n//2])/np.max(abs(y_fft[1:n//2])), 
             color='orange',
             linewidth=2)  
plt.xlabel('cycling period (hours)')

#We add lines indicating day, week, month 
plt.axvline(x=24, color='lightgrey', linestyle='--')
plt.axvline(x=24*7, color='lightgrey', linestyle='--')
plt.axvline(x=24*30, color='lightgrey', linestyle='--')
plt.axvline(x=8760, color='lightgrey', linestyle='--') 
plt.text(26, 0.95, 'day', horizontalalignment='left', color='dimgrey', fontsize=14)
plt.text(24*7+20, 0.95, 'week', horizontalalignment='left', color='dimgrey', fontsize=14)
plt.text(24*30+20, 0.95, 'month', horizontalalignment='left', color='dimgrey', fontsize=14)
Text(740, 0.95, 'month')
../_images/405ae57eca23a97f3d7095a02824d13832a1e2a19e6fd1038819ccce7792aaef.png
t_sampling=1 # sampling rate, 1 data per hour
x = np.arange(1,8761, t_sampling) 
y = data_wind.loc['2017-01-01':'2017-12-31'][country]
n = len(x)
y_fft = np.fft.fft(y)/n #n for normalization    
frq = np.arange(0,1/t_sampling,1/(t_sampling*n))        
period = np.array([1/f for f in frq]) 
/tmp/ipykernel_2105/2483825668.py:7: RuntimeWarning: divide by zero encountered in scalar divide
  period = np.array([1/f for f in frq])
plt.semilogx(period[1:n//2],
             abs(y_fft[1:n//2])/np.max(abs(y_fft[1:n//2])), 
             linewidth=2)  
plt.xlabel('cycling period (hours)')

#We add lines indicating day, week, month 
plt.axvline(x=24, color='lightgrey', linestyle='--')
plt.axvline(x=24*7, color='lightgrey', linestyle='--')
plt.axvline(x=24*30, color='lightgrey', linestyle='--')
plt.axvline(x=8760, color='lightgrey', linestyle='--') 
plt.text(26, 0.95, 'day', horizontalalignment='left', color='dimgrey', fontsize=14)
plt.text(24*7+20, 0.95, 'week', horizontalalignment='left', color='dimgrey', fontsize=14)
plt.text(24*30+20, 0.95, 'month', horizontalalignment='left', color='dimgrey', fontsize=14)
Text(740, 0.95, 'month')
../_images/fa12bb76c260926ef21ea5fe35a340cf9232876e52414d82979f82ab382e7d60.png

Solar PV main fluctuations show a daily cycle, while onshore wind shows significant fluctuations in the synoptic timescale (with a frequency of around one week)

4)Plot the duration curve (sorted capacity factors values) for every technology. What percentage of the potential wind and solar energy will be lost if the potential generation is curtailed for the 100 hours with the highest capacity factors? What about if curtailment affects the 1000 hours with the highest capacity factors?

Solar PV#

duration_pv = data_pv.loc['2017-01-01':'2017-12-31'][country].sort_values(ascending=False,ignore_index=True)
duration_pv.plot(kind='line', ylabel='Capacity Factor solar PV', color='orange')
<Axes: ylabel='Capacity Factor solar PV'>
../_images/fcfe68f75e4d15a16818191707e0455266b0081588af2923d1f297f9845b0d04.png
def curtailment_loss(duration_data, hours):
    full_load_hours_per_year = sum(duration_data)
    curtailment_loss = sum(duration_data[:hours])/full_load_hours_per_year
    print('{:.1f} % annual production is lost if the {} hours with highest capacity factor are curtailed'.format(curtailment_loss*100,hours))
print('PV curtailment loss')
curtailment_loss(duration_pv,10)
curtailment_loss(duration_pv,100)
PV curtailment loss
0.8 % annual production is lost if the 10 hours with highest capacity factor are curtailed
7.6 % annual production is lost if the 100 hours with highest capacity factor are curtailed

Onshore wind#

duration_wind = data_wind.loc['2017-01-01':'2017-12-31'][country].sort_values(ascending=False,ignore_index=True)
duration_wind.plot(kind='line', ylabel='Capacity Factor onshore wind')
<Axes: ylabel='Capacity Factor onshore wind'>
../_images/7a2aea1ffc4e060f8fdbc35cc0f778598a7ca5b653f07a4b4186b8fd44f307f1.png
print('Wind curtailment loss')
curtailment_loss(duration_wind,100)
curtailment_loss(duration_wind,1000)
Wind curtailment loss
3.5 % annual production is lost if the 100 hours with highest capacity factor are curtailed
29.4 % annual production is lost if the 1000 hours with highest capacity factor are curtailed

The duration curves for solar and wind are fundamentally different. Solar PV includes more than 4000 hours of zero capacity factors (the nights!) and the maximum capacity factor is 0.7 (It is very difficult for the entire Denmark to have a clear sky simultaneously). Wind duration curve includes a very low number of hours with zero capacity factor and a maximum of 0.9

5) Calculate the ramps for every technology for every hour of the year. We define ramp as the difference between the capacity factor in an hour and the capacity factor in the previous hour. Plot the distribution of ramps for wind and solar. Which technology has the higher variation?

ramps_pv = data_pv.loc['2017-01-01':'2017-12-31'][country].diff()
ramps_pv.plot(kind='hist', xlabel='ramps solar PV', color='orange')
<Axes: xlabel='ramps solar PV', ylabel='Frequency'>
../_images/07dccfde9db679f211b5421dc6639220a4cc1372069f084ee7207c542da264da.png
ramps_wind = data_wind.loc['2017-01-01':'2017-12-31'][country].diff()
ramps_wind.plot(kind='hist', xlabel='ramps wind')
<Axes: xlabel='ramps wind', ylabel='Frequency'>
../_images/7081b3eb935f07e04b0d35110012d0058ee246ebebe2c66016d52e4056e6cd7f.png

Let’s look now at the interannual variability. For every technology, calculate the annual average capacity factor for the most recent year for which you have data. Then, calculate the annual capacity factor for every year for which you have data. Estimate the average value for all the years and the year-to-year variance.

Solar PV#

annual_mean_pv = data_pv[country].resample('YE').mean()
annual_mean_pv.plot(kind='line', ylabel='Capacity Factor solar PV', color='orange')
<Axes: xlabel='utc_time', ylabel='Capacity Factor solar PV'>
../_images/6a65a403053fb9a5e2b46f25ff134b1956fef7542ddcdca97b75da611307b9d1.png
print('Average annual mean {:.2f}, \nNormalized Standar deviation {:.2f} %'.format(annual_mean_pv.mean(),annual_mean_pv.std()/annual_mean_pv.mean()))
Average annual mean 0.10, 
Normalized Standar deviation 0.05 %

Onshore wind#

annual_mean_wind = data_wind[country].resample('YE').mean()
annual_mean_wind.plot(kind='line', ylabel='Capacity Factor onshore wind')
<Axes: xlabel='utc_time', ylabel='Capacity Factor onshore wind'>
../_images/64233d60f9e24c9ec5c41dc843b73c33cc45a5331d73fe32b95db1fb63e37f09.png
print('Average annual mean {:.2f}, \nNormalized Standar deviation {:.2f} %'.format(annual_mean_wind.mean(),annual_mean_wind.std()/annual_mean_wind.mean()))
Average annual mean 0.27, 
Normalized Standar deviation 0.07 %

Repeat steps 1 to 5 for the electricity and heating demand time series.

Data import#

data_el = pd.read_csv('data/electricity_demand.csv',sep=';')
data_el.index = pd.DatetimeIndex(data_el['utc_time'])
data_heat = pd.read_csv('data/heat_demand.csv',sep=';')
data_heat.index = pd.DatetimeIndex(data_heat['utc_time'])

1) Plot of demand throughout two weeks#

data_el.loc['2015-01-01':'2015-01-14'][country].plot(kind='line', ylabel='Electricity demand', color='black')
<Axes: xlabel='utc_time', ylabel='Electricity demand'>
../_images/afecbe5023dfe02ece3ccb0d31f08c19e7276ab7054f95c990f78037645ebb3d.png
data_heat.loc['2015-01-01':'2015-01-14'][country].plot(kind='line', ylabel='Electricity demand', color='red')
<Axes: xlabel='utc_time', ylabel='Electricity demand'>
../_images/2984fa68ba21c88f686ffa5dbb1318a7c0f905d0af08793d7e420790be44a2e1.png
### 1) Plot of daily, weekly, annual demand
# Daily
daily_mean_el = data_el.loc['2015-01-01':'2015-12-31'][country].resample('D').mean()
daily_mean_heat = data_heat.loc['2015-01-01':'2015-12-31'][country].resample('D').mean()
# Weekly
weekly_mean_el = data_el.loc['2015-01-01':'2015-12-31'][country].resample('W').mean()
weekly_mean_heat = data_heat.loc['2015-01-01':'2015-12-31'][country].resample('W').mean()
# Monthly
monthly_mean_el = data_el.loc['2015-01-01':'2015-12-31'][country].resample('ME').mean()
monthly_mean_heat = data_heat.loc['2015-01-01':'2015-12-31'][country].resample('ME').mean()
daily_mean_el.plot(kind='line', ylabel='demand', color='black')
daily_mean_heat.plot(kind='line', color='red')
<Axes: xlabel='utc_time', ylabel='demand'>
../_images/f2e7917b8bac989ef46265de67272173a484b7906837089767819f625a2fe335.png
weekly_mean_el.plot(kind='line', ylabel='demand', color='black')
weekly_mean_heat.plot(kind='line', color='red')
<Axes: xlabel='utc_time', ylabel='demand'>
../_images/851443e7d6f0f3123a344290a9ba1759e737561a405009f45251854ee9b947f8.png
monthly_mean_el.plot(kind='line', ylabel='demand', color='black')
monthly_mean_heat.plot(kind='line', color='red')
<Axes: xlabel='utc_time', ylabel='demand'>
../_images/2a69a21baedbdc265392674ab86f5a6403301760adcdcc9d7322a1fa5d011d3e.png

Fast Fourier power spectra#

t_sampling=1 # sampling rate, 1 data per hour
x = np.arange(1,8761, t_sampling) 
y = data_el.loc['2015-01-01':'2015-12-31'][country]
n = len(x)
y_fft = np.fft.fft(y)/n #n for normalization    
frq = np.arange(0,1/t_sampling,1/(t_sampling*n))        
period = np.array([1/f for f in frq]) 

plt.semilogx(period[1:n//2],
             abs(y_fft[1:n//2])/np.max(abs(y_fft[1:n//2])), 
             color='black',
             linewidth=2)  
plt.xlabel('cycling period (hours)')

#We add lines indicating day, week, month 
plt.axvline(x=24, color='lightgrey', linestyle='--')
plt.axvline(x=24*7, color='lightgrey', linestyle='--')
plt.axvline(x=24*30, color='lightgrey', linestyle='--')
plt.axvline(x=8760, color='lightgrey', linestyle='--') 
plt.text(26, 0.95, 'day', horizontalalignment='left', color='dimgrey', fontsize=14)
plt.text(24*7+20, 0.95, 'week', horizontalalignment='left', color='dimgrey', fontsize=14)
plt.text(24*30+20, 0.95, 'month', horizontalalignment='left', color='dimgrey', fontsize=14)
/tmp/ipykernel_2105/1496012896.py:7: RuntimeWarning: divide by zero encountered in scalar divide
  period = np.array([1/f for f in frq])
Text(740, 0.95, 'month')
../_images/a9857cf1a13ac9a8f66409177cff9e8ae7a11503a079bb561221b03b2dd7051d.png
t_sampling=1 # sampling rate, 1 data per hour
x = np.arange(1,8761, t_sampling) 
y = data_heat.loc['2015-01-01':'2015-12-31'][country]
n = len(x)
y_fft = np.fft.fft(y)/n #n for normalization    
frq = np.arange(0,1/t_sampling,1/(t_sampling*n))        
period = np.array([1/f for f in frq]) 

plt.semilogx(period[1:n//2],
             abs(y_fft[1:n//2])/np.max(abs(y_fft[1:n//2])), 
             color='red',
             linewidth=2)  
plt.xlabel('cycling period (hours)')

#We add lines indicating day, week, month 
plt.axvline(x=24, color='lightgrey', linestyle='--')
plt.axvline(x=24*7, color='lightgrey', linestyle='--')
plt.axvline(x=24*30, color='lightgrey', linestyle='--')
plt.axvline(x=8760, color='lightgrey', linestyle='--') 
plt.text(26, 0.95, 'day', horizontalalignment='left', color='dimgrey', fontsize=14)
plt.text(24*7+20, 0.95, 'week', horizontalalignment='left', color='dimgrey', fontsize=14)
plt.text(24*30+20, 0.95, 'month', horizontalalignment='left', color='dimgrey', fontsize=14)
/tmp/ipykernel_2105/3748597511.py:7: RuntimeWarning: divide by zero encountered in scalar divide
  period = np.array([1/f for f in frq])
Text(740, 0.95, 'month')
../_images/3e6936679a8e3e8c9e6b0e3e9c725b31411e368358e32fd1ec8ece46a52c8104.png

Duration curves#

duration_el = data_el.loc['2015-01-01':'2015-12-31'][country].sort_values(ascending=False,ignore_index=True)
duration_heat = data_heat.loc['2015-01-01':'2015-12-31'][country].sort_values(ascending=False,ignore_index=True)
duration_el.plot(kind='line', ylabel='demand', color='black')
duration_heat.plot(kind='line', ylabel='demand', color='red')
<Axes: ylabel='demand'>
../_images/a24d84231221938e2c885142acbe6b5f44635a01fbae527bb7b223bd40fe571b.png

Ramps#

ramps_el = data_el.loc['2015-01-01':'2015-12-31'][country].diff()
ramps_el.plot(kind='hist', xlabel='electricity demand ramps (MWh)', color='black')
<Axes: xlabel='electricity demand ramps (MWh)', ylabel='Frequency'>
../_images/9d2da0e8dbb9629fc182a0a1f27492600052d6671d9aa1e3c95e3039a2c6878f.png
ramps_heat = data_heat.loc['2015-01-01':'2015-12-31'][country].diff()
ramps_heat.plot(kind='hist', xlabel='heating demand ramps (MWh)', color='red')
<Axes: xlabel='heating demand ramps (MWh)', ylabel='Frequency'>
../_images/09e46073b5f4941b79e6566fd59156b9b80e231c73714f8043872a893b6dea13.png