4 minute read

Summary: Tableau is a popular visualization software. I collected Texas outage data and visualize it in Tableau.

Contents

1. Introduction

I want to visualize the Texas power outage during Feb.10 to Feb.19 hour by hour in Tableau.

  • Firstly, collecting outage data, weather history data and Texas county data.
  • Secondly, processing the data.
  • Lastly, visualizing the processed data in Tableau.

Notebook for collecting and processing data can be found here

2. Data Collection

In this part, I will collect outage data, weather history data and Texas county data.

2.1 Outage data

outage = pd.read_excel('outage_feb1019.xlsx', sheet_name='OutageData')
print(f'Outage size: {outage.shape}')
outage.head(3)

name

outage['FUEL TYPE'].unique()

array(['WIND', 'NG', 'COAL', 'SOLAR', 'ESR', 'HYDRO', 'NUCLEAR'], dtype=object)

2.2 County data

cnty = pd.read_csv('Texas_Counties_Centroid_Map.csv')
print(f'County size: {cnty.shape}')
cnty.head(3)

County size: (254, 8) name

2.3 Weather history data

##scrape weather history data hourly
stations = Stations()
# Set time period
start = datetime(2021, 2, 10)
end = datetime(2021, 2, 19, 23, 59)

# for idx in tqdm(range(len(cnty))):
# for idx in range(len(cnty)):
idx_s = 200 #change this in tiny step to avoid blocking
idx_e = len(cnty)
for idx in tqdm(range(idx_s, idx_e)):
    cnty_name = cnty['CNTY_NM'][idx]
    lat = cnty['X (Lat)'][idx].round(4)
    lon = cnty['Y (Long)'][idx].round(4)
    stations = stations.nearby(lon, lat)
    station = stations.fetch(1)
    
    # Get daily data for 2018
    data = Hourly(station.index[0], start, end) #https://dev.meteostat.net/python/hourly.html#example
    data = data.fetch()
    data.reset_index(inplace=True) #the time was used as index, reset it
    data['COUNTY'] = cnty_name
    
    if idx == idx_s:
        hourly_data = data
    else:
        hourly_data = hourly_data.append(data)
        
        
hourly_data.reset_index(inplace=True)
hourly_data.to_csv('hourly_data_200_last.csv')

Combining all weather data together:

hourly_data_0_99 = pd.read_csv('hourly_data_0_99.csv')
hourly_data_100_149 = pd.read_csv('hourly_data_100_149.csv')
hourly_data_150_199 = pd.read_csv('hourly_data_150_199.csv')
hourly_data_200_last = pd.read_csv('hourly_data_200_last.csv')

hourly_data = pd.concat([hourly_data_0_99, hourly_data_100_149, hourly_data_150_199, hourly_data_200_last], axis=0)

Cleaning the data names:

hourly_data.drop(['Unnamed: 0', 'index'], inplace=True, axis=1)
hourly_data.rename(columns={'COUNTY': 'CNTY_NM'}, inplace=True)
hourly_data.reset_index(drop=True, inplace=True)
hourly_data.to_csv('hourly_data.csv')
hourly_data.head(3)

name

3. Data processing

3.1 Picking useful features from weather data

Adding county information to weather data and pick usefull features from the weather data.

hourly_data = pd.read_csv('hourly_data.csv') 
cnty_xycnty = cnty[['X (Lat)', 'Y (Long)', 'CNTY_NM', 'CNTY_NBR', 'FIPS']] 
weather_data = hourly_data.merge(cnty_xycnty, how='inner', on='CNTY_NM') #addX (Lat)', 'Y (Long) to weather data
weather_data['CNTY_NM'] = weather_data['CNTY_NM'].str.upper()
# weather_data['time'] = pd.Timestamp(weather_data['time'])
weather_data['time'] = pd.to_datetime(weather_data['time']) #str to timestamp
# df['Timestamp'] = pd.to_datetime(df['Timestamp'], unit='s')
weather_data['State'] = 'Texas'
weather_cut = weather_data[['State', 'CNTY_NM', 'X (Lat)', 'Y (Long)', 'CNTY_NBR', 'FIPS', 'time', 'temp', 'coco']]
print(f'weather_cut shape: {weather_cut.shape}')
weather_cut.head(3)

weather_cut shape: (59760, 9) name

3.2 Picking useful features from the outage data

outage_cut = outage.drop(['STATION LONG NAME', 'UNIT NAME', 'RESOURCE ENTITY'], axis=1)
outage_cut.rename(columns={'SEASONAL MAX MW (HSL)': 'MAX_MW', 'AVAILABLE MW AFTER OUTAGE/DERATE': 'AVAI_MW', 'MW REDUCTION FROM OUTAGE/DERATE': 'REDU_MW'}, inplace=True)
outage_cut.head(3)

name

3.3 Adding outage info to weather_cut_drop.

For each county at each hour, checking if it’s suffering outage or not. The total MAX_MW, AVAIL_MW, REDU_MW will be added.

'''
Adding columns for reduced MW with different fuel type
'''
weather_cut_drop = weather_cut.copy()
# outage['FUEL TYPE'].unique():  ['WIND', 'NG', 'COAL', 'SOLAR', 'ESR', 'HYDRO', 'NUCLEAR']
rng = len(weather_cut_drop)
weather_cut_drop = weather_cut.copy()

for FT in outage['FUEL TYPE'].unique():
    weather_cut_drop[FT + '_recu_mw'] = pd.DataFrame([0] * rng)
    
weather_cut_drop.head(3)

Calculating total reduced power of each fuel type for each county at every hour.

'''
loop through every pair
'''
rng = len(weather_cut_drop)
for index_weather in tqdm(range(rng)):
    for index_outage in range(len(outage_cut)):
        if weather_cut_drop['CNTY_NM'][index_weather] == outage_cut['COUNTY'][index_outage]: 
            #think aobut the efficiency on accesing data in dataframe
            if weather_cut_drop['time'][index_weather] > outage_cut['START'][index_outage] and \
                            weather_cut_drop['time'][index_weather] < outage_cut['END'][index_outage] :
                FT = outage_cut['FUEL TYPE'][index_outage]
                weather_cut_drop[FT + '_recu_mw'][index_weather] = \
                    weather_cut_drop[FT + '_recu_mw'][index_weather] + outage_cut['REDU_MW'][index_outage]
weather_cut_drop.to_csv('cnty_weather_out_v2.csv')
weather_cut_drop.head(3)

4. Visualization in Tableau

Import cnty_weather_out_v2.csv to Tableau and visualize the data.

4.1 Evolution of outage(left) and temperature(right)

The slider can help user select time.

name

4.2 Comparing the evolution in one map

The color represents the temperature, the size of dot represents the total reduced power. The slider can help user select time.

name

4.3 Heatmap of counties’ total reduced power at each time

name

4.4 Heatmap of counties’ temperature at each time

name

4.5 Line chart of each county’s temperature and reduced power

In drop-down menu, the user can select county.

name

4.6 Total reduced power by fuel type

name

4.7 Top X counties suffer power outage at each time

The slider can help user select time. User can set X in the text box.

name

Comments