2020 SciPy - Jacob Dice¶

Geospatial analysis of Housing Stock at Risk of Sea-Level Rise¶

In the jupyter notebook below, I describe a research project I've been working with colleagues at the Federal Reserve Bank of Kansas City and NOAA (David Rodziewicz, Christopher Amante, and Eugene Wahl).

Please start with the embedded video
Read through the code example
Check out our recently-posted research working paper (link)
If you have any follow-up questions, don't hesitate to email me at: Jacob.Dice@kc.frb.org

from IPython.display import YouTubeVideo
YouTubeVideo('vk7ZmUXINPg', width=900, height=600)

from pathlib import Path
import rasterio as rio
import fiona
import geopandas as gpd
import pandas as pd
import numpy as np
from geopandas.tools import sjoin
import matplotlib.pyplot as plt
import re
import operator

To determine the effects of sea level rise at a property level, we first have to identify the exact elevation each property in our sample will go underwater. To do this, we use two ways of deriving the inundation elevation of a property:

DEM (Digital Elevation Model) Elevation
Shapefile Elevation

The DEM file is a raster file (tiled data) that defines the elevation for any given point on the coastline. The shapefile is a series of polygons describing the coastline boundary at 0-10 ft.

The shapefile, while a less granular match (only 1ft increments), was computed accounting for hydrological connectivity. Essentially, if there are natural barriers/dams preventing the water from flowing inland, the shapefile correctly identifies these regions as not-inundated.

To optimize for match granularity and fidelity, we match to both files. If the DEM elevation resides within the 1ft tolerance of the shapefile, we inundate at the DEM elevation. If the shapefile indicates that the property will innundate at a higher elevation than the DEM elevation (indicating hydrologic connectivity is at play), we inundate the property at the shapefile elvation since we can't estimate a better proxy.

Below, I show how to use rasterio and geopandas to perform these matches on 3 made-up properties in Florida. In our research working paper (Kansas City Federal Reserve), our sample was ~5.4M properties across 15 Metros.

#Specify file paths to DEM file:
#For the research paper, we use a variant of what can be downloaded at: https://coast.noaa.gov/slrdata/
DEM_FILE_PATH = Path('../data/FL_MFL_2_GCS_5m_MHHWft.tif')

#Specify file paths to SHP file:
#Source: https://coast.noaa.gov/slrdata/
SHP_FILE_PATH = Path('../data/FL_MFL_slr_data_dist/FL_MFL_slr_final_dist.gdb')

#Generate dataframe of made-up properties on the coastline of FL.
houses = pd.DataFrame({'house_id': ['a','b','c'],
                       'lat': [25.772332, 25.974418 , 25.945423],
                       'lon': [-80.206236, -80.133745, -80.141970],
                      })

#Convert to geopandas dataframe (will make spatial joins easier)
houses = gpd.GeoDataFrame(houses,
                          geometry=gpd.points_from_xy(houses['lon'], houses['lat']))
#set coordinate-reference-system metadata in geopandas
houses.crs = "EPSG:4326"
houses

#state shapefile (only to generate map below): https://www2.census.gov/geo/tiger/GENZ2018/shp/cb_2018_us_state_5m.zip
states = gpd.read_file('../data/state_shapefile/cb_2018_us_state_5m.shp')
states = states[states['STUSPS']=='FL']
states

fig, ax = plt.subplots(ncols=2, figsize=(10,5))
states.boundary.plot(ax=ax[0])
houses.plot(color='green', alpha=1, markersize=80, ax=ax[0])

states.boundary.plot(ax=ax[1])
houses.plot(color='green', alpha=1, markersize=10, ax=ax[1])
ax[1].set_xlim(xmin=-81, xmax=-79.5)
ax[1].set_ylim(ymin=25.2, ymax=26.5)
ax[0].set_title('Florida Coastline')
ax[1].set_title('Miami - 3 made up properties')
plt.show()

Query DEM height (rasterio)¶

coord_tuples = [tuple(x) for x in houses[['lon','lat']].values]

#generate list of sampled elevations from raster file.
with rio.open(DEM_FILE_PATH) as dem:
    #DEM is only a single layer, so we just need the 0th index
    dem_heights = [i[0] for i in dem.sample(coord_tuples)]
    
houses['DEM'] = dem_heights

houses

Query Shapefile height (geopandas)¶

layers = fiona.listlayers(SHP_FILE_PATH)
print(layers)

['FL_MFL1_low_0ft', 'FL_MFL1_low_10ft', 'FL_MFL1_low_1ft', 'FL_MFL1_low_2ft', 'FL_MFL1_low_3ft', 'FL_MFL1_low_4ft', 'FL_MFL1_low_5ft', 'FL_MFL1_low_6ft', 'FL_MFL1_low_7ft', 'FL_MFL1_low_8ft', 'FL_MFL1_low_9ft', 'FL_MFL1_slr_0ft', 'FL_MFL1_slr_10ft', 'FL_MFL1_slr_1ft', 'FL_MFL1_slr_2ft', 'FL_MFL1_slr_3ft', 'FL_MFL1_slr_4ft', 'FL_MFL1_slr_5ft', 'FL_MFL1_slr_6ft', 'FL_MFL1_slr_7ft', 'FL_MFL1_slr_8ft', 'FL_MFL1_slr_9ft', 'FL_MFL2_low_0ft', 'FL_MFL2_low_10ft', 'FL_MFL2_low_1ft', 'FL_MFL2_low_2ft', 'FL_MFL2_low_3ft', 'FL_MFL2_low_4ft', 'FL_MFL2_low_5ft', 'FL_MFL2_low_6ft', 'FL_MFL2_low_7ft', 'FL_MFL2_low_8ft', 'FL_MFL2_low_9ft', 'FL_MFL2_slr_0ft', 'FL_MFL2_slr_10ft', 'FL_MFL2_slr_1ft', 'FL_MFL2_slr_2ft', 'FL_MFL2_slr_3ft', 'FL_MFL2_slr_4ft', 'FL_MFL2_slr_5ft', 'FL_MFL2_slr_6ft', 'FL_MFL2_slr_7ft', 'FL_MFL2_slr_8ft', 'FL_MFL2_slr_9ft']

#Typically this is a fast process.
#In this case, the coastline polygons are very complex, and take a few minutes per layer to load into memory

for layer in layers:
 if "low" not in layer:
    print("Merging: {}".format(layer))
    gdf = gpd.read_file(SHP_FILE_PATH,
        driver='FileGDB',
        layer=layer)
    
    gdf = gdf.to_crs("EPSG:4326")
    
    gdf[layer] = True
    gdf = gdf[[layer, 'geometry']]
    houses = sjoin(houses, gdf, how ='left', op='within').drop(['index_right'], axis=1)

Merging: FL_MFL1_slr_0ft
Merging: FL_MFL1_slr_10ft
Merging: FL_MFL1_slr_1ft
Merging: FL_MFL1_slr_2ft
Merging: FL_MFL1_slr_3ft
Merging: FL_MFL1_slr_4ft
Merging: FL_MFL1_slr_5ft
Merging: FL_MFL1_slr_6ft
Merging: FL_MFL1_slr_7ft
Merging: FL_MFL1_slr_8ft
Merging: FL_MFL1_slr_9ft
Merging: FL_MFL2_slr_0ft
Merging: FL_MFL2_slr_10ft
Merging: FL_MFL2_slr_1ft
Merging: FL_MFL2_slr_2ft
Merging: FL_MFL2_slr_3ft
Merging: FL_MFL2_slr_4ft
Merging: FL_MFL2_slr_5ft
Merging: FL_MFL2_slr_6ft
Merging: FL_MFL2_slr_7ft
Merging: FL_MFL2_slr_8ft
Merging: FL_MFL2_slr_9ft

# There is now a column for each of the shapefile coastline polygons.
# The values are true/nan value if the property was/wasn't contained within the given coastline elevation
print(houses)

  house_id        lat        lon                    geometry       DEM  \
0        a  25.772332 -80.206236  POINT (-80.20624 25.77233)  2.363548   
1        b  25.974418 -80.133745  POINT (-80.13375 25.97442)  4.911385   
2        c  25.945423 -80.141970  POINT (-80.14197 25.94542)  5.717912   

  FL_MFL1_slr_0ft FL_MFL1_slr_10ft FL_MFL1_slr_1ft FL_MFL1_slr_2ft  \
0             NaN              NaN             NaN             NaN   
1             NaN              NaN             NaN             NaN   
2             NaN              NaN             NaN             NaN   

  FL_MFL1_slr_3ft  ... FL_MFL2_slr_10ft FL_MFL2_slr_1ft FL_MFL2_slr_2ft  \
0             NaN  ...             True             NaN             NaN   
1             NaN  ...             True             NaN             NaN   
2             NaN  ...             True             NaN             NaN   

  FL_MFL2_slr_3ft FL_MFL2_slr_4ft FL_MFL2_slr_5ft FL_MFL2_slr_6ft  \
0             NaN             NaN             NaN             NaN   
1             NaN             NaN            True            True   
2             NaN             NaN             NaN            True   

   FL_MFL2_slr_7ft FL_MFL2_slr_8ft FL_MFL2_slr_9ft  
0             True            True            True  
1             True            True            True  
2             True            True            True  

[3 rows x 27 columns]

#make list of tuples of form: (shapefile name, elevation)

elev_filter = re.compile(r'^.*_(\d{1,2})ft$')
elev_cols = list(filter(elev_filter.search, list(houses)))
elev_heights = [int(re.search(r'^.*_(\d{1,2})ft$',nm).group(1)) for nm in elev_cols]
elev_zip = sorted(zip(elev_cols, elev_heights), key=operator.itemgetter(1))
print("Elevation heights found: {}".format(elev_zip))

Elevation heights found: [('FL_MFL1_slr_0ft', 0), ('FL_MFL2_slr_0ft', 0), ('FL_MFL1_slr_1ft', 1), ('FL_MFL2_slr_1ft', 1), ('FL_MFL1_slr_2ft', 2), ('FL_MFL2_slr_2ft', 2), ('FL_MFL1_slr_3ft', 3), ('FL_MFL2_slr_3ft', 3), ('FL_MFL1_slr_4ft', 4), ('FL_MFL2_slr_4ft', 4), ('FL_MFL1_slr_5ft', 5), ('FL_MFL2_slr_5ft', 5), ('FL_MFL1_slr_6ft', 6), ('FL_MFL2_slr_6ft', 6), ('FL_MFL1_slr_7ft', 7), ('FL_MFL2_slr_7ft', 7), ('FL_MFL1_slr_8ft', 8), ('FL_MFL2_slr_8ft', 8), ('FL_MFL1_slr_9ft', 9), ('FL_MFL2_slr_9ft', 9), ('FL_MFL1_slr_10ft', 10), ('FL_MFL2_slr_10ft', 10)]

#Find most restrictive shapefile match by iterating through the sorted list of elevations.
#Update column shapefile_elevation to the lowest elevation match

houses['shapefile_elevation'] = np.nan

for shape_name, elev in elev_zip:
    #This should work for shapefiles with multiple shapes with same elevation. Only update if it's nan or elev is less than current set value
    houses.loc[(houses[shape_name] == True) &
            ((np.isnan(houses["shapefile_elevation"])) | (elev < houses["shapefile_elevation"])),
            'shapefile_elevation'] = elev
    
houses = houses.drop(columns=elev_cols)
houses

houses['inundation_elevation'] = np.where(houses['DEM'] > (houses['shapefile_elevation']-1),
                                          houses['DEM'],
                                          houses['shapefile_elevation'])

houses

Results:¶

At this point, we have mapped the elevation of inundation for each property in the sample. We can proceed to applying a variety of sea level rise models, however in our paper we highlight a model by Robert Kopp in a 2014 paper. Please see our research working paper for more information: https://www.kansascityfed.org/publications/research/rwp

	house_id	lat	lon	geometry
0	a	25.772332	-80.206236	POINT (-80.20624 25.77233)
1	b	25.974418	-80.133745	POINT (-80.13375 25.97442)
2	c	25.945423	-80.141970	POINT (-80.14197 25.94542)