Deep SpatioTemporal Residual Networks for Citywide Crowd Flows Prediction
Junbo Zhang, Yu Zheng, Dekang Qi (Microsoft Research) 2017
Keras implementation : https://github.com/lucktroy/DeepST
Introduction
 Forecating the flow of crowds

In this paper, we predict two types fo crowd flows : inflow and outflow
 Inflow and outflow of crowds are affected by the following
 Spatial dependencies
 Temporal dependencies
 External influence : such as weather, events
 Contributions
 STResNet employs convolutionbased residual networks to model nearby and distance spatial dependencies between any two regions
 three categories of temporal properties : temporal closeness, period, and trend. STResNet use three residual netowrks to model these, respectively
 STResNet dynamically aggregates the output of the three aforementioned networks.
Formulation of Crowd Flows Problem
 Region : we partition a city into an I*J grid map
 Inflow/outflow : Let P be a collection of trajectories at the t^{th} time interval. For a grid (i, j) that lies at the i^{th} row and j^{th} column, the inflow and outflow of the crowds at the tiem interval t are defined respectively as
where
 is a trajectory in P
 is the geospatial coordinate
 means the point lies within grid (i, j), and vice versa
 denotes the cardinality of a set
Deep SpatioTemporal Residual Networks

comprised of four major components modeling temporal closeness, period, trend, and external influence, respectively.

First, we turn inflow and outflow throughout a city at each time interval into a 2channel imagelike matrix.
 Then, we divide the time axis into three fragments, denoting recent time, near history and distant history. The 2channel flow matrics of intervals in each time fragment are the fed into the first three components seperately to model the aforementioned three temporal properties: closeness, period, and trend
 three components share the same network structure(Regisudal Unit sequence)
 The output of the three components are fused as based on parameter metrics, which assign different weights to the results of different components in different regions.

In the external component, we manually extract some feature form external datasets, such as weather conditions and events, feeding them into a twolayer fullyconnected neural network.
 and are integrated together. Then, the final output is mapped into [1, 1] using Tanh function.
Structures of the First Three Components
 Do not user subsampling, but only convolutions
 closeness component
 : concatnate them along with the first axis
 is followed by
conv1
Residual Unit
: stack residual units to capture very large citywide dependenciesResidual Unit
combinations fo “ReLu + Convolution” and “BatchNormalization” is added before ReLu. On top of the residual unit, we append a convolutional layer
conv2
 output of the closeness componet is
 period component
 Assume that there are time intervals from the period fragment and the period is ;
 output :
 in implementation, p is equal to oneday (daily periodicity)
 trend component
 is the length of the trend dependent sequence and q is the trend span
 input :
 output :
 in implementation, q is equal to oneweek(week trend)
The Structure of the External Component

mainly consider weather, holiday event, and metadata(DayOfWeek, Weekday/Weekend)

stack two fullyconnected layers upon
 first layer : embedding layer
 second layer : to map low to high dimensions that have the same shape with
Fusion
 flows of two regions are all affected by closeness, period, and trend, but the degrees of influence may be very different ; parametricmatrixbased fusion
 is Hadamard product (i.e., elementwise multiplication)

are learnable parameters
 fusing the external component
 objectives : minimizing mean squared error between the predicted flow matrix and the true flow matrix.
Experiments
 Datasets
 Baselines
 HA : historical data (previous week, same time)
 ARIMA, SARIMA, VAR
 STANN : It first extracts spatial (nearby 8 regions’ values) and temporal (8 previous time intervals) features, then fed into an artificial neural network.
 DeepST : (Zhang et al. 2016)
 Preprocessing
 minmax normalization : [1, 1] (tanh)
 onehot encoding for external data
 Result
Comments