Normalising bounded graph measures
Creating scale free local network analysis
Abstract
- Graph measures that are calculated within a radius cutoff creates a value of local relevance. In order to encode the true local relativity of this value, the graph measure could be normalised at the local level. This is achieved by calculating the betweenness of a segment by ranking it within the subgraph radius of interest. This provides new insights that complement the current normalisation methods, which are calculated at the full graph scale. This is demonstrated in the example of space syntax analysis and its use of radius cutoffs in calculations. The measure of "Ranked Choice" provides the same value range at all scales and does not have issues with localised hotspots like Normalised Choice. This creates the opportunity of a standardised scale-free measure which could be applied to any size of network. *
Space syntax & Segment Angular choice
The space syntax measure of choice provides an indication of the hierarchy of movement potential and the structure and syntax of space. Choice is often calculated using a graph distance radius around the segment in question, as this allows the analysis to portray movement potential at local, city-wide or regional scales. Therefore portraying the relative hierarchy of segments within a network at all scales is a core objective in translating choice values into legible insight.
Normalised Angular Choice (NACH)
Choice is generally distributed exponentially due to all segments in the graph contributing to the choice calculation, the current standard for nomalising segment angular choice is Normalised Angular Choice (NACH) formulated as:
log(choice+ 1) / log(total_depth + 3)
Hillier, B., Yang, T., Turner, A., (2012)
NACH was developed to counter act the fact that choice measures are influenced more by segregated designs than integrated designs. By combining depth with choice, this in theory controls for graph size, meaning all network types can be compared.
...we can indeed not only compare cities, but streets in one city with streets in others. We can even compare a street with a city
Ibid (2012)
NACH values are therefore used as a way of comparing morphologies in all contexts and scales, and comparing the level of structure of different cities. NACH has limitations for certain purposes however.
Continuous Urban Areas
NACH works best within contained urban graphs.
NACH means and maxima make most sense for continuous urban systems. If you expand the city to include partially unurbanised regions, for example expanding London to the M25, then global values will in general be rather lower, reflecting the lack of development in some areas. This will also lead to problems in local analysis of area.
Ibid(2012)
This creates issues when analysing rural/urban and local/global structures at the same time.
NACH is therefore sensitive to the size and composition of the network it is calculated for, which is problematic for networks larger than a city.
In order for NACH to accurately portray local hierarchy for a large network it would therefore be necessary for the graphs for each city-region created seperately and/or certain segments to be excluded, this creates issues of scalability and consistency when portraying NACH values.
Calculating true hierarchy
Choice calculations bound by radius are by definition only considering the sub-graph within that radius. A hierarchy is by definition a ranking of elements within a system, therefore an effective method of calculating the localised hierarchy would be to rank any radius-bound graph measure within the subset of the graph from which it was derived. By doing this a truly scale-free indication of hierarchy can be created.
Ranking Choice
In the case of choice this could require finding the percentile rank of a choice value within its radius (although other ranking methods could be employed). This will provide a 0 to 1 percental rank of the segment within its radius, meaning a 0.5 will be in the middle of the local hierarchy no matter what scale or graph size is being analysed.
Analysis & Visualisation Potential
By providing the actual hierarchical value of a segment, analysis can be done in terms of relative local hierarchy and structure, as opposed to total graph hierarchy. Direct comparisons can be made between different parts of a system and different scales.
By encoding all values within a consistent 0 to 1 scale that always corresponds to hierarchy, visualisation can be made similarly consistent, the rank can be intrinsically linked to visual styles such as thickness, colour or scale based on visual best practice. The user is able to view the relative hierarchy of the network without prejudice or needing to make ad-hoc decisions concerning graph size or visualisation.
Method of calculation
The ideal method would involve the following pseudo-code:
for each (segment_in_graph):
get list of (all_segments within calculated_radius of segment)
sort list by (measure)
return (rank of segment_measure - 1) /
(count of (all_segments within calculated_radius of segment) - 1)
Practical Implementation
In practice DepthmapX and other similar space syntax or graph processing packages do not acommodate such calculations. It is anticipated a purpose made software approach that calculates such measures while it is processing the sub-graph would allow this calculation to occur quickly and efficiently within memory.
Calculation Approach
Space Syntax Open Maps GB -a graph of the GB road network- is used for demonstration (see the Appendix). Both measures are visualised using a 16 class equal interval colour map, this is to ensure both measures have a roughly equal distribution of colour for each level of the hierarchy.
Due to software, power, and time limitations, the calculation used a square bounding box estimation to determine the sub-graph for each segment and then ranked the choice value within this bounding box using PostgreSQL's own percent_rank
function.
Approximations by using either a square bounding box or a circular buffer offer a quicker method at the cost of accuracy, it is unclear to what degree these deviate from a 'true' measure of rank which would be calculated within the actual sub-graph itself.
NACH vs Choice Rank Comparison
It appears that the choice rank always provides an equalised definition of local hierarchy, without the localised anomalies of NACH at 2km, this holds true at at any location within the GB network.
At 10km scale however, NACH seems to have a more exponential distribution of values than choice rank, with a stronger distinction between background and foreground networks, this is because choice rank is normalising locally whereas NACH is generally applying a logarithmic normalisation.
The visualisations raise questions concerning distribution and how to best portray hierarchy.
Visual Comparison (Left - NACH, Right - Rank Choice)
Top - Barnsbury, Middle - London, Bottom - Peterborough
2km
10km
Objectives for distribution
Given the pure choice value is often too unwieldy for direct analysis, transformations are necessary. The question however is when transforming the values, what are the objectives for analysis?
Should high values be few and lower values many (or vice versa)? In order to distinguish foreground and background better (at the cost of local background detail)?
Does a more even ranked distribution represent hierearchy better, allowing a more localised and fine grained analysis?
Choice values can be normalised and re-distributed in many ways, this is both an opportunity and a risk. There is a line however between picking out different dimensions for different purposes, and creating inadvertent bias in analysis.
Conclusion
NACH has become a primary form of normalisation, that accounts for both the exponential distribution (by logging) and the excessive weight of deep segments (by dividing by depth). However NACH is prone to localised clusters of high value on relatively isolated non-urban segments.
By ranking graph edges based on their local sub-graph, one can create a scale-free method of analysing and visualising hierarchy that is not sensitive to the graph size or composition. At local scales especially, there are no local cluster anomalies that NACH is prone to have. This scale-free nature creates potential for a consistent approach to both working with and viewing such data.
There are other opporunities for experimentation. Ranking values need not only be limited to calculation subgraphs or whole graphs, they could use a more flexible boundary, perhaps ranking within a study area or ranking specific segments for comparison. Furthermore ranking need not only be the way to locally normalise, a minimum-maximum may also offer new insights when calculated at the sub-graph level.
Given NACH's limitations and the ever larger networks that can be processed; it would be prudent to open up further study in how choice (and other bounded sub-graph) values are normalised.