OTP2 Using European Data Standards

Building with Netex Data

One important new feature of OTP2 is the ability to load Netex data. Netex is a European specification for transit data exchange, comparable in purpose to GTFS but broader in scope. An EU directive aims to have all EU countries sharing Netex data by the end of 2019.

Different countries are currently using different incompatible "profiles" of Netex, but an effort is underway to converge on a single European standard profile. This is based in large part on the Norwegian profile, and Norway's national passenger information and ticketing agency Entur has contributed the OTP2 Netex loading code. Therefore if you'd like to try loading Netex data, Norway is a good place to start.

The Norwegian Netex data can be downloaded from the Entur developer pages. There is a column of Netex download links partway down the page, and the first row is for all of Norway.

Full OSM data for Norway can be downloaded from the Geofabrik Norway downloads page. Get the norway-latest.osm.pbf file, which can then be filtered to remove buildings and other unused data before loading into OTP using a command like the one below. This filtering step can be skipped if you don't have the necessary Osmium tools installed.

osmium tags-filter norway-latest.osm.pbf w/highway w/public_transport=platform w/railway=platform w/park_ride=yes r/type=restriction -o norway-filtered.osm.pbf -f pbf,add_metadata=false,pbf_dense_nodes=true

Be sure to move the original unfiltered file out of your graph inputs directory (or rename it with a suffix like norway-latest.osm.pbf.ignore) otherwise OTP2 will try to include both the filtered and unfiltered OSM data in your graph.

The build-config.json for a Norwegian graph using Netex data looks like this:

{
  "areaVisibility": true,
  "parentStopLinking": true,
  "platformEntriesLinking": true,
  "osmWayPropertySet": "norway",
  "islandWithoutStopsMaxSize": 5,
  "islandWithStopsMaxSize": 5,
  "dataImportReport": true,
  "netex" : {
    "moduleFilePattern" : ".*-netex\\.zip",
    "sharedFilePattern" : "_stops.xml",
    "sharedGroupFilePattern" : "_(\\w{3})_shared_data.xml",
    "groupFilePattern" : "(\\w{3})_.*\\.xml",
    "netexFeedId": "RB"
  }
}

Note the special section specifying how to find Netex XML files within the single ZIP archive you downloaded.

Once you have the graph inputs (the OSM PBF file, the Netex ZIP file, and the build-config.json) saved together in a directory, you can instruct OTP2 to build a graph from these inputs:

java -Xmx10G otp2.jar --build /path/to/graph/inputs

This should produce a file graph.obj in the same directory as your inputs. Building this Norway graph takes approximately 16 minutes (without elevation data, as configured above), and can be done within 10GB of heap memory (JVM switch -Xmx10G). Increasing that to 12 or 14GB might speed it up a bit if you have the space. The Graph file it produces is just under 600MB. The server will take about 30 seconds to load this Graph and start up, and will consume about 4GB of heap memory under light use.

You can then start up an OTP server with a command like this:

java -Xmx6G otp2.jar --load /path/to/graph

Once the server is started up, go to http://localhost:8080 in a browser to try out your server using OTP's built in testing web client. Try some long trips like Oslo to Bergen and see if you can get long distance trains and flights as alternatives. You might need to increase the walking limit above its very low default value.

Adding SIRI Realtime Data

Another important feature in OTP2 is the ability to use SIRI realtime data. Within the EU data standards, SIRI is analogous to GTFS-RT: a way to apply realtime updates on top of schedule data. While technically a distinct specification from Netex, both Netex and SIRI use the Transmodel vocabulary, allowing SIRI messages to reference entities in Netex schedule data. Like GTFS-RT, SIRI is consumed by OTP2 using "graph updaters" which are configured in the router-config.json file, which is placed in the same directory as the graph.obj file and loaded at server startup.

{
    "updaters": [
        {
            "type": "siri-sx-updater",
            "frequencySec": 60,
            "url": "https://api.example.com/siri",
            "feedId": "siri-sx",
            "blockReadinessUntilInitialized": true
        },
        {
            "type": "siri-et-updater",
            "frequencySec": 20,
            "previewIntervalMinutes": 180,
            "url": "https://api.example.com/siri",
            "feedId": "siri-et",
            "blockReadinessUntilInitialized": true
        },
        {
            "type": "siri-vm-updater",
            "frequencySec": 60,
            "url": "https://api.example.com/siri",
            "feedId": "siri-vm",
            "blockReadinessUntilInitialized": true
        },
        {
            "type": "raptor-transit-layer",
            "updateIntervalSeconds": 20
        }
    ]
}

The first three updaters fetch three different kinds of SIRI data:

These updaters can handle differential updates, but they use a polling approach rather than the message-oriented streaming approach of the GTFS-RT Websocket updater. The server keeps track of clients, sending only the things that have changed since the last polling operation.

Note that between these SIRI updaters and the GTFS-RT Websocket updater, we now have both polling and streaming examples of GTFS-RT "incrementality" semantics, so should be able to finalize that part of the specification.

The final updater regularly performs a copy of the realtime data into a format suitable for use by OTP2's new Raptor router. Without this updater the realtime data will be received and cataloged, but not visible to the router.

TODO explain on blockReadinessUntilInitialized for load balancers.