# Planetiler Architecture ![Architecture Diagram](diagrams/architecture.png) Planetiler builds a map in 3 phases: 1. [Process Input Files](#1-process-input-files) according to the [Profile](#profiles) and write vector tile features to intermediate files on disk 2. [Sort Features](#2-sort-features) by tile ID 3. [Emit Vector Tiles](#3-emit-vector-tiles) by iterating through sorted features to group by tile ID, encoding, and writing to the output tile archive User-defined [profiles](#profiles) customize the behavior of each part of this pipeline. ## 1) Process Input Files First, Planetiler reads [SourceFeatures](planetiler-core/src/main/java/com/onthegomap/planetiler/reader/SourceFeature.java) from each input source: - For "simple sources" [NaturalEarthReader](planetiler-core/src/main/java/com/onthegomap/planetiler/reader/NaturalEarthReader.java) or [ShapefileReader](planetiler-core/src/main/java/com/onthegomap/planetiler/reader/ShapefileReader.java) can get the latitude/longitude geometry directly from each feature - For OpenStreetMap `.osm.pbf` files, [OsmReader](planetiler-core/src/main/java/com/onthegomap/planetiler/reader/osm/OsmReader.java) needs to make 2 passes through the input file to construct feature geometries: - pass 1: - nodes: store node latitude/longitude locations in-memory or on disk using [LongLongMap](planetiler-core/src/main/java/com/onthegomap/planetiler/collection/LongLongMap.java) - ways: nothing - relations: call `preprocessOsmRelation` on the [profile](#profiles) and store information returned for each relation of interest, along with relation member IDs in-memory using a [LongLongMultimap](planetiler-core/src/main/java/com/onthegomap/planetiler/collection/LongLongMultimap.java). - pass 2: - nodes: emit a point source feature - ways: - lookup the latitude/longitude for each node ID to get the way geometry and relations that the way is contained in - emit a source feature with the reconstructed geometry which can either be a line or polygon, depending on the `area` tag and whether the way is closed - if this way is part of a multipolygon, also save the way geometry in-memory for later in a [LongLongMultimap](planetiler-core/src/main/java/com/onthegomap/planetiler/collection/LongLongMultimap.java) - relations: for any multipolygon relation, fetch the member geometries and attempt to reconstruct the multipolygon geometry using [OsmMultipolygon](planetiler-core/src/main/java/com/onthegomap/planetiler/reader/osm/OsmMultipolygon.java), then emit a polygon source feature with the reconstructed geometry if successful Then, for each [SourceFeature](planetiler-core/src/main/java/com/onthegomap/planetiler/reader/SourceFeature.java), generate vector tile features according to the [profile](#profiles) in a worker thread (default 1 per core): - Call `processFeature` method on the [profile](#profiles) for each source feature - For every vector tile feature added to the [FeatureCollector](planetiler-core/src/main/java/com/onthegomap/planetiler/FeatureCollector.java): - Call [FeatureRenderer#accept](planetiler-core/src/main/java/com/onthegomap/planetiler/render/FeatureRenderer.java) which for each zoom level the feature appears in: - Scale the geometry to that zoom level - Simplify it in screen pixel coordinates - Use [TiledGeometry](planetiler-core/src/main/java/com/onthegomap/planetiler/render/TiledGeometry.java) to slice the geometry into subcomponents that appear in every tile it touches using the stripe clipping algorithm derived from [geojson-vt](https://github.com/mapbox/geojson-vt): - `sliceX` splits the geometry into vertical slices for each "column" representing the X coordinate of a vector tile - `sliceY` splits each "column" into "rows" representing the Y coordinate of a vector tile - Uses an [IntRangeSet](planetiler-core/src/main/java/com/onthegomap/planetiler/collection/IntRangeSet.java) to optimize processing for large filled areas (like oceans) - If any features wrapped past -180 or 180 degrees longitude, repeat with a 360 or -360 degree offset - Reassemble each vector tile geometry and round to tile precision (4096x4096) - For polygons, [GeoUtils#snapAndFixPolygon](planetiler-core/src/main/java/com/onthegomap/planetiler/geo/GeoUtils.java) uses [JTS](https://github.com/locationtech/jts) utilities to fix any topology errors (i.e. self-intersections) introduced by rounding. This is very expensive, but necessary since clients like [MapLibre GL JS](https://github.com/maplibre/maplibre-gl-js) produce rendering artifacts for invalid polygons. - Encode the feature into compact binary format using [FeatureGroup#newRenderedFeatureEncoder](planetiler-core/src/main/java/com/onthegomap/planetiler/collection/FeatureGroup.java) consisting of a sortable 64-bit `long` key (zoom, x, y, layer, sort order) and a binary value encoded using [MessagePack](https://msgpack.org/) (feature group/limit, feature ID, geometry type, tags, geometry) - Add the encoded feature to a [WorkQueue](planetiler-core/src/main/java/com/onthegomap/planetiler/worker/WorkQueue.java) Finally, a single-threaded writer reads encoded features off of the work queue and writes them to disk using [ExternalMergeSort#add](planetiler-core/src/main/java/com/onthegomap/planetiler/collection/ExternalMergeSort.java) - Write features to a "chunk" file until that file hits a size limit (i.e. 1GB) then start writing to a new file ## 2) Sort Features [ExternalMergeSort](planetiler-core/src/main/java/com/onthegomap/planetiler/collection/ExternalMergeSort.java) sorts all of the intermediate features using a worker thread per core: - Read each "chunk" file into memory - Sort the features it contains by 64-bit `long` key - Write the chunk back to disk ## 3) Emit Vector Tiles [TileArchiveWriter](planetiler-core/src/main/java/com/onthegomap/planetiler/archive/TileArchiveWriter.java) is the main driver. First, a single-threaded reader reads features from disk: - [ExternalMergeSort](planetiler-core/src/main/java/com/onthegomap/planetiler/collection/ExternalMergeSort.java) emits sorted features by doing a k-way merge using a priority queue of iterators reading from each sorted chunk - [FeatureGroup](planetiler-core/src/main/java/com/onthegomap/planetiler/collection/FeatureGroup.java) collects consecutive features in the same tile into a `TileFeatures` instance, dropping features in the same group over the grouping limit to limit point label density - Then [TileArchiveWriter](planetiler-core/src/main/java/com/onthegomap/planetiler/archive/TileArchiveWriter.java) groups tiles into variable-sized batches for workers to process (complex tiles get their own batch to ensure workers stay busy while the writer thread waits for finished tiles in order) Then, process tile batches in worker threads (default 1 per core): - For each tile in the batch, first check to see if it has the same contents as the previous tile to avoid re-encoding the same thing for large filled areas (i.e. oceans) - Encode using [VectorTile](planetiler-core/src/main/java/com/onthegomap/planetiler/VectorTile.java) - gzip each encoded tile - Pass the batch of encoded vector tiles to the writer thread Finally, a single-threaded writer writes encoded vector tiles to the output archive format: - For MBTiles, create the largest prepared statement supported by SQLite (999 parameters) - Iterate through finished vector tile batches until the prepared statement is full, flush to disk, then repeat - Then flush any remaining tiles at the end ## Profiles To customize the behavior of this pipeline, custom profiles implement the [Profile](planetiler-core/src/main/java/com/onthegomap/planetiler/Profile.java) interface to override: - what vector tile features to generate from an input feature - what information from OpenStreetMap relations we need to save for later use - how to post-process vector features grouped into a tile before emitting A Java project can implement this interface and add arbitrarily complex processing when overriding the methods. The [custommap](planetiler-custommap) project defines a [ConfiguredProfile](planetiler-custommap/src/main/java/com/onthegomap/planetiler/custommap/ConfiguredProfile.java) implementation that loads instructions from a YAML config file to dynamically control how schemas are generated without needing to write or compile Java code.