Handling Map Cluster Data

Written by Paul

Introduction

When developing a map-based service, handling pin data is essential, and implementing clustering for multiple pins is often necessary. Key considerations in clustering include:
  • Determining the "distance level" for clustering.
  • Managing clusters based on zoom levels in client SDKs like Mapbox, Kakao Map, or Naver Map.

Model Setup

Using a Prisma schema, the required fields are lat, lng, and geohash, as shown below:
model Location { lat Int lng Int geohash String title String // Additional fields as needed }
These fields support querying (via geohash) and clustering (via lat and lng).

Querying Location Data with Geohash

On the backend, map data is queried, clustered, and sent to the client as an HTTP response. The following outlines how to query location data using geohash.

Client Side

In map SDKs like Mapbox, Kakao Map, or Naver Map, you can retrieve the boundary coordinates (bounds) visible to the user, defined as SouthWestLng, SouthWestLat, NorthEastLng, and NorthEastLat. After obtaining these boundaries, install the ngeohash library:
$ yarn add ngeohash
Using ngeohash, generate all geohash values within these bounds as follows:
import ngeohash from 'ngeohash'; const bboxes = ngeohash.bboxes( swlat, swlng, nelat, nelng, precision - 2 > 1 ? precision - 2 : 1 );
To adjust precision based on zoom level, use the following function:
const getGeohashPrecision = (zoomLevel: number) => { if (zoomLevel <= 3) return 2; if (zoomLevel <= 6) return 4; if (zoomLevel <= 9) return 6; if (zoomLevel <= 12) return 7; return 8; };
The bboxes array now contains all geohash values within the map’s boundaries. Pass these geohashes to the server via HTTP.

Server Side

After parsing the geohashes from the query string, use them to query location data as follows:
const data = await database.location.findMany({ where: { OR: geohashes.map((prefix) => ({ geohash: { startsWith: prefix, } })) } })
The result will include all location data within the requested geohash regions.

Clustering with K-means

For clustering, you can use the ml-kmeans library:
import { kmeans } from 'ml-kmeans'; const limit = 20; const kmeansData = kmeans( data.map((item) => [Number(item.lat), Number(item.lng)]), limit, {} );
The limit value defines the number of clusters and may need adjustment based on data size:
const limit = data.length < 100 ? data.length : Math.min(data.length, geohashes.length);
Once clusters are created, you can associate the cluster IDs with the queried data:
const clusterData = kmeansData.cluster.map((clusterId, index) => ({ ...data[index], clusterId }));

Aggregating Clustered Data

Aggregate data for each cluster using reduce:
const locationPins = clusterData.reduce( (acc, cluster) => { const { clusterId } = cluster; if (!acc[clusterId]) { acc[clusterId] = { title: '', sumLat: 0, sumLng: 0, locations: [] }; } acc[clusterId].sumLat += Number(cluster.lat); acc[clusterId].sumLng += Number(cluster.lng); acc[clusterId].locations.push({ title: cluster.title, lat: cluster.lat ? Number(cluster.lat) : 0, lng: cluster.lng ? Number(cluster.lng) : 0 }); return acc; }, {} as { [key: string]: { title: string, sumLat: number, sumLng: number, locations: { title: string, lat: number, lng: number }[] } } );
Now, transform the aggregated locationPins into a serialized DTO to return in the response:
return Object.keys(locationPins).map( (key) => new LocationPinDTO({ lat: locationPins[key].sumLat / locationPins[key].locations.length, lng: locationPins[key].sumLng / locationPins[key].locations.length, count: locationPins[key].locations.length, locations: locationPins[key].locations }) );

Client Rendering

On the client side, parse the response data and render it on the map. Render a single pin if count is 1; otherwise, render a cluster icon.
← Go home