Author(s): Wang Mengyao; Liu Qijie; Li Jiaye; Li Tiejian
Linked Author(s):
Keywords: Global-scale river network platform; Cloud-native architecture; Lakehouse; Distributed computing
Abstract: The management and analysis of global river network data present significant challenges due to the scale of structured data reaching terabytes and the presence of variable-length data types. Traditional database systems, such as MySQL and PostgreSQL, are often insufficient for indexing and storing such datasets. This paper introduces the "River Network" platform, a distributed cloud-native system designed to handle complex analytical and query requirements. The architecture leverages a hierarchical node topology (Star, Planet, Satellite) to optimize for varied data center capabilities and network latencies. By adopting a Lakehouse architecture with ParadeDB and object storage, the system unifies Online Transaction Processing (OLTP) and Online Analytical Processing (OLAP) workflows. Furthermore, the platform integrates AI-assisted development and replaces traditional Message Passing Interface (MPI) and Java-based GIS stacks with Ray, Dask, and Rust-based cloud-native toolchains, significantly enhancing computational efficiency and rendering performance.
Year: 2026