使用隨機森林實現OSM路網城市多車道信息提取

時間 2019-12-14

標籤使用隨機森林實現 osm 城市車道信息提取简体版

原文原文鏈接

Multilane roads extracted from the OpenStreetMap urban road network using random forests.,DOI:10.1111/tgis.12514.html

https://www.baidu.com/s?wd=%E4%BD%BF%E7%94%A8%E9%9A%8F%E6%9C%BA%E6%A3%AE%E6%9E%97%E5%AE%9E%E7%8E%B0OSM%E8%B7%AF%E7%BD%91%E5%9F%8E%E5%B8%82%E5%A4%9A%E8%BD%A6%E9%81%93%E4%BF%A1%E6%81%AF%E6%8F%90%E5%8F%96&rsv_spt=1&rsv_iqid=0x864e4cd80000ce4f&issp=1&f=8&rsv_bp=1&rsv_idx=2&ie=utf-8&tn=baiduhome_pg&rsv_enter=1&rsv_sug3=68&rsv_sug1=18&rsv_sug7=101&rsv_sug2=0&inputT=36415&rsv_sug4=75202git

OSM輔助的車載激光點雲道路三維矢量邊界提取：https://www.hanspub.org/journal/PaperInformation.aspx?paperID=24702算法

osmnx初探：https://blog.csdn.net/qq_32002189/article/details/88687028數據庫

城市道路網主幹道的多邊形提取方法：http://www.docin.com/p-775607614.htmlexpress

1. High-precision 3D road information plays an important role in intelligent transportation, urban planning and management. The mobile laser scanning system can quickly obtain the 3D information of the street scene, but it is difficult to directly extract the complete and accurate road boundary from the original point cloud due to the large amount of data, occlusion and complicated urban street scenes. OpenStreetMap is a kind of crowd source geographic data. It can be used to assist road extraction of mobile laser point clouds. This paper proposes a road 3D boundary extraction algorithm that integrates two-dimensional vector data OpenStreetMap and vehicle-borne laser point cloud data. Firstly, the point cloud feature map is constructed by analyzing the spatial distribution characteristics of the Scanning points. The OSM provides the initial position, and then the road boundary extraction is performed on the feature map of the point cloud by the improved active contour model. We use StreetMapper data to carry out experiments. The results show that the proposed algorithm can repair the lack of boundary information caused by point cloud defects, and accurately and completely extract road three-dimensional boundary information, which proves strong robustness and applicability.網絡

高精度三維道路信息在智能交通、城市規劃和管理中具備重要做用。移動激光掃描系統能夠快速獲取街景的三維信息，但因爲數據量大、遮擋多、城市街景複雜，很難從原始點雲直接提取完整、準確的道路邊界。openstreetmap是一種人羣源地理數據。它可用於輔助移動激光點雲的道路提取。本文提出了一種結合二維矢量數據、開放式街道地圖和車載激光點雲數據的道路三維邊界提取算法。首先，經過分析掃描點的空間分佈特徵，構建了點雲特徵圖。OSM提供初始位置，而後利用改進的活動輪廓模型對點雲特徵圖進行道路邊界提取。咱們使用Streetmapper數據進行實驗。結果代表，該算法可以修復點雲缺陷形成的邊界信息缺失，準確、完整地提取道路三維邊界信息，具備較強的魯棒性和適用性。app

2.框架

3.dom

The volunteered geographic information (VGI) collected in OpenStreetMap (OSM) has been used in many applica‐ tions. Extracting multilane roads and establishing a high level of expressed detail play important roles in the field of automated cartographic generalization. An accurate and detailed extraction process benefits geographic analysis, urban region division, and road network construction, as well as transportation applications services. The road net‐ works in OSM have a high level of detail and complex structures; however, they also include many duplicate lines, which degrade the efficiency and increase the diffi‐ culty of extracting multilane roads. To resolve these prob‐ lems, this work proposes a machine‐learning‐based approach, in which the road networks are first converted from lines to polygons. Then, various geometric descrip‐ tors, including compactness, width, circularity, area, pe‐ rimeter, complexity, parallelism, shape descriptor, and width‐to‐length ratio, are used to train a random forest (RF) classifier and identify the candidates. Finally, another RF is trained to evaluate the candidates using all the geo‐ metric descriptors and topological features; the outputs of this second trained RF are the predicted multilane roads. An experiment using OSM data from Beijing, China vali‐ dated the proposed method, which achieves a highly ef‐ fective performance when extracting multilane roads from OSM機器學習

OpenStreetmap（OSM）中收集的志願地理信息（VGI）已在許多應用程序中使用。提取多車道道路，創建高層次的表達細節，在地圖自動綜合領域發揮着重要做用。準確而詳細的提取過程有利於地理分析、城市區域劃分、路網建設以及交通應用服務。OSM中的道路網工程具備高度的細節和複雜的結構；可是，它們還包括許多重複的線路，這會下降效率並增長提取多車道道路的難度。爲了解決這些問題，本文提出了一種基於機器學習的方法，其中道路網絡首先從直線轉換爲多邊形。而後，使用各類幾何描述工具（包括緊湊性、寬度、圓度、面積、周長、複雜度、平行度、形狀描述符和寬長比）訓練隨機森林（RF）分類器並識別候選對象。最後，對另外一個RF進行培訓，以使用全部地理度量描述符和拓撲特徵評估候選；第二個通過培訓的RF的輸出是預測的多車道道路。利用中國北京的OSM數據進行的一項實驗驗證了該方法的有效性，該方法在從OSM中提取多車道道路時具備很好的效果。

1 | INTRODUCTION

As information technology has improved, cartography has largely switched from digitization to informatization, and has begun to focus on automatic mapping requirements, including multiscale expressions of spatial data in geographic information science (GIS), series scale‐map production, updating multiscale geospatial databases, and so on. This process is termed 「smart cartography」 and has been widely researched (Wang, 2010). One hot re‐ search topic is the ability to automatically derive small‐scale road networks from large‐scale road networks, which form the most important feature on many maps. Multiscale road network cartography lies at the core of—and is a key aspect of—many analysis and application studies. As multilane roads play an important role in city road network transportation patterns from fine to coarse‐grained level, their functional hierarchy is crucial (Heinzle & Anders, 2007; Heinzle, Anders, & Sester, 2006; Zhang, 2004)

隨着信息技術的不斷完善，地圖學已經從數字化轉向信息化，並開始關注自動製圖的需求，包括地理信息科學（GIS）中空間數據的多尺度表達、系列比例尺地圖的製做、多尺度地理空間數據庫的更新等。這一過程被稱爲「智能地圖學」，並已被普遍研究（王，2010年）。一個熱門的從新搜索主題是可以從大型道路網絡自動派生小型道路網絡，這是許多地圖上最重要的功能。多尺度路網地圖學是許多分析和應用研究的核心和關鍵。因爲多車道公路在城市道路網從細到粗運輸模式中起着重要做用，所以其功能層次相當重要（Heinzle&Anders，2007；Heinzle，Anders&Sester，2006；Zhang，2004）。

In recent years, volunteered geographic information (VGI) such as the OpenStreetMap (OSM) project has been widely used for updating spatial databases, in spatial analysis, and in many other applications (Xu, Chen, Xie, & Wu, 2017) because every user can become a contributor (Goodchild, 2007; Li & Qian, 2010). The development of global positioning system (GPS) devices, which can acquire personal geographical location information (Zou, Yu, & Cao, 2017), has conveniently allowed highly detailed OSM road network data to be obtained easily. Multiscale expres‐ sions of road networks and the production of multiscale maps have engendered many new research opportunities. Such studies are helpful in studying the automatic synthesis of road networks and in improving the production of map data. The wiki of OSM has defined a tag of 「lanes」 to specify how many traffic lanes are on a highway. However, most road layers lack the tag, and OSM road network data have almost no clear indication of multilane road properties; thus, it is of limited use for research on the functional levels of roads. The goal of this study was to extract multilane roads from OSM urban road networks. This study was undertaken for the following reasons:

近年來，志願地理信息（vgi）如openstreetmap（osm）項目被普遍用於更新空間數據庫、空間分析以及許多其餘應用程序（Xu、Chen、Xie和Wu，2017），由於每一個用戶均可以成爲貢獻者（Goodchild，2007；Li和Qian，2010）。全球定位系統（GPS）設備的開發，能夠獲取我的地理位置信息（鄒、俞、曹，2017），方便地得到高度詳細的OSM道路網絡數據。道路網的多尺度擴展和多尺度地圖的製做帶來了許多新的研究機會。這些研究有助於研究道路網的自動綜合，提升地圖數據的生成。OSM的wiki定義了一個「車道」標籤，用於指定一條公路上有多少條車道。然而，大多數道路層缺少標記，OSM道路網絡數據幾乎沒有明確的多車道道路特性指示，所以，它在道路功能水平研究中的應用有限。本研究的目的是從OSM城市道路網中提取多車道道路。本研究的開展緣由以下：

1. The multilane roads in an urban road network form a framework for the construction of urban road networks. Generally, the multilane roads in urban road networks have high traffic capacity and represent the urban traffic flow model. Thus, analyzing the traffic flow of multilane roads is very important when constructing urban road networks

1. 城市道路網中的多車道道路構成了城市道路網建設的框架。一般，城市道路網中的多車道道路具備較高的通行能力，表明了城市交通流模型。所以，分析多車道公路的交通流在城市道路網建設中具備十分重要的意義。

2. High level of detail (LoD) data concerning urban road networks are required when building road network data‐ bases. The data quality of multilane roads is directly related to the data quality of road networks data at differ‐ ent scales, which affects the effect of multiscale map expression. Therefore, it is important to study the most appropriate way to extract the multilane roads from a road network to establish an application database.

2. 在創建道路網絡數據庫時，須要有關城市道路網絡的高詳細程度（LOD）數據。多車道公路的數據質量直接關係到不一樣尺度上路網數據的數據質量，從而影響多尺度地圖表達的效果。所以，研究從道路網絡中提取多車道道路的最合適方法，創建應用數據庫具備重要意義。

3. The multilane roads of an urban road network play important roles in geographical analysis, traffic analysis, traf‐ fic application services, and so on. The multilane road network also plays an important role in building urban road network models, as well as at the function level.

3. 城市道路網的多車道道路在地理分析、交通分析、交通應用服務等方面發揮着重要做用。多車道公路網在城市道路網模型的創建和功能層面上也發揮着重要做用。

This study extracted multilane roads from the OSM road network using a random forest (RF)‐based method. Most of the multilane roads in a city are expressed by multiple lanes, which can be considered as several closed polygons constructed by their intersecting points. Therefore, multilane roads can be extracted using polygon analysis tech‐ niques (Li, Fan, Luan, Yang, & Liu, 2014), and this study proposes a polygon‐based intelligent extraction method for the multilane roads of urban road networks. The proposed method in this article uses more effective shape descriptors for circularity, complexity, and compactness to describe the multilane polygons. By combining these shape descriptors with the topological characteristics of polygons between roads, some candidate polygons are evaluated by another trained RF. This method is both highly feasible and introduces no loss of precision, making it a significant step in im‐ proving and optimizing road networks.

本研究使用基於隨機森林（RF）的方法從OSM道路網絡中提取多車道道路。城市中大多數多車道道路都是用多車道表示的，多車道能夠看做是由交叉點構成的多個封閉多邊形。所以，可使用多邊形分析技術（Li、Fan、Luan、Yang和Liu，2014）提取多車道道路，本研究提出了一種基於多邊形的城市道路網絡多車道道路智能提取方法。本文提出的方法使用更有效的形狀描述符來描述多平面多邊形的圓度、複雜性和緊湊性。經過將這些形狀描述符與道路間多邊形的拓撲特徵相結合，用另外一個通過訓練的RF對一些候選多邊形進行評估。該方法既具備很高的可行性，又不會形成精度損失，這使得它成爲改進和優化道路網絡的一個重要步驟。

The remainder of this article is organized as follows. Section 2 provides an overview of prior work related to this study. Section 3 describes the method for extracting multilane roads using the RF in detail. Section 4 de‐ scribes and discusses the experimental results and Section 5 presents concluding remarks.

本文的其他部分組織以下。第2節概述了與本研究相關的前期工做。第3節詳細介紹了使用射頻提取多車道道路的方法。第4節描述並討論了實驗結果，第5節給出告終論性評論。

2 | REL ATED WORK

Road network synthesis is an important research field in cartography, and considerable research has been con‐ ducted on matching, recognizing, and extracting roads (Kuntzsch, Sester, & Brenner, 2016; Volker & Fritsch, 1999; Xiong, 2000). Regarding extraction methods for multilane roads, numerous approaches exist, including manual, semi‐automated, and fully automated. In the early stage, road‐level attributes were used as the extraction metric (Wang, 1994); however, this method is limited by factors such as data quality and data providers’ expertise. Some scholars have proposed the concept of a 「stroke,」 which is defined as a road that is connected, unbranched, and coherent; subsequently, multilane roads can be selected according to the stroke order (Thomson, 2006; Thomson & Richardson, 1999; Yang, Luan, & Li, 2011). The stroke value can be calculated by multiple attributes such as road length (Chaudhry & Mackaness, 2005), connectivity between strokes (Zhang, 2005), and so on (Jiang & Claramunt, 2004). Indeed, the stroke concept is an effective structural model that allows road network analysis based on the importance of every road path, even without other information (Mackaness, Ruas, & Sarjakoski, 2011). However, the stroke concept does not consider spatial topology; therefore, it can be accurate only at the local level. In recent years, several methods have been proposed for extracting road networks based on their geometric features, topo‐ logical relations, and spatial distribution characteristics (Guo, Qian, Huang, He, & Liu, 2014; He, Qian, Liu, Wang, & Hu, 2015). Among these, some have introduced intelligent algorithms, including a case study approach (Guo et al., 2014), a method that used the genetic algorithm (Wang & Deng, 2005), and another that used a neural network (Balboa & López, 2008; Zhou & Li, 2014). The case study methodology simplifies the complex extraction process but depends highly on an expert case library. The genetic algorithm is time‐consuming and the genetic model can experience convergence problems. Although the intelligent methods used to analyze road networks each have their own advantages and disadvantages, with further research and scientific and technological advances, these methods will become increasingly perfected.

道路網綜合是地圖學的一個重要研究領域，在匹配、識別和提取道路方面進行了大量的研究（Kuntzsch、Sester和Brenner，2016；Volker和Fritsch，1999；Xiong，2000）。對於多車道公路的提取方法，存在許多方法，包括手動、半自動和全自動。在早期階段，道路等級屬性被用做提取指標（Wang，1994）；可是，這種方法受到數據質量和數據提供商專業知識等因素的限制。一些學者提出了「中風」的概念，即一條鏈接、不分叉、連貫的道路；隨後，能夠根據中風順序選擇多車道道路（Thomson，2006；Thomson&Richardson，1999；Yang、Luan和Li，2011）。行程值能夠經過多種屬性計算，如道路長度（Chaudhry&Mackanes，2005年）、行程之間的連通性（Zhang，2005年）等（Jiang&Claramunt，2004年）。事實上，中風概念是一種有效的結構模型，它容許基於每條道路重要性的道路網絡分析，即便沒有其餘信息（Mackanes、Ruas和Sarjakoski，2011年）。可是，筆畫概念不考慮空間拓撲，所以只能在局部級別上進行精確計算。近年來，根據路網的幾何特徵、地形關係和空間分佈特徵，提出了幾種提取路網的方法（郭、錢、黃、何、劉，2014；何、錢、劉、王、胡，2015）。其中，一些人介紹了智能算法，包括案例研究方法（Guo等人，2014年）、使用遺傳算法的方法（Wang&Deng，2005年）和使用神經網絡的方法（Balboa&L_Pez，2008年；Zhou&Li，2014年）。案例研究方法簡化了複雜的提取過程，但高度依賴於專家案例庫。遺傳算法耗時，遺傳模型存在收斂問題。雖然分析路網的智能方法各有優缺點，但隨着研究的深刻和科學技術的進步，這些方法將愈來愈完善。

Some studies of multilane road extraction are based on lines—parallel lines, which in proximity are defined as multilane roads when they exhibit the appropriate angles, lengths, and distances—they are connected by grow‐ ing a buffer to generate the road network (Yang et al., 2011; Zhang, 2009). However, because some VGI data are of poor quality, such as the road network data in OSM, it is both time‐consuming and error‐prone to extract multilane roads using only lines (Li et al., 2014). Fortunately, a new approach based on polygon analysis has been proposed (Li et al., 2014), which converts road lines to polygons to better describe the road network. That study used a support vector machine (SVM) to classify multilane roads. Polygon analysis is a better approach for solving the poor‐data problem of VGI data, but it requires capable polygon shape descriptors and an effective method to determine the polygons that represent multilane roads (Li et al., 2014).

對多車道道路提取的一些研究是基於平行線，這些平行線在附近被定義爲多車道道路，當它們顯示出適當的角度、長度和距離時，它們經過增長緩衝區來生成道路網絡而相互鏈接（Yang等人，2011；Zhang，2009）。可是，因爲某些VGI數據質量較差，例如OSM中的道路網絡數據，所以僅使用線路提取多車道道路既耗時又容易出錯（Li等人，2014）。幸運的是，提出了一種基於多邊形分析的新方法（Li等人，2014），該方法將道路線轉換爲多邊形，以更好地描述道路網絡。該研究使用支持向量機（SVM）對多車道道路進行分類。多邊形分析是解決VGI數據差數據問題的較好方法，但它須要有能力的多邊形形狀描述符和肯定表明多車道道路的多邊形的有效方法（Li等人，2014）。

In contrast to the abovementioned studies, and by taking full advantage of the polygon analysis method, we aim to extract the multilane roads from OSM data using a machine learning RF‐based approach. In this study, polygon circularity, parallelism, and width are defined, and shape descriptors are extracted using discrete Fourier transforms. Combined with some other geometric features such as compactness, circularity, perimeter, and com‐ plexity, these data form the input to one RF that extracts first‐stage candidates. Then, a second RF evaluates the first‐stage multilane road candidate polygons to generate the final set of multilane roads. The model is trained using the input dataset by adding the proposed topological relationships, including topological intensity and topo‐ logical connections based on the candidates.

與上述研究相比，經過充分利用多邊形分析方法，咱們旨在使用基於機器學習的射頻方法從OSM數據中提取多車道道路。本文定義了多邊形的圓度、平行度和寬度，並利用離散傅立葉變換提取了形狀描述符。這些數據與一些其餘幾何特徵（如緊湊性、圓度、周長和複雜度）相結合，造成對一個射頻的輸入，該射頻提取第一階段的候選對象。而後，第二個RF評估第一階段的多車道道路候選多邊形，以生成最終的多車道道路集。使用輸入數據集，經過添加所提議的拓撲關係（包括基於候選對象的拓撲強度和拓撲鏈接）來訓練模型。

3 | METHODOLOGY

A road network is composed of lines. The complex topological relations between the road segments allow the entire road network to be regarded as a group of polygons. The multilane roads in these road networks always contain some parallel lines; therefore, the polygons that describe multilane roads can be recognized by these features (Figure 1). The approach used in this article attempts to find some geometric and topological descriptors for the polygons; then, the RFs are applied to perform a binary classification of the polygons into either multilane roads or not multilane roads.

道路網由線組成。路段之間的複雜拓撲關係容許將整個路網視爲一組多邊形。這些道路網絡中的多車道道路老是包含一些平行線；所以，這些特徵能夠識別描述多車道道路的多邊形（圖1）。本文所用的方法試圖找到一些多邊形的幾何和拓撲描述符；而後，應用RFS對多邊形進行二元分類，將其劃分爲多車道公路或非多車道公路。

3.1 | Data preprocessing OSM is a free worldwide vector map dataset created by volunteers from all over the world; consequently, some volunteers lack professional training, and the OSM dataset includes several problems in terms of both data quality and data availability. First, some road data are repeatedly created by different volunteers; thus, repeated lines may exist in the OSM data which lack professional checking. Second, some of the contributions by non‐professional volunteers may be incorrect (Goodchild & Li, 2012). For example, there are some unreasonable angles between lines, disconnected lines, even entangled lines. It is impossible for all these cases to exist. There is no multilane road attribute in the OSM road network data; therefore, we cannot simply extract the multilane roads in the urban road network based on pre‐existing attributes. Instead, we must analyze the characteristics of the road network data carefully and perform high‐quality processing of the original OSM road network data to obtain processed data that meets the requirements of the method studied in this article.

相關標籤/搜索

每日一句

每一个你不满意的现在，都有一个你没有努力的曾经。