|
28 | 28 | "cell_type": "markdown", |
29 | 29 | "metadata": {}, |
30 | 30 | "source": [ |
31 | | - "The `arcgis.learn` module has an efficient point cloud classification model called RandLA-Net <a href=\"#References\">[1]</a>, which can be used to classify a large number of points in a point cloud dataset. In general, point cloud datasets are gathered using LiDAR sensors, which apply a laser beam to sample the earth's surface and generate high-precision x, y, and z points. These points, are known as 'point clouds' and are commonly generated through the use of terrestrial and airborne LiDAR.\n", |
| 31 | + "The `arcgis.learn` module has an efficient point cloud classification model called RandLA-Net <a href=\"#references\">[1]</a>, which can be used to classify a large number of points in a point cloud dataset. In general, point cloud datasets are gathered using LiDAR sensors, which apply a laser beam to sample the earth's surface and generate high-precision x, y, and z points. These points, are known as 'point clouds' and are commonly generated through the use of terrestrial and airborne LiDAR.\n", |
32 | 32 | "\n", |
33 | 33 | "Point clouds are collections of 3D points that carry the location, measured in x, y, and z coordinates. These points also have some additional information like 'GPS timestamps', 'intensity', and 'number of returns'. The intensity represents the returning strength from the laser pulse that scanned the area, and the number of returns shows how many times a given pulse returned. LiDAR data can also be fused with RGB (red, green, and blue) bands, derived from imagery taken simultaneously with the LiDAR survey. \n", |
34 | 34 | "\n", |
|
63 | 63 | "source": [ |
64 | 64 | "When it comes to classifying point clouds, deep learning and neural networks are a great choice since they offer a scalable and efficient architecture. They have enormous potential to make manual or semi-assisted classification modes of point clouds a thing of the past. With that in mind, we can take a closer look at the RandLA-Net model included in `arcgis.learn` and how it can be used for point cloud classification.\n", |
65 | 65 | "\n", |
66 | | - "RandLA-Net is a unique architecture that utilizes random sampling and a local feature aggregator to improve efficient learning and semantic segmentation on a large-scale for point clouds. Compared to existing approaches, RandLA-Net is up to 200 times faster and surpasses state-of-the-art benchmarks like Semantic3D and SemanticKITTI. Its effective local feature aggregation approach preserves complex local structures and delivers significant memory and computational gains over other methods <a href=\"#References\">[1]</a>." |
| 66 | + "RandLA-Net is a unique architecture that utilizes random sampling and a local feature aggregator to improve efficient learning and semantic segmentation on a large-scale for point clouds. Compared to existing approaches, RandLA-Net is up to 200 times faster and surpasses state-of-the-art benchmarks like Semantic3D and SemanticKITTI. Its effective local feature aggregation approach preserves complex local structures and delivers significant memory and computational gains over other methods <a href=\"#references\">[1]</a>." |
67 | 67 | ] |
68 | 68 | }, |
69 | 69 | { |
|
83 | 83 | "</center>\n", |
84 | 84 | "</p>\n", |
85 | 85 | "<br>\n", |
86 | | - "<center>Figure 2. The detailed architecture of RandLA-Net. (N, D) represents the number of points and feature dimension respectively. FC: Fully Connected layer, LFA: Local Feature Aggregation, RS: Random Sampling, MLP: shared Multi-Layer Perceptron, US: Up-sampling, DP: Dropout <a href=\"#References\">[1]</a>.</center>" |
| 86 | + "<center>Figure 2. The detailed architecture of RandLA-Net. (N, D) represents the number of points and feature dimension respectively. FC: Fully Connected layer, LFA: Local Feature Aggregation, RS: Random Sampling, MLP: shared Multi-Layer Perceptron, US: Up-sampling, DP: Dropout <a href=\"#references\">[1]</a>.</center>" |
87 | 87 | ] |
88 | 88 | }, |
89 | 89 | { |
|
105 | 105 | "- The final semantic label of each point is predicted by three fully-connected layers, (N, 64) → (N, 32) → (N, n<sub>class</sub>), and a dropout layer. The dropout ratio is 0.5.\n", |
106 | 106 | "\n", |
107 | 107 | "\n", |
108 | | - "- The output of RandLA-Net is the predicted semantics of all points, with a size of N × n<sub>class</sub>, where n<sub>class</sub> is the number of classes <a href=\"#References\">[1]</a>.\n" |
| 108 | + "- The output of RandLA-Net is the predicted semantics of all points, with a size of N × n<sub>class</sub>, where n<sub>class</sub> is the number of classes <a href=\"#references\">[1]</a>.\n" |
109 | 109 | ] |
110 | 110 | }, |
111 | 111 | { |
|
125 | 125 | "</center>\n", |
126 | 126 | "</p>\n", |
127 | 127 | "<br>\n", |
128 | | - "<center>Figure 3. RandLA-Net utilizes downsampling of point clouds at each layer, while still preserving important features required for precise classification <a href=\"#References\">[1]</a>.</center>" |
| 128 | + "<center>Figure 3. RandLA-Net utilizes downsampling of point clouds at each layer, while still preserving important features required for precise classification <a href=\"#references\">[1]</a>.</center>" |
129 | 129 | ] |
130 | 130 | }, |
131 | 131 | { |
|
153 | 153 | "</center>\n", |
154 | 154 | "</p>\n", |
155 | 155 | "<br>\n", |
156 | | - "<center>Figure 4. Illustration of the dilated residual block which significantly increases the receptive field (dotted circle) of each point, colored points represent the aggregated features. L: Local spatial encoding, A: Attentive pooling <a href=\"#References\">[1]</a>.</center>" |
| 156 | + "<center>Figure 4. Illustration of the dilated residual block which significantly increases the receptive field (dotted circle) of each point, colored points represent the aggregated features. L: Local spatial encoding, A: Attentive pooling <a href=\"#references\">[1]</a>.</center>" |
157 | 157 | ] |
158 | 158 | }, |
159 | 159 | { |
|
164 | 164 | "\n", |
165 | 165 | "In an attentive pooling unit, the attention mechanism is used to automatically learn important local features and aggregate neighboring point features while avoiding the loss of crucial information. It also maintains the focus on the overall objective, which is to learn complex local structures in a point cloud by considering the relative importance of neighboring point features.\n", |
166 | 166 | "\n", |
167 | | - "Lastly in the dilated residual block unit, the receptive field is increased for each point by stacking multiple LocSE and Attentive Pooling units. This dilated residual block operates by cheaply dilating the receptive field and expanding the effective neighborhood through feature propagation (see Figure 4). Stacking more and more units enhances the receptive field and makes the block more powerful, which may compromise the overall computation efficiency and lead to overfitting. Hence, in RandLA-Net, two sets of LocSE and Attentive Pooling are stacked as a standard residual block to achieve a balance between efficiency and effectiveness <a href=\"#References\">[1]</a>." |
| 167 | + "Lastly in the dilated residual block unit, the receptive field is increased for each point by stacking multiple LocSE and Attentive Pooling units. This dilated residual block operates by cheaply dilating the receptive field and expanding the effective neighborhood through feature propagation (see Figure 4). Stacking more and more units enhances the receptive field and makes the block more powerful, which may compromise the overall computation efficiency and lead to overfitting. Hence, in RandLA-Net, two sets of LocSE and Attentive Pooling are stacked as a standard residual block to achieve a balance between efficiency and effectiveness <a href=\"#references\">[1]</a>." |
168 | 168 | ] |
169 | 169 | }, |
170 | 170 | { |
|
419 | 419 | ], |
420 | 420 | "metadata": { |
421 | 421 | "kernelspec": { |
422 | | - "display_name": "Python 3 (ipykernel)", |
| 422 | + "display_name": "Python [conda env:conda-arcgispro-py3-clone] *", |
423 | 423 | "language": "python", |
424 | | - "name": "python3" |
| 424 | + "name": "conda-env-conda-arcgispro-py3-clone-py" |
425 | 425 | }, |
426 | 426 | "language_info": { |
427 | 427 | "codemirror_mode": { |
|
433 | 433 | "name": "python", |
434 | 434 | "nbconvert_exporter": "python", |
435 | 435 | "pygments_lexer": "ipython3", |
436 | | - "version": "3.11.10" |
| 436 | + "version": "3.11.11" |
437 | 437 | }, |
438 | 438 | "toc": { |
439 | 439 | "base_numbering": 1, |
|
0 commit comments