Index

A

Accuracy, 
classification, 361, 364, 397
detection, 80, 104, 125
localization, 4, 68, 87, 236, 253
positioning, 237, 251
Adversarial, 
training, 17
Adversarial variational Bayes (AVB), 13, 17
Adversarially learned inference (ALI), 13, 19
Aerial image scenes, 62
Alarm detection, 344
Amazon Mechanical Turk (AMT), 151
Ambient light sensors, 249
Anchor point relation (APR), 225
Annotated datasets, 280
Application programming interface (API), 246
Architecture, 11, 20, 22, 25, 26, 29, 105, 106, 109, 110, 112, 117, 121, 209, 214–216, 220, 238, 241, 244, 249, 387, 388, 391, 392
baseline, 388
CNNs, 15
FuseNet, 57
fusion, 103, 109, 125, 214, 215
model, 27
multimodal, 31
network, 43–45, 48, 49, 81, 386
optimal, 109
performance, 109
TRPN, 112
Artificial intelligence (AI), 240
Artificial neural network (ANN), 71, 247
Automated drone, 257
Automated localization, 239
Automatic guided vehicles (AGV), 251
Automatic multimodal coding systems, 3
Autonomous driving (AD), 1, 10, 102, 160, 202, 238, 244, 252, 260, 261, 384
Autonomous driving vehicles, 202, 207, 248, 252
Autonomous drone, 192
Autonomous mobile robots (AMR), 251, 262

B

Baseline, 
architecture, 388
SegNet architecture, 60
TRPN, 122
Bayesian fusion, 344
Benchmark datasets, 11, 49
BicycleGAN, 151
BigLittle architecture, 244
Binary class probabilities, 368, 373
Binary classification, 356, 367, 368
Binary classification fusion, 368
Bonnland dataset, 321, 331
Bounding box (BB), 4, 43, 80, 81, 85–87, 92, 93, 95, 96, 98, 102, 103, 112, 113, 115, 117, 118
Broadcast GNSS signal, 219
Building information modeling (BIM), 264
Buildings, 
enhanced detection, 376
key geometric features, 326
Bundle adjustment (BA), 224

C

Camera, 
coordinate system, 210
localization, 81
motion, 160, 162, 173
networks, 254
parameters, 160, 263
sensor, 213
smart, 210
standard, 210
system, 245
tracking, 223
trajectory, 235
Candidate architecture, 242
Cellular networks, 216
Central fusion unit, 216
Centralized architecture, 203, 204, 207–209, 212–215, 219–224, 227–229, 232–235, 240–247, 249, 250
Centralized fusion, 214
CEVA deep neural network (CDNN), 242
Cityscape dataset, 32, 34, 154
Classes, 
borders, 363
classification, 368
land cover, 359, 369
probabilities, 81, 82, 84, 85, 347, 351, 352, 354, 368
target, 32
Classification, 69, 77, 80–82, 91, 92, 95, 107, 225, 238, 241, 247, 253, 317, 321, 324, 344, 346–349, 354, 356, 358, 361, 363, 364, 367, 369, 370, 372, 375, 376, 387, 391
accuracy, 361, 364, 397
dataset, 6, 281
fusion, 3
land cover, 6, 346, 377
layer, 84, 113
map, 354, 355
margin, 351
methods, 387
models, 107, 365
performance, 357
problem, 357, 365
process, 344, 347
scene, 11, 308, 317, 323, 331, 337
standard, 365
supervised, 365
target, 108, 112
tasks, 67
unsupervised, 345
Classification scores (CS), 112, 115
Classifier, 4, 46, 47, 81, 82, 84, 85, 317, 322, 344–346, 351, 353, 367, 372
CNNs, 80
confidence, 352
head, 48
network, 46
trained, 321
CMOS camera, 210, 263
CNNs, 11, 43–45, 47, 67, 81, 98, 106, 222, 253, 365
architecture, 15
classifier, 80
deep, 42
training, 285
training process, 283
Coarse tracking, 167, 168, 170, 171, 181, 182
Color, 
cameras, 152, 264
image, 4, 75, 136, 138, 139, 144, 146, 147, 149, 151, 152, 154
segmentation images, 146
space fusion, 54
Computation architectures, 202
Computer aided design (CAD), 251
Computer vision (CV), 1–3, 137, 201, 221, 247, 384
Concat fusion, 123
Concat layer, 116
Concatenation, 12, 13, 29, 104, 105, 111, 119, 120, 130, 353, 367
Concatenation fusion, 105, 109, 111
Concatenation fusion scheme, 4, 104
Conditional adversarial networks, 146, 147
Conditional random field (CRF), 344
Conduct multitask learning, 117
Confusion, 372, 376
matrix, 55, 61, 345
Connected components (CC), 323
Consistency loss, 139
Constrain, 10
Constructive Solid Geometry (CSG), 330
Convolutional, 
layers, 30, 45, 46, 48, 55, 62, 69, 73, 75–77, 81, 83, 84, 110–113, 116–118, 147, 388
networks, 44
neural networks, 3–6, 10, 42, 66–68, 70, 75, 79–81, 84, 92, 222, 253, 332, 346, 365
Convolutional accelerators (CA), 246
Convolutional neural network (CNN), 6, 10, 67, 106, 138, 222, 253, 365, 367
Credibilist decision fusion, 344
Crime scene, 256
Cross correlation function (CCF), 225
Cross entropy loss, 285
Cumulative distribution functions (CDF), 315
Cumulative matching characteristic (CMC), 151
Cumulative weight brightness transfer function (CWBTF), 137
Curb detection, 252
Cyber physical systems (CPS), 262
CycleGAN, 21, 22, 145

D

Data fusion, 44, 60, 212, 214, 215, 227, 229, 230, 233, 234, 252, 253, 259, 260, 344, 370
algorithms, 227, 230
detection, 260
filter, 203
multimodal, 11
system, 227
Data modalities, 385
Dataset, 
annotation, 142
augmentation, 154
classification, 6, 281
EuRoC, 231
GTSRB, 81
KAIST, 118
multimodal, 231, 386
pedestrian, 107
statistics, 294
target, 298
traffic sign, 68
Decentralized architecture, 203, 204, 207–209, 212–215, 219–224, 227–229, 232–235, 240–247, 249, 250
Decoder, 10, 13, 14, 19, 24–27, 36, 44, 47
depth, 29
features, 47
functions, 27
networks, 10, 13, 27
Deconvolutional layer, 147
Deep, 
autoencoder architecture, 387
CNNs, 42
convolutional neural networks, 138, 246, 344
convolutional neural networks architecture, 106
learning, 2, 42, 98, 222, 224, 236–238, 241, 245, 253, 266, 280, 281
algorithms, 280
approaches, 222, 384, 387
CNN framework, 3
methods, 49
models, 47, 51
techniques, 280
Deep learning accelerator (DLA), 245
Deep neural network (DNN), 12, 19, 104, 236, 246, 280, 283, 337
Depth, 
branches, 48
channel, 397
cues, 384
data, 49, 57, 58, 62, 384, 392, 398
data stream, 386, 387
decoder, 29
estimation, 11, 29–32, 87
methods, 31
task, 32
feature, 58, 388
feature maps, 391, 392
filtering, 51, 53, 54, 57, 59
fusion, 42, 44, 47, 50, 53
fusion approaches, 57
fusion networks, 42
image, 12, 36, 42, 43, 47, 48, 50, 51, 53, 56, 60, 61, 397
information, 44, 48, 53, 58, 384
map, 26–29, 34, 36, 211, 236, 313, 314
measure, 3
modality, 397
model, 393
network, 387, 393, 395
perception, 383, 384
reconstruction, 10
relative, 384
representation, 48
scene, 47
sensors, 210, 248, 384, 397
slice, 76
stream, 391, 399
stream network, 393, 395
values, 29, 48, 51
variables, 171
Detection, 109, 118, 125, 128, 216, 259, 260, 263, 332, 333, 337, 343, 344, 364, 366, 372, 374
accuracy, 80, 104, 125
algorithm, 244
data fusion, 260
loss term, 115, 119, 123
multimodal pedestrian, 4, 103, 104, 108, 109, 112, 116, 125, 130
pedestrian, 4, 101–103, 105–110, 121, 123, 124, 138, 280
performances, 119, 123, 125
results, 103, 108, 119, 120, 123, 124
target, 105, 112
traffic sign, 81
Differential GNSS, 207, 219, 220
Digital elevation model (DEM), 235, 258
Digital signal processors (DSP), 242
Digital surface models (DSM), 49, 50
Direct memory access (DMA), 244
Direct sparse odometry (DSO), 160, 165
Discriminative features, 4
Discriminator network, 19, 147
Distillation loss, 386, 396
Drone, 163, 187, 189, 190, 200, 202, 212, 242, 247, 251, 256–258
devices, 253
Dropout layer, 69, 73, 75–77, 83, 84
Dynamic vision sensor (DVS), 210

E

eBee drone, 258
Electronic control units (ECU), 261
Encoder, 10, 13, 14, 18, 19, 24–27, 36, 44, 47, 48
Encoder network, 13, 139, 147
Ensemble classifier, 58
Erroneous detection, 363
ETRIMS dataset, 333
Euclidean loss, 387, 391
EuRoC dataset, 5, 160, 163, 186, 189, 191, 193, 196, 255
Evidential fusion rule, 345
Expectation maximization (EM), 309
Extended Kalman filter (EKF), 213, 228

F

Feature, 
fusion, 110, 111, 119, 130
layers, 119
learning, 80, 81, 92, 98, 103, 110
maps, 4, 6, 27, 76, 95, 104, 105, 110, 111, 113, 115–117, 119–121, 123, 130, 387, 390, 394, 396
modalities, 12
Floating diffusion, 208
Floating Point Unit (FPU), 244
Functional safety features, 244
FuseNet, 48, 55, 57, 58, 60, 62
architecture, 57
NRG, 60–62
RGB, 57, 58, 62
Fusion, 372
algorithms, 217, 343
approaches, 49, 62, 343, 349
architecture, 103, 109, 125, 214, 215
classifications, 3, 216
depth, 42, 44, 47, 50, 53
feature, 110, 111, 119, 130
filter, 227, 260
framework, 376
functions, 111, 119
layer, 112
methods, 346, 347, 363, 377
model classification result, 356
multimodal, 2, 4, 103, 105, 110
multimodal feature, 119
pose, 224
problem, 346
process, 3, 358, 362
purpose, 360
result, 350, 353, 367, 368
rules, 6, 344, 345, 347–353, 358, 363, 372, 373, 376, 377
stages, 109
step, 364
strategy, 343
supervised, 353, 372, 376
system, 235
unit, 214
Fuzzy classifier, 346

G

Generative adversarial network (GAN), 4, 5, 13, 15, 16, 19, 22, 24, 136, 138, 145, 147, 154, 236
Generator network, 147
Geometrical features, 222, 234
Global positioning system (GPS), 205
GNSS, 5, 201–203, 205, 216, 218, 220, 226, 232, 233, 235, 237, 249, 252, 265
accuracy, 219
antennas, 206
measurements, 233
position, 207
receiver, 219
satellites, 206
services, 206
signal, 206, 216, 219, 220
spoofing attack, 233
Graphic processing units (GPU), 242
Gravity direction, 173, 174, 182, 185, 192, 193, 196
Ground sampling distance (GSD), 51
Ground truth, 20, 28, 29, 32, 34, 56, 60–62, 91, 93, 96, 137, 187, 238, 254, 266, 347, 351, 353, 361, 363, 364, 367, 369, 370, 372, 374–377
bounding box, 93
labels, 49
traffic sign, 96

H

Halfway fusion, 109, 125, 129
Hallucinated feature maps, 387, 393
Hallucination, 
learning, 387, 393
loss, 392
network, 6, 385, 386, 391–393, 395–397, 399
network learning process, 396
Handcrafted features, 387
Heterogeneous sensor, 201, 247, 248, 253, 266
Heterogeneous sensor data fusion, 212, 216
Heterogeneous sensor fusion, 239
Hidden layer, 69, 71, 73, 75–77, 83, 84
Hierarchical architectures, 203, 204, 207–209, 212–215, 219–224, 227–229, 232–235, 240–247, 249, 250
High dynamic range (HDR), 244
High performance computer (HPC), 246
HOG features, 79, 107
Holographic processing unit (HPU), 250
HS classification, 362, 363
Hyspex sensors, 360

I

Image signal processors (ISP), 242
ImageNet classes, 285, 293
ImageNet dataset, 118, 333
IMU measurements, 162, 172, 173, 232, 260
Independent classifiers, 344
Indian regional navigation satellite system (IRNSS), 206
Indoor, 
datasets, 62
environments, 216, 218, 237
images, 45
images segmentation, 45
images semantic segmentation, 62
localization, 216, 237, 259, 265
navigation, 218
pedestrian navigation, 232
positioning, 220
scene, 51, 57, 62
scene understanding, 211
semantic segmentation, 49
situations, 220
Inertial data fusion, 230
Inertial measurement unit (IMU), 2, 5, 160, 203
Inertial navigation system (INS), 201
Information fusion, 108
Infrared cameras, 136, 253
Infrared sensors, 4
Insufficient training data, 80
Integrate channel features (ICF), 106
Integrated circuit (IC), 209
Intellectual property (IP), 241
ISPRS dataset, 43, 51, 53, 58, 62
ISPRS Vaihingen dataset, 58, 60
Iterative dual correspondence (IDC), 225

J

Joint multitask training, 12
Joint supervision, 106

K

KAIST, 
dataset, 108, 118
multimodal dataset, 109
testing dataset, 105, 118, 121, 123, 125
Kalman filter (KF), 227
Keyframe pose, 182
Kinematic GNSS, 207

L

Label decoders, 28
Label encoders, 28
Land cover, 49, 342, 343, 346, 357, 360, 365, 366
classes, 359, 369
classification, 6, 346, 377
fusion scheme, 343
urban, 357
Laser imaging detection, 201
Layer, 45, 46, 71–73, 75–77, 83, 84, 110, 116, 333, 335, 336, 387, 388, 390, 391
classification, 84, 113
fusion, 112
multimodal feature fusion, 110
network, 73
Layer for classification, 106
Learned features, 92, 98
Learning, 4, 13, 15, 67, 68, 75, 125, 130, 235, 238, 247, 253, 266, 281–283, 285, 385, 392, 393, 396, 399
algorithms, 68, 69
deep, 2, 42, 98, 222, 224, 236–238, 241, 245, 253, 266, 280, 281
embedding models, 293
feature, 80, 81, 92, 98, 103, 110
hallucination, 387, 393
machine, 244, 387
multimodal feature, 110
multitask, 10–12, 36, 130
package, 3
paradigm, 385, 391, 392, 399
problems, 24, 236
procedure, 385
process, 68, 69, 387, 392, 393, 396
rate, 29, 74, 119, 285, 394
representations, 6, 286, 398
rule, 73
supervised, 68, 69, 73, 75–77, 83, 84
unsupervised, 68–70, 73, 75–77, 83, 84
Least class confusions, 372
LiDAR, 2, 42, 201, 211–213, 218, 225, 226, 231, 234, 235, 237, 252, 262, 265, 342, 344, 345
measurements, 225, 234
sensors, 5, 202, 211, 212, 216, 225, 237
technology, 256
ToF, 251
Local features, 221, 222
Local networks, 262
Local steering kernel (LSK), 108
Localization, 203, 204, 207–209, 212–215, 219–224, 227–229, 232–235, 240–247, 249, 250
accuracy, 4, 68, 87, 236, 253
algorithms, 249
applications, 247, 253
approaches, 218
camera, 81
error, 96, 97, 218
indoor, 216, 237, 259, 265
maps, 96
methods, 217, 255
multimodal, 5, 202, 218, 221, 226, 227, 231, 237, 239, 248, 255, 259, 261, 265, 266
pedestrian, 202, 255, 259
performance, 96
problem, 223
process, 235, 236
purposes, 202, 222, 231
step, 217
system, 238, 239, 259, 260, 266
techniques, 238, 251–253, 255
Localizing ground penetrating radar (LGPR), 255
Long term evolution (LTE), 249
Loss, 
function, 26, 28, 74, 75, 147, 385, 386, 391, 392, 395
multitask, 103, 105, 117
hallucination, 392
localization, 391
Loss term, 113
detection, 119
for classification, 113
for detection, 112, 115, 123
for scene prediction, 123
segmentation supervision, 117

M

Machine learning, 3, 66, 67, 221, 222, 243, 246, 247, 256, 280
Machine learning performance, 243
Margin loss, 282
Markov random fields (MRF), 347
Masked depth maps, 393
Maximally stable color regions (MSCR), 148
Maximally stable extremal region (MSER), 148, 221
Maximization fusion, 111
Mean intersection over union (MIOU), 30
Megapixel, 
cameras, 258
RGB camera, 258
video camera, 250
Micro aerial vehicle (MAV), 160
MIRFlickr dataset, 295, 298, 304
Misclassifications, 363, 364, 374
Miss rate, 118, 119, 123
Mixed reality, 262
Mobile industry processor interface (MIPI), 245
Modalities, 6, 10, 12, 26, 29, 235, 237, 238, 251, 265, 384–386, 388, 390, 393, 394, 396, 398
Modalities feature, 12
Model, 
architecture, 27
depth, 393
performance, 399
scene prediction, 113
spatiotemporal features, 387
Monocular camera, 162
Monocular video camera, 233
Monomodal localization, 218
Motion features, 388
Motion sensors, 223
Multicore architecture, 246
Multilayer perceptrons, 72
Multimodal, 
aggregated feature, 108
architecture, 31
characteristics, 113
conversion, 36
data, 5, 6, 12, 281, 388
data fusion, 11
deep learning, 2, 13
deep learning techniques, 11
feature, 
fusion, 119
fusion architectures, 4
learning, 110
maps, 103, 110–113, 124
fusion, 2, 4, 103, 105, 110
fusion architectures, 105
image, 5, 19, 103, 105, 108, 113, 116, 118, 121
image generation, 139
image translation, 145
localization, 5, 202, 218, 221, 226, 227, 231, 237, 239, 248, 255, 259, 261, 265, 266
networks, 12
pedestrian, 117, 118, 120, 124
detection, 4, 103, 104, 108, 109, 112, 116, 125, 130
detectors, 104, 105, 109, 130
detectors performance, 122
detectors quantitative performance, 118
performances, 111
reconstruction, 150
retrieval, 5
scene understanding, 2, 3
segmentation, 117, 130
segmentation supervision, 116, 124
architectures, 116, 117, 130
infusion architectures, 104
joint training, 105
networks, 124
semantic segmentation, 3
sensing technology, 108
thermal image generation, 150
Multimodality, 146
Multiple, 
data modalities, 386
layers, 72
modalities, 31
multispectral datasets, 138
sensors, 90, 250
stream architecture, 388
stream networks, 388
Multipurpose applications, 230
Multiscale discriminator architecture, 21
Multispectral, 
datasets, 140, 142
pedestrian detection performance, 4
semantic segmentation, 154
sensors, 368
ThermalWorld dataset, 137, 154
Multistage pipeline, 80
Multitask, 
learning, 10–12, 36, 130
framework, 10
methods, 12
scheme, 106
setting, 21
loss function, 103, 105, 117
network, 10
training, 28

N

Navigation purposes, 205, 218
Network, 
architecture, 43–45, 48, 49, 81, 386
branches, 48, 58
classifier, 46
configurations, 390
depth, 387, 393, 395
distillation, 386
hallucination, 6, 385, 386, 391–393, 395–397, 399
in semantic segmentation, 42
input, 47
interface communication tasks, 241
layer, 73
multitask, 10
optimization, 149
parameters, 49
performance, 50, 56, 77
processing, 386
SegNet, 48
single, 80
structure, 21, 47, 48, 55
Neural networks, 4, 12, 13, 20, 42, 48, 51, 67–73, 75, 81, 82, 84, 91, 98, 242, 244, 286
architecture, 62, 238
classification, 253, 266
convolutional, 3–6, 10, 42, 66–68, 70, 75, 79–81, 84, 92, 222, 253, 332, 346, 365
Neural processing unit (NPU), 242
Next generation mobile network (NGMN), 217
Nighttime scenes, 113, 115, 121–124, 128
Nighttime segmentation prediction, 117
Noisy, 
depth, 397
depth data, 397
thermal images, 108
training data, 283
NRG based network, 61
NTU dataset, 397
NTU RGB, 386, 387, 390, 393, 394, 399

O

Object detection, 2, 4, 11, 44, 67, 79, 91, 92, 98, 106, 222, 236, 384, 400
Obstacle detection, 256
Online localization, 201, 217
Optimal, 
architecture, 109
fusion architecture, 116
segmentation fusion scheme, 125
Optimum performances, 286
Outdoor localization, 218, 232, 238
Outdoor scenes, 149

P

Paired training data, 22
Particular filter (PF), 229
Pavia datasets, 363, 364
Peak performance, 241
Pedestrian, 
dataset, 107
detection, 4, 101–103, 105–110, 121, 123, 124, 138, 280
joint training, 110
learning task, 103
methods, 108
performance, 106
localization, 202, 255, 259
multimodal, 117, 118, 120, 124
Perception sensors, 202, 207, 234, 265
Performance, 
architecture, 109
classification, 357
comparable, 388
comparison, 5, 125, 283, 294
gain, 32, 106, 123, 125, 384
improvements, 36, 294
localization, 96
measure, 152, 397
metrics, 31, 241
model, 399
multimodal pedestrian detection, 119, 124, 125
multitask, 36
network, 50, 56, 77
pedestrian detection, 106
positioning, 220
scene prediction, 121, 122
segmentation supervision, 117, 125
Photometric camera parameters, 167
Pixelwise classification, 333
Pooling layer, 47, 69, 73, 75–77, 81, 83, 84, 113
Portable cameras, 237
Pose, 203, 204, 207–209, 212–215, 219–224, 227–229, 232–235, 240–247, 249, 250
change, 226
estimate, 168, 181
estimation, 6, 182, 192, 221, 225, 246, 256, 263, 308–310, 312–314, 337
fusion, 224
information, 238
relative, 168, 176, 202, 223–225, 309, 310
Position accuracy, 204, 220
Positioning, 
accuracy, 237, 251
indoor, 220
performance, 220
sensors, 201
Posterior class probabilities, 348, 358
Potential relocalization, 263
Power networks, 256
Powerful computing architecture, 266
Pretrained SegNet, 57
Privileged information, 385, 386, 392, 399
learning theories, 385
source, 392
theory, 386
PRO cameras, 139, 140, 144
Probabilistic autoencoder, 17
Professional mapping drone, 257
Programmable vision accelerators (PVA), 244
Pseudo random code (PRC), 206

Q

Quadratic entropy (QE), 349

R

Random forest, 4, 6, 67–69, 73, 75–77, 79, 81–84, 91, 92, 95, 98, 317, 343, 345, 346, 353, 358, 365, 367
Random forest classification, 82, 92
Recognition performance, 91–93, 152
RegDB dataset, 138
Region proposal network (RPN), 112
ReID datasets, 140
ReID performance, 136, 138, 139, 151, 152, 154
Relative, 
accuracy, 258
depth, 384
features, 317
pose, 168, 176, 202, 223–225, 309, 310
pose error, 189
Relocalization, 223
Remote sensing, 1–3, 6, 44, 45, 47, 49, 62, 342, 344, 364
Remote sensing modalities, 6
ResNet layer, 390
RF classifier, 321, 368
RGB, 
feature maps, 393
FuseNet, 57, 58, 62
image, 11, 26–29, 32, 34, 42, 47, 48, 51, 53
image decoders, 28
image encoder, 28
image generation from depth, 28
SegNet, 57, 58
stream networks, 395
Robot localization, 234
Root mean square error (RMSE), 187

S

Safety features, 266
Salient features, 225
Satisfactory accuracy, 363
Scene, 
building, 313
classification, 11, 308, 317, 323, 331, 337
complexity, 317
components, 2
conditions, 115, 121, 122
decomposition, 323, 324
depth, 47
geometry, 5, 47, 236
information, 117
mapping, 202, 255, 256
prediction, 122
model, 113
networks, 113, 121
performance, 121, 122
reconstruction, 2, 234, 308
segmentation, 47
understanding, 2, 10, 201, 202, 249, 384
Scene prediction network (SPN), 113, 121
Segmentation, 11, 12, 42, 47, 116, 117, 125, 141, 147, 344, 347
images, 147
masks supervision learning, 103
multimodal, 117, 130
outputs, 117
prediction, 117
problems, 43
process, 57
scene, 47
semantic, 2, 4, 10–12, 26, 27, 29–33, 41, 42, 81, 109, 116
supervision, 103, 104, 110, 116–118, 124, 125, 130
joint training, 103, 124
loss term, 117
performance, 117, 125
thermal, 146–148
SegNet, 48, 56, 57, 60, 62
HSD, 58
network, 48
NRG, 58, 60
NRG classifier, 61
NRGD, 58
NRGD outperforms, 58
RGB, 57, 58
RGB network, 62
RGBD, 57
RGBN, 57
Semantic, 
feature, 222
feature maps, 110
labels, 10–12, 26, 28–30, 32, 34, 144
scene understanding, 42
segmentation, 2, 4, 10–12, 26, 27, 29–33, 41, 42, 81, 109, 116
Sensors, 
data, 251
data processing, 213
depth, 210, 248, 384, 397
fusion architectures, 214
multiple, 90, 250
multispectral, 368
networks, 216
positioning, 201
specifications, 204
Shape distribution histogram (SDH), 107
Siamese network, 42, 62
Siamese network structures, 48
SIFT features, 79
Simultaneous location and mapping (SLAM), 5, 201, 243
Single, 
block, 213
channel, 148
color image, 139, 146, 147, 152
exposure, 209
frame, 243
fusion processor, 214
image, 95
image classifier, 42
input color image, 152
instruction, 209
model, 330
network, 80
patch, 91
sample, 79
sensor, 213, 232
Smart, 
camera, 210
glasses, 202, 205, 207, 248, 249, 251
phone, 66–68, 89, 90, 97, 205, 218, 220, 241–243, 248, 249, 253, 254, 259, 260
phone localization application, 254
Social networks, 241
Software Development Kit (SDK), 249
Spaceborne sensors, 342
Spoofed GNSS signal, 233
Standard dynamic random access memory (SDRAM), 245
Stanford dataset, 43, 50, 51, 56
Static random access memory (SRAM), 246
Stereo camera, 210, 231, 251, 262
Stereo camera sensor, 251
Stereo thermal images, 107
Stochastic gradient descent (SGD), 18, 55
Student network, 392
Supervised, 
classification, 365
deep learning, 281
fusion, 353, 372, 376
learning, 68, 69, 73, 75–77, 83, 84
learning algorithms, 87
Supervision, 4, 23, 24, 105, 109, 116, 117, 124, 130, 386
Supervision information, 106
Supervision segmentation, 103, 104, 110, 116–118, 124, 125, 130
Support Vector Machine (SVM), 353, 367
SVM classifier, 358, 360
SVM supervised fusions, 372
Synthetic datasets, 363

T

Target, 
classes, 32
classification, 108, 112
dataset, 298
detection, 105, 112
Teacher learning phase, 393
Teacher network, 386, 387, 391, 392
Temperature sensors, 203
Text embedding space, 284, 298, 302
Text embeddings, 5, 281, 283–286, 289, 293, 294, 298, 299, 302, 304
Thermal, 
cameras, 136, 138, 152, 154
images, 107, 108, 121, 122, 136, 138, 139, 142, 144, 145, 148, 149, 151–153
segmentation, 146–148
streams, 111, 120, 121, 123
Toulouse dataset, 359, 363, 364
Traffic signs, 67, 68, 81, 85–89, 91, 92, 94–98
bounding boxes, 91
dataset, 68
detection, 81
Trained, 
classifier, 321
CNN, 284
model, 285
network, 139
TRPN, 112, 113, 116, 119, 123–125, 130
architecture, 112
baseline, 122
feature maps, 125
models, 119, 123

U

Ultrasonic sensors, 253
Underwater localization purposes, 233
Unmanned aerial vehicles (UAV), 2, 202, 233
Unmanned ground vehicle (UGV), 237
Unmatched detections, 118
Unpaired RGB images, 30
Unpooling layers, 47
Unscented Kalman filter (UKF), 213, 229
Unscented transform (UT), 229
Unsupervised, 
classification, 345
feature learning methods, 80
learning, 68–70, 73, 75–77, 83, 84
Urban, 
footprint detection, 365
land cover, 357
land cover classification, 343, 357
scene classification, 317
scene reconstruction, 6, 337
scenes, 102, 222

V

Vaihingen dataset, 50
Vehicle localization, 226
Vehicle pose, 233
VGG network, 44, 45, 47
VHR sensors, 357, 365
Video classification, 390
Video graphics array (VGA), 210
Visible features, 108
Visible pedestrian detection, 105, 106, 109
Vision processing unit (VPU), 246

W

Weak features, 111, 121
WebVision, 287, 293
Weighted parallel iterative closed point (WPICP), 226
Wide video graphics array (WVGA), 231
WiFi communication networks, 216
WiFi localization, 232
Window classification, 107
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset