1.1 Introduction to Heterogeneous Computing
1.4 Concurrency and Parallel Programming Models
1.6 Message-Passing Communication
1.7 Different Grains of Parallelism
1.8 Heterogeneous Computing with OpenCL
Chapter 2: Device architectures
2.3 The Architectural Design Space
Chapter 3: Introduction to OpenCL
3.3 The OpenCL Execution Model
3.4 Kernels and the OpenCL Programming Model
3.6 The OpenCL Runtime with an Example
3.7 Vector Addition Using an OpenCL C++ Wrapper
3.8 OpenCL for CUDA Programmers
Chapter 5: OpenCL runtime and concurrency model
5.1 Commands and the Queuing Model
5.3 The Kernel Execution Domain: Work-Items, Work-Groups, and NDRanges
5.4 Native and Built-In Kernels
Chapter 6: OpenCL host-side memory model
Chapter 7: OpenCL device-side memory model
7.1 Synchronization and Communication
Chapter 8: Dissecting OpenCL on a heterogeneous system
8.1 OpenCL on an AMD FX-8350 CPU
8.2 OpenCL on the AMD Radeon R9 290X GPU
8.3 Memory Performance Considerations in OpenCL
Chapter 9: Case study: Image clustering
9.2 The Feature Histogram on the CPU
Chapter 10: OpenCL profiling and debugging
10.2 Profiling OpenCL Code Using Events
10.5 Analyzing Kernels Using CodeXL
10.6 Debugging OpenCL Kernels Using CodeXL
Chapter 11: Mapping high-level programming languages to OpenCL 2.0: A compiler writer’s perspective
11.2 A Brief Introduction to C++ AMP
11.3 OpenCL 2.0 as a Compiler Target
11.4 Mapping Key C++ AMP Constructs to OpenCL
11.7 How Shared Virtual Memory in OpenCL 2.0 Fits in
11.8 Compiler Support for Tiling in C++AMP
11.10 Data Movement Optimization
11.11 Binomial Options: A Full Example
Chapter 12: WebCL: Enabling OpenCL acceleration of Web applications
12.4 Interoperability with WebGL
12.8 Status and Future of WebCL