Introduction

It is customary to preface a work with an explanation of the author's aim, why he wrote the book, and the relationship in which he believes it to stand to other earlier or contemporary treatises on the same subject. In the case of a technical work, however, such an explanation seems not only superfluous but, in view of the nature of the subject-matter, even inappropriate and misleading. In this sense, a technical book is similar to a book about anatomy. We are quite sure that we do not as yet possess the subjectmatter itself, the content of the science, simply by reading around it, but must in addition exert ourselves to know the particulars by examining real cadavers and by performing real experiments. Technical knowledge requires a similar exertion in order to achieve any level of competence.

Besides the reader's desire to be hands-on rather than heads-down, a book about Kinect development offers some additional challenges due to its novelty. The Kinect seemed to arrive exnihilo in November of 2010 and attempts to interface with the Kinect technology, originally intended only to be used with the XBOX gaming system, began almost immediately. The popularity of these efforts to hack the Kinect appears to have taken even Microsoft unawares.

Several frameworks for interpreting the raw feeds from the Kinect sensor have been released prior to Microsoft's official reveal of the Kinect SDK in July of 2011 including libfreenect developed by the OpenKinect community and OpenNI developed primarily by PrimeSense, vendors of one of the key technologies used in the Kinect sensor. The surprising nature of the Kinect's release as well as Microsoft's apparent failure to anticipate the overwhelming desire on the part of developers, hobbyists and even research scientists to play with the technology may give the impression that the Kinect SDK is hodgepodge or even a briefly flickering fad.

The gesture recognition capabilities made affordable by the Kinect, however, have been researched at least since the late 70's. A brief search on YouTube for the phrase “put that there” will bring up Chris Schmandt's1979 work with the MIT Media Lab demonstrating key Kinect concepts such as gesture tracking and speech recognition. The influence of Schmandt's work can be seen in Mark Lucente's work with gesture and speech recognition in the 90's for IBM Research on a project called DreamSpace. These early concepts came together in the central image from Steven Speilberg's 2002 film Minority Reportthat captured viewers imaginations concerning what the future should look like. That image was of Tom Cruise waving his arms and manipulating his computer screens without touching either the monitors or any input devices. In the middle of an otherwise dystopic society filled with robotic spiders, ubiquitous marketing and panopticon police surveilence, Steven Speilberg offered us a vision not only of a possible technological future but of a future we wanted.

Although Minority Report was intended as a vision of technology 50 years in the future, the first concept videos for the Kinect, code-named Project Natal, started appearing only seven years after the movie's release. One of the first things people noticed about the technology with respect to its cinematic predecessor was that the Kinect did not require Tom Cruise's three-fingered, blue-lit gloves to function. We had not only caught up to the future as envisioned by Minority Report in record time but had even surpassed it.

The Kinect is only new in the sense that it has recently become affordable and fit for massproduction. As pointed out above, it has been anticipated in research circles for over 40 years. The principle concepts of gesture-recognition have not changed substantially in that time. Moreover, the cinematic exploration of gesture-recognition devices demonstrates that the technology has succeeded in making a deep connection with people's imaginations, filling a need we did not know we had.

In the near future, readers can expect to see Kinect sensors built into monitors and laptops as gesture-based interfaces gain ground in the marketplace. Over the next few years, Kinect-like technology will begin appearing in retail stores, public buildings, malls and multiple locations in the home. As the hardware improves and becomes ubiquitous, the authors anticipate that the Kinect SDK will become the leading software platform for working with it. Although slow out of the gate with the Kinect SDK, Microsoft's expertise in platform development, the fact that they own the technology, as well as their intimate experience with the Kinect for game development affords them remarkable advantages over the alternatives. While predictions about the future of technology have been shown, over the past few years, to be a treacherous endeavor, the authors posit with some confidence that skills gained in developing with the Kinect SDK will not become obsolete in the near future.

Even more important, however, developing with the Kinect SDK is fun in a way that typical development is not. The pleasure of building your first skeleton tracking program is difficult to describe. It is in order to share this ineffable experience -- an experience familiar to anyone who still remembers their first software program and became software developers in the belief thissense of joy and accomplishment was repeatable – that we have written this book.

About This Book

This book is for the inveterate tinkerer who cannot resist playing with code samples before reading the instructions on why the samples are written the way they are. After all, you bought this book in order to find out how to play with the Kinect sensor and replicate some of the exciting scenarios you may have seen online. We understand if you do not want to initially wade through detailed explanations before seeing how far you can get with the samples on your own. At the same time, we have included in depth information about why the Kinect SDK works the way it does and to provide guidance on the tricks and pitfalls of working with the SDK. You can always go back and read this information at a later point as it becomes important to you.

The chapters are provided in roughly sequential order, with each chapter building upon the chapters that went before. They begin with the basics, move on to image processing and skeleton tracking, then address more sophisticated scenarios involving complex gestures and speech recognition. Finally they demonstrate how to combine the SDK with other code libraries in order to build complex effects. The appendix offers an overview of mathematical and kinematic concepts that you will want to become familiar with as you plan out your own unique Kinect applications.

Chapter Overview

Chapter 1: Getting Started

Your imagination is running wild with ideas and cool designs for applications. There are a few things to know first, however. This chapter will cover the surprisingly long history that led up to the creation of the Kinect for Windows SDK. It will then provide step-by-step instructions for downloading and installing the necessary libraries and tools needed to developapplications for the Kinect.

Chapter 2: Application Fundamentals guides the reader through the process of building a Kinect application. At the completion of this chapter, the reader will have the foundation needed to write relatively sophisticated Kinect applications using the Microsoft SDK. Thisincludes getting data from the Kinect to display a live image feed as well as a few tricksto manipulate the image stream. The basic code introduced here is common to virtually all Kinect applications.

Chapter 3: Depth Image Processing

The depth stream is at the core of Kinect technology. This code intensive chapter explains the depth stream in detail: what data the Kinect sensor provides and what can be done with this data. Examples include creating images where users are identified and their silhouettes are colored as well as simple tricks using the silhouettes to determinine the distance of the user from the Kinect and other users.

Chapter 4: Skeleton Tracking

By using the data from the depth stream, the Microsoft SDK can determine human shapes. This is called skeleton tracking. The reader will learn how to get skeleton tracking data, what that data means and how to use it. At this point, you will know enough to have some fun. Walkthroughs include visually tracking skeleton joints and bones, and creating some basic games.

Chapter 5: Advanced Skeleton Tracking

There is more to skeleton tracking than just creating avatars and skeletons. Sometimes reading and processing raw Kinect data is not enough. It can be volatile and unpredictable. This chapter provides tips and tricks to smooth out this data to create more polished applications. In this chapter we will also move beyond the depth image and work with the live image. Using the data produced by the depth image and the visual of the live image, we will work with an augmented reality application.

Chapter 6: Gestures

The next level in Kinect development is processing skeleton tracking data to detect using gestures. Gestures make interacting with your application more natural. In fact, there is a whole fieldof study dedicated to natural user interfaces. This chapter will introduce NUI and show how it affects application development. Kinect is so new that well-established gesture libraries and tools are still lacking. This chapter will give guidance to help define what a gesture is and how to implement a basic gesture library.

Chapter 7: Speech

The Kinect is more than just a sensor that sees the world. It also hears it. The Kinect has an array of microphones that allows it to detect and process audio. This means that the user can use voice commands as well as gestures to interact with an application. In this chapter, you will be introduced to the Microsoft Speech Recognition SDK and shown how it is integrated with the Kinect microphone array.

Chapter 8: Beyond the Basics introduces the reader to much more complex development that can be done with the Kinect. This chapter addresses useful tools and ways to manipulate depth data to create complex applications and advanced Kinect visuals.

Appendix A: Kinect Math

Basic math skills and formulas needed when working with Kinect. Gives only practical information needed for development tasks.

What You Need to Use This Book

The Kinect SDK requires the Microsoft .NET Framework 4.0. To build applications with it, you will need either Visual Studio 2010 Express or another version of Visual Studio 2010. The Kinect SDK may be downloaded at http://www.kinectforwindows.org/download/.

The samples in this book are written with WPF 4 and C#. The Kinect SDK merely provides a way to read and manipulate the sensor streams from the Kinect device. Additional technology is required in order to display this data in interesting ways. For this book we have selected WPF, the preeminant vector graphic platform in the Microsoft stack as well as a platform generally familiar to most developers working with Microsoft technologies. C#, in turn, is the .NET language with the greatest penetration among developers.

About the Code Samples

The code samples in this book have been written for version 1.0 of the Kinect for Windows SDK released on February 1st, 2012. You are invited to copy any of the code and use it as you will, but the authors hope you will actually improve upon it. Book code, after all, is not real code. Each project and snippet found in this book has been selected for its ability to illustrate a point rather than its efficiency in performing a task. Where possible we have attempted to provide best practices for writing performant Kinect code, but whenever good code collided with legible code, legibility tended to win.

More painful to us, given that both the authors work for a design agency, was the realization that the book you hold in your hands needed to be about Kinect code rather than about Kinect design. To this end, we have reined in our impulse to build elaborate presentation layers in favor of spare, workman-like designs.

The source code for the projects described in this book is available for download at http://www.apress.com/9781430241041. This is the official home page of the book. You can also check for errata and find related Apress titles here.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset