Detecting Different Skeletal Poses Using Kinematics with Kinect Joints Tracking
Introduction
The ability to accurately detect and classify different skeletal poses is crucial in various fields including human-computer interaction, augmented reality, and sports analysis. One of the most popular devices for capturing human skeletal data is the Kinect, which provides detailed joint position information. This article outlines the process of using Kinect joints tracking to detect and classify different skeletal poses in a step-by-step manner.
Step 1: Understanding Kinematics and Joint Data
The Kinect captures the x, y, z values of over 25 joints for a person. By default, these coordinates are relative to the Kinect sensor's own position in the room. This means that any movement of the Kinect will cause the captured coordinates to change, making it difficult to compare postures captured at different times or with different sensor positions.
One way to mitigate this issue is by shifting the origin of the coordinate space to the Spine Base. By doing so, all joint values are made invariant to the Kinect's position in the room. This step is crucial as it ensures a consistent reference point for all data points, making subsequent analysis and classification more straightforward.
Step 2: Collecting Data
To create a comprehensive dataset for training a machine learning model, it is necessary to collect data for as many postures as possible. For this example, let's aim to classify four different postures: Sitting, Standing, Bending, and Lying down. Each of these postures will need to be represented by data collected from at least ten individuals to ensure a robust dataset.
The process involves capturing joint data for each posture and saving it in a structured format. One popular format for such data is CSV (Comma Separated Values). Each row in the CSV file corresponds to a single data point (pose), and each column represents the x, y, z coordinates of a particular joint. For instance, if we have 25 joints, the CSV file will have 25 columns, each representing one joint.
Step 3: Labeling Postures
Once the data has been collected and cleaned, the next step is to label the different postures. This is a crucial step as it helps to train the machine learning model with the correct classifications. In this example, we can assign the following labels:
Sitting: 0 Standing: 1 Bending: 2 Lying: 3These labels will be used when training a scikit-learn classifier to associate the extracted features (joint coordinates) with their respective postures.
Step 4: Multi-Label Classification
The problem of classifying different skeletal poses based on joint data is essentially a multi-label classification problem. This means that a single set of joint coordinates can correspond to one or more postures. For instance, a person bending over could also be leaning slightly to one side, making it a combination of bending and possibly leaning.
There are various machine learning techniques that can be used for multi-label classification, including decision trees, random forests, and support vector machines (SVM). Scikit-learn provides these and many other algorithms to choose from, making it a versatile tool for this task.
TL;DR
Collect data with joint points as features. Use a classifier from scikit-learn to perform multi-label classification based on the collected data. Shift the origin to the spine base to make joint coordinates invariant to Kinect position. Collect data from multiple individuals for each posture to ensure robustness. Label each posture appropriately.With these steps, you can effectively capture, preprocess, and classify skeletal poses using Kinect joints tracking and machine learning. This process not only enhances the accuracy of pose detection but also opens up numerous possibilities for applications in various fields.
References
Documentation: Kinect, PyKinect2 Code Examples: Scikit-learn