Many diversified applications consider virtual environments as an efficient human computer interaction forms. There are many applications that require natural form of human computer interaction. Existing human-computer interactions method using mouse, keyboards and pens are not sufficient to support natural interaction with the computer. Hence, researchers always had an interest to explore new and natural form of human-computer interactions. Hand gestures based methods opened the gateway for the new methods of human computer interactions. In the beginning Gloves and sensor based tracker methods were introduced. These methods were not efficient, had many limitations and were not comfortable in use. Thus, new innovative methods were searched for a natural form of human computer based interaction. Direct use of hand through graphical based interfaces has proved to be a better from of natural human computer interaction.
Computer vision technology and human-computer interaction have helped in designing and implementing many natural forms of human-computer interaction systems. These include interactions through gestures, speech, facial expressions, etc. Hand gesture recognition system thus is a human-computer interaction system that is capable of recognizing various gestures of a human hand. The human gesture can be static or dynamic. Different methods have been proposed and implemented for recognizing human hand gestures. Some methods make use of external devices like gloves and some make use of skin color to segment the hand features. This research work makes use of Hidden Markov Model along with morphological operations in recognizing hand gestures for a natural HCI system. Over the years of research on Hidden Markov Model has proven that these models performances are better than the others like artificial neural networks, fuzzy logic, for pattern recognition. Thus, this works makes use of simple kind of integration of digital image processing techniques and Hidden Markov model based methodology for hand gesture recognition. The methodology is given in section below after theoretical description. Methodology first explains the overall system workflow diagram and each step of the diagram and the process undertaken in it are explained in the section following it. Finally the experimentation and the results are described in sections following it.
Gesture is a sort of non-verbal method of conveying message where in a bodily based actions are used to communicate a message. Gestures are categorized as static or dynamic. In the first form of category the static gesture the bodily part remain in certain pose or positions and are thus easier to compute and in dynamic gesture the bodily part is represented through sequence of postures and is more complex in nature to compute. Since communication is evolving so are new methods of communication are sought for (Chen, Fu, and Huang, 2003). Among them one of the emerging methods are acquiring information of message through recognition of gestures systems. Existing methods on gesture recognition focus on devices like data gloves, color markers, skin color, hand body part movement etc.
Gesture recognition has opened gates for new methods of Human-Computer interaction even more powerful than Graphical User interfaces based on mouse and keyboards. Gesture recognition can enable human to interact with the machine with no external devices like mechanical devices in use. Gesture recognition is broadly classified as Glove based Gesture Recognition and Vision-Based Gesture Recognition. Glove based Gesture recognition has a drawback that it hides the naturalness as it needs accessories in the form of devices to interact with machines. But Vision-based gesture recognition uses features from the Visual image of body part like hand and compares its features with those features extracted from web camera (Ionescu, et. al, 2005).
Gaussian Blur filter is a variant of a blur filter. This filter for a given selection selects only a part of an image and blurs its. This may result in color leakage from unblurred area to the blurred area of the image. Thus, Gaussian filter is used to blur the images by removing noise and details from it. The one dimensions Gaussian function is defined as:
In the above equation, σ is known as the standard deviation of the distribution .The mean value of this distribution is 0. Images should be of two-dimensional Gaussian functions (Mitra and Acharya, 2007) which are the product of two one-dimensional Gaussian functions for each direction and thus is given by the following equation:
Thus, for images the Gaussian filter works using two-dimensional distribution function as a point-spread function and thus the blurring of the image is achieved by convolving the image with the two-dimensional Gaussian distribution function of the image.
Figure 1: a. Image with noise b. Image after filtering (Ng and Ranganath, 2002)
The RGB model contains three primary color red, green and blue. It symbolizes the cathode ray tube based color model. These are called as the primary colors and their combination can result in the production of any other desired color.
The RGB color model makes use of the Cartesian coordinate based system (Sarkar, Sanyal, and Majumder, 2013). The RGB color model and the Cartesian system is given in the figure below.
The diagonal which has coordinates (0,0,0) represents black color and (1,1,1) represents white color totally defines the gray-scale.
Figure 2: RGB Model
HSV Color Model
The HSV color model represents the Hue, saturation and value of an image. The HSV coordinate system is in a hexacone as given in the figure below. The values of the coordinates represent the intensity of a color (Yang and Ahuja, 2001).
The hue value changes from 0 to 1 for various colors starting from 0 to red than yellow, green, cyan, blue, magenta and again 1 o red. Saturation value also varies from 0 to 1 and corresponds to the unsaturated shades representing shades of grey to fully saturated shades representing no white based colors. Brightness value varies from 0 to 1 and it corresponds to the light intensity of a color with 0 as less to brighter value at 1.
Figure 3: a. HSV coordinate system b. HSV Color Model
Considering the figure below, the point P in the figure has a hue measured at an “angle between the line connecting triangle center and RED point and triangle center and P point” (Zhu, et. al., 2000).
Figure 4: RGB to HSV
For a given point P the is defined as the distance between the point P and the triangle center and finally the intensity for the point P is given as the height of the line which is perpendicular to the given triangle and passes through its center (Anzai, 2012). Thus, from the above description, the RGB to HSV conversion formula is given below.
Morphological operations are the application of non-linear operations on the features of an image. These operations rely on the relative pixels of the image. These operations make use of structuring element which is a matrix which is placed on all the possible locations of the input image (Chen, Georganas, and Petriu, 2007). The structuring element combines the input binary image using a set of operations like intersection, union, etc. the morphological operations are erosion, dilation, boundary detection, openness, etc.
The Canny edge detection is an edge location algorithm that uses a multi-stage calculation to detect edges of the image (Lee and Kim, 1999). It was developed by John F. Shrewd in 1986. Canny edge detection algorithm process is given below:
- First make use of Gaussian channel based filter to blur the picture
- Compute the force inclinations of the given input image.
- Then non-greatest concealment is applied to dispose of any suspicious points which acts as hindrances to edge discovery
- Twofold limit is applied on the image to help in deciding the edges
- Finally the edges are recognized by smothering the edges that are frail and not associated with solid edges.
Blob detection is used to detect specific regions of an image that differ from the surrounding image in properties like brightness, color, etc. blob detectors are of two types one the differential method which is dependent on the derivate of the function related to the position of the region and the other is based on the local maxima and minima function of the region (Murthy and Jadon, 2009).
Consider a person has three coins and he tosses those coins in a sequence known to him. The total outcome of flipping the coins is the observation sequence of the whole event. Now the persons standing outside the room is not aware of the sequence of the coins. Now if one considers that the third coin outcome is mostly tails and if all the coins are tossed in equal probability then one can easily say that the number of tails is more than the heads. If the probability of moving the state of the first coin or the second coin to the state of the third coin is zero and then tossing is started using first and second coins then the outcome sequence of the toss shall result in more tails due to the transition of probabilities between the state of coins and the initial state of the coins (Ramamoorthy et. al., 2003). Thus, the given example forms three sets one the set of individual bias, the seconds the set of transition of probabilities and the third the set of initial probabilities. These set form the basis of Hidden Markov Model.
Hidden Markov model is thus defined as a collection of finite states which are connected by a transition. Each state has two sets of probabilities a transition probability and the other a discrete output probability or continuous output probability. These output probability density function define the condition of producing a given output from an input random vector.
The methodology for hand gesture recognition system is based on the following figure:
Figure 5: Methodology
The gesture recognition system is capable of identifying meaningful gesture in real-time from color image sequences using HMM. The steps and the explanation for the given methodology are given below:
Step 1: acquire image from the video
Using webcam of good resolution a video is acquired by the system. The video is broken into image sequences at a given interval to give images of the hand.
Step 2: preprocessing of the image
Since the image acquired from the webcam may be noisy thus to make the image perfect for hand detection it is essential to do some preprocessing of the image. This includes segmentation of the image, morphological operations, filtering and de-noising of the image.
First, the image is converted into HSV scale from the colored RGB model. This generates a binary image with binary threshold values (Suk, Sin, and Lee, 2008).
Now in the segmentation process, the background image and the foreground image is segmented using a selection of an adequate threshold of gray level. This segmentation separates the hand part of the image from the background image. Existing Research has proved that OStu segmentation algorithm has good results on hand gestures and hence this algorithm is used for segmentation of the image.
The segmented image using Otsu algorithm has some background part with 1 values as background noise and some hand part of the image as 0 values as gesture noise. This morphological filtering consisting of dilation and erosion is used to get an image with smooth and closed filtered image (Yeasin and Chaudhuri, 2000).
To have accurate hand gesture position in the image blob detection is carried out on the image. This helps in identifying the centroid point, boundary area and bounding box for the hand gesture.
Canny edge detection algorithm is applied to the image to smooth the image, eliminate any noises if any then find the edges of the hand gesture giving a black and white image with white lines representing the boundary or the edge of the hand gesture.
Step 3: Feature extraction
For each binary image of the hand, a Fourier Descriptor (FD) is formed which represents the boundary of the image. The boundary of the image is represented as bk= xk+jyk where xk and yk represent the boundary pixels. This number is resampled at fixed length sequences. Thus, for an image a discrete vector is formed which represents the features of the image like centroid, orientation etc. this discrete vector is given to the HMM as an input.
Step 4: HMM-based gesture recognition
The final step is the classification of the image into the set of already defined hand gestures. HMM is made up of five stages with a feed forward algorithm. The discrete input vector is quantized (Zhu, Xu and Kriegman, 2002). The HMM is trained using the Baum–Welch re-estimation formulas. The training identifies the initial probability vector, transitions probability matrix, and observation probability matrix. In training, the discrete vector is used to construct gesture database which contains isolated hand gestures.
In this work, the system is designed for recognizing gesture by zero-code word detection where each gesture ending with a line segment is assigned a zero code word. After the training process, the feed forward algorithm is applied. This algorithm computes the probability of the input discrete vector sequences for HMM topology. This results in the gesture path corresponding to the maximal likelihood of the database gestures. The one with maximum likelihood is selected as the recognized hand gesture (Sarkar, Sanyal and Majumder, 2013).
Using the above methodology the Hand gesture recognition system is designed in C++ language on Visual studio platform. OpenCV libraries have been used to help in the programming to have integration of some functions supporting digital image processing using C++ language.
First the hand movements are captured via a video camera or a web camera of the laptop. This is done using the command Video Capture cap. Once a video is captured it is divided into the snapshots of the images to get image of the hand. A good image is selected and that particular image is converted into HSV from the existing RGB format using the function COLOR BGR2HSV code (Mitra and Acharya, 2007). Using the concept of image segmentation the foreground and background images are segmented. The skin color is determined using the HSV values and the image is converted to binary form for image segmentation. Edges of the hand are determined using the edge detection algorithm and then the feature extraction is helpful to convert the image into vector. After then the HMM is used to generate close templates to identify gestures. The above algorithm is executed on the C++ based environment.
The following figure displays the result of the hand gesture recognition using the designed system.
Figure 6: HSV Control
Figure 7: a. Original image b. HSV image
Figure 8: a. Edge detection output b. Threshold image
Nowadays, gesture-based recognition system research has increased and new techniques and methods are carried out daily for enhancing the accuracy of the overall system. This work successfully implements a hand gesture recognition system for a given video of hand gesture movement. This work makes use of image segmentation to segment the foreground and background of the image, and then carries out morphological filtering of the image to remove the noise and enhance the image (Chen, Georganas, and Petriu, 2007). Then the edges of the hand gesture are recognized and the boundary of the image is transformed to a discrete random vector. This vector is given to the HMM for classification of the hand gesture based on the training database already created. The program was successfully able to recognize hand gesture and it is proposed that the accuracy of the system is about 92%.
Alsheakhali, M., Skaik, A., Aldahdouh, M. and Alhelou, M., 2011. Hand Gesture Recognition System. Information & Communication Systems, 132.
Anzai, Y., 2012. Pattern Recognition & Machine Learning. Elsevier.
Chen, F.S., Fu, C.M. and Huang, C.L., 2003. Hand gesture recognition using a real-time tracking method and hidden Markov models. Image and vision computing, 21(8), pp.745-758.
Chen, Q., Georganas, N.D. and Petriu, E.M., 2007, May. Real-time vision-based hand gesture recognition using haar-like features. In Instrumentation and Measurement Technology Conference Proceedings, 2007. IMTC 2007. IEEE (pp. 1-6). IEEE.
Ionescu, B., Coquin, D., Lambert, P. and Buzuloiu, V., 2005. Dynamic hand gesture recognition using the skeleton of the hand. EURASIP Journal on Advances in Signal Processing, 2005(13), pp.1-9.
Lee, H.K. and Kim, J.H., 1999. An HMM-based threshold model approach for gesture recognition. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 21(10), pp.961-973.
Mitra, S. and Acharya, T., 2007. Gesture recognition: A survey. Systems, Man, and Cybernetics, Part C: Applications and Reviews, IEEE Transactions on, 37(3), pp.311-324
Murthy, G.R.S. and Jadon, R.S., 2009. A review of vision based hand gestures recognition. International Journal of Information Technology and Knowledge Management, 2(2), pp.405-410.
Ng, C.W. and Ranganath, S., 2002. Real-time gesture recognition system and application. Image and Vision computing, 20(13), pp.993-1007.
Ramamoorthy, A., Vaswani, N., Chaudhury, S. and Banerjee, S., 2003. Recognition of dynamic hand gestures. Pattern Recognition, 36(9), pp.2069-2081.
Sarkar, A.R., Sanyal, G. and Majumder, S., 2013. Hand gesture recognition systems: a survey. International Journal of Computer Applications, 71(15).
Suk, H.I., Sin, B.K. and Lee, S.W., 2008, September. Recognizing hand gestures using dynamic bayesian network. In Automatic Face & Gesture Recognition, 2008. FG'08. 8th IEEE International Conference on (pp. 1-6). IEEE.
Yang, M.H. and Ahuja, N., 2001. Recognizing hand gestures using motion trajectories. In Face Detection and Gesture Recognition for Human-Computer Interaction (pp. 53-81). Springer US.
Yeasin, M. and Chaudhuri, S., 2000. Visual understanding of dynamic hand gestures. Pattern Recognition, 33(11), pp.1805-1817.
Zhu, Y., Ren, H., Xu, G. and Lin, X., 2000. Toward real-time human-computer interaction with continuous dynamic hand gestures. In Automatic Face and Gesture Recognition, 2000. Proceedings. Fourth IEEE International Conference on (pp. 544-549). IEEE.Zhu, Y., Xu, G. and Kriegman, D.J., 2002. A real-time approach to the spotting, representation, and recognition of hand gestures for human–computer interaction. Computer Vision and Image Understanding, 85(3), pp.189-208.
To export a reference to this article please select a referencing stye below:
My Assignment Help. (2017). A Real Time Hand Gesture Recognition. Retrieved from https://myassignmenthelp.com/free-samples/a-real-time-hand-gesture-recognition.
"A Real Time Hand Gesture Recognition." My Assignment Help, 2017, https://myassignmenthelp.com/free-samples/a-real-time-hand-gesture-recognition.
My Assignment Help (2017) A Real Time Hand Gesture Recognition [Online]. Available from: https://myassignmenthelp.com/free-samples/a-real-time-hand-gesture-recognition
[Accessed 08 December 2023].
My Assignment Help. 'A Real Time Hand Gesture Recognition' (My Assignment Help, 2017) <https://myassignmenthelp.com/free-samples/a-real-time-hand-gesture-recognition> accessed 08 December 2023.
My Assignment Help. A Real Time Hand Gesture Recognition [Internet]. My Assignment Help. 2017 [cited 08 December 2023]. Available from: https://myassignmenthelp.com/free-samples/a-real-time-hand-gesture-recognition.