Perceptual Intelligent Systems 1 Seminar Report On PERCEPTUAL INTELLIGENT SYSTEMS Guided By: Ms. Bindu S. Moni Submitted By: N.M.Jophi S1 MCA Roll No. 29 A.J.C.E Dept. Of Comp. Science & Engg. Perceptual Intelligent Systems 2 CONTENTS 1. Introduction 3 2. Perception 3 - Filters that make up perception 3 3. Perceptual User Interfaces 5 4. Information Flow in Perceptual User Interfaces 6 5. Perceptual Intelligence 7 6. Perceptual Intelligent Systems 7 7. Gesture Recognition Systems 8 - Challenge of Gesture Recognition 8 8. Speech Recognition Systems 9 - Performance of Speech Recognition Systems 9. Nouse Perceptual Vision Interface - Tools available 9 10 11 10. Conclusion 13 11. Reference 14 A.J.C.E Dept. Of Comp. Science & Engg. Perceptual Intelligent Systems 3 Introduction Inanimate things are coming to our life. That is the simple objects that surround us are gaining sensors, computational powers, and actuators. Consequently, desks and doors, TVs and telephones, cars and trains, eyeglasses and shoes, and even the shirts on our backs are changing from static, inanimate objects into adaptive, reactive systems that can be more friendly, useful, and efficient. These new systems could be even more difficult to use than current systems. It depends how we design the interface between the world of humans and the world of this new generation of machines. To change inanimate objects into smart active helpmates they need perceptual intelligence. The main problem with today’s systems is they are both deaf and blind. They mostly experience the world around them through a slow serial line to a keyboard and mouse. Even multimedia computers, which can handle signals like sound and image, do so only as a transport device that knows nothing Computers need to share our perceptual environment before they can be really helpful. They need to be situated in the same world that we are; they need to know much more than just the text of our words of the signals’ content. Here comes the importance of perceptual intelligence. If the systems have the ability to learn perception, they can act in a smart way. Perceptual intelligence is actually a learned skill. Perception Perception is the end result of a thought that begins its journey with the senses. We see, hear, physically feel, smell or taste an event. After the event is experienced it must then go through various filters before our brains decipher what exactly has happened and how we feel about it. Even though this process can seem instantaneous, it still always happens. The filters that make up perception are as follows: What we know about the subject or event. I saw an orange and knew it was editable. What our previous experience (and/or knowledge) with the subject or event was. Last time I ate an orange I peeled it first (knowledge to peel an orange before eating it) and it was sweet. A.J.C.E Dept. Of Comp. Science & Engg. Perceptual Intelligent Systems 4 Our previous experience forms our expectations. Our current emotional state. How we are feeling at the time of the event does affect how we will feel after the event. I was in a bad mood when I ate the orange and it angered me that it was sour and not sweet (my expectation). In the end my intellectual and emotional perception regarding the eating of an orange was an unpleasant experience. Depending on how strong that experience was, determines how I will feel next time I eat an orange. For example, if I got violently sick after eating an orange, the next time I see an orange, I probably won’t want to eat it. If I had a pleasant experience eating an orange, the next time I see an orange, I’ll likely want to eat it. Even though emotions seemly occur as a result of an experience, they are actually the result of a complicated process. This process involves interpreting action and thought and then assigning meaning to it. The mind attaches meaning with prejudice as the information goes through the perceptual filters we mentioned above. Our perceptual filters also determine truth, logic along with meaning - though they don’t always do this accurately. Only when we become aware that a bad feeling could be an indication of a misunderstanding (error in perception) we can begin to make adjustments to our filters and change the emotional outcome. When left alone and untrained, the mind chooses emotions and reactions based on a "survival" program which does not take into account that we are civilized beings – it’s only concerned with survival. A good portion of this program is faulty because the filters have created distortions, deletions and generalizations which alter perception. For example, jumping to a conclusion about "all" or "none" of something based on one experience. The unconscious tends to think in absolutes and supports "one time" learning from experience (this is the survival aspect of learning). A.J.C.E Dept. Of Comp. Science & Engg. Perceptual Intelligent Systems 5 Perceptual User Interfaces A perceptual interface is one that allows a computer user to interact with the computer without having to use the normal keyboard and mouse. These interfaces are realised by giving the computer the capability of interpreting the user's movements or voice commands. Perceptual Interfaces are concerned with extending human computer interaction to use all modalities of human perception. All current research efforts are focused at including vision, audition, and touch in the process. The goal of perceptual reality is to create virtual and augmented versions of the world, that are perceptually identical to the human with the real world. The goal of creating perceptual user interfaces is to allow humans to have natural means of interacting with computers, appliances and devices using voice, sounds, gestures, and touch. Perceptual User interfaces (PUI) are characterised by interaction techniques that combine an understanding of natural human capabilities with computer I/O devices and machine perception and reasoning. They seek to make the user interface more natural and compelling by taking advantage of the ways in which people naturally interact with each other and with the world-both verbally and nonverbally. Devices and sensors should be transparent and passive if possible, and machines should perceive relevant human communication channels as well as generate output that is naturally understood. This is expected to require integration at multiple levels of technologies such as speed and sound recognition and generation, computer vision, graphical animation and visualization, language understanding, touch based sensing and feedback learning, user modelling and dialogue management. A.J.C.E Dept. Of Comp. Science & Engg. Perceptual Intelligent Systems 6 Information Flow in Perceptual User Interfaces PUI integrates perceptive, multimodal, and multimedia interfaces to bring our human capabilities to bear on creating more natural and intuitive interfaces. A perceptive user interface is one that adds human-like perceptual capabilities to the computer, for example, making the computer aware of what the user is saying or what the user’s face, body and hands are doing. These interfaces provide input to the computer while leveraging human communication and motor skills. A multimodal user interface is closely related emphasizing human communication skills. We use multiple modalities when we engage in face to face communication leading to more effective communication. Most work on multimodal UI as focused on computer input(for example using speech together with pen based gestures).Multimodal output uses different modalities, like visual display, audio and tactile feedback to engage human perceptual, cognitive and communication skills in understanding what is being presented. In multimodal UI various modalities are sometimes used independently or simultaneously or tightly coupled. Multimedia UI uses perceptual and cognitive skills to interpret information presented to the user .Text, graphics, audio and video are the typical media used. PUIs will enhance the use of computers as tools or appliances, directly enhancing GUI-based applications. For example, by taking into account gestures, speech and eye gaze. Perhaps, more importantly, these technologies will enable broad use of computers as A.J.C.E Dept. Of Comp. Science & Engg. Perceptual Intelligent Systems 7 assistance, or agents that will interact in more human like ways. Perceptual interfaces will enable multiple styles of interaction such as speech only, speech and gesture, text and touch, vision, and synthetic sound, each of which may be appropriate in different circumstances, whether that be desktop apps, hands-free mobile use, or embedded household systems. Perceptual Intelligence Perceptual Intelligence is the knowledge and understanding that everything we experience (especially thoughts and feelings) are defined by our perception. Perceptual intelligence is paying attention to people and the surrounding situation in the same way another person would, thus allowing these new devices to learn to adapt their behaviour to suit us, rather than adapting to them as we do today. In the language of cognitive science, perceptual intelligence is the ability to deal with the frame problem; it is the ability to classify the current situation, so that it is possible to know what variables are important and thus can take appropriate action. Once a computer has the perceptual ability to know who, what, when, where, and why, then the probabilistic rules derived by statistical learning methods are normally sufficient for the computer to determine a good course of action. The key to perceptual intelligence is making machines aware of their environment, and in particular, sensitive to the people who interact with them. They should know who we are, see our expressions and gestures, and hear the tone and emphasis of our voice. Perceptual Intelligent Systems We have developed computer systems that can follow people‘s actions, recognizing their faces, gestures, and expressions. Some of the systems are: Gesture Recognition System Speech Recognition System Nouse Perceptual Vision Interface A.J.C.E Dept. Of Comp. Science & Engg. Perceptual Intelligent Systems 8 Gesture Recognition System Gesture Recognition deals with the goal of interpreting human gestures via mathematical algorithms. Gestures can originate from any bodily motion or state but commonly originate from the face or hand. Current focuses in the field include emotion recognition from the face and hand gesture recognition. Many approaches have been made using cameras and computer vision algorithms to interpret sign language. Gesture Recognition can be seen as a way for computers to begin to understand human body language, thus building a richer bridge between machines and humans than primitive text user interfaces or even GUIs (Graphical User Interfaces), which still limit the majority of input to keyboard and mouse. Gesture Recognition enables humans to interface with the machine (HMI) and interact naturally without any mechanical devices. Using the concept of Gesture Recognition, it is possible to point a finger at the computer screen so that the cursor will move accordingly. This could potentially make conventional input devices such as mouse, keyboards and even touch-screens redundant. Gesture Recognition can be conducted with techniques from computer vision and image processing. Often the term gesture interaction is used to refer to inking or mouse gesture interaction, which is computer interaction through the drawing of symbols with a pointing device cursor. Strictly speaking the term mouse strokes should be used instead of mouse gesture since this implies written communication, making a mark to represent a symbol. Challenges of Gesture Recognition There are many challenges associated with the accuracy and usefulness of Gesture Recognition software. For image-based gesture recognition there are limitations on the equipment used and image noise. Images or video may not be under consistent lighting, or in the same location. Items in the background or distinct features of the users may make recognition more difficult. The variety of implementations for image-based gesture recognition may also cause issue for viability of the technology to general usage. For example, recognition using stereo cameras or depth-detecting cameras are not currently commonplace. Video or web cameras can give less accurate results based on their limited resolution. A.J.C.E Dept. Of Comp. Science & Engg. Perceptual Intelligent Systems 9 Speech recognition System Speech recognition converts spoken words to machine-readable input (for example, to the binary code for a string of character codes). The term voice recognition may also be used to refer to speech recognition, but more precisely refers to speaker recognition, which attempts to identify the person speaking, as opposed to what is being said. Speech recognition applications include voice dialling (e.g., "Call home"), call routing (e.g., "I would like to make a collect call"), domotic appliance control and content-based spoken audio search (e.g., find a podcast where particular words were spoken), simple data entry (e.g., entering a credit card number), preparation of structured documents (e.g., a radiology report), speech-to-text processing (e.g., word processors or emails), and in aircraft cockpits (usually termed Direct Voice Input). Performance of Speech Recognition Systems The performance of speech recognition systems is usually specified in terms of accuracy and speed. Most speech recognition users would tend to agree that dictation machines can achieve very high performance in controlled conditions. There is some confusion, however, over the interchange ability of the terms "speech recognition" and "dictation". Commercially available speaker-dependent dictation systems usually require only a short period of training (sometimes also called `enrolment') and may successfully capture continuous speech with a large vocabulary at normal pace with a very high accuracy. Most commercial companies claim that recognition software can achieve between 98% to 99% accuracy if operated under optimal conditions. `Optimal conditions' usually assume that users: Have speech characteristics which match the training data, Can achieve proper speaker adaptation, and Work in a clean noise environment (e.g. quiet office or laboratory space). This explains why some users, especially those whose speech is heavily accented, might achieve recognition rates much lower than expected. Speech recognition in video has become a popular search technology used by several video search companies. A.J.C.E Dept. Of Comp. Science & Engg. Perceptual Intelligent Systems 10 Limited vocabulary systems, requiring no training, can recognize a small number of words (for instance, the ten digits) as spoken by most speakers. Such systems are popular for routing incoming phone calls to their destinations in large organizations. Nouse Perceptual Vision Interface Nouse PVI is a perceptual vision interface program that offers a complete solution to working with a computer in Microsoft Windows OS hands-free. Using a camera connected to a computer, the program analyzes the facial motion of the user to allow him/her to use it instead of a mouse and a keyboard. As such Nouse - PVI allows a user, to perform the basic three computer-control actions: Cursor control: Includes Cursor positioning Cursor moving, and Object dragging - which are normally performed using mouse motion Clicking: Includes Right-button click Left-button click Double-click, and Holding the button down - which are normally performed using the mouse buttons Key/letter entry: Includes Typing of English letters Switching from capital to small letters, and to functional keys Entering basic MS Windows functional keys as well as Nouse functional keys - which would normally be performed using a keyboard. A.J.C.E Dept. Of Comp. Science & Engg. Perceptual Intelligent Systems 11 The program is equipped with such tools as: Nousor (Nouse Cursor) The video-feedback-providing cursor that is used to point and to provide the feeling of “touch" with a computer. Nouse Click A nose-operated mechanism to simulate types of clicks. Nouse Codes Configurable Nouse tool that allows entering computer commands and operate the program using head motion codes. Nouse Editor Provides an easy way of typing and storing messages hands-free using face motion. Typed messages are automatically stored in Clipboard (as with CNTR+A, CNTR+C). Nouse BoardA specially designed for face-motion-based typing on-screen keyboard that automatically maps to the user's facial motion range. Nouse Typer A configurable Nouse tool that allows typing letters by drawing them inside the cursor (instead of using the Nouse Board). Nouse Chalk A configurable Nouse tool that allows writing letters as with a chalk on a piece of paper. Written letters are automatically saved on hard drive as images that can be opened and emailed. And such features as: Automatic focusing on the user nose and motion range calibration. Lock On Area, Glue/Unglue mechanisms that allow to map user's motion range onto an arbitrary windows application A.J.C.E Dept. Of Comp. Science & Engg. Perceptual Intelligent Systems 12 Figure: The appearances of Nouse Board: grouping of letters by four is made to suit four directions of “clicking” motion Many of the universities have research centres which focus on perceptual intelligence. In India MIT have developed two experimental test buds smart rooms and smart clothes. A.J.C.E Dept. Of Comp. Science & Engg. Perceptual Intelligent Systems 13 Conclusion It is now possible to track people’s motion, identify them by voice and facial appearance, and recognize their actions in real time using only modest computational resources. By using this perceptual information we have been able to build smart room and smart clothes that can recognize people, understand their speech, allow them to control information displays without mouse or keyboard, communicate by facial and hand gesture, and interact in a more personalized, adaptive manner. Our overall goal is to make the computers seem as natural to interact with as another person. Sometimes this means than there should be no interface; it should just recognize what is going on and what is the right thing. At other times, it means that the system should engage in a dialogue with a person. We want a system that is truly human centred and natural to interact with; this requires not just perceptions but also a significant understanding of the semantics of the everyday world and the reasoning capabilities to use this understanding flexibly. A.J.C.E Dept. Of Comp. Science & Engg. Perceptual Intelligent Systems 14 Reference www.ayrmetes.com www.centerforfuturehealth.com www.infibeam.com www.wikipedia.org A.J.C.E Dept. Of Comp. Science & Engg.
© Copyright 2025 Paperzz