Artificial Intelligence and computer vision is a quickly enlarging research group to develop various tools and techniques for modelling, analysing, mining, understanding and visualising data. Computer vision is a area that comprise methods for obtaining, processing, analyzing, and understanding images and, in general, high-dimensional data from the real world in order to produce numerical or symbolic information.
The image data can take many types, such as views from multiple cameras, as video sequences or multi-dimensional data from a medical scanner. ie.Barcode scanner’s ability to see a bunch of lines in a UPC. The general definition of the Computer vision is an incorporated area that deals with how computers can be made to gain high-level understanding from digital images or videos, it seeks to automate tasks that the human visual system can do.
Computer Vision is the broad parent area for any automations involving visual content. But within this parent idea, there are a few specific tasks that are as below:
- Object Classification: We have to make a train a model on a dataset of specific objects, and the model classifies new objects as belonging to one or more of your training classes.
- Object Identification: Our model will recognize a specific instance of an object – for example, parsing two faces in an image and tagging one as person1 and other is person2.
There are also other methods of analysis namely:
- Video motion analysis: It uses computer vision to estimate the velocity of objects in a video, or the camera/webcam itself.
- Image Segmentation: in this,algorithms partition images into multiple sets of views.
- Scene Reconstruction: It is creates a 3D model of a scene inputted through images or video.
A traditional application of computer vision is handwriting recognition for digitizing handwritten content.
Working of Computer Vision
Machines resolve images very simply: as a series of pixels, each with their own set of color values. Think of an image as a giant grid of different squares, or pixels Each pixel in an image can be defined by a number, normally from 0 – 255. The series of numbers on the right is what software use when you input an image. When we start to add in color, things get more complex. machines usually read color as a series of 3 values – RGB(red, green, and blue ) – on that same 0 – 255 scale.
For some field of vision on how computationally costly this is, consider this hierarchy:
- Each color value is stored in 8 bits.
- 8 bits x 3 colors per pixel = 24 bits/pixel.
- A normal sized 1024 x 768 image x 24 bits per pixel = almost 19M bits, or about 2.36 mbs.