Showing posts with label Computer Vision. Show all posts

Tuesday, October 10, 2017

opencv c++ visual studio 2017 configuration

In this article, download software's, configure visual studio for opencv and build hello world opencv app using c++ in visual studio 2017.

First download required softwares:

a) Download latest from opencv 3.3
b) Install visual studio 2017 with c++ packages

Then, lets setup global environment variables for opencv build:

a) Goto My computer -> Right click -> Properties
b) Click "Advance System Settings" -> click "Environment Variables"
c) Click "New" from System variables grid
d) Enter "Variable name:" = OPENCV_DIR & "Variable value:" = C:\Program Files\opencv\build\ or opencv build path
e) Select Variable 'Path' from System variables grid
f) Edit append ";%OPENCV_DIR%\x64\vc14\bin" ; is delimiter to earlier paths

Last, we should configure Visual studio 2017 opencv project properties:

a) create a new c++ console project in visual studio 2017
b) click project properties
c) Goto -> Configuration Properties->C/C++->General->Additional Include Directories-> set "C:\Program Files\opencv\build\include"

d)Goto -> Configuration Properties->Linker->General->Additional Library Directories-> set "C:\Program Files\opencv\build\x64\vc14\lib"

e)Goto -> Configuration Properties->Linker->Input->Additional Dependencies-> add "opencv_world330.lib" "opencv_world330d.lib"

Woot woot , thats it, you are all set to develop amazing image processing app.

Here is the Test code:

#include "stdafx.h"
#include opencv2/core/core.hpp
#include opencv2/highgui/highgui.hpp
#include iostream

using namespace cv;
using namespace std;

int main()
{
Mat image;
image = imread("c:\\mahesh\\pics 110.jpg", IMREAD_COLOR); // Read the file
namedWindow("Display window", WINDOW_AUTOSIZE); // Create a window for display.
imshow("Display window", image); // Show our image inside it.

waitKey(0); // Wait for a keystroke in the window
return 0;
}

Monday, August 28, 2017

Digital Image Processing in a minute!

Digital Image Processing

What is DIP?

DIP means, processing digital images by means of a digital computer using math’s\signal processing algorithms. Digital image (jpg, png, gif, video’s) is nothing but a two dimensional array of pixels, basically a grid which contains RGB values. Images can be generated by using Electromagnetic spectrum like visible light, X-rays, Gamma rays and also other waves like Acoustic (ultrasound).

Why DIP?

Simple answer, Automation & to Sense the information out of it.

a)   You want to search your baby\most cherished moment photos in the middle of hundreds of photos in your laptop instantly.

b)   FBI wants to track a fugitive criminal, they can scan automatically scan his photo against CCTVs across airport, streets …

c)    If Robots want to help humans in automation, it has learn to recognize humans, objects, animals etc.

d)   If you want to board airplane by just showing your face, no more boarding pass. DIP scans your face, processes and authenticates you.

e)   Instagram, Snapchat all making money because of DIP – filters

f)    Robots, can quickly take the inventory count for you.

The list goes on…

Processes inside DIP:

Inside Image processing series of process goes on, you can transform images from grayscale to color, apply filters, enhance the image color, delete a part of the images etc. They are as follow:

Image Grabbing or acquisition

Image Preprocessing

Image Enhancement

Image Compression

Image Segmentation

Image Representation and feature extraction

Image Recognition and interpretation.

Image Compression, is a technique to compress and decompress the images\videos, may be lossy or lossless.

Image segmentation is the process of subdividing the given image into its constituent parts or objects. Basically identifying the edges in the images.

Image Restoration is a process that attempts to reconstruct or recover an image that has been degraded by using a priori knowledge of the degradation phenomenon.

Conclusion:

Image processing is part of parcel of modern human life. It started slowly with image transfer from London to New York in 1920’s, later with the advancement of computers, field of image processing is growing exponentially.

Image processing can solve lot of problems, improve human living standards, automate routine stuffs, etc.

References:

Digital Image Processing (3rd Edition) by Rafael C. Gonzalez, Richard E. Woods

Sunday, June 18, 2017

Hello World in OpenCV

Wednesday, May 10, 2017

Computer Vision API - Microsoft Cognitive Services

MS Computer Vision API is powerful REST API’s or Web services, which takes image as input, processes it and return back the image analysis in JSON or in structured format. For ex: API identifies for any adult content, does recognizes texts/character (OCR), face recognition etc.

MS Computer Vision API, is based on the hundreds of image processing algorithms which internally uses lots of complex math’s to process and analyze the images.

Microsoft, bundled image processing API’s into Computer Vision API and hooked up it with her Cloud platform. So basically you have to pay for every single API call. You might be wondering how and what scenario you can consume it.

Here is a typical scenario where you can use Computer Vision API:

You have a camera, it’s an IoT device hooked up with internet. You have set up that in front of your home / apt / office, as soon as someone pops in and knocks the door. Your camera/IoT device takes photo of her/his and sends it to MS Azure\ cloud, it’s their you make call to MS computer vision api to process and get the result, from that result you analyze and decide to open the door or not. Like this you can use it in any scenario where you want to process image.

Following are just few things you can do on images with CV API’s: Tagging, Categorizing, Identifying, OCR, Face recognition, generate thumbnails and lot more. Obviously it can process video (video is an illusion, when you display a set of frames /images like 24 images or more in a second, eye processes it as movie)

Simple Client program to invoke computer vision api:

Ref:

https://azure.microsoft.com/en-us/services/cognitive-services/

Monday, December 26, 2016

CPU v GPU (Deep learning \ AI)

Why bother about CPU v GPU? Because this will let you know how parallel processors making revolution in the Deep Learning \ AI world.

CPU or Central Processing Unit: is electronic microprocessor which is designed to do sequential processing of tasks/programs as described by von Neumann Architecture (1945). Neumann was working on Manhattan project at Los Alamos National Laboratory, he wanted to device a machine to do lot of calculations. So Neumann designed CPU architecture, obviously he was influenced by the great Alan Turing, British mathematician.

The reason I am trying to give background of CPU is to understand the time its architecture was designed.

GPU or Graphics Processing Unit: many of us don’t know that GPU exist in almost all modern computer, mobile phones or any display units. It’s the work horse, does all the graphics related activities like calculating vectors\polygons or all activities involved in the graphics pipeline. In short to do math required to displaying video games\movies, images is responsibilities of GPU.

Initially CPU is to do all graphics processing related activities, but the raise of the sizes of movies, video games, images let to the raise of GPU’s. Actually still CPU is central of a computer, but it just outsource the graphics processing work to GPU. GPU is capable of doing many times billions of calculations per second then CPU because of its enormous number of its cores.

CPU v GPU

“A CPU consists of a few cores optimized for sequential serial processing while a GPU has a massively parallel architecture consisting of thousands of smaller, more efficient cores designed for handling multiple tasks simultaneously” NVidia

CPUs & GPUs both have fundamentally different design philosophies. We cannot compare apple to orange. Each excel at their own respective domain. Here is the few basic difference between them:

CPU	GPU
Multiple cores	Hundreds or thousands of Cores
general purpose processors to handle all kind of activities	special purpose processors to handle polygon based 3d graphics
serial portions of the code run	parallel portions of the code run
have higher operating frequency, higher of number of registers	- lower operating frequency, lower number of registers
file handling, branching, serial tasks	better in math computer like polygons, vector math’s, transformation
	GPUs are in general targeted for gaming.
Sequential	Parallel

Deep learning:

With rapid advancement of GPU’s processing power researchers and business started to use it in Deep learning \ AI

“Deep learning is the fastest-growing field in artificial intelligence, helping computers make sense of infinite amounts of data in the form of images, sound, and text” - nvidia

Deep learning is nothing but trying to solve complex problems by creating neural networks which works like human brain. It learns from the process it does and try’s evolve or get better as it does more and more operations. Basically try’s to become like one human brain.

Today’s advanced deep neural networks use algorithms, big data, and the computational power of the GPU to change this dynamic. Machines are now able to learn at a speed, accuracy, and scale that are driving true artificial intelligence.

Deep learning is used in the research community and in industry to help solve many complex problems like genetic simulation, molecular biology simulation etc.

Parallel Processors Framework

Following API’s helps programmers to manage parallelism and data delivery in massively parallel processors (GPU’s):

OpenCL by Apple

DirectCompute by Microsoft

CUDA by NVidia

By using any above frameworks you can write programs to analyze or solve problems like complex math calculation, molecular simulation, genetics or even atom level simulations.

Patterns everywhere