Quantcast
Channel: OpenCV 3 – PyImageSearch
Viewing all 49 articles
Browse latest View live

Compiling OpenCV with CUDA support

$
0
0

install_opencv_cuda_logos

Alight, so you have the NVIDIA CUDA Toolkit and cuDNN library installed on your GPU-enabled system.

What next?

Let’s get OpenCV installed with CUDA support as well.

While OpenCV itself doesn’t play a critical role in deep learning, it is used by other deep learning libraries such as Caffe, specifically in “utility” programs (such as building a dataset of images). Simply put, having OpenCV installed makes it easier to write code to facilitate the procedure of pre-processing images prior to feeding them into deep neural networks.

Because of this, we should install OpenCV into the same environment as our deep learning libraries, to at the very least, make our lives easier.

Furthermore, in a GPU-enabled CUDA environment, there are a number of compile-time optimizations we can make to OpenCV, allowing it to take advantage of the GPU for faster computation (but mainly for C++ applications, not so much for Python, at least at the present time).

I’ll be making the assumption that you’ll be installing OpenCV into the same environment as last week’s blog post — in this case, I’ll be continuing my example of using the Ubuntu 14.04 g2.2xlarge instance on Amazon EC2.

Truth be told, I’ve already covered installing OpenCV on Ubuntu in many previous blog posts, but I’ll explain the process here as well. Overall, the instructions are near identical, but with a few important changes inside the

cmake
  command, allowing us to compile OpenCV with CUDA support.

By the time you finish reading this blog post, you’ll have OpenCV with CUDA support compiled and installed in your deep learning development environment.

Installing OpenCV with CUDA support

Before we can compile OpenCV with CUDA support, we first need to install some prerequisites:

$ sudo apt-get install libjpeg8-dev libtiff5-dev libjasper-dev libpng12-dev
$ sudo apt-get install libgtk2.0-dev
$ sudo apt-get install libavcodec-dev libavformat-dev libswscale-dev libv4l-dev
$ sudo apt-get install libatlas-base-dev gfortran
$ sudo apt-get install libhdf5-serial-dev
$ sudo apt-get install python2.7-dev

If you’re a follower of the PyImageSearch blog, then you’ll also know that I’m a big fan of using

pip
 ,
virtualenv
 , and
virtualenvwrapper
  to create sequestered, independent Python virtual environments for each of our projects. You can install the virtual environment packages using the commands listed below (or you can skip this step if you already have Python virtual environments setup on your machine):
$ wget https://bootstrap.pypa.io/get-pip.py
$ sudo python get-pip.py
$ sudo pip install virtualenv virtualenvwrapper
$ sudo rm -rf get-pip.py ~/.cache/pip

If this is your first time using Python virtual environments, I would suggest reading the first half of this blog post to familiarize yourself with them. The RealPython.com blog also has an excellent article on Python virtual environments for the uninitiated.

Next, let’s use update our

~/.bashrc
  file. Open this file using your favorite command line text editor (such as
nano
 ,
vi
 , or
emacs
 ):
$ nano ~/.bashrc

Then, scroll down to the bottom of the file, append the following lines, and save + exit the editor:

# virtualenv and virtualenvwrapper
export WORKON_HOME=$HOME/.virtualenvs
source /usr/local/bin/virtualenvwrapper.sh

At this point, we can create our

cv
  virtual environment:
$ source ~/.bashrc
$ mkvirtualenv cv
$ pip install numpy

Note: Again, you’ll want to read the first half of this blog post to better understand Python virtual environments if this is your first time using them. I also explain them more thoroughly and how to properly use them in other OpenCV installation guides on this website.

Now, let’s download and unpack OpenCV. If you’re using the default Amazon EC2 g2.2xlarge instance, then I highly suggest that you download the OpenCV sources and do your compiling on

/mnt
 .

The default g2.2xlarge instance has only ~8GB of space, which once you factor in the system files, NVIDIA drivers, etc., is not enough room to compile OpenCV from source:

Figure 1: The default disk size for the g2.2xlarge instance is only 8GB, which doesn't leave enough space to compile OpenCV from source.

Figure 1: The default disk size for the g2.2xlarge instance is only 8GB, which doesn’t leave enough space to compile OpenCV from source.

However, the

/mnt
  volume has 64GB of space, which is more than enough for our compile:
Figure 2: However, if we use the '/mnt' volume instead, we have 64GB -- far more than what is required to compile OpenCV.

Figure 2: However, if we use the ‘/mnt’ volume instead, we have 64GB — far more than what is required to compile OpenCV.

If you are indeed on an Amazon EC2 instance, be sure to change directory to

/mnt
  and create a directory specifically for your OpenCV compile prior to downloading the source:
$ cd /mnt
$ sudo mkdir opencv_compile
$ sudo chown -R ubuntu opencv_compile
$ cd opencv_compile

The above command will create a new directory named

opencv_compile
  in the
/mnt
  volume, followed by giving the
ubuntu
  user permission to modify it at their will.

Note: The

/mnt
  volume is what Amazon calls “ephemeral storage”. Any data put on this volume will be lost when your system is stopped/rebooted. You don’t want to use
/mnt
  to store long-term data, but it’s perfectly fine to use
/mnt
  to compile OpenCV. Once OpenCV is compiled, it will be installed to the system drive — your OpenCV installation
will not disappear between reboots.

For this tutorial, I’ll be using OpenCV 3.1. But you could also use OpenCV 2.4.X or OpenCV 3.0. Use the following commands to download the source:

$ wget -O opencv.zip https://github.com/Itseez/opencv/archive/3.1.0.zip
$ wget -O opencv_contrib.zip https://github.com/Itseez/opencv_contrib/archive/3.1.0.zip
$ unzip opencv.zip
$ unzip opencv_contrib.zip

In case the URLs of the

.zip
  archives are cutoff, I’ve included them below:

We are now ready to use

cmake
  to configure our build. Take special care when running this command, as I’m introducing some configuration variables you may not be familiar with:
$ cd opencv-3.1.0
$ mkdir build
$ cd build
$ cmake -D CMAKE_BUILD_TYPE=RELEASE \
    -D CMAKE_INSTALL_PREFIX=/usr/local \
    -D WITH_CUDA=ON \
    -D ENABLE_FAST_MATH=1 \
    -D CUDA_FAST_MATH=1 \
    -D WITH_CUBLAS=1 \
    -D INSTALL_PYTHON_EXAMPLES=ON \
    -D OPENCV_EXTRA_MODULES_PATH=../../opencv_contrib-3.1.0/modules \
    -D BUILD_EXAMPLES=ON ..

To start, take note of the

WITH_CUDA=ON
  flag. Technically, this flag will be set to
ON
  by default since CMake is smart enough to detect that the CUDA Toolkit is installed. But, just in case, we’ll manually set the variable to
WITH_CUDA=ON
  to ensure CUDA support is compiled.

From there, we add in a few more optimizations, mainly around using cuBLAS, an implementation of the BLAS (Basic Linear Algebra Subprograms) library in the CUDA runtime.

We also indicate that we want to utilize the “fast math” optimizations, a series of extremely fast mathematical routines that are optimized for speed (they are written in Assembly) — and essentially perform little-to-no error checking. Again, the FastMath libraries are geared towards pure speed and nothing else.

After running

cmake
 , take a look at the “NVIDIA CUDA” section — it should look similar to mine, which I have included below:
Figure 3: Examining the output of CMake to ensure OpenCV will be compiled with CUDA support.

Figure 3: Examining the output of CMake to ensure OpenCV will be compiled with CUDA support.

Notice how CUDA support is going to be compiled using both cuBLAS and “fast math” optimizations.

Provided that your own CMake command exited without error, you can now compile and install OpenCV:

$ make -j8
$ sudo make install
$ sudo ldconfig

If all goes well, the

make
  command should run successfully:
Figure 4: OpenCV with CUDA support has successfully compiled.

Figure 4: OpenCV with CUDA support has successfully compiled.

Again, assuming your compile finished without error, OpenCV should now be installed in

/usr/local/lib/python2.7/site-packages
 . You can verify this using the
ls
  command:
$ ls -l /usr/local/lib/python2.7/site-packages
total 2092
-rw-r--r-- 1 root staff 2138812 Jun  2 14:11 cv2.so

Note: You’ll want to find and make note of where your

cv2.so
  file is on your system! Whenever we create a virtual environment (which we’ll be doing lots of to explore various deep learning libraries), you’ll want to sym-link the
cv2.so
  file into the
site-packages
  directory of your Python virtual environment so you have access to OpenCV.

The last step is to sym-link the

cv2.so
  file (our Python bindings) into the
cv
  virtual environment:
$ cd ~/.virtualenvs/cv/lib/python2.7/site-packages/
$ ln -s /usr/local/lib/python2.7/site-packages/cv2.so cv2.so

To verify our installation, open up a new terminal, access the

cv
  virtual environment using the
workon
  command, fire up a Python shell, and then import OpenCV:
$ cd ~
$ workon cv
$ python
>>> import cv2
>>> cv2.__version__
'3.1.0'
>>>

Finally, now that OpenCV is installed, let’s perform a bit of cleanup and remove the source files used for installation:

$ cd /mnt
$ sudo rm -rf opencv_compile

Again, I can’t stress this point enough — you need to get comfortable working with Python virtual environments, the

site-packages
  directory, and how to use symbolic links. I recommend the following tutorials to help understand each of them:

Summary

In today’s blog post, I detailed how to install OpenCV into our deep learning environment with CUDA support. While OpenCV itself isn’t directly used for deep learning, other deep learning libraries (for example, Caffe) indirectly use OpenCV.

Furthermore, by installing OpenCV with CUDA support, we can take advantage of the GPU for further optimized operations (at least from within C++ applications — there isn’t much support for Python + OpenCV + GPU, yet).

Next week, I’ll detail how to install the Keras Python package for deep learning and Convolutional Neural Networks — from there, the real fun will start!

The post Compiling OpenCV with CUDA support appeared first on PyImageSearch.


Ubuntu 16.04: How to install OpenCV

$
0
0

ubuntu1604_anpr

Over the past two years running the PyImageSearch blog, I’ve authored two tutorials detailing the required steps to install OpenCV (with Python bindings) on Ubuntu. You can find the two tutorials here:

However, with support of Ubuntu 14.04 winding down and Ubuntu 16.04 set as the next LTS (with support until April 2021), I thought it would be appropriate to create a new, updated Ubuntu + OpenCV install tutorial.

Inside this tutorial, I will document, demonstrate, and provide detailed steps to install OpenCV 3 on Ubuntu 16.04 with either Python 2.7 or Python 3.5 bindings.

Furthermore, this document has been fully updated from my previous Ubuntu 14.04 tutorials to use the latest, updated packages from the

apt-get
  repository.

To learn how to install OpenCV on your Ubuntu 16.04 system, keep reading.

Note: Don’t care about Python bindings and simply want OpenCV installed on your system (likely for C++ coding)? No worries, this tutorial will still work for you. Follow along with the instructions and perform the steps — by the end of this article you’ll have OpenCV installed on your system. From there, just ignore the Python bindings and proceed as usual.

Ubuntu 16.04: How to install OpenCV

Before we get into this tutorial, I want to mention that Ubuntu 16.04 actually ships out-of-the-box with both Python 2.7 and Python 3.5 installed. The actual versions (as of 24 October 2016) are:

  • Python 2.7.12 (used by default when you type
    python
      in your terminal).
  • Python 3.5.2 (can be accessed via the
    python3
      command).

Again, it’s worth repeating that Python 2.7 is still the default Python version used by Ubuntu. There are plans to migrate to Python 3 and use Python 3 by default; however, as far as I can tell, we are still a long way from that actually becoming a reality.

In either case, this tutorial will support both Python 2.7 and Python 3. I’ve highlighted the steps that require you to make a decision regarding which version of Python you would like to use. Make sure you are consistent with your decision, otherwise you will inevitably run into compile, linking, and import errors.

Regarding which Python version you should use…I’m not getting into that argument. I’ll simply say that you should use whichever version of Python you are comfortable with and use on a daily basis. Keep in mind that Python 3 is the future — but also keep in mind that porting Python 2.7 code to Python 3 isn’t terribly challenging either once you understand the differences between the Python versions. And as far as OpenCV goes, OpenCV 3 doesn’t care which version of Python you’re using: the bindings will work just the same.

All that said, let’s get started installing OpenCV with Python bindings on Ubuntu 16.04.

Step #1: Install OpenCV dependencies on Ubuntu 16.04

Most (in fact, all) steps in this tutorial will be accomplished by using your terminal. To start, open up your command line and update the

apt-get
  package manager to refresh and upgrade and pre-installed packages/libraries:
$ sudo apt-get update
$ sudo apt-get upgrade

Next, let’s install some developer tools:

$ sudo apt-get install build-essential cmake pkg-config

The

pkg-config
  package is (very likely) already installed on your system, but be sure to include it in the above
apt-get
  command just in case. The
cmake
  program is used to automatically configure our OpenCV build.

OpenCV is an image processing and computer vision library. Therefore, OpenCV needs to be able to load various image file formats from disk such as JPEG, PNG, TIFF, etc. In order to load these images from disk, OpenCV actually calls other image I/O libraries that actually facilitate the loading and decoding process. We install the necessary ones below:

$ sudo apt-get install libjpeg8-dev libtiff5-dev libjasper-dev libpng12-dev

Okay, so now we have libraries to load images from disk — but what about video? Use the following commands to install packages used to process video streams and access frames from cameras:

$ sudo apt-get install libavcodec-dev libavformat-dev libswscale-dev libv4l-dev
$ sudo apt-get install libxvidcore-dev libx264-dev

OpenCV ships out-of-the-box with a very limited set of GUI tools. These GUI tools allow us to display an image to our screen (

cv2.imshow
 ), wait for/record keypresses (
cv2.waitKey
 ), track mouse events (
cv2.setMouseCallback
 ), and create simple GUI elements such as sliders and trackbars. Again, you shouldn’t expect to be building full-fledged GUI applications with OpenCV — these are just simple tools that allow you to debug your code and build very simple applications.

Internally, the name of the module that handles OpenCV GUI operations is

highgui
 . The
highgui
  module relies on the GTK library, which you should install using the following command:
$ sudo apt-get install libgtk-3-dev

Next, we install libraries that are used to optimize various functionalities inside OpenCV, such as matrix operations:

$ sudo apt-get install libatlas-base-dev gfortran

We’ll wrap up Step #1 by installing the Python development headers and libraries for both Python 2.7 and Python 3.5 (that way you have both):

$ sudo apt-get install python2.7-dev python3.5-dev

Note: If you do not install the Python development headers and static library, you’ll run into issues during Step #4 where we run

cmake
  to configure our build. If these headers are not installed, then the
cmake
  command will be unable to automatically determine the proper values of the Python interpreter and Python libraries. In short, the output of this section will look “empty” and you will not be able to build the Python bindings. When you get to Step #4, take the time to compare your output of the command to mine.

Step #2: Download the OpenCV source

At the time of this article’s publication, the most recent version of OpenCV is

3.1.0
 , which we download a
.zip
  of and unarchive using the following commands:
$ cd ~
$ wget -O opencv.zip https://github.com/Itseez/opencv/archive/3.1.0.zip
$ unzip opencv.zip

When new versions of OpenCV are released you can check the official OpenCV GitHub and downloaded the latest release by changing the version number of the

.zip
 .

However, we’re not done downloading source code yet — we also need the opencv_contrib repository as well:

$ wget -O opencv_contrib.zip https://github.com/Itseez/opencv_contrib/archive/3.1.0.zip
$ unzip opencv_contrib.zip

Why are we bothering to download the contrib repo as well?

Well, we want the full install of OpenCV 3 to have access to features (no pun intended) such as SIFT and SURF. In OpenCV 2.4, SIFT and SURF were included in the default installation of OpenCV. However, with the release of OpenCV 3+, these packages have been moved to contrib, which houses either (1) modules that are currently in development or (2) modules that are marked as “non-free” (i.e., patented). You can learn more about the reasoning behind the SIFT/SURF restructuring in this blog post.

Note: You might need to expand the commands above using the “<=>” button during your copy and paste. The

.zip
  in the
3.1.0.zip
  may be cutoff in smaller browser windows. For convenience, I have included the full URL of both the
opencv
  archive as well as the
opencv_contrib
  archive below:

I also want to mention that both your

opencv
  and
opencv_contrib
  versions should be the same (in this case,
3.1.0
 ). If the versions numbers do not matchup, you could very easily run into compile time errors (or worse, runtime errors that are near impossible to debug).

Step #3: Setup your Python environment — Python 2.7 or Python 3

We are now ready to start configuring our Python development environment for the build. The first step is to install

pip
 , a Python package manager:
$ cd ~
$ wget https://bootstrap.pypa.io/get-pip.py
$ sudo python get-pip.py

I’ve mentioned this in every single OpenCV + Python install tutorial I’ve ever done, but I’ll say it again here today: I’m a huge fan of both virtualenv and virtualenvwrapper. These Python packages allow you to create separate, independent Python environments for each project that you are working on.

In short, using these packages allows you to solve the “Project X depends on version 1.x, but Project Y needs 4.x dilemma. A fantastic side effect of using Python virtual environments is that you can keep your system Python neat, tidy, and free from clutter.

While you can certainly install OpenCV with Python bindings without Python virtual environments, I highly recommend you use them as other PyImageSearch tutorials leverage Python virtual environments. I’ll also be assuming that you have both

virtualenv
  and
virtualenvwrapper
  installed throughout the remainder of this guide.

If you would like a full, detailed explanation on why Python virtual environments are a best practice, you should absolutely give this excellent blog post on RealPython a read. I also provide some commentary on why I personally prefer Python virtual environments in the first half of this tutorial.

Again, let me reiterate that it’s standard practice in the Python community to be leveraging virtual environments of some sort, so I suggest you do the same:

$ sudo pip install virtualenv virtualenvwrapper
$ sudo rm -rf ~/get-pip.py ~/.cache/pip

Once we have

virtualenv
  and
virtualenvwrapper
  installed, we need to update our
~/.bashrc
  file to include the following lines at the bottom of the file:
# virtualenv and virtualenvwrapper
export WORKON_HOME=$HOME/.virtualenvs
source /usr/local/bin/virtualenvwrapper.sh

The

~/.bashrc
  file is simply a shell script that Bash runs whenever you launch a new terminal. You normally use this file to set various configurations. In this case, we are setting an environment variable called
WORKON_HOME
  to point to the directory where our Python virtual environments live. We then load any necessary configurations from
virtualenvwrapper
 .

To update your

~/.bashrc
  file simply use a standard text editor. I would recommend using
nano
 ,
vim
 , or
emacs
 . You can also use graphical editors as well, but if you’re just getting started,
nano
  is likely the easiest to operate.

A more simple solution is to use the

cat
  command and avoid editors entirely:
$ echo -e "\n# virtualenv and virtualenvwrapper" >> ~/.bashrc
$ echo "export WORKON_HOME=$HOME/.virtualenvs" >> ~/.bashrc
$ echo "source /usr/local/bin/virtualenvwrapper.sh" >> ~/.bashrc

After editing our

~/.bashrc
  file, we need to reload the changes:
$ source ~/.bashrc

Note: Calling

source
  on
.bashrc
  only has to be done once for our current shell session. Anytime we open up a new terminal, the contents of
.bashrc
  will be automatically executed (including our updates).

Now that we have installed

virtualenv
  and
virtualenvwrapper
 , the next step is to actually create the Python virtual environment — we do this using the
mkvirtualenv
  command.

But before executing this command, you need to make a choice: Do you want to use Python 2.7 or Python 3?

The outcome of your choice will determine which command you run in the following section.

Creating your Python virtual environment

If you decide to use Python 2.7, use the following command to create a Python 2.7 virtual environment:

$ mkvirtualenv cv -p python2

Otherwise, use this command to create a Python 3 virtual environment:

$ mkvirtualenv cv -p python3

Regardless of which Python command you decide to use, the end result is that we have created a Python virtual environment named

cv
  (short for “computer vision”).

You can name this virtual environment whatever you like (and create as many Python virtual environments as you want), but for the time bing, I would suggest sticking with the

cv
  name as that is what I’ll be using throughout the rest of this tutorial.

Verifying that you are in the “cv” virtual environment

If you ever reboot your Ubuntu system; log out and log back in; or open up a new terminal, you’ll need to use the

workon
  command to re-access your
cv
  virtual environment. An example of the
workon
  command follows:
$ workon cv

To validate that you are in the

cv
  virtual environment, simply examine your command line — if you see the text
(cv)
  preceding your prompt, then you are
in the
cv
  virtual environment:
Figure 1: Make sure you see the "(cv)" text on your prompt, indicating that you are in the cv virtual environment.

Figure 1: Make sure you see the “(cv)” text on your prompt, indicating that you are in the cv virtual environment.

Otherwise, if you do not see the

cv
  text, then you are not in the
cv
  virtual environment:
Figure 2: If you do not see the "(cv)" text on your prompt, then you are not in the cv virtual environment and need to run the "workon" command to resolve this issue. 

Figure 2: If you do not see the “(cv)” text on your prompt, then you are not in the cv virtual environment and need to run the “workon” command to resolve this issue.

To access the

cv
  virtual environment simply use the
workon
  command mentioned above.

Install NumPy into your Python virtual environment

The final step before we compile OpenCV is to install NumPy, a Python package used for numerical processing. To install NumPy, ensure you are in the

cv
  virtual environment (otherwise NumPy will be installed into the system version of Python rather than the
cv
  environment). From there execute the following command:
$ pip install numpy

Step #4: Configuring and compiling OpenCV on Ubuntu 16.04

At this point, all of our necessary prerequisites have been installed — we are now ready to compile and OpenCV!

But before we do that, double-check that you are in the

cv
  virtual environment by examining your prompt (you should see the
(cv)
  text preceding it), and if not, use the
workon
  command:
$ workon cv

After ensuring you are in the

cv
  virtual environment, we can setup and configure our build using CMake:
$ cd ~/opencv-3.1.0/
$ mkdir build
$ cd build
$ cmake -D CMAKE_BUILD_TYPE=RELEASE \
    -D CMAKE_INSTALL_PREFIX=/usr/local \
    -D INSTALL_PYTHON_EXAMPLES=ON \
    -D INSTALL_C_EXAMPLES=OFF \
    -D OPENCV_EXTRA_MODULES_PATH=~/opencv_contrib-3.1.0/modules \
    -D PYTHON_EXECUTABLE=~/.virtualenvs/cv/bin/python \
    -D BUILD_EXAMPLES=ON ..

The above commands change directory to

~/opencv-3.1.0
 , which if you have been following this tutorial is where you downloaded and unarchived the
.zip
  files.

Note: If you are getting an error related to

stdlib.h: No such file or directory
 during either the
cmake
  or
make
  phase of this tutorial you’ll also need to include the following option to CMake:
-D ENABLE_PRECOMPILED_HEADERS=OFF
 . In this case I would suggest deleting your
build
  directory, re-creating it, and then re-running CMake with the above option included. This will resolve the
stdlib.h
  error. Thank you to Carter Cherry and Marcin for pointing this out in the comments section!

Inside this directory we create a sub-directory named

build
  and change into it. The
build
  directory is where the actual compile is going to take place.

Finally, we execute

cmake
  to configure our build.

Before we move on to the actual compilation of OpenCV, make sure you examine the output of CMake!

To do this, scroll down the section titled

Python 2
  and
Python 3
 .

If you are compiling OpenCV on Ubuntu 16.04 with Python 2.7 support, make sure the

Python 2
  section includes valid paths to the
Interpreter
 ,
Libraries
 ,
numpy
 , and
packages path
 . Your output should be similar to mine below:
Figure 3: Ensuring that Python 2.7 will be used when compiling OpenCV 3 for Ubuntu 16.04.

Figure 3: Ensuring that Python 2.7 will be used when compiling OpenCV 3 for Ubuntu 16.04.

Examining this output, you can see that:

  1. The
    Interpreter
      points to the Python 2.7 binary in the
    cv
      virtual environment.
  2. Libraries
      points to the Python 2.7 library (which we installed during the final step of Step #1).
  3. The
    numpy
      value points to our NumPy installation in the
    cv
      virtual environment.
  4. And finally, the
    packages path
      points to
    lib/python2.7/site-packages
     . When combined with the
    CMAKE_INSTALL_PREFIX
     , this means that after compiling OpenCV, we’ll find our
    cv2.so
      bindings in
    /usr/local/lib/python2.7/site-packages/
     .

Similarly, if you’re compiling OpenCV 16.04 with Python 3 support, you’ll want to ensure your

Python 3
  section looks similar to mine below:
Figure 4: Validating that Python 3 will be used when compiling OpenCV 3 for Ubuntu 16.04.

Figure 4: Validating that Python 3 will be used when compiling OpenCV 3 for Ubuntu 16.04.

Again, notice how my

Interpreter
 ,
Libraries
 ,
numpy
  and
packages path
  have all been correctly set.

If you do not see the

cv
  virtual environments in these variable paths, it’s almost certainly because you are NOT in the
cv
  virtual environment prior to running CMake!

If that is indeed the case, simply access the

cv
  virtual environment by calling
workon cv
  and re-run the CMake command mentioned above.

Assuming your CMake command exited without any errors, you can now compile OpenCV:

$ make -j4

The

-j
  switch controls the number of processes to be used when compiling OpenCV — you’ll want to set this value to the number of processors/cores on your machine. In my case, I have a quad-core processor, so I set
-j4
 .

Using multiple processes allows OpenCV to compile faster; however, there are times where race conditions are hit and the compile bombs out. While you can’t really tell if this is the case without a lot of previous experience compiling OpenCV, if you do run into an error, my first suggestion would be to run

make clean
  to flush the build, followed by compiling using only a single core:
$ make clean
$ make

Below you can find a screenshot of a successful OpenCV + Python compile on Ubuntu 16.04:

Figure 5: Successfully compiling OpenCV 3 for Ubuntu 16.04.

Figure 5: Successfully compiling OpenCV 3 for Ubuntu 16.04.

The last step is to actually install OpenCV 3 on Ubuntu 16.04:

$ sudo make install
$ sudo ldconfig

Step #5: Finish your OpenCV install

You’re coming down the home stretch, just a few more steps to go and your Ubuntu 16.04 system will be all setup with OpenCV 3.

For Python 2.7:

After running

sudo make install
 , your Python 2.7 bindings for OpenCV 3 should now be located in
/usr/local/lib/python-2.7/site-packages/
 . You can verify this using the
ls
  command:
$ ls -l /usr/local/lib/python2.7/site-packages/
total 1972
-rw-r--r-- 1 root staff 2016608 Sep 15 09:11 cv2.so

Note: In some cases, you may find that OpenCV was installed in

/usr/local/lib/python2.7/dist-packages
  rather than
/usr/local/lib/python2.7/site-packages
  (note
dist-packages
  versus
site-packages
 ). If your
cv2.so
  bindings are not in the
site-packages
  directory, be sure to check
dist-pakages
 .

The final step is to sym-link our OpenCV

cv2.so
  bindings into our
cv
  virtual environment for Python 2.7:
$ cd ~/.virtualenvs/cv/lib/python2.7/site-packages/
$ ln -s /usr/local/lib/python2.7/site-packages/cv2.so cv2.so

For Python 3.5:

After running

sudo make install
 , your OpenCV + Python 3 bindings should be located in
/usr/local/lib/python3.5/site-packages/
 . Again, you can verify this using the
ls
  command:
$ ls -l /usr/local/lib/python3.5/site-packages/
total 1972
-rw-r--r-- 1 root staff 2016816 Sep 13 17:24 cv2.cpython-35m-x86_64-linux-gnu.so

I’ve been puzzled regarding this behavior ever since OpenCV 3 was released, but for some reason, when compiling OpenCV with Python 3 support, the output

cv2.so
  filename is different. The actual filename might vary for you, but it should look something similar to
cv2.cpython-35m-x86_64-linux-gnu.so
 .

Again, I have no idea exactly why this happens, but it’s a very easy fix. All we need to do is rename the file:

$ cd /usr/local/lib/python3.5/site-packages/
$ sudo mv cv2.cpython-35m-x86_64-linux-gnu.so cv2.so

After renaming

cv2.cpython-35m-x86_64-linux-gnu.so
  to simply
cv2.so
 , we can sym-link our OpenCV bindings into the
cv
  virtual environment for Python 3.5:
$ cd ~/.virtualenvs/cv/lib/python3.5/site-packages/
$ ln -s /usr/local/lib/python3.5/site-packages/cv2.so cv2.so

Step #6: Testing your OpenCV install

Congratulations, you now have OpenCV 3 installed on your Ubuntu 16.04 system!

To verify that your installation is working:

  1. Open up a new terminal.
  2. Execute the
    workon
      command to access the
    cv
      Python virtual environment.
  3. Attempt to import the Python + OpenCV bindings.

I have demonstrated how to perform these steps below:

$ cd ~
$ workon cv
$ python
Python 3.5.2 (default, Jul  5 2016, 12:43:10) 
[GCC 5.4.0 20160609] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import cv2
>>> cv2.__version__
'3.1.0'
>>>

As you can see, I can import my OpenCV bindings into my Python 3.5 shell.

Below follows a screenshot of me utilizing the same steps outlined in this tutorial and importing OpenCV bindings into a Python 2.7 shell:

Figure 6: Ensuring that I can successfully import my Python + OpenCV bindings on Ubuntu 16.04.

Figure 6: Ensuring that I can successfully import my Python + OpenCV bindings on Ubuntu 16.04.

Thus, regardless of which Python version you decide to use, simply follow the steps detailed in this tutorial and you’ll be able to install OpenCV + Python on your Ubuntu 16.04 system.

Once OpenCV has been installed, you can delete both the

opencv-3.1.0
  and
opencv_contrib-3.1.0
  directories (along with their associated
.zip
  files):
$ cd ~
$ rm -rf opencv-3.1.0 opencv_contrib-3.1.0 opencv.zip opencv_contrib.zip

But again, be careful when running this command! You’ll want to make sure you have properly installed OpenCV on your system prior to blowing along these directories. Otherwise, you’ll need to restart the entire compile process!

Troubleshooting and FAQ

In this section, I address some of the common questions, problems, and issues that arise when installing OpenCV on Ubuntu 16.04.

Q. When I execute

mkvirtualenv
  or
workon
 , I get a “command not found error”.

A. There are three primary reasons why you would be getting this error message, all of which are related to Step #3:

  1. First, make sure you have installed
    virtualenv
      and
    virtualenvwrapper
      using the
    pip
      package manager. You can verify this by running
    pip freeze
     , examining the output, and ensuring that you see both
    virtualenv
      and
    virtualenvwrapper
      in the list of installed packages.
  2. Your
    ~/.bashrc
      file may not be updated correctly. To diagnose this, use a text editor such as
    nano
      and view the contents of your
    ~/.bashrc
      file. At the bottom of the file, you should see the proper
    export
      and
    source
      commands are present (again, check Step #3 for the commands that should be appended to
    ~/.bashrc
     ).
  3. After editing your
    ~/.bashrc
      file, you may have forgotten to
    source
      it and reload its contents. Make sure you run source
    ~/.bashrc
      after editing it to ensure the contents are reloaded — this will give you access to the
    mkvirtualenv
      and
    workon
      commands.

Q. Whenever I open a new terminal, logout, or reboot my Ubuntu system, I cannot execute the

mkvirtualenv
  or
workon
  commands.

A. See reason #2 from the previous question.

Q. When I (1) open up a Python shell that imports OpenCV or (2) execute a Python script that calls OpenCV, I get an error:

Import Error: No module named cv2
 .

A. Unfortunately, the exact cause of this error message is extremely hard to diagnose as there are multiple reasons this could be happening. In general, I recommend the following suggestions to help diagnose and resolve the error:

  1. Make sure you are in the
    cv
      virtual environment by using the
    workon cv
      command. If this command gives you an error, then see the first question in this FAQ.
  2. If after you’ve ensured your
    ~/.bashrc
      file has been updated properly and
    source
     ‘d, then try investigating the contents of the
    site-packages
      directory in your
    cv
      virtual environment. You can find the
    site-packages
      directory in
    ~/.virtualenvs/cv/lib/python2.7/site-packages/
      or
    ~/.virtualenvs/cv/lib/python3.5/site-packages/
      depending on your Python version. Make sure that (1) there is a
    cv2.so
      file in this
    site-packages
      directory and (2) that it’s properly sym-linked to a valid, existing file.
  3. Be sure to check the
    site-packages
      (and even
    dist-packages
     ) directory for the system install of Python located in
    /usr/local/lib/python2.7/site-packages/
      and
    /usr/local/lib/python3.5/site-packages/
     , respectively. Ideally, you should have a
    cv2.so
      file there.
  4. If all else fails, check in your
    build/lib
      directory of your OpenCV build. There should be a
    cv2.so
      file there (provided that both
    cmake
      and
    make
      executed without error). If the
    cv2.so
      file is present, manually copy it into both the system
    site-packages
      directory as well as the
    site-packages
      directory for the
    cv
      virtual environment.

So, what’s next?

Congrats! You now have a brand new, fresh install of OpenCV on your Ubuntu 16.04 system — and I’m sure you’re just itching to leverage your install to build some awesome computer vision apps…

…but I’m also willing to bet that you’re just getting started learning computer vision and OpenCV, and probably feeling a bit confused and overwhelmed on exactly where to start.

Personally, I’m a big fan of learning by example, so a good first step would be to have some fun and read this blog post on detecting cats in images/videos. This tutorial is meant to be very hands-on and demonstrate how you can (quickly) build a Python + OpenCV application to detect the presence of cats in images.

And if you’re really interested in leveling-up your computer vision skills, you should definitely check out my book, Practical Python and OpenCV + Case Studies. My book not only covers the basics of computer vision and image processing, but also teaches you how to solve real-world computer vision problems including face detection in images and video streamsobject tracking in video, and handwriting recognition.

curious_about_cv

So, let’s put that fresh install of OpenCV 3 on your Ubuntu 16.04 system to good use — just click here to learn more about the real-world projects you can solve using Practical Python and OpenCV.

Summary

In today’s blog post, I demonstrated how to install OpenCV 3 with either Python 2.7 or Python 3 bindings on your Ubuntu 16.04 system.

For more OpenCV install tutorials on other operating systems (such as OSX, Raspbian, etc.), please refer to this page where I provide additional links and resources.

But before you go…

If you’re interested in learning more about OpenCV, computer vision, and image processing, be sure to enter your email address in the form below to be notified when new blog posts are published!

The post Ubuntu 16.04: How to install OpenCV appeared first on PyImageSearch.

macOS: Install OpenCV 3 and Python 2.7

$
0
0

sierra_os_hog_example

I’ll admit it: Compiling and installing OpenCV 3 on macOS Sierra was a lot more of a challenge than I thought it would be, even for someone who has a compiled OpenCV on hundreds of machines over his lifetime.

If you’ve tried to use one of my previous tutorials on installing OpenCV on your freshly updated Mac (Sierra or greater) you likely ran into a few errors, specifically with the

QTKit.h
  header files.

And even if you were able to resolve the QTKit problem, you likely ran into more issues trying to get your CMake command configured just right.

In order to help resolve any issues, problems, or confusion when installing OpenCV with Python bindings on macOS Sierra (or greater) I’ve decided to create two hyper-detailed tutorials:

  1. This first tutorial covers how to install OpenCV 3 with Python 2.7 bindings on macOS.
  2. My second tutorial will come next week where I’ll demonstrate how to install OpenCV 3 with Python 3.5 bindings on macOS.

I decided to break these tutorials into two separate blog posts because they are quite lengthy.

Furthermore, tuning your CMake command to get it exactly right can be a bit of a challenge, especially if you’re new to compiling from OpenCV from source, so I wanted to take the time to devise a foolproof method to help readers get OpenCV installed on macOS.

To learn how to install OpenCV with Python 2.7 bindings on your macOS system, keep reading.

macOS: Install OpenCV 3 and Python 2.7

The first part of this blog post details why I am creating a new tutorial for installing OpenCV 3 with Python bindings on the Mac Operating System. In particular, I explain a common error you may have run across — the

QTKit.h
  header issue from the now deprecated QTKit library.

From there, I provide super detailed instructions on how to install OpenCV 3 + Python 2.7 your macOS Sierra system or greater.

Avoiding the QTKit/QTKit.h file not found error

In the Mac OSX environment the QTKit (QuickTime Kit) Objective-C framework is used for manipulating, reading, and writing media. In OSX version 10.9 (Mavericks) QTKit was deprecated (source).

However, it wasn’t until the release of macOS Sierra that much of QTKit was removed and instead replaced with AVFoundation, the successor to QTKit. AVFoundation is the new framework for working with audiovisual media in iOS and macOS.

This created a big problem when compiling OpenCV on Mac systems — the QTKit headers were not found on the system and were expected to exist.

Thus, if you tried to compile OpenCV on your Mac using my previous tutorials your compile likely bombed out and you ended up with an error message similar to this:

fatal error: 
      'QTKit/QTKit.h' file not found
#import <QTKit/QTKit.h>
        ^ 1 error generated. make[2]: *** [modules/videoio/CMakeFiles/opencv_videoio.dir/src/cap_qtkit.mm.o]
Error 1 make[1]: ***
[modules/videoio/CMakeFiles/opencv_videoio.dir/all] Error 2 make: ***
[all] Error 2

Even more problematic, both the tagged releases of OpenCV v3.0 and v3.1 do not include fixes to this issue.

That said, the latest commits to the OpenCV GitHub repo do address this issue; however, a new tagged release of v3.2 has yet to be released.

That said, I’m happy to report that by using the latest commit to OpenCV’s GitHub we can install OpenCV on macOS Sierra and greater.

The trick is that we need to use the

HEAD
  of the repo as opposed to a tagged release.

Once OpenCV 3.2 is released I’m sure the QKit to AVFoundation migration will be included, but until then, if you want to install OpenCV 3 on your macOS system running Sierra or later, you’ll need to avoid using tagged releases and instead compile and install the development version of OpenCV 3.

How do I check my Mac Operating System version?

To check your Mac OS version click the Apple icon at the very top-left corner of your screen in the menu then select “About this Mac”.

A window should then pop up, similar to the one below:

Figure 1: Checking your OS version on Mac. My machine is currently running macOS Sierra (10.12).

Figure 1: Checking your OS version on Mac. My machine is currently running macOS Sierra (10.12).

If you are running macOS Sierra or greater, you can use this tutorial to help you install OpenCV 3 with Python 2.7 bindings.

If you are using an older version of the Mac Operating System (Mavericks, Yosemite, etc.), please refer to my previous tutorials.

Step #1: Install Xcode

Before we can even think about compiling OpenCV, we first need to install Xcode, a full blown set of software development tools for the Mac Operating System.

Register for an Apple Developer account

Before downloading Xcode you’ll want to register with the Apple Developer Program (it’s free). If you have an existing Apple ID (i.e., what you use to sign in to iTunes with) this is even easier. Simply provide some basic information such as name, address, etc. and you’ll be all set.

From there, the easiest way to download Xcode is via the App Store. Search for “Xcode” in the search bar, select it, and then click the “Get” button:

Figure 2: Selecting Xcode from the Apple App Store.

Figure 2: Selecting Xcode from the Apple App Store.

Xcode will then start to download and install. On my machine the download and install process took approximately 30 minutes.

Accept the Apple Developer license

Assuming this is the first time you’ve installed or used Xcode, you’ll need to accept the developer license (otherwise, you can skip this step). I prefer using the terminal whenever possible. You can use the following command to accept the Apple Developer License:

$ sudo xcodebuild -license

Scroll to the bottom of the license and accept it.

Install Apple Command Line Tools

Finally, we need to install the command line tools. These tools include packages such as make, GCC, clang, etc. This is absolutely a required step, so make sure you install the command line tools:

$ sudo xcode-select --install

After you enter the command above a window will pop up confirming that you want to install the command line tools:

Figure 3: Installing the Apple Command Line Tools on macOS.

Figure 3: Installing the Apple Command Line Tools on macOS.

Click “Install” and the Apple Command Line Tools will be downloaded and installed on your system. This should take less than 5 minutes.

Step #2: Install Homebrew

We are now ready to install Homebrew, a package manager for macOS. Think of Homebrew as similar equivalent to apt-get for Ubuntu and Debian-based systems.

Installing Homebrew is simple. Simply copy and paste the command underneath the “Install Homebrew” section of the Homebrew website (make sure you copy and paste the entire command) into your terminal:

$ /usr/bin/ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)"

Once Homebrew is installed you should update it to ensure the most recent package definitions are downloaded:

$ brew update

The last step is to update our

~/.bash_profile
  file. This file may exist on your system already or it may not. In either case, open it with your favorite text editor (I’ll use
nano
  in this case):
$ nano ~/.bash_profile

And insert the following lines at the bottom of the file (if

~/.bash_profile
  does not exist the file will be empty, so simply insert the following lines):
# Homebrew
export PATH=/usr/local/bin:$PATH

The above snippet updates your

PATH
  variable to look for binaries/libraries along the Homebrew path before searching your system path.

After updating the file save and exit the editor. I have included a screenshot of my

~/.bash_profile
  below:
Figure 4: Updating my .bash_profile file to include Homebrew.

Figure 4: Updating my .bash_profile file to include Homebrew.

You should then use the

source
  command to ensure the changes to your
~/.bash_profile
  file are manually reloaded:
$ source ~/.bash_profile

This command only needs to be executed once. Whenever you login, open up a new terminal, etc., your

.bash_profile
  will will automatically be loaded and sourced for you.

Step #3: Setup Homebrew for Python 2.7 and macOS

In general, you do not want to develop against the system Python as your main interpreter. This is considered bad form. The system version of Python should (in an ideal world) serve only one purpose — support system operations.

Instead, you’ll want to install your own version of Python that is independent of the system one. Installing Python via Homebrew is dead simple:

$ brew install python

Note: This tutorial covers how to install OpenCV 3 with Python 2.7 bindings on macOS. Next week I’ll be covering OpenCV 3 with Python 3 bindings — if you want to use Python 3 with OpenCV on macOS, please refer to next week’s blog post.

After the install command finishes we just need to run the following command to complete the Python installation:

$ brew linkapps python

To confirm that we are using the Homebrew version of Python rather than the system version of Python you should use the

which
  command:
$ which python
/usr/local/bin/python

Important: Be sure to inspect the output of the

which
  command! If you see
/usr/local/bin/python
  then you are correctly using the Hombrew version of Python.

However, if the output is

/usr/bin/python
  then you are incorrectly using the system version of Python. If this is the case then you should ensure:
  1. Homebrew installed without error.
  2. The
    brew install python
      command completed successfully.
  3. You have properly updated your
    ~/.bash_profile
      file and reloaded the changes using
    source
     . This basically boils down to making sure your
    ~/.bash_profile
      looks like mine above in Figure 4.

Step #4: Install virtualenv, virtualenvwrapper, and NumPy

We are now ready to install three Python packages: virtualenv and virtualenvwrapper, along with NumPy, used for numerical processing.

Installing virtualenv and virtualenvwrapper

The

virtualenv
  and
virtualenvwrapper
  packages allow us to create separate, independent Python environments for each project we are working on. I’ve mentioned Python virtual environments many times before, so I won’t rehash what’s already been said. Instead, if you are unfamiliar with Python virtual environments, how they work, and why we use them, please refer to the first half of this blog post. There is also an excellent tutorial on the RealPython.com blog that takes a deeper dive into Python virtual environments.

To install

virtualenv
  and
virtualenvwrapper
 , just use
pip
 :
$ pip install virtualenv virtualenvwrapper

After these packages have been installed we once again need to update our

~/.bash_profile
  file:
# Virtualenv/VirtualenvWrapper
source /usr/local/bin/virtualenvwrapper.sh

After updating, your

~/.bash_profile
  should look similar to mine below:
Figure 5: Update your .bash_profile file to include virtualenv/virtualenvwrapper.

Figure 5: Update your .bash_profile file to include virtualenv/virtualenvwrapper.

Save and exit your text editor, followed by refreshing your environment using the

source
  command:
$ source ~/.bash_profile

Again, this command only needs to be executed once. Whenever you open up a new terminal the contents of your

.bash_profile
  file will be automatically loaded for you.

Creating your Python virtual environment

Assuming the above commands completed without error, we can now use the

mkvirtualenv
  command to create our Python virtual environment. We’ll name this Python virtual environent 
cv
 :
$ mkvirtualenv cv

This command will create a Python environment that is independent from all other Python environments on the system (meaning this environment has its own separate

site-packages
  directory, etc.). This is the virtual environment we will be using when compiling and installing OpenCV.

The

mkvirtualenv
  command only needs to be executed once. If you ever need to access this virtual environment again, just use the
workon
  command:
$ workon cv

To validate that you are in the

cv
  virtual environment, simply examine your command line — if you see the text
(cv)
  preceding the prompt, then you are in the
cv
  virtual environment:
Figure 6: Make sure you see the "(cv)" text on your prompt, indicating that you are in the cv virtual environment.

Figure 6: Make sure you see the “(cv)” text on your prompt, indicating that you are in the cv virtual environment.

Otherwise, if you do not see the

cv
  text, then you are not in the
cv
  virtual environment:
Figure 7: If you do not see the "(cv)" text on your prompt, then you are not in the cv virtual environment and you need to run the "workon" command to resolve this issue before continuing.

Figure 7: If you do not see the “(cv)” text on your prompt, then you are not in the cv virtual environment and you need to run the “workon” command to resolve this issue before continuing.

To access the

cv
  virtual environment simply use the
workon
  command mentioned above.

Install NumPy

The last step is to install NumPy, a scientific computing package for Python.

Ensure you are in the

cv
  virtual environment (otherwise NumPy will be installed into the system version of Python rather than the
cv
  environment) and then install NumPy using
pip
 :
$ pip install numpy

Step #5: Install OpenCV prerequisites using Homebrew

OpenCV requires a number of prerequisites, all of which can be installed easily using Homebrew.

Some of these packages are related to tools used to actually build and compile OpenCV while others are used for image I/O operations (i.e., loading various image file formats such as JPEG, PNG, TIFF, etc.)

To install the required prerequisites for OpenCV on macOS, just execute these commands:

$ brew install cmake pkg-config
$ brew install jpeg libpng libtiff openexr
$ brew install eigen tbb

Step #6: Download the OpenCV 3 source from GitHub

As I mentioned at the top of this tutorial, we need to compile OpenCV from the latest commit, not a tagged release. This requires us to download the OpenCV GitHub repo:

$ cd ~
$ git clone https://github.com/opencv/opencv

Along with the opencv_contrib repo:

$ git clone https://github.com/opencv/opencv_contrib

Step #7: Configuring OpenCV 3 and Python 2.7 via CMake on macOS

In this section I detail how to configure your OpenCV 3 + Python 2.7 on macOS Sierra build using CMake.

First, I demonstrate how to setup your build by creating the

build
  directory.

I then provide a CMake build template that you can use. This template requires you to fill in two values — the path to your

libpython2.7.dylib
  file and the path to your
Python.h
  headers.

I will help you find and determine the correct values for these two paths.

Finally, I provide an example of a fully completed CMake command. However, please take note that this command is specific to my machine. Your CMake command may be slightly different due to the paths specified. Please read the rest of this section for more details.

Setting up the build

In order to compile OpenCV 3 with Python 2.7 support for macOS we need to first set up the build. This simply amounts to changing directories into

opencv
  and creating a
build
  directory:
$ cd ~/opencv
$ mkdir build
$ cd build

OpenCV 3 + Python 2.7 CMake template for macOS

In order to make the compile and install process easier, I have constructed the following template OpenCV 3 + Python 2.7 CMake template:

$ cmake -D CMAKE_BUILD_TYPE=RELEASE \
    -D CMAKE_INSTALL_PREFIX=/usr/local \
    -D OPENCV_EXTRA_MODULES_PATH=~/opencv_contrib/modules \
    -D PYTHON2_LIBRARY=YYY \
    -D PYTHON2_INCLUDE_DIR=ZZZ \
    -D PYTHON2_EXECUTABLE=$VIRTUAL_ENV/bin/python \
    -D BUILD_opencv_python2=ON \
    -D BUILD_opencv_python3=OFF \
    -D INSTALL_PYTHON_EXAMPLES=ON \
    -D INSTALL_C_EXAMPLES=OFF \
    -D BUILD_EXAMPLES=ON ..

Looking at this template I want to point out a few things to you:

  1. BUILD_opencv_python2=ON
     : This indicates that we want to build Python 2.7 bindings for our OpenCV 3 install.
  2. BUILD_opencv_python3=OFF
     : Since we are compiling Python 2.7 bindings we need to explicitly state that we do not want Python 3 bindings. Failure to include these two switches can cause problems in the CMake configuration process.
  3. PYTHON2_LIBRARY=YYY
     : This is the first value you need to fill in yourself. You will need to replace
    YYY
      with the path to your
    libpython2.7.dylib
      file (I will help you find it in the next section).
  4. PYTHON2_INCLUDE_DIR
     : This is the second value you will need to fill in. You need to replace
    ZZZ
      with the path to your
    Python.h
      headers (again, I’ll help you determine this path).

Determining your Python 2.7 library and include directory

Let’s start by configuring your

PYTHON2_LIBRARY
  value. This switch should point to our
libpython2.7.dylib
  file. You can find this file within many nested subdirectories of
/usr/local/Cellar/python/
 . To find the exact path to the
libpython2.7.dylib
  file, just use the
ls
  command along with the wildcard asterisk:
$ ls /usr/local/Cellar/python/2.7.*/Frameworks/Python.framework/Versions/2.7/lib/python2.7/config/libpython2.7.dylib
/usr/local/Cellar/python/2.7.12_2/Frameworks/Python.framework/Versions/2.7/lib/python2.7/config/libpython2.7.dylib

Take note of the output of this command — this is the full path to your

libpython2.7.dylib
  file and will replace
YYY
  in the CMake template above.

Next, let’s determine the

PYTHON2_INCLUDE_DIR
 . This path should point to the
Python.h
  headers used to generate our actual OpenCV + Python 2.7 bindings.

Again, we’ll use the same

ls
  and wildcard trick here to determine the proper path:
$ ls -d /usr/local/Cellar/python/2.7.*/Frameworks/Python.framework/Versions/2.7/include/python2.7/
/usr/local/Cellar/python/2.7.12_2/Frameworks/Python.framework/Versions/2.7/include/python2.7/

The output of the

ls -d
  command is our full path to the
Python.h
  headers. This value will replace
ZZZ
  in the CMake template.

Filling in the CMake template

Now that you’ve determined the

PYTHON2_LIBRARY
  and
PYTHON2_INCLUDE_DIR
  values you need to update the CMake command with these values.

On my particular machine the full CMake command looks like this:

$ cmake -D CMAKE_BUILD_TYPE=RELEASE \
    -D CMAKE_INSTALL_PREFIX=/usr/local \
    -D OPENCV_EXTRA_MODULES_PATH=~/opencv_contrib/modules \
    -D PYTHON2_LIBRARY=/usr/local/Cellar/python/2.7.12_2/Frameworks/Python.framework/Versions/2.7/lib/python2.7/config/libpython2.7.dylib \
    -D PYTHON2_INCLUDE_DIR=/usr/local/Cellar/python/2.7.12_2/Frameworks/Python.framework/Versions/2.7/include/python2.7/ \
    -D PYTHON2_EXECUTABLE=$VIRTUAL_ENV/bin/python \
    -D BUILD_opencv_python2=ON \
    -D BUILD_opencv_python3=OFF \
    -D INSTALL_PYTHON_EXAMPLES=ON \
    -D INSTALL_C_EXAMPLES=OFF \
    -D BUILD_EXAMPLES=ON ..

However, please do not copy and paste my exact CMake command — make sure you have used the instructions above to properly determine your

PYTHON2_LIBRARY
  and
PYTHON2_INCLUDE_DIR
  values.

Once you’ve filled in these values execute your

cmake
  command and your OpenCV 3 + Python 2.7 build will be configured.

As an example, take a look at the

Python 2
  section of the output from my configuration:
Figure 8: Ensuring that Python 2.7 will be used when compiling OpenCV 3 for macOS.

Figure 8: Ensuring that Python 2.7 will be used when compiling OpenCV 3 for macOS.

You’ll want to make sure that:

  1. The
    Interpreter
      points to the Python binary in your
    cv
      virtual environment.
  2. Libraries
      points to your
    libpython2.7.dylib
      file.
  3. The
    numpy
      version being utilized is the one you installed in your
    cv
      virtual environment.

Step #8: Compile and install OpenCV on macOS

Assuming you

cmake
  command exited without error and your
Python 2
  section is properly configured, you can now compile OpenCV:
$ make -j4

The

-j
  switch controls the number of parallel processes to compile OpenCV. We normally set this to the number of available cores/processors on our machine. Since I’m on a quad-core system, I use
-j4
 .

OpenCV can take awhile to compile (30-90 minutes) depending on the speed of your machine. A successful compile will end with a 100% completion:

Figure 9: Successfully compiling OpenCV 3 from source with Python 2.7 bindings on macOS.

Figure 9: Successfully compiling OpenCV 3 from source with Python 2.7 bindings on macOS.

Assuming that OpenCV compiled without error, you can now install it on your macOS system:

$ sudo make install

Step #9: Sym-link your OpenCV 3 + Python 2.7 bindings

After running

make install
  you should now see a file named
cv2.so
  in
/usr/local/lib/python2.7/site-packages
 :
$ cd /usr/local/lib/python2.7/site-packages/
$ ls -l cv2.so 
-rwxr-xr-x  1 root  admin  3694564 Nov 15 09:20 cv2.so

The

cv2.so
  file is your actual set of OpenCV 3 + Python 2.7 bindings.

However, we need to sym-link these bindings into our

cv
  virtual environment. This can be accomplished using the following commands:
$ cd ~/.virtualenvs/cv/lib/python2.7/site-packages/
$ ln -s /usr/local/lib/python2.7/site-packages/cv2.so cv2.so
$ cd ~

Step #10: Testing your OpenCV install on macOS

To verify that your OpenCV 3 + Python 2.7 installation on macOS is working:

  1. Open up a new terminal.
  2. Execute the
    workon
      command to access the
    cv
      Python virtual environment.
  3. Attempt to import the Python + OpenCV bindings.

Here are the exact steps to test the install process:

$ workon cv
$ python
Python 2.7.12 (default, Oct 11 2016, 05:20:59) 
[GCC 4.2.1 Compatible Apple LLVM 8.0.0 (clang-800.0.38)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import cv2
>>> cv2.__version__
'3.1.0-dev'
>>>

Note: Take note of the

-dev
  in the
cv2.__version__
 . This indicates that we are using the development version of OpenCV and not a tagged release. Once OpenCV 3.2 is released these instructions can be updated to simply download a .zip of the tagged version rather than having to clone down the entire repositories. 

I’ve also included a screenshot below that utilizes these same steps. As you can see, I can access my OpenCV 3 bindings from Python 2.7 shell on macOS Sierra:

Figure 10: Ensuring that I can successfully import my OpenCV + Python bindings on macOS.

Figure 10: Ensuring that I can successfully import my OpenCV 3 + Python 2.7 bindings on macOS.

Congratulations, you have installed OpenCV 3 with Python 2.7 bindings on your macOS system!

So, what’s next?

Congrats! You now have a brand new, fresh install of OpenCV on your macOS system — and I’m sure you’re just itching to leverage your install to build some awesome computer vision apps…

…but I’m also willing to bet that you’re just getting started learning computer vision and OpenCV, and probably feeling a bit confused and overwhelmed on exactly where to start.

Personally, I’m a big fan of learning by example, so a good first step would be to have some fun and read this blog post on detecting cats in images/videos. This tutorial is meant to be very hands-on and demonstrate how you can (quickly) build a Python + OpenCV application to detect the presence of cats in images.

And if you’re really interested in leveling-up your computer vision skills, you should definitely check out my book, Practical Python and OpenCV + Case Studies. My book not only covers the basics of computer vision and image processing, but also teaches you how to solve real-world computer vision problems including face detection in images and video streamsobject tracking in video, and handwriting recognition.

curious_about_cv

So, let’s put that fresh install of OpenCV 3 on your macOS system to good use — just click here to learn more about the real-world projects you can solve using Practical Python and OpenCV.

Summary

In this blog post I demonstrated how to install OpenCV 3 with Python 2.7 bindings on macOS Sierra and above.

Next week I’ll have a second tutorial, this one covering OpenCV 3 with Python 3.5 bindings on macOS.

For more OpenCV install tutorials on other operating systems (such as Ubuntu, Raspbian, etc.), please refer to this page where I provide additional links and resources.

But before you go

If your’e interested in learning more about OpenCV, computer vision, and image processing be sure to enter your email address in the form below to be notified when new blog posts + tutorials are published!

The post macOS: Install OpenCV 3 and Python 2.7 appeared first on PyImageSearch.

macOS: Install OpenCV 3 and Python 3.5

$
0
0

sierra_os_contours_example

Last week I covered how to install OpenCV 3 with Python 2.7 bindings on macOS Sierra and above.

In today’s tutorial we’ll learn how to install OpenCV 3 with Python 3.5 bindings on macOS.

I decided to break these install tutorials into two separate guides to keep them well organized and easy to follow.

To learn how to install OpenCV 3 with Python 3.5 bindings on your macOS system, just keep reading.

macOS: Install OpenCV 3 and Python 3.5

As I mentioned in the introduction to this post, I spent last week covering how to install OpenCV 3 with Python 2.7 bindings on macOS.

Many of the steps in last week’s tutorial and today’s tutorial are very similar (and in some cases identical) so I’ve tried to trim down some of the explanations for each step to reduce redundancy. If you find any step confusing or troublesome I would suggest referring to the OpenCV 3 + Python 2.7 tutorial where I have provided more insight.

The exception to this is “Step #7: Configure OpenCV 3 and Python 3.5 via CMake on macOS” where I provide an extremely thorough walkthrough on how to configure your OpenCV build. You should pay extra special attention to this step to ensure your OpenCV build has been configured correctly.

With all that said, let’s go ahead and install OpenCV 3 with Python 3.5 bindings on macOS.

Step #1: Install Xcode

Before we can compile OpenCV on our system, we first need to install Xcode, Apple’s set of software development tools for the Mac Operating System.

The easiest method to download Xcode is to open up the App Store application on your desktop, search for “Xcode” in the search bar, and then click the “Get” button:

Figure 1: Downloading and installing Xcode on macOS.

Figure 1: Downloading and installing Xcode on macOS.

After installing Xcode you’ll want to open up a terminal and ensure you have accepted the developer license:

$ sudo xcodebuild -license

We also need to install the Apple Command Line Tools. These tools include programs and libraries such as GCC, make, clang, etc. You can use the following command to install the Apple Command Line Tools:

$ sudo xcode-select --install

When executing the above command a confirmation window will pop up asking for you to confirm the install:

Figure 2: Installing the Apple Command Line Tools on macOS.

Figure 2: Installing the Apple Command Line Tools on macOS.

Click the “Install” button to continue. The actual installation process should take less than 5 minutes to complete.

Step #2: Install Homebrew

The next step is to install Homebrew, a package manager for macOS. You can think of Homebrew as the macOS equivalent of Ubuntu/Debian-based apt-get.

Installing Homebrew itself is super easy, just copy and paste the entire command below:

$ /usr/bin/ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)"

Once Homebrew is installed you should make sure the package definitions are up to date by running:

$ brew update

We now need to update our

~/.bash_profile
  file (or create it if it doesn’t exist already). Open up the file using your favorite text editor (I’m using
nano
  in this case):
$ nano ~/.bash_profile

And then add the following lines to the file:

# Homebrew
export PATH=/usr/local/bin:$PATH

This

export
  command simply updates the
PATH
  variable to look for binaries/libraries along the Homebrew path before the system path is consulted.

I have included a screenshot of what my

~/.bash_profile
  looks like as reference below:
Figure 3: Updating my .bash_profile file to include Homebrew.

Figure 3: Updating my .bash_profile file to include Homebrew.

After updating your

.bash_profile
  file, save and exitor the editor followed by using
source
  to ensure the changes to the
.bash_profile
  are manually reloaded:
$ source ~/.bash_profile

This command only needs to be executed once. Anytime you open up a new terminal your

.bash_profile
  will automatically be
source
 ‘d for you.

Step #3: Setup Homebrew for Python 3.5 and macOS

It is considered bad form to develop against the system Python as your main interpreter. The system version of Python should serve only one purpose — support system routines and operations. There is also the fact that macOS does not ship with Python 3 out of the box.

Instead, you should install your own version of Python that is independent from the system install. Using Homebrew, we can install Python 3 using the following command:

$ brew install python3

Note: Make sure you don’t forget the “3” in “python3”. The above command will install Python 3.5 on your system. However, if you leave off the “3” you’ll end up installing Python 2.7.

After the Python 3 install completes we need to create some symbolic links:

$ brew linkapps python3

As a sanity check, it’s important to confirm that you are using the Homebrew version of Python 3 rather than the system version of Python 3. To accomplish this, simply use the

which
  command:
$ which python3
/usr/local/bin/python3

Important: Inspect this output closely. If you see

/usr/local/bin/python3
  then you are correctly using the Homebrew version of Python. However, if the output is
/usr/bin/python3
  then you are incorrectly using the system version of Python.

If you find yourself using the system version of Python instead of the Homebrew version you should:

  1. Ensure Homebrew installed without error.
  2. Check that
    brew install python3
      finished successfully.
  3. You have properly updated your
    ~/.bash_profile
      and reloaded the changes using
    source
     . This basically boils down to making sure your
    ~/.bash_profile
      looks like mine above in Figure 3.

Step #4: Install Python virtual environments and NumPy

We’ve made good progress so far. We’ve installed a non-system version of Python 3 via Homebrew. However, let’s not stop there. Let’s install both virtualenv and virtualenvwrapper so we can create separate, independent Python environments for each project we are working on — this is considered a best practice when developing software in the Python programming language.

I’ve already discussed Python virtual environments ad nauseam in previous blog posts, so if you’re curious about how they work and why we use them, please refer to the first half of this blog post. I also highly recommend reading through this excellent tutorial on the RealPython.com blog that takes a deeper dive into Python virtual environments.

Install virtualenv and virtualenvwrapper

Installing both

virtualenv
  and
virtualenvwrapper
  is a snap using
pip
 :
$ pip install virtualenv virtualenvwrapper

After these packages have been installed we need to update our

~/.bash_profile
  again:
$ nano ~/.bash_profile

Once opened, append the following lines to the file:

# Virtualenv/VirtualenvWrapper
source /usr/local/bin/virtualenvwrapper.sh

After updating, your

~/.bash_profile
  should look similar to mine:
Figure 4: Updating your .bash_profile file to include virtualenv/virtualenvwrapper.

Figure 4: Updating your .bash_profile file to include virtualenv/virtualenvwrapper.

After updating your

.bash_profile
 , save it, exit, and then once again
source
  it:
$ source ~/.bash_profile

I’ll reiterate that this command only needs to be executed once. Each time you open up a new terminal window this file will automatically be

source
 ‘d for you.

Create your Python 3 virtual environment

We can now use the

mkvirtualenv
  command to create a Python 3 virtual environment named
cv
 :
$ mkvirtualenv cv -p python3

The

-p python3
  switch ensures that a Python 3 virtual environment is created instead of a Python 2.7 one.

Again, the above command will create a Python environment named

cv
  that is independent from all other Python environments on your system. This environment will have it’s own
site-packages
  directory, etc., allowing you to avoid any type of library versioning issues across projects.

The

mkvirtualenv
  command only needs to be executed once. To access the
cv
  Python virtual environment after it has been created, just use the
workon
  command:
$ workon cv

To validate that you are in the

cv
  virtual environment, just examine your command line. If you see the text
(cv)
  preceding the prompt, then you are are in the
cv
  virtual environment:
Figure 6: Make sure you see the "(cv)" text on your prompt, indicating that you are in the cv virtual environment.

Figure 5: Make sure you see the “(cv)” text on your prompt, indicating that you are in the cv virtual environment.

Otherwise, if you do not see the

cv
  text, then you are not in the
cv
  virtual environment:
Figure 7: If you do not see the "(cv)" text on your prompt, then you are not in the cv virtual environment and you need to run the "workon" command to resolve this issue before continuing.

Figure 6: If you do not see the “(cv)” text on your prompt, then you are not in the cv virtual environment and you need to run the “workon” command to resolve this issue before continuing.

If you find yourself in this situation all you need to do is utilize the

workon
  command mentioned above.

Install NumPy

The only Python-based prerequisite that OpenCV needs is NumPy, a scientific computing package.

To install NumPy into our

cv
  virtual environment, ensure you are in the
cv
  environment (otherwise NumPy will be installed into the system version of Python) and then utilize
pip
  to handle the actual installation:
$ pip install numpy

Step #5: Install OpenCV prerequisites using Homebrew

OpenCV requires a few prerequisites to be installed before we compile it. These packages are related to either (1) tools used to build and compile, (2) libraries used for image I/O operations (i.e., loading various image file formats from disk such as JPEG, PNG, TIFF, etc.) or (3) optimization libraries.

To install these prerequisites for OpenCV on macOS execute the following commands:

$ brew install cmake pkg-config
$ brew install jpeg libpng libtiff openexr
$ brew install eigen tbb

Step #6: Download the OpenCV 3 source from GitHub

As I detailed in last week’s tutorial, OpenCV 3 on macOS needs to be compiled via the latest commit to GitHub instead of an actual tagged release (i.e., 3.0, 3.1, etc.). This is because the current tagged releases of OpenCV do not provide fixes for the QTKit vs. AVFoundation errors (please see last week’s blog post for a thorough discussion on this).

First, we need to download the OpenCV GitHub repo:

$ cd ~
$ git clone https://github.com/opencv/opencv

Followed by the opencv_contrib repo:

$ git clone https://github.com/opencv/opencv_contrib

Step #7: Configure OpenCV and Python 3.5 via CMake on macOS

This section of the tutorial is the most challenging and the one that you’ll want to pay the most attention to.

First, I’ll demonstrate how to setup your build by creating the a

build
  directory.

I then provide a CMake template that you can use to start the process of compiling OpenCV 3 with Python 3.5 bindings on macOS. This template requires you to fill in two values:

  1. The path to your
    libpython3.5.dylib
      file.
  2. The path to your
    Python.h
      headers for Python 3.5.

I will help you find and determine the correct values for these paths.

Finally, I provide a fully completed CMake command as an example. Please note that is command is specific to my machine. Your CMake command may be slightly different due to the paths specified. Please read the rest of this section for details.

Setting up the build

In order to compile OpenCV with Python 3.5 bindings for macOS we first need to set up the build. This simply amounts to changing directories and creating a

build
  directory:
$ cd ~/opencv
$ mkdir build
$ cd build

OpenCV 3 + Python 3.5 CMake template for macOS

The next part, where we configure our actual build, gets a little tricky. In order to make this process easier I have constructed the following OpenCV 3 + Python 3.5 CMake template:

$ cmake -D CMAKE_BUILD_TYPE=RELEASE \
    -D CMAKE_INSTALL_PREFIX=/usr/local \
    -D OPENCV_EXTRA_MODULES_PATH=~/opencv_contrib/modules \
    -D PYTHON3_LIBRARY=YYY \
    -D PYTHON3_INCLUDE_DIR=ZZZ \
    -D PYTHON3_EXECUTABLE=$VIRTUAL_ENV/bin/python \
    -D BUILD_opencv_python2=OFF \
    -D BUILD_opencv_python3=ON \
    -D INSTALL_PYTHON_EXAMPLES=ON \
    -D INSTALL_C_EXAMPLES=OFF \
    -D BUILD_EXAMPLES=ON ..

Looking at this template I want to point out a few things to you:

  1. BUILD_opencv_python2=OFF
     : This switch indicates that we do not want to build Python 2.7 bindings. This needs to be explicity stated in the CMake command. Failure to do this can cause problems when we actually run CMake.
  2. BUILD_opencv_python3=ON
     : We would like for OpenCV 3 + Python 3.5 bindings to be built. This instruction indicates to CMake that the Python 3.5 binding should be built rather than Python 2.7.
  3. PYTHON3_LIBRARY=YYY
     : This is the first value that you need to fill in yourself. You need to replace
    YYY
      with the path to your
    libpython3.5.dylib
      file. I will hep you find the path to this value in the next section.
  4. PYTHON3_INCLUDE_DIR=ZZZ
     : This is the second value that you need to fill in. You will need to replace
    ZZZ
      with the path to your
    Python.h
      headers. Again, I will help you determine this path.

Determining your Python 3.5 library and include directory

We will start by configuring your

PYTHON3_LIBRARY
  value. This switch should point to your
libpython3.5.dylib
  file. This file is located within many nested subdirectories of
/usr/local/Cellar/python
 . To find the exact path to the
libpython3.5.dylib
  file, just use the
ls
  command with a wildcard (auto-tab complete also works as well):
$ ls /usr/local/Cellar/python3/3.*/Frameworks/Python.framework/Versions/3.5/lib/python3.5/config-3.5m/libpython3.5.dylib
/usr/local/Cellar/python3/3.5.2_3/Frameworks/Python.framework/Versions/3.5/lib/python3.5/config-3.5m/libpython3.5.dylib

Take note of the output of this command — this is the full path to your

libpython3.5.dylib
  file and will replace
YYY
  in the CMake template above.

Let’s move along to determining the

PYTHON3_INCLUDE_DIR
  variable. This path should point to the
Python.h
  header files for Python 3.5 used to generate the actual OpenCV 3 + Python 3.5 bindings.

Again, we’ll use the same

ls
  and wildcard trick here to determine the proper path:
$ ls -d /usr/local/Cellar/python3/3.*/Frameworks/Python.framework/Versions/3.5/include/python3.5m/
/usr/local/Cellar/python3/3.5.2_3/Frameworks/Python.framework/Versions/3.5/include/python3.5m/

The output of the

ls -d
  command is our full path to the
Python.h
  headers. This value will replace
ZZZ
  in the CMake template.

Filling in the CMake template

Now that we’ve determined the

PYTHON3_LIBRARY
  and
PYTHON3_INCLUDE_DIR
  values we need to update the CMake command to reflect these paths.

On my machine, the full CMake command to configure my OpenCV 3 + Python 3.5 build looks like:

$ cmake -D CMAKE_BUILD_TYPE=RELEASE \
    -D CMAKE_INSTALL_PREFIX=/usr/local \
    -D OPENCV_EXTRA_MODULES_PATH=~/opencv_contrib/modules \
    -D PYTHON3_LIBRARY=/usr/local/Cellar/python3/3.5.2_3/Frameworks/Python.framework/Versions/3.5/lib/python3.5/config-3.5m/libpython3.5.dylib \
    -D PYTHON3_INCLUDE_DIR=/usr/local/Cellar/python3/3.5.2_3/Frameworks/Python.framework/Versions/3.5/include/python3.5m/ \
    -D PYTHON3_EXECUTABLE=$VIRTUAL_ENV/bin/python \
    -D BUILD_opencv_python2=OFF \
    -D BUILD_opencv_python3=ON \
    -D INSTALL_PYTHON_EXAMPLES=ON \
    -D INSTALL_C_EXAMPLES=OFF \
    -D BUILD_EXAMPLES=ON ..

However, please do not copy and paste my exact CMake command — make sure you have used the instructions above to properly determine your

PYTHON3_LIBRARY
  and
PYTHON3_INCLUDE_DIR
  values.

Once you’ve filled in these values, execute your

cmake
  command and your OpenCV 3 + Python 3.5 build will be configured.

As an example, take a look at the

Python 3
  section output from my configuration:
Figure 7: Ensuring that Python 3.5 will be used when compiling OpenCV 3 for macOS.

Figure 7: Ensuring that Python 3.5 will be used when compiling OpenCV 3 for macOS.

In particular, you’ll want to make sure that:

  1. The
    Interpreter
      points to the Python binary in your
    cv
      virtual environment.
  2. Libraries
      points to your
    libpython3.5.dylib
      file.
  3. The
    numpy
      version being utilized is the one you installed in your
    cv
      virtual environment.

Step #8: Compile and install OpenCV 3 on macOS

After investigating your

cmake
  command and ensuring it exited without error (and that the
Python 3
  section was properly configured), you can now compile OpenCV:
$ make -j4

In this case, I am supplying

-j4
  to compile OpenCV using all four cores on my machine. You can tune this value based on the number of processors/cores you have.

OpenCV can take awhile to compile, anywhere from 30-90 minutes, depending on your system specs. I would consider going for a nice long walk while it compiles.

A successful compile will end with a 100% completion:

Figure 9: Successfully compiling OpenCV 3 from source with Python 2.7 bindings on macOS.

Figure 8: Successfully compiling OpenCV 3 from source with Python 3.5 bindings on macOS.

Assuming that OpenCV compiled without error, you can now install it on your macOS system:

$ sudo make install

Step #9: Rename and sym-link your OpenCV 3 + Python 3.5 bindings

After running

sudo make install
  your OpenCV 3 + Python 3.5 bindings should be located in
/usr/local/lib/python3.5/site-packages
 . You can verify this by using the
ls
  command:
$ cd /usr/local/lib/python3.5/site-packages/
$ ls -l *.so
-rwxr-xr-x  1 root  admin  3694564 Nov 15 11:28 cv2.cpython-35m-darwin.so

I’ve been perplexed by this behavior ever since OpenCV 3 was released, but for some reason, when compiling OpenCV with Python 3 support enabled the output

cv2.so
  bindings are named differently. The actual filename will vary a bit depending on your system architecture, but it should look something like
cv2.cpython-35m-darwin.so
 .

Again, I don’t know exactly why this happens, but it’s an easy fix. All we need to do is rename the file to

cv2.so
 :
$ cd /usr/local/lib/python3.5/site-packages/
$ mv cv2.cpython-35m-darwin.so cv2.so
$ cd ~

After renaming

cv2.cpython-35m-darwin.so
  to
cv2.so
  we then need to sym-link our OpenCV bindings into the
cv
  virtual environment for Python 3.5:
$ cd ~/.virtualenvs/cv/lib/python3.5/site-packages/
$ ln -s /usr/local/lib/python3.5/site-packages/cv2.so cv2.so
$ cd ~

Step #10: Verify your OpenCV 3 install on macOS

To verify that your OpenCV 3 + Python 3.5 installation on macOS is working you should:

  1. Open up a new terminal.
  2. Execute the
    workon
      command to access the
    cv
      Python virtual environment.
  3. Attempt to import the Python + OpenCV bindings.

Here are the exact steps you can use to test the install:

$ workon cv
$ python
Python 3.5.2 (default, Oct 11 2016, 04:59:56) 
[GCC 4.2.1 Compatible Apple LLVM 8.0.0 (clang-800.0.38)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import cv2
>>> cv2.__version__
'3.1.0-dev'
>>>

Note: Take note of the

-dev
  in the
cv2.__version__
 . This indicates that we are using the development version of OpenCV and not a tagged release. Once OpenCV 3.2 is released these instructions can be updated to simply download a .zip of the tagged version rather than having to clone down the entire repositories. 

I’ve also included a screenshot below that utilizes these same steps. As you can see, I can access my OpenCV 3 bindings from Python 3.5 shell:

Figure 9: Ensuring that I have successfully installed my OpenCV 3 + Python 3.5 bindings on macOS.

Figure 9: Ensuring that I have successfully installed my OpenCV 3 + Python 3.5 bindings on macOS.

Congratulations, you have installed OpenCV with Python 3.5 bindings on your macOS system!

So, what’s next?

Congrats! You now have a brand new, fresh install of OpenCV on your macOS system — and I’m sure you’re just itching to leverage your install to build some awesome computer vision apps…

…but I’m also willing to bet that you’re just getting started learning computer vision and OpenCV, and probably feeling a bit confused and overwhelmed on exactly where to start.

Personally, I’m a big fan of learning by example, so a good first step would be to have some fun and read this blog post on detecting cats in images/videos. This tutorial is meant to be very hands-on and demonstrate how you can (quickly) build a Python + OpenCV application to detect the presence of cats in images.

And if you’re really interested in leveling-up your computer vision skills, you should definitely check out my book, Practical Python and OpenCV + Case Studies. My book not only covers the basics of computer vision and image processing, but also teaches you how to solve real-world computer vision problems including face detection in images and video streamsobject tracking in video, and handwriting recognition.

curious_about_cv

So, let’s put that fresh install of OpenCV 3 on your macOS system to good use — just click here to learn more about the real-world projects you can solve using Practical Python and OpenCV.

Summary

In this tutorial you learned how to compile and install OpenCV 3 with Python 3.5 bindings on macOS Sierra.

To accomplish this, we configured and compiled OpenCV 3 by hand using the CMake utility. While this isn’t exactly the most “fun” experience, it does give us complete and total control over the install.

If you’re looking for an easier way to get OpenCV installed on your Mac system be sure to stay tuned for next week’s blog post where I demonstrate how to install OpenCV on macOS using nothing but Homebrew.

To be notified when this blog post goes live, please enter your email address in the form below and I’ll be sure to ping you when the tutorial is published.

The post macOS: Install OpenCV 3 and Python 3.5 appeared first on PyImageSearch.

Install OpenCV 3 on macOS with Homebrew (the easy way)

$
0
0

homebrew_opencv3_header

Over the past few weeks I have demonstrated how to compile OpenCV 3 on macOS with Python (2.7, 3.5) bindings from source.

Compiling OpenCV via source gives you complete and total control over which modules you want to build, how they are built, and where they are installed.

All this control can come at a price though.

The downside is that determining the correct CMake paths to your Python interpreter, libraries, and include directories can be non-trivial, especially for users who are new to OpenCV/Unix systems.

That begs the question…

“Is there an easier way to install OpenCV on macOS? A way that avoids the complicated CMake configuration?”

It turns out, there is — just use Homebrew, what many consider to be “the missing package manager for Mac”.

So, is it really that easy? Just can a few simple keystrokes and commands can be used to avoid the hassle and install OpenCV 3 without the headaches?

Well, there’s a little more to it than that…but the process is greatly simplified. You lose a bit of control (as compared to compiling from source), but what you gain is an easier to follow path to installing OpenCV on your Mac system.

To discover the easy way to install OpenCV 3 on macOS via Homebrew, just keep reading.

Install OpenCV 3 on macOS with Homebrew (the easy way)

The remainder of this blog post demonstrates how to install OpenCV 3 with both Python 2.7 and Python 3 bindings on macOS via Homebrew. The benefit of using Homebrew is that it greatly simplifies the install process (although it can pose problems of its own if you aren’t careful) to only a few set of commands that need to be run.

If you prefer to compile OpenCV from source with Python bindings on macOS, please refer to these tutorials:

Step #1: Install XCode

Before we can install OpenCV 3 on macOS via Homebrew, we first need to install Xcode, a set of software development tools for the Mac Operating System.

Download Xcode

The easiest method to download and install Xcode is to use the included App Store application on your macOS system. Simply open up App Store, search for “Xcode” in the search bar, and then click the “Get” button:

Figure 1: Downloading and installing Xcode on macOS.

Figure 1: Downloading and installing Xcode on macOS.

Depending on your internet connection and system speed, the download and install process can take anywhere from 30 to 60 minutes. I would suggest installing Xocde in the background while you are getting some other work done or going for a nice long walk.

Accept the Apple developer license

I’m assuming that you’re working with a fresh install of macOS and Xcode. If so, you’ll need to accept the developer license before continuing. Personally, I think this is easier to do via the terminal. Just open up a terminal and execute the following command:

$ sudo xcodebuild -license

Scroll to the bottom of the license and accept it.

If you have already installed Xcode and previously accepted the Apple developer license, you can skip this step.

Install the Apple Command Line Tools

Now that Xcode is installed and we have accepted the Apple developer license, we can install the Apple Command Line Tools. These tools include packages such as make, GCC, clang, etc. This is a required step, so make you install the Apple Command line tools via:

$ sudo xcode-select --install

When executing the above command you’ll see a confirmation window pop up asking you to approve the install:

Figure 2: Installing Apple Command Line Tools on macOS.

Figure 2: Installing Apple Command Line Tools on macOS.

Simply click the “Install” button to continue. The actual install process of Apple Command Line Tools should take less than a 5 minutes.

Step #2: Install Homebrew

We are now ready to install Homebrew, a package manager for macOS. You can think of Homebrew as the macOS equivalent of the Ubuntu/Debian-based apt-get.

Installing Homebrew is dead simple — simply copy and paste the command below the “Install Homebrew” section of the Homebrew website (make sure you copy and paste the entire command into your terminal). I have included the command below as reference:

$ ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)"

After Homebrew installs you should make sure the package definitions (i.e., the instructions used to install a given library/package) are up to date by executing the following command:

$ brew update

Now that Homebrew is successfully installed and updated, we need to update our

~/.bash_profile
  file so that it searches the Homebrew path for packages/libraries before it searches the system path. Failure to complete this step can lead to confusing errors, import problems, and segfaults when trying to utilize Python and OpenCV, so make sure you update your
~/.bash_profile
 
 file correctly!

The

~/.bash_profile
  file may or may not already exist on your system. In either case, open it with your favorite text editor (I’ll be using
nano
  in this example):
$ nano ~/.bash_profile

And then insert the following lines at the bottom of the file (if

~/.bash_profile
  does not exist the file will be empty — this is okay, just add the following lines to the file):
# Homebrew
export PATH=/usr/local/bin:$PATH

All this snippet is doing is updating your

PATH
  variable to look for libraries/binaries along the Homebrew path before it searches the system path.

After updating the

~/.bash_profile
  file, save and exit your text editor.

To make sure you are on the right path, I have included a screenshot of my

~/.bash_profile
  below so you can compare it to yours:
Figure 3: Updating my .bash_profile file to include Homebrew.

Figure 3: Updating my .bash_profile file to include Homebrew.

Remember, your

~/.bash_profile
  may look very different than mine — that’s okay! Just make sure you have included the above Homebrew snippet in your file, followed by successfully saving and editing the editor.

Finally, we need to manually

source
  the
~/.bash_profile
  file to ensure the changes have been reloaded:
$ source ~/.bash_profile

The above command only needs to be executed once. Whenever you open up a new terminal, login, etc., your

.bash_profile
  file will be automatically loaded and sourced for you.

Step #3: Install Python 2.7 and Python 3 using Homebrew

The next step is to install the Homebrew versions of Python 2.7 and Python 3. It is considered bad form to develop against the system Python as your main interpreter. The system version of Python should serve exactly that — system routines.

Instead, you should install your own version of Python that is independent from the system install. Using Homebrew, we can install both Python 2.7 and Python 3 using the following command:

$ brew install python python3

At the time of this writing the current Python versions installed by Homebrew are Python 2.7.12 and Python 3.5.2.

After the Python 2.7 and Python 3 install completes, we need to create some symbolic links:

$ brew linkapps python
$ brew linkapps python3

As a sanity check, let’s confirm that you are using the Homebrew version of Python rather than the system version of Python. You can accomplish this via the

which
  command:
$ which python
/usr/local/bin/python
$ which python3
/usr/local/bin/python3

Inspect the output of

which
  closely. If you are see
/usr/local/bin/python
  and
/usr/local/bin/python3
  for each of the paths then you are correctly using the Homebrew versions of Python. However, if the output is instead
/usr/bin/python
  and
/usr/bin/python3
  then you are incorrectly using the system version of Python.

If you find yourself in this situation you should:

  1. Go back to Step #2 and ensure Homebrew installed without error.
  2. Check that
    brew install python python3
      finished successfully.
  3. You have correctly updated your
    ~/.bash_profile
      file and reloaded the changes via
    source
     . Your
    ~/.bash_profile
      should look similar to mine in Figure 3 above.

Step #4: Install OpenCV 3 with Python bindings on macOS using Homebrew

Now that we have installed the Homebrew versions of Python 2.7 and Python 3 we are now ready to install OpenCV 3.

Tap the “homebrew/science” repo

The first step is to add the

homebrew/science
  repository to the set of packages we are tracking. This allows us to access the formulae to install OpenCV. To accomplish this, just use the following command:
$ brew tap homebrew/science

Understanding the “brew install” command

To install OpenCV on our macOS system via Homebrew we are going to use the

brew install
  command. This command accepts the name of a package to install (like Debian/Ubuntu’s apt-get), followed by set of optional arguments.

The base of our command is:

brew install opencv3
 ; however, we need to add some additional parameters.

The most important set of parameters are listed below:

  • --with-contrib
     : This ensures that the opencv_contrib repository is installed, giving us access to additional, critical OpenCV features such as SIFT, SURF, etc.
  • --with-python3
     : OpenCV 3 + Python 2.7 bindings will be automatically compiled; however, to compile OpenCV 3 + Python 3 bindings we need to explicitly supply the
    --with-python3
      switch.
  • --HEAD
     : Rather than compiling a tagged OpenCV release (i.e., v3.0, v3.1, etc.) the
    --HEAD
      switch instead clones down the bleeding-edge version of OpenCV from GitHub. Why would we bother doing this? Simple. We need to avoid the QTKit error that plagues macOS Sierra systems with the current tagged OpenCV 3 releases (please see the “Avoiding the QTKit/QTKit.h file not found error” section of this blog post for more information)

You can see the full listing of options/switches by running

brew info opencv3
 , the output of which I’ve included below:
$ brew info opencv3
...
--32-bit
	Build 32-bit only
--c++11
	Build using C++11 mode
--with-contrib
	Build "extra" contributed modules
--with-cuda
	Build with CUDA v7.0+ support
--with-examples
	Install C and python examples (sources)
--with-ffmpeg
	Build with ffmpeg support
--with-gphoto2
	Build with gphoto2 support
--with-gstreamer
	Build with gstreamer support
--with-jasper
	Build with jasper support
--with-java
	Build with Java support
--with-libdc1394
	Build with libdc1394 support
--with-opengl
	Build with OpenGL support (must use --with-qt5)
--with-openni
	Build with openni support
--with-openni2
	Build with openni2 support
--with-python3
	Build with python3 support
--with-qt5
	Build the Qt5 backend to HighGUI
--with-quicktime
	Use QuickTime for Video I/O instead of QTKit
--with-static
	Build static libraries
--with-tbb
	Enable parallel code in OpenCV using Intel TBB
--with-vtk
	Build with vtk support
--without-eigen
	Build without eigen support
--without-numpy
	Use a numpy you've installed yourself instead of a Homebrew-packaged numpy
--without-opencl
	Disable GPU code in OpenCV using OpenCL
--without-openexr
	Build without openexr support
--without-python
	Build without Python support
--without-test
	Build without accuracy & performance tests
--HEAD
	Install HEAD version

For those who are curious, the Homebrew formulae (i.e., the actual commands used to install OpenCV 3) can be found here. Use the parameters above and the install script as a reference if you want to add any additional OpenCV 3 features.

We are now ready to install OpenCV 3 with Python bindings on your macOS system via Homebrew. Depending on the dependencies you do or do not already have installed, along with the speed of your system, this compilation could easily take a couple of hours, so you might want to go for a walk once you kick-off the install process.

Installing OpenCV 3 with Python 3 bindings via Homebrew

To start the OpenCV 3 install process, just execute the following command:

$ brew install opencv3 --with-contrib --with-python3 --HEAD

This command will install OpenCV 3 on your macOS system with both Python 2.7 and Python 3 bindings via Homebew. We’ll also be compiling the latest, bleeding edge version of OpenCV 3 (to avoid any QTKit errors) along with

opencv_contrib
  support enabled.

As I mentioned, this install process can take some time so consider going for a long walk while OpenCV installs. However, make sure your computer doesn’t go to sleep/shut down while you are gone! If it does, the install process will break and you’ll have to restart it.

Assuming OpenCV 3 installed without a problem, your terminal output should look similar to mine below:

Figure 5: Compiling and installing OpenCV 3 with Python bindings on macOS with Homebrew.

Figure 4: Compiling and installing OpenCV 3 with Python bindings on macOS with Homebrew.

However, we’re not quite done yet.

You’ll notice a little note at the bottom of the install output:

If you need Python to find bindings for this keg-only formula, run:
  echo /usr/local/opt/opencv3/lib/python2.7/site-packages >> /usr/local/lib/python2.7/site-packages/opencv3.pth

This means that our Python 2.7 + OpenCV 3 bindings are now installed in

/usr/local/opt/opencv3/lib/python2.7/site-packages
 , which is Homebrew path to the OpenCV compile. We can verify this via the
ls
  command:
$ ls -l /usr/local/opt/opencv3/lib/python2.7/site-packages
total 6944
-r--r--r--  1 admin  admin  3552288 Dec 15 09:28 cv2.so

However, we need to get these bindings into

/usr/local/lib/python2.7/site-packages/
 , which is the
site-packages
  directory for Python 2.7. We can do this by executing the following command:
$ echo /usr/local/opt/opencv3/lib/python2.7/site-packages >> /usr/local/lib/python2.7/site-packages/opencv3.pth

The above command creates a

.pth
  file which tells Homebrew’s Python 2.7 install to look for additional packages in
/usr/local/opt/opencv3/lib/python2.7/site-packages
 — in essence, the
.pth
  file can be considered a “glorified sym-link”.

At this point you now have OpenCV 3 + Python 2.7 bindings installed!

However, we’re not quite done yet…there is still a few extra steps we need to take for Python 3.

Handling the Python 3 issue

Remember the

--with-python3
  option we supplied to
brew install opencv3
 ?

Well, this option did work (although it might not seem like it) — we do have Python 3 + OpenCV 3 bindings installed on our system.

Note: A big thank you to Brandon Hurr for pointing this out. For a long time I thought the

--with-python3
  switch simply wasn’t working.

However, there’s a bit of a problem. If you check the contents of

/usr/local/opt/opencv3/lib/python3.5/site-packages/
  you’ll see that our
cv2.so
  file has a funny name:
$ ls -l /usr/local/opt/opencv3/lib/python3.5/site-packages/
total 6952
-r--r--r--  1 admin  admin  3556384 Dec 15 09:28 cv2.cpython-35m-darwin.so

I have no idea why the Python 3 + OpenCV 3 bindings are not named

cv2.so
  as they should be, but the same is true across operating systems. You’ll see this same issue on macOS, Ubuntu, and Raspbian.

Luckily, the fix is easy — all you need to do is rename

cv2.cpython-35m-darwin.so
  to
cv2.so
 :
$ cd /usr/local/opt/opencv3/lib/python3.5/site-packages/
$ mv cv2.cpython-35m-darwin.so cv2.so
$ cd ~

From there, we can create another

.pth
  file, this time for the Python 3 + OpenCV 3 install:
$ echo /usr/local/opt/opencv3/lib/python3.5/site-packages >> /usr/local/lib/python3.5/site-packages/opencv3.pth

At this point you now have both Python 2.7 + OpenCV 3 and Python 3 + OpenCV 3 installed on your macOS system via Homebrew.

Verifying that OpenCV 3 has been installed

Here are the commands I use to validate that OpenCV 3 with Python 2.7 bindings are working on my system:

$ python
Python 2.7.12 (default, Oct 11 2016, 05:20:59) 
[GCC 4.2.1 Compatible Apple LLVM 8.0.0 (clang-800.0.38)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import cv2
>>> cv2.__version__
'3.1.0-dev'
>>>

The screenshot below shows how to import the OpenCV 3 bindings into a Python 3 shell as well:

Figure 6: Confirming that OpenCV 3 with Python 3 bindings have been successfully installed on my macOS system via Homebrew.

Figure 5: Confirming that OpenCV 3 with Python 3 bindings have been successfully installed on my macOS system via Homebrew.

Congratulations, you have installed OpenCV 3 with Python bindings on your macOS system via Homebrew!

But if you’re a longtime reader reader of this blog, you know that I use Python virtual environments extensively — and you should too.

Step #5: Setup your Python virtual environment (optional)

You’ll notice that unlike many of my previous OpenCV 3 install tutorials, Homebrew does not make use of Python virtual environments, a best practice when doing Python development.

While Steps #5-#7 are optional, I highly recommend that you do them to ensure your system is configured in the same way as my previous tutorials. You’ll see many tutorials on the PyImageSearch blog leverage Python virtual environments. While they are indeed optional, you’ll find that in the long run they make your life easier.

Installing virtualenv and virtualenvwrapper

The virtualenv and virtualenvwrapper packages allow us to create separate, independent Python virtual environments for each project we are working on. I’ve mentioned Python virtual environments many times before on this blog so I won’t rehash what’s already been said. Instead, if you are unfamiliar with Python virtual environments, how they work, and why we use them, please refer to the first half of this blog post. I also recommend this excellent tutorial on the RealPython.com blog that takes a more in-depth dive into Python virtual environments.

To install both

virtualenv
  and
virtualenvwrapper
 , just use
pip
 :
$ pip install virtualenv virtualenvwrapper

After both packages have successfully installed, you’ll need to update your

~/.bash_profile
  file again:
$ nano ~/.bash_profile

Append the following lines to the file:

# Virtualenv/VirtualenvWrapper
source /usr/local/bin/virtualenvwrapper.sh

After updating, your

~/.bash_profile
  should look similar to mine below:
Figure 7: Update your .bash_profile file to include virtualenv/virtualenvwrapper.

Figure 6: Update your .bash_profile file to include virtualenv/virtualenvwrapper.

Once you have confirmed that your

~/.bash_profile
  has been created, you need to refresh your shell by using the
source
  command:
$ source ~/.bash_profile

This command only needs to be executed once. Assuming that your

~/.bash_profile
  has been updated correctly, it will automatically be loaded and
source
 ‘d each time you open a new shell, login, etc.

Create your Python virtual environment

We are now ready to use the

mkvirtualenv
  command to create a Python virtual environment named
cv
  (for “computer vision”).

For Python 2.7 use the following command:

$ mkvirtualenv cv -p python

For Python 3 use this command:

$ mkvirtualenv cv -p python3

The

-p
  switch controls which Python version is used to create your virtual environment. Please note that each virtual environment needs to be uniquely named so if you want to create two separate virtual environments, one for Python 2.7 and another for Python 3, you’ll want to make sure that each environment has a separate name — both cannot be named “cv”.

The

mkvirtualenv
  command only needs to be executed once. To access the
cv
  Python virtual environment after you have already created it, just use the
workon
  command:
$ workon cv

To visually validate you are in the

cv
  virtual environment, just examine your command line. If you see the text
(cv)
  preceding the prompt, then you are in the
cv
  virtual environment:
Figure 8: Make sure you see the "(cv)" text on your prompt, indicating that you are in the cv virtual environment.

Figure 7: Make sure you see the “(cv)” text on your prompt, indicating that you are in the cv virtual environment.

Otherwise, if you do not  see the

cv
  text, then you are not in the
cv
  virtual environment:
Figure 9: If you do not see the “(cv)” text on your prompt, then you are not in the cv virtual environment and you need to run the "workon" command to resolve this issue before continuing.

Figure 8: If you do not see the “(cv)” text on your prompt, then you are not in the cv virtual environment and you need to run the “workon” command to resolve this issue before continuing.

Install NumPy

The only Python prerequisite for OpenCV is NumPy, a scientific computing package.

To install NumPy, first make sure you are in the

cv
  virtual environment and then let
pip
  handle the actual installation:
$ pip install numpy

Step #6: Sym-link the OpenCV 3 bindings (optional)

We are now ready to sym-link in the

cv2.so
  bindings into our
cv
  virtual environment. I have included the commands for both Python 2.7 and Python 3, although the process is very similar.

For Python 2.7

To sym-link the

cv2.so
  bindings into your Python 2.7 virtual environment named
cv
 , use these commands:
$ cd ~/.virtualenvs/cv/lib/python2.7/site-packages/
$ ln -s /usr/local/opt/opencv3/lib/python2.7/site-packages/cv2.so cv2.so
$ cd ~

For Python 3:

To sym-link the

cv2.so
  bindings installed via Homebrew to your Python 3 virtual environment (named
cv
 ), execute these commands:
$ cd ~/.virtualenvs/cv/lib/python3.5/site-packages/
$ ln -s /usr/local/opt/opencv3/lib/python3.5/site-packages/cv2.so cv2.so
$ cd ~

Repeat as necessary

If you would like to have OpenCV 3 bindings installed for both Python 2.7 and Python 3, then you’ll want to repeat Step #5 and Step #6 for both Python versions. This includes creating a uniquely named Python virtual environment, installing NumPy, and sym-linking in the

cv2.so
  bindings.

Step #7: Test your OpenCV 3 install (optional)

To verify that your OpenCV 3 + Python + virtual environment install on macOS is working properly, you should:

  1. Open up a new terminal window.
  2. Execute the
    workon
      command to access the
    cv
      Python virtual environment.
  3. Attempt to import your Python + OpenCV 3 bindings on macOS.

Here are the exact commands I used to validate that my Python virtual environment + OpenCV install are working correctly:

$ workon cv
$ python
>>> import cv2
>>> cv2.__version__
'3.1.0-dev'
>>>

Note that the above output demonstrates how to use OpenCV 3 + Python 2.7 with virtual environments.

I also created an OpenCV 3 + Python 3 virtual environment as well (named

py3cv3
 ), installed NumPy, and sym-linked the OpenCV 3 bindings. The output of me accessing the
py3cv3
  virtual environment and importing OpenCV can be seen below:
Figure 10: Utilizing virtual environments with Python 3 + OpenCV 3 on macOS.

Figure 10: Utilizing virtual environments with Python 3 + OpenCV 3 on macOS.

So, what’s next?

Congrats! You now have a brand new, fresh install of OpenCV on your macOS system — and I’m sure you’re just itching to leverage your install to build some awesome computer vision apps…

…but I’m also willing to bet that you’re just getting started learning computer vision and OpenCV, and probably feeling a bit confused and overwhelmed on exactly where to start.

Personally, I’m a big fan of learning by example, so a good first step would be to have some fun and read this blog post on detecting cats in images/videos. This tutorial is meant to be very hands-on and demonstrate how you can (quickly) build a Python + OpenCV application to detect the presence of cats in images.

And if you’re really interested in leveling-up your computer vision skills, you should definitely check out my book, Practical Python and OpenCV + Case Studies. My book not only covers the basics of computer vision and image processing, but also teaches you how to solve real-world computer vision problems including face detection in images and video streamsobject tracking in video, and handwriting recognition.

curious_about_cv

So, let’s put that fresh install of OpenCV 3 on your macOS system to good use — just click here to learn more about the real-world projects you can solve using Practical Python and OpenCV.

Summary

In today’s blog post I demonstrated how to install OpenCV 3 with Python 2.7 and Python 3 bindings on your macOS system via Homebrew.

As you can see, utilizing Homebrew is a great method to avoid the tedious process of manually configuring your CMake command to compile OpenCV via source (my full list of OpenCV install tutorials can be found on this page).

The downside is that you lose much of the control that CMake affords you.

Furthermore, while the Homebrew method certainly requires executing less commands and avoids potentially frustrating configurations, it’s still worth mentioning that you still need to do a bit of work yourself, especially when it comes to the Python 3 bindings.

These steps also compound if you decide to use virtual environments, a best practice when doing Python development.

When it comes to installing OpenCV 3 on your own macOS system I would suggest you:

  1. First try to install OpenCV 3 via source. If you run into considerable trouble and struggle to get OpenCV 3 to compile, use this as an opportunity to teach yourself more about Unix environments. More times than not, OpenCV 3 failing to compile is due to an incorrect CMake parameter that can be correctly determined with a little more knowledge over Unix systems, paths, and libraries.
  2. Use Homebrew as a fallback. I would recommend using the Homebrew method to install OpenCV 3 as your fallback option. You lose a bit of control when installing OpenCV 3 via Homebrew, and worse, if any sym-links break during a major operating system upgrade you’ll struggle to resolve them. Don’t get me wrong: I love Homebrew and think it’s a great tool — but make sure you use it wisely.

Anyway, I hope you enjoyed this blog post! And I hope it helps you get OpenCV 3 installed on their macOS systems.

If you’re interested in learning more about OpenCV, computer vision, and image processing, be sure to enter your email address in the form below to be notified when new blog posts + tutorials are published!

The post Install OpenCV 3 on macOS with Homebrew (the easy way) appeared first on PyImageSearch.

Rotate images (correctly) with OpenCV and Python

$
0
0

opencv_rotated_header

Let me tell you an embarrassing story of how I wasted three weeks of research time during graduate school six years ago.

It was the end of my second semester of coursework.

I had taken all of my exams early and all my projects for the semester had been submitted.

Since my school obligations were essentially nil, I started experimenting with (automatically) identifying prescription pills in images, something I know a thing or two about (but back then I was just getting started with my research).

At the time, my research goal was to find and identify methods to reliably quantify pills in a rotation invariant manner. Regardless of how the pill was rotated, I wanted the output feature vector to be (approximately) the same (the feature vectors will never be to completely identical in a real-world application due to lighting conditions, camera sensors, floating point errors, etc.).

After the first week I was making fantastic progress.

I was able to extract features from my dataset of pills, index them, and then identify my test set of pills regardless of how they were oriented…

…however, there was a problem:

My method was only working with round, circular pills — I was getting completely nonsensical results for oblong pills.

How could that be?

I racked my brain for the explanation.

Was there a flaw in the logic of my feature extraction algorithm?

Was I not matching the features correctly?

Or was it something else entirely…like a problem with my image preprocessing.

While I might have been ashamed to admit this as a graduate student, the problem was the latter:

I goofed up.

It turns out that during the image preprocessing phase, I was rotating my images incorrectly.

Since round pills have are approximately square in their aspect ratio, the rotation bug wasn’t a problem for them. Here you can see a round pill being rotated a full 360 degrees without an issue:

Figure 1: Rotating a circular pill doesn't reveal any obvious problems.

Figure 1: Rotating a circular pill doesn’t reveal any obvious problems.

But for oblong pills, they would be “cut off” in the rotation process, like this:

Figure 2: However, rotating oblong pills using the OpenCV's standard cv2.getRotationMatrix2D and cv2.warpAffine functions caused me some problems.

Figure 2: However, rotating oblong pills using the OpenCV’s standard cv2.getRotationMatrix2D and cv2.warpAffine functions caused me some problems that weren’t immediately obvious.

In essence, I was only quantifying part of the rotated, oblong pills; hence my strange results.

I spent three weeks and part of my Christmas vacation banging my head against the wall trying to diagnose the bug — only to feel quite embarrassed when I realized it was due to me being negligent with the

cv2.rotate
  function.

You see, the size of the output image needs to be adjusted, otherwise, the corners of my image would be cut off.

How did I accomplish this and squash the bug for good?

To learn how to rotate images with OpenCV such that the entire image is included and none of the image is cut off, just keep reading.

Looking for the source code to this post?
Jump right to the downloads section.

Rotate images (correctly) with OpenCV and Python

In the remainder of this blog post I’ll discuss common issues that you may run into when rotating images with OpenCV and Python.

Specifically, we’ll be examining the problem of what happens when the corners of an image are “cut off” during the rotation process.

To make sure we all understand this rotation issue with OpenCV and Python I will:

  • Start with a simple example demonstrating the rotation problem.
  • Provide a rotation function that ensures images are not cut off in the rotation process.
  • Discuss how I resolved my pill identification issue using this method.

A simple rotation problem with OpenCV

Let’s get this blog post started with an example script.

Open up a new file, name it

rotate_simple.py
 , and insert the following code:
# import the necessary packages
import numpy as np
import argparse
import imutils
import cv2

# construct the argument parse and parse the arguments
ap = argparse.ArgumentParser()
ap.add_argument("-i", "--image", required=True,
	help="path to the image file")
args = vars(ap.parse_args())

Lines 2-5 start by importing our required Python packages.

If you don’t already have imutils, my series of OpenCV convenience functions installed, you’ll want to do that now:

$ pip install imutils

If you already have

imutils
  installed, make sure you have upgraded to the latest version:
$ pip install --upgrade imutils

From there, Lines 8-10 parse our command line arguments. We only need a single switch here,

--image
 , which is the path to where our image resides on disk.

Let’s move on to actually rotating our image:

# import the necessary packages
import numpy as np
import argparse
import imutils
import cv2

# construct the argument parse and parse the arguments
ap = argparse.ArgumentParser()
ap.add_argument("-i", "--image", required=True,
	help="path to the image file")
args = vars(ap.parse_args())

# load the image from disk
image = cv2.imread(args["image"])

# loop over the rotation angles
for angle in np.arange(0, 360, 15):
	rotated = imutils.rotate(image, angle)
	cv2.imshow("Rotated (Problematic)", rotated)
	cv2.waitKey(0)

# loop over the rotation angles again, this time ensuring
# no part of the image is cut off
for angle in np.arange(0, 360, 15):
	rotated = imutils.rotate_bound(image, angle)
	cv2.imshow("Rotated (Correct)", rotated)
	cv2.waitKey(0)

Line 14 loads the image we want to rotate from disk.

We then loop over various angles in the range [0, 360] in 15 degree increments (Line 17).

For each of these angles we call

imutils.rotate
 , which rotates our
image
  the specified number of
angle
  degrees about the center of the image. We then display the rotated image to our screen.

Lines 24-27 perform an identical process, but this time we call

imutils.rotate_bound
  (I’ll provide the implementation of this function in the next section).

As the name of this method suggests, we are going to ensure the entire image is bound inside the window and none is cut off.

To see this script in action, be sure to download the source code using the “Downloads” section of this blog post, followed by executing the command below:

$ python rotate_simple.py --image images/saratoga.jpg

The output of using the

imutils.rotate
  function on a non-square image can be seen below:
Figure 3: An example of corners being cut off when rotating an image using OpenCV and Python.

Figure 3: An example of corners being cut off when rotating an image using OpenCV and Python.

As you can see, the image is “cut off” when it’s rotated — the entire image is not kept in the field of view.

But if we use

imutils.rotate_bound
  we can resolve this issue:
Figure 4: We can ensure the entire image is kept in the field of view by modifying the matrix returned by cv2.getRotationMatrix2D.

Figure 4: We can ensure the entire image is kept in the field of view by modifying the matrix returned by cv2.getRotationMatrix2D.

Awesome, we fixed the problem!

So does this mean that we should always use

.rotate_bound
  over the
.rotate
  method?

What makes it so special?

And what’s going on under the hood?

I’ll answer these questions in the next section.

Implementing a rotation function that doesn’t cut off your images

Let me start off by saying there is nothing wrong with the

cv2.getRotationMatrix2D
  and
cv2.warpAffine
  functions that are used to rotate images inside OpenCV.

In reality, these functions give us more freedom than perhaps we are comfortable with (sort of like comparing manual memory management with C versus automatic garbage collection with Java).

The

cv2.getRotationMatrix2D
  function doesn’t care if we would like the entire rotated image to kept.

It doesn’t care if the image is cut off.

And it won’t help you if you shoot yourself in the foot when using this function (I found this out the hard way and it took 3 weeks to stop the bleeding).

Instead, what you need to do is understand what the rotation matrix is and how it’s constructed.

You see, when you rotate an image with OpenCV you call

cv2.getRotationMatrix2D
  which returns a matrix M that looks something like this:
Figure 5: The structure of the matrix M returned by cv2.getRotationMatrix2D.

Figure 5: The structure of the matrix M returned by cv2.getRotationMatrix2D.

This matrix looks scary, but I promise you: it’s not.

To understand it, let’s assume we want to rotate our image \theta degrees about some center (c_{x}, c_{y}) coordinates at some scale (i.e., smaller or larger).

We can then plug in values for \alpha and \beta:

\alpha = scale * cos \theta and \beta = scale * sin \theta

That’s all fine and good for simple rotation — but it doesn’t take into account what happens if an image is cut off along the borders. How do we remedy this?

The answer is inside the

rotate_bound
  function in convenience.py of imutils:
# author:    Adrian Rosebrock
# website:   http://www.pyimagesearch.com

# import the necessary packages
import numpy as np
import cv2
import sys

# import any special Python 2.7 packages
if sys.version_info.major == 2:
    from urllib import urlopen

# import any special Python 3 packages
elif sys.version_info.major == 3:
    from urllib.request import urlopen

def translate(image, x, y):
    # define the translation matrix and perform the translation
    M = np.float32([[1, 0, x], [0, 1, y]])
    shifted = cv2.warpAffine(image, M, (image.shape[1], image.shape[0]))

    # return the translated image
    return shifted

def rotate(image, angle, center=None, scale=1.0):
    # grab the dimensions of the image
    (h, w) = image.shape[:2]

    # if the center is None, initialize it as the center of
    # the image
    if center is None:
        center = (w // 2, h // 2)

    # perform the rotation
    M = cv2.getRotationMatrix2D(center, angle, scale)
    rotated = cv2.warpAffine(image, M, (w, h))

    # return the rotated image
    return rotated

def rotate_bound(image, angle):
    # grab the dimensions of the image and then determine the
    # center
    (h, w) = image.shape[:2]
    (cX, cY) = (w // 2, h // 2)

    # grab the rotation matrix (applying the negative of the
    # angle to rotate clockwise), then grab the sine and cosine
    # (i.e., the rotation components of the matrix)
    M = cv2.getRotationMatrix2D((cX, cY), -angle, 1.0)
    cos = np.abs(M[0, 0])
    sin = np.abs(M[0, 1])

    # compute the new bounding dimensions of the image
    nW = int((h * sin) + (w * cos))
    nH = int((h * cos) + (w * sin))

    # adjust the rotation matrix to take into account translation
    M[0, 2] += (nW / 2) - cX
    M[1, 2] += (nH / 2) - cY

    # perform the actual rotation and return the image
    return cv2.warpAffine(image, M, (nW, nH))

On Line 41 we define our

rotate_bound
  function.

This method accepts an input

image
  and an
angle
  to rotate it by.

We assume we’ll be rotating our image about its center (x, y)-coordinates, so we determine these values on lines 44 and 45.

Given these coordinates, we can call

cv2.getRotationMatrix2D
  to obtain our rotation matrix M (Line 50).

However, to adjust for any image border cut off issues, we need to apply some manual calculations of our own.

We start by grabbing the cosine and sine values from our rotation matrix M (Lines 51 and 52).

This enables us to compute the new width and height of the rotated image, ensuring no part of the image is cut off.

Once we know the new width and height, we can adjust for translation on Lines 59 and 60 by modifying our rotation matrix once again.

Finally,

cv2.warpAffine
  is called on Line 63 to rotate the actual image using OpenCV while ensuring none of the image is cut off.

For some other interesting solutions (some better than others) to the rotation cut off problem when using OpenCV, be sure to refer to this StackOverflow thread and this one too.

Fixing the rotated image “cut off” problem with OpenCV and Python

Let’s get back to my original problem of rotating oblong pills and how I used

.rotate_bound
  to solve the issue (although back then I had not created the
imutils
  Python package — it was simply a utility function in a helper file).

We’ll be using the following pill as our example image:

Figure 6: The example oblong pill we will be rotating with OpenCV.

Figure 6: The example oblong pill we will be rotating with OpenCV.

To start, open up a new file and name it

rotate_pills.py
 . Then, insert the following code:
# import the necessary packages
import numpy as np
import argparse
import imutils
import cv2

# construct the argument parse and parse the arguments
ap = argparse.ArgumentParser()
ap.add_argument("-i", "--image", required=True,
	help="path to the image file")
args = vars(ap.parse_args())

Lines 2-5 import our required Python packages. Again, make sure you have installed and/or upgraded the imutils Python package before continuing.

We then parse our command line arguments on Lines 8-11. Just like in the example at the beginning of the blog post, we only need one switch:

--image
 , the path to our input image.

Next, we load our pill image from disk and preprocess it by converting it to grayscale, blurring it, and detecting edges:

# import the necessary packages
import numpy as np
import argparse
import imutils
import cv2

# construct the argument parse and parse the arguments
ap = argparse.ArgumentParser()
ap.add_argument("-i", "--image", required=True,
	help="path to the image file")
args = vars(ap.parse_args())

# load the image from disk, convert it to grayscale, blur it,
# and apply edge detection to reveal the outline of the pill
image = cv2.imread(args["image"])
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
gray = cv2.GaussianBlur(gray, (3, 3), 0)
edged = cv2.Canny(gray, 20, 100)

After executing these preprocessing functions our pill image now looks like this:

Figure 7: Detecting edges in the pill.

Figure 7: Detecting edges in the pill.

The outline of the pill is clearly visible, so let’s apply contour detection to find the outline of the pill:

# import the necessary packages
import numpy as np
import argparse
import imutils
import cv2

# construct the argument parse and parse the arguments
ap = argparse.ArgumentParser()
ap.add_argument("-i", "--image", required=True,
	help="path to the image file")
args = vars(ap.parse_args())

# load the image from disk, convert it to grayscale, blur it,
# and apply edge detection to reveal the outline of the pill
image = cv2.imread(args["image"])
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
gray = cv2.GaussianBlur(gray, (3, 3), 0)
edged = cv2.Canny(gray, 20, 100)

# find contours in the edge map
cnts = cv2.findContours(edged.copy(), cv2.RETR_EXTERNAL,
	cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if imutils.is_cv2() else cnts[1]

We are now ready to extract the pill ROI from the image:

# import the necessary packages
import numpy as np
import argparse
import imutils
import cv2

# construct the argument parse and parse the arguments
ap = argparse.ArgumentParser()
ap.add_argument("-i", "--image", required=True,
	help="path to the image file")
args = vars(ap.parse_args())

# load the image from disk, convert it to grayscale, blur it,
# and apply edge detection to reveal the outline of the pill
image = cv2.imread(args["image"])
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
gray = cv2.GaussianBlur(gray, (3, 3), 0)
edged = cv2.Canny(gray, 20, 100)

# find contours in the edge map
cnts = cv2.findContours(edged.copy(), cv2.RETR_EXTERNAL,
	cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if imutils.is_cv2() else cnts[1]

# ensure at least one contour was found
if len(cnts) > 0:
	# grab the largest contour, then draw a mask for the pill
	c = max(cnts, key=cv2.contourArea)
	mask = np.zeros(gray.shape, dtype="uint8")
	cv2.drawContours(mask, [c], -1, 255, -1)

	# compute its bounding box of pill, then extract the ROI,
	# and apply the mask
	(x, y, w, h) = cv2.boundingRect(c)
	imageROI = image[y:y + h, x:x + w]
	maskROI = mask[y:y + h, x:x + w]
	imageROI = cv2.bitwise_and(imageROI, imageROI,
		mask=maskROI)

First, we ensure that at least one contour was found in the edge map (Line 26).

Provided we have at least one contour, we construct a

mask
  for the largest contour region on Lines 29 and 30.

Our

mask
  looks like this:
Figure 8: The mask representing the entire pill region in the image.

Figure 8: The mask representing the entire pill region in the image.

Given the contour region, we can compute the (x, y)-coordinates of the bounding box of the region (Line 34).

Using both the bounding box and

mask
 , we can extract the actual pill region ROI (Lines 35-38).

Now, let’s go ahead and apply both the

imutils.rotate
  and
imutils.rotate_bound
  functions to the
imageROI
 , just like we did in the simple examples above:
# import the necessary packages
import numpy as np
import argparse
import imutils
import cv2

# construct the argument parse and parse the arguments
ap = argparse.ArgumentParser()
ap.add_argument("-i", "--image", required=True,
	help="path to the image file")
args = vars(ap.parse_args())

# load the image from disk, convert it to grayscale, blur it,
# and apply edge detection to reveal the outline of the pill
image = cv2.imread(args["image"])
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
gray = cv2.GaussianBlur(gray, (3, 3), 0)
edged = cv2.Canny(gray, 20, 100)

# find contours in the edge map
cnts = cv2.findContours(edged.copy(), cv2.RETR_EXTERNAL,
	cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if imutils.is_cv2() else cnts[1]

# ensure at least one contour was found
if len(cnts) > 0:
	# grab the largest contour, then draw a mask for the pill
	c = max(cnts, key=cv2.contourArea)
	mask = np.zeros(gray.shape, dtype="uint8")
	cv2.drawContours(mask, [c], -1, 255, -1)

	# compute its bounding box of pill, then extract the ROI,
	# and apply the mask
	(x, y, w, h) = cv2.boundingRect(c)
	imageROI = image[y:y + h, x:x + w]
	maskROI = mask[y:y + h, x:x + w]
	imageROI = cv2.bitwise_and(imageROI, imageROI,
		mask=maskROI)

	# loop over the rotation angles
	for angle in np.arange(0, 360, 15):
		rotated = imutils.rotate(imageROI, angle)
		cv2.imshow("Rotated (Problematic)", rotated)
		cv2.waitKey(0)

	# loop over the rotation angles again, this time ensure the
	# entire pill is still within the ROI after rotation
	for angle in np.arange(0, 360, 15):
		rotated = imutils.rotate_bound(imageROI, angle)
		cv2.imshow("Rotated (Correct)", rotated)
		cv2.waitKey(0)

After downloading the source code to this tutorial using the “Downloads” section below, you can execute the following command to see the output:

$ python rotate_pills.py --image images/pill_01.png

The output of

imutils.rotate
  will look like:
Figure 9: Incorrectly rotating an image with OpenCV causes parts of the image to be cut off.

Figure 9: Incorrectly rotating an image with OpenCV causes parts of the image to be cut off.

Notice how the pill is cut off during the rotation process — we need to explicitly compute the new dimensions of the rotated image to ensure the borders are not cut off.

By using

imutils.rotate_bound
 , we can ensure that no part of the image is cut off when using OpenCV:
Figure 10: By modifying OpenCV's rotation matrix we can resolve the issue and ensure the entire image is visible.

Figure 10: By modifying OpenCV’s rotation matrix we can resolve the issue and ensure the entire image is visible.

Using this function I was finally able to finish my research for the winter break — but not before I felt quite embarrassed about my rookie mistake.

Summary

In today’s blog post I discussed how image borders can be cut off when rotating images with OpenCV and

cv2.warpAffine
 .

The fact that image borders can be cut off is not a bug in OpenCV — in fact, it’s how

cv2.getRotationMatrix2D
  and
cv2.warpAffine
  are designed.

While it may seem frustrating and cumbersome to compute new image dimensions to ensure you don’t lose your borders, it’s actually a blessing in disguise.

OpenCV gives us so much control that we can modify our rotation matrix to make it do exactly what we want.

Of course, this requires us to know how our rotation matrix M is formed and what each of its components represents (discussed earlier in this tutorial). Provided we understand this, the math falls out naturally.

To learn more about image processing and computer vision, be sure to take a look at the PyImageSearch Gurus course where I discuss these topics in more detail.

Otherwise, I encourage you to enter your email address in the form below to be notified when future blog posts are published.

Downloads:

If you would like to download the code and images used in this post, please enter your email address in the form below. Not only will you get a .zip of the code, I’ll also send you a FREE 11-page Resource Guide on Computer Vision and Image Search Engines, including exclusive techniques that I don’t post on this blog! Sound good? If so, enter your email address and I’ll send you the code immediately!

The post Rotate images (correctly) with OpenCV and Python appeared first on PyImageSearch.

Faster video file FPS with cv2.VideoCapture and OpenCV

$
0
0

file_video_sream_animation

Have you ever worked with a video file via OpenCV’s

cv2.VideoCapture
  function and found that reading frames just felt slow and sluggish?

I’ve been there — and I know exactly how it feels.

Your entire video processing pipeline crawls along, unable to process more than one or two frames per second — even though you aren’t doing any type of computationally expensive image processing operations.

Why is that?

Why, at times, does it seem like an eternity for

cv2.VideoCapture
  and the associated
.read
  method to poll another frame from your video file?

The answer is almost always video compression and frame decoding.

Depending on your video file type, the codecs you have installed, and not to mention, the physical hardware of your machine, much of your video processing pipeline can actually be consumed by reading and decoding the next frame in the video file.

That’s just computationally wasteful — and there is a better way.

In the remainder of today’s blog post, I’ll demonstrate how to use threading and a queue data structure to improve your video file FPS rate by over 52%!

Looking for the source code to this post?
Jump right to the downloads section.

Faster video file FPS with cv2.VideoCapture and OpenCV

When working with video files and OpenCV you are likely using the

cv2.VideoCapture
  function.

First, you instantiate your

cv2.VideoCapture
  object by passing in the path to your input video file.

Then you start a loop, calling the

.read
  method of
cv2.VideoCapture
  to poll the next frame from the video file so you can process it in your pipeline.

The problem (and the reason why this method can feel slow and sluggish) is that you’re both reading and decoding the frame in your main processing thread!

As I’ve mentioned in previous posts, the

.read
  method is a blocking operation — the main thread of your Python + OpenCV application is entirely blocked (i.e., stalled) until the frame is read from the video file, decoded, and returned to the calling function.

By moving these blocking I/O operations to a separate thread and maintaining a queue of decoded frames we can actually improve our FPS processing rate by over 52%!

This increase in frame processing rate (and therefore our overall video processing pipeline) comes from dramatically reducing latency — we don’t have to wait for the

.read
  method to finish reading and decoding a frame; instead, there is always a pre-decoded frame ready for us to process.

To accomplish this latency decrease our goal will be to move the reading and decoding of video file frames to an entirely separate thread of the program, feeing up our main thread to handle the actual image processing.

But before we can appreciate the faster, threaded method to video frame processing, we first need to set a benchmark/baseline with the slower, non-threaded version.

The slow, naive method to reading video frames with OpenCV

The goal of this section is to obtain a baseline on our video frame processing throughput rate using OpenCV and Python.

To start, open up a new file, name it

read_frames_slow.py
 , and insert the following code:
# import the necessary packages
from imutils.video import FPS
import numpy as np
import argparse
import imutils
import cv2

# construct the argument parse and parse the arguments
ap = argparse.ArgumentParser()
ap.add_argument("-v", "--video", required=True,
	help="path to input video file")
args = vars(ap.parse_args())

# open a pointer to the video stream and start the FPS timer
stream = cv2.VideoCapture(args["video"])
fps = FPS().start()

Lines 2-6 import our required Python packages. We’ll be using my imutils library, a series of convenience functions to make image and video processing operations easier with OpenCV and Python.

If you don’t already have

imutils
  installed or if you are using a previous version, you can install/upgrade
imutils
  by using the following command:
$ pip install --upgrade imutils

Lines 9-12 then parse our command line arguments. We only need a single switch for this script,

--video
 , which is the path to our input video file.

Line 15 opens a pointer to the

--video
  file using the
cv2.VideoCapture
  class while Line 16 starts a timer that we can use to measure FPS, or more specifically, the throughput rate of our video processing pipeline.

With

cv2.VideoCapture
  instantiated, we can start reading frames from the video file and processing them one-by-one:
# import the necessary packages
from imutils.video import FPS
import numpy as np
import argparse
import imutils
import cv2

# construct the argument parse and parse the arguments
ap = argparse.ArgumentParser()
ap.add_argument("-v", "--video", required=True,
	help="path to input video file")
args = vars(ap.parse_args())

# open a pointer to the video stream and start the FPS timer
stream = cv2.VideoCapture(args["video"])
fps = FPS().start()

# loop over frames from the video file stream
while True:
	# grab the frame from the threaded video file stream
	(grabbed, frame) = stream.read()

	# if the frame was not grabbed, then we have reached the end
	# of the stream
	if not grabbed:
		break

	# resize the frame and convert it to grayscale (while still
	# retaining 3 channels)
	frame = imutils.resize(frame, width=450)
	frame = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
	frame = np.dstack([frame, frame, frame])

	# display a piece of text to the frame (so we can benchmark
	# fairly against the fast method)
	cv2.putText(frame, "Slow Method", (10, 30),
		cv2.FONT_HERSHEY_SIMPLEX, 0.6, (0, 255, 0), 2)	

	# show the frame and update the FPS counter
	cv2.imshow("Frame", frame)
	cv2.waitKey(1)
	fps.update()

On Line 19 we start looping over the frames of our video file.

A call to the

.read
  method on Line 21 returns a 2-tuple containing:
  1. grabbed
     : A boolean indicating if the frame was successfully read or not.
  2. frame
     : The actual video frame itself.

If

grabbed
  is
False
  then we know we have reached the end of the video file and can break from the loop (Lines 25 and 26).

Otherwise, we perform some basic image processing tasks, including:

  1. Resizing the frame to have a width of 450 pixels.
  2. Converting the frame to grayscale.
  3. Drawing the text on the frame via the
    cv2.putText
      method. We do this because we’ll be using the
    cv2.putText
      function to display our queue size in the fast, threaded example below and want to have a fair, comparable pipeline.

Lines 40-42 display the frame to our screen and update our FPS counter.

The final code block handles computing the approximate FPS/frame rate throughput of our pipeline, releasing the video stream pointer, and closing any open windows:

# import the necessary packages
from imutils.video import FPS
import numpy as np
import argparse
import imutils
import cv2

# construct the argument parse and parse the arguments
ap = argparse.ArgumentParser()
ap.add_argument("-v", "--video", required=True,
	help="path to input video file")
args = vars(ap.parse_args())

# open a pointer to the video stream and start the FPS timer
stream = cv2.VideoCapture(args["video"])
fps = FPS().start()

# loop over frames from the video file stream
while True:
	# grab the frame from the threaded video file stream
	(grabbed, frame) = stream.read()

	# if the frame was not grabbed, then we have reached the end
	# of the stream
	if not grabbed:
		break

	# resize the frame and convert it to grayscale (while still
	# retaining 3 channels)
	frame = imutils.resize(frame, width=450)
	frame = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
	frame = np.dstack([frame, frame, frame])

	# display a piece of text to the frame (so we can benchmark
	# fairly against the fast method)
	cv2.putText(frame, "Slow Method", (10, 30),
		cv2.FONT_HERSHEY_SIMPLEX, 0.6, (0, 255, 0), 2)	

	# show the frame and update the FPS counter
	cv2.imshow("Frame", frame)
	cv2.waitKey(1)
	fps.update()

# stop the timer and display FPS information
fps.stop()
print("[INFO] elasped time: {:.2f}".format(fps.elapsed()))
print("[INFO] approx. FPS: {:.2f}".format(fps.fps()))

# do a bit of cleanup
stream.release()
cv2.destroyAllWindows()

To execute this script, be sure to download the source code + example video to this blog post using the “Downloads” section at the bottom of the tutorial.

For this example we’ll be using the first 31 seconds of the Jurassic Park trailer (the .mp4 file is included in the code download):

Let’s go ahead and obtain a baseline for frame processing throughput on this example video:

$ python read_frames_slow.py --video videos/jurassic_park_intro.mp4

Figure 1: The slow, naive method to read frames from a video file using Python and OpenCV.

Figure 1: The slow, naive method to read frames from a video file using Python and OpenCV.

As you can see, processing each individual frame of the 31 second video clip takes approximately 47 seconds with a FPS processing rate of 20.21.

These results imply that it’s actually taking longer to read and decode the individual frames than the actual length of the video clip!

To see how we can speedup our frame processing throughput, take a look at the technique I describe in the next section.

Using threading to buffer frames with OpenCV

To improve the FPS processing rate of frames read from video files with OpenCV we are going to utilize threading and the queue data structure:

Figure 2: An example of the queue data structure. New data is enqueued to the back of the list while older data is dequeued from the front of the list. (source: Wikipedia)

Figure 2: An example of the queue data structure. New data is enqueued to the back of the list while older data is dequeued from the front of the list. (source: Wikipedia)

Since the

.read
  method of
cv2.VideoCapture
  is a blocking I/O operation we can obtain a significant speedup simply by creating a separate thread from our main Python script that is solely responsible for reading frames from the video file and maintaining a queue.

Since Python’s Queue data structure is thread safe, much of the hard work is done for us already — we just need to put all the pieces together.

I’ve already implemented the FileVideoStream class in imutils but we’re going to review the code so you can understand what’s going on under the hood:

# import the necessary packages
from threading import Thread
import sys
import cv2

# import the Queue class from Python 3
if sys.version_info >= (3, 0):
	from queue import Queue

# otherwise, import the Queue class for Python 2.7
else:
	from Queue import Queue

Lines 2-4 handle importing our required Python packages. The

Thread
  class is used to create and start threads in the Python programming language.

We need to take special care when importing the

Queue
  data structure as the name of the queue package is different based on which Python version you are using (Lines 7-12).

We can now define the constructor to

FileVideoStream
 :
# import the necessary packages
from threading import Thread
import sys
import cv2

# import the Queue class from Python 3
if sys.version_info >= (3, 0):
	from queue import Queue

# otherwise, import the Queue class for Python 2.7
else:
	from Queue import Queue

class FileVideoStream:
	def __init__(self, path, queueSize=128):
		# initialize the file video stream along with the boolean
		# used to indicate if the thread should be stopped or not
		self.stream = cv2.VideoCapture(path)
		self.stopped = False

		# initialize the queue used to store frames read from
		# the video file
		self.Q = Queue(maxsize=queueSize)

Our constructor takes a single required argument followed by an optional one:

  • path
     : The path to our input video file.
  • queueSize
     : The maximum number of frames to store in the queue. This value defaults to 128 frames, but you depending on (1) the frame dimensions of your video and (2) the amount of memory you can spare, you may want to raise/lower this value.

Line 18 instantiates our

cv2.VideoCapture
  object by passing in the video
path
 .

We then initialize a boolean to indicate if the threading process should be stopped (Line 19) along with our actual

Queue
  data structure (Line 23).

To kick off the thread, we’ll next define the

start
  method:
# import the necessary packages
from threading import Thread
import sys
import cv2

# import the Queue class from Python 3
if sys.version_info >= (3, 0):
	from queue import Queue

# otherwise, import the Queue class for Python 2.7
else:
	from Queue import Queue

class FileVideoStream:
	def __init__(self, path, queueSize=128):
		# initialize the file video stream along with the boolean
		# used to indicate if the thread should be stopped or not
		self.stream = cv2.VideoCapture(path)
		self.stopped = False

		# initialize the queue used to store frames read from
		# the video file
		self.Q = Queue(maxsize=queueSize)

	def start(self):
		# start a thread to read frames from the file video stream
		t = Thread(target=self.update, args=())
		t.daemon = True
		t.start()
		return self

This method simply starts a thread separate from the main thread. This thread will call the

.update
  method (which we’ll define in the next code block).

The

update
  method is responsible for reading and decoding frames from the video file, along with maintaining the actual queue data structure:
# import the necessary packages
from threading import Thread
import sys
import cv2

# import the Queue class from Python 3
if sys.version_info >= (3, 0):
	from queue import Queue

# otherwise, import the Queue class for Python 2.7
else:
	from Queue import Queue

class FileVideoStream:
	def __init__(self, path, queueSize=128):
		# initialize the file video stream along with the boolean
		# used to indicate if the thread should be stopped or not
		self.stream = cv2.VideoCapture(path)
		self.stopped = False

		# initialize the queue used to store frames read from
		# the video file
		self.Q = Queue(maxsize=queueSize)

	def start(self):
		# start a thread to read frames from the file video stream
		t = Thread(target=self.update, args=())
		t.daemon = True
		t.start()
		return self

	def update(self):
		# keep looping infinitely
		while True:
			# if the thread indicator variable is set, stop the
			# thread
			if self.stopped:
				return

			# otherwise, ensure the queue has room in it
			if not self.Q.full():
				# read the next frame from the file
				(grabbed, frame) = self.stream.read()

				# if the `grabbed` boolean is `False`, then we have
				# reached the end of the video file
				if not grabbed:
					self.stop()
					return

				# add the frame to the queue
				self.Q.put(frame)

On the surface, this code is very similar to our example in the slow, naive method detailed above.

The key takeaway here is that this code is actually running in a separate thread — this is where our actual FPS processing rate increase comes from.

On Line 34 we start looping over the frames in the video file.

If the

stopped
  indicator is set, we exit the thread (Lines 37 and 38).

If our queue is not full we read the next frame from the video stream, check to see if we have reached the end of the video file, and then update the queue (Lines 41-52).

The

read
  method will handle returning the next frame in the queue:
# import the necessary packages
from threading import Thread
import sys
import cv2

# import the Queue class from Python 3
if sys.version_info >= (3, 0):
	from queue import Queue

# otherwise, import the Queue class for Python 2.7
else:
	from Queue import Queue

class FileVideoStream:
	def __init__(self, path, queueSize=128):
		# initialize the file video stream along with the boolean
		# used to indicate if the thread should be stopped or not
		self.stream = cv2.VideoCapture(path)
		self.stopped = False

		# initialize the queue used to store frames read from
		# the video file
		self.Q = Queue(maxsize=queueSize)

	def start(self):
		# start a thread to read frames from the file video stream
		t = Thread(target=self.update, args=())
		t.daemon = True
		t.start()
		return self

	def update(self):
		# keep looping infinitely
		while True:
			# if the thread indicator variable is set, stop the
			# thread
			if self.stopped:
				return

			# otherwise, ensure the queue has room in it
			if not self.Q.full():
				# read the next frame from the file
				(grabbed, frame) = self.stream.read()

				# if the `grabbed` boolean is `False`, then we have
				# reached the end of the video file
				if not grabbed:
					self.stop()
					return

				# add the frame to the queue
				self.Q.put(frame)

	def read(self):
		# return next frame in the queue
		return self.Q.get()

We’ll create a convenience function named

more
  that will return
True
  if there are still more frames in the queue (and
False
  otherwise):
# import the necessary packages
from threading import Thread
import sys
import cv2

# import the Queue class from Python 3
if sys.version_info >= (3, 0):
	from queue import Queue

# otherwise, import the Queue class for Python 2.7
else:
	from Queue import Queue

class FileVideoStream:
	def __init__(self, path, queueSize=128):
		# initialize the file video stream along with the boolean
		# used to indicate if the thread should be stopped or not
		self.stream = cv2.VideoCapture(path)
		self.stopped = False

		# initialize the queue used to store frames read from
		# the video file
		self.Q = Queue(maxsize=queueSize)

	def start(self):
		# start a thread to read frames from the file video stream
		t = Thread(target=self.update, args=())
		t.daemon = True
		t.start()
		return self

	def update(self):
		# keep looping infinitely
		while True:
			# if the thread indicator variable is set, stop the
			# thread
			if self.stopped:
				return

			# otherwise, ensure the queue has room in it
			if not self.Q.full():
				# read the next frame from the file
				(grabbed, frame) = self.stream.read()

				# if the `grabbed` boolean is `False`, then we have
				# reached the end of the video file
				if not grabbed:
					self.stop()
					return

				# add the frame to the queue
				self.Q.put(frame)

	def read(self):
		# return next frame in the queue
		return self.Q.get()

	def more(self):
		# return True if there are still frames in the queue
		return self.Q.qsize() > 0

And finally, the

stop
  method will be called if we want to stop the thread prematurely (i.e., before we have reached the end of the video file):
# import the necessary packages
from threading import Thread
import sys
import cv2

# import the Queue class from Python 3
if sys.version_info >= (3, 0):
	from queue import Queue

# otherwise, import the Queue class for Python 2.7
else:
	from Queue import Queue

class FileVideoStream:
	def __init__(self, path, queueSize=128):
		# initialize the file video stream along with the boolean
		# used to indicate if the thread should be stopped or not
		self.stream = cv2.VideoCapture(path)
		self.stopped = False

		# initialize the queue used to store frames read from
		# the video file
		self.Q = Queue(maxsize=queueSize)

	def start(self):
		# start a thread to read frames from the file video stream
		t = Thread(target=self.update, args=())
		t.daemon = True
		t.start()
		return self

	def update(self):
		# keep looping infinitely
		while True:
			# if the thread indicator variable is set, stop the
			# thread
			if self.stopped:
				return

			# otherwise, ensure the queue has room in it
			if not self.Q.full():
				# read the next frame from the file
				(grabbed, frame) = self.stream.read()

				# if the `grabbed` boolean is `False`, then we have
				# reached the end of the video file
				if not grabbed:
					self.stop()
					return

				# add the frame to the queue
				self.Q.put(frame)

	def read(self):
		# return next frame in the queue
		return self.Q.get()

	def more(self):
		# return True if there are still frames in the queue
		return self.Q.qsize() > 0

	def stop(self):
		# indicate that the thread should be stopped
		self.stopped = True

The faster, threaded method to reading video frames with OpenCV

Now that we have defined our

FileVideoStream
  class we can put all the pieces together and enjoy a faster, threaded video file read with OpenCV.

Open up a new file, name it

read_frames_fast.py
 , and insert the following code:
# import the necessary packages
from imutils.video import FileVideoStream
from imutils.video import FPS
import numpy as np
import argparse
import imutils
import time
import cv2

# construct the argument parse and parse the arguments
ap = argparse.ArgumentParser()
ap.add_argument("-v", "--video", required=True,
	help="path to input video file")
args = vars(ap.parse_args())

# start the file video stream thread and allow the buffer to
# start to fill
print("[INFO] starting video file thread...")
fvs = FileVideoStream(args["video"]).start()
time.sleep(1.0)

# start the FPS timer
fps = FPS().start()

Lines 2-8 import our required Python packages. Notice how we are using the

FileVideoStream
  class from the
imutils
  library to facilitate faster frame reads with OpenCV.

Lines 11-14 parse our command line arguments. Just like the previous example, we only need a single switch,

--video
 , the path to our input video file.

We then instantiate the

FileVideoStream
  object and start the frame reading thread (Line 19).

Line 23 then starts the FPS timer.

Our next section handles reading frames from the

FileVideoStream
 , processing them, and displaying them to our screen:
# import the necessary packages
from imutils.video import FileVideoStream
from imutils.video import FPS
import numpy as np
import argparse
import imutils
import time
import cv2

# construct the argument parse and parse the arguments
ap = argparse.ArgumentParser()
ap.add_argument("-v", "--video", required=True,
	help="path to input video file")
args = vars(ap.parse_args())

# start the file video stream thread and allow the buffer to
# start to fill
print("[INFO] starting video file thread...")
fvs = FileVideoStream(args["video"]).start()
time.sleep(1.0)

# start the FPS timer
fps = FPS().start()

# loop over frames from the video file stream
while fvs.more():
	# grab the frame from the threaded video file stream, resize
	# it, and convert it to grayscale (while still retaining 3
	# channels)
	frame = fvs.read()
	frame = imutils.resize(frame, width=450)
	frame = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
	frame = np.dstack([frame, frame, frame])

	# display the size of the queue on the frame
	cv2.putText(frame, "Queue Size: {}".format(fvs.Q.qsize()),
		(10, 30), cv2.FONT_HERSHEY_SIMPLEX, 0.6, (0, 255, 0), 2)	

	# show the frame and update the FPS counter
	cv2.imshow("Frame", frame)
	cv2.waitKey(1)
	fps.update()

We start a

while
  loop on Line 26 that will keep grabbing frames from the
FileVideoStream
  queue until the queue is empty.

For each of these frames we’ll apply the same image processing operations, including: resizing, conversion to grayscale, and displaying text on the frame (in this case, our text will be the number of frames in the queue).

The processed frame is displayed to our screen on Lines 40-42.

The last code block computes our FPS throughput rate and performs a bit of cleanup:

# import the necessary packages
from imutils.video import FileVideoStream
from imutils.video import FPS
import numpy as np
import argparse
import imutils
import time
import cv2

# construct the argument parse and parse the arguments
ap = argparse.ArgumentParser()
ap.add_argument("-v", "--video", required=True,
	help="path to input video file")
args = vars(ap.parse_args())

# start the file video stream thread and allow the buffer to
# start to fill
print("[INFO] starting video file thread...")
fvs = FileVideoStream(args["video"]).start()
time.sleep(1.0)

# start the FPS timer
fps = FPS().start()

# loop over frames from the video file stream
while fvs.more():
	# grab the frame from the threaded video file stream, resize
	# it, and convert it to grayscale (while still retaining 3
	# channels)
	frame = fvs.read()
	frame = imutils.resize(frame, width=450)
	frame = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
	frame = np.dstack([frame, frame, frame])

	# display the size of the queue on the frame
	cv2.putText(frame, "Queue Size: {}".format(fvs.Q.qsize()),
		(10, 30), cv2.FONT_HERSHEY_SIMPLEX, 0.6, (0, 255, 0), 2)	

	# show the frame and update the FPS counter
	cv2.imshow("Frame", frame)
	cv2.waitKey(1)
	fps.update()

# stop the timer and display FPS information
fps.stop()
print("[INFO] elasped time: {:.2f}".format(fps.elapsed()))
print("[INFO] approx. FPS: {:.2f}".format(fps.fps()))

# do a bit of cleanup
cv2.destroyAllWindows()
fvs.stop()

To see the results of the

read_frames_fast.py
  script, make sure you download the source code + example video using the “Downloads” section at the bottom of this tutorial.

From there, execute the following command:

$ python read_frames_fast.py --video videos/jurassic_park_intro.mp4

Figure 3: Utilizing threading with cv2.VideoCapture and OpenCV leads to higher FPS and a larger throughput rate.

Figure 3: Utilizing threading with cv2.VideoCapture and OpenCV leads to higher FPS and a larger throughput rate.

As we can see from the results we were able to process the entire 31 second video clip in 31.09 seconds — that’s an improvement of 34% from the slow, naive method!

The actual frame throughput processing rate is much faster, clocking in at 30.75 frames per second, an improvement of 52.15%.

Threading can dramatically improve the speed of your video processing pipeline — use it whenever you can.

What about built-in webcams, USB cameras, and the Raspberry Pi? What do I do then?

This post has focused on using threading to improve the frame processing rate of video files.

If you’re instead interested in speeding up the FPS of your built-in webcam, USB camera, or Raspberry Pi camera module, please refer to these blog posts:

Summary

In today’s tutorial I demonstrated how to use threading and a queue data structure to improve the FPS throughput rate of your video processing pipeline.

By placing the call to

.read
  of a
cv2.VideoCapture
  object in a thread separate from the main Python script we can avoid blocking I/O operations that would otherwise dramatically slow down our pipeline.

Finally, I provided an example comparing threading with no threading. The results show that by using threading we can improve our processing pipeline by up to 52%.

However, keep in mind that the more steps (i.e., function calls) you make inside your

while
  loop, the more computation needs to be done — therefore, your actual frames per second rate will drop, but you’ll still be processing faster than the non-threaded version.

To be notified when future blog posts are published, be sure to enter your email address in the form below!

Downloads:

If you would like to download the code and images used in this post, please enter your email address in the form below. Not only will you get a .zip of the code, I’ll also send you a FREE 11-page Resource Guide on Computer Vision and Image Search Engines, including exclusive techniques that I don’t post on this blog! Sound good? If so, enter your email address and I’ll send you the code immediately!

The post Faster video file FPS with cv2.VideoCapture and OpenCV appeared first on PyImageSearch.

Recognizing digits with OpenCV and Python

$
0
0

Today’s tutorial is inspired by a post I saw a few weeks back on /r/computervision asking how to recognize digits in an image containing a thermostat identical to the one at the top of this post.

As Reddit users were quick to point out, utilizing computer vision to recognize digits on a thermostat tends to overcomplicate the problem — a simple data logging thermometer would give much more reliable results with a fraction of the effort.

On the other hand, applying computer vision to projects such as these are really good practice.

Whether you are just getting started with computer vision/OpenCV, or you’re already writing computer vision code on a daily basis, taking the time to hone your skills on mini-projects are paramount to mastering your trade — in fact, I find it so important that I do exercises like this one twice a month.

Every other Friday afternoon I block off two hours on my calendar and practice my basic image processing and computer vision skills on computer vision/OpenCV questions I’ve found on Reddit or StackOverflow.

Doing this exercise helps me keep my skills sharp — it also has the added benefit of making great blog post content.

In the remainder of today’s blog post, I’ll demonstrate how to recognize digits in images using OpenCV and Python.

Looking for the source code to this post?
Jump right to the downloads section.

Recognizing digits with OpenCV and Python

In the first part of this tutorial, we’ll discuss what a seven-segment display is and how we can apply computer vision and image processing operations to recognize these types of digits (no machine learning required!)

From there I’ll provide actual Python and OpenCV code that can be used to recognize these digits in images.

The seven-segment display

You’re likely already familiar with a seven-segment display, even if you don’t recognize the particular term.

A great example of such a display is your classic digital alarm clock:

Figure 1: A classic digital alarm clock that contains four seven-segment displays to represent the time of day.

Figure 1: A classic digital alarm clock that contains four seven-segment displays to represent the time of day.

Each digit on the alarm clock is represented by a seven-segment component just like the one below:

Figure 2: An example of a single seven-segment display. Each segment can be turned "on" or "off" to represent a particular digit.

Figure 2: An example of a single seven-segment display. Each segment can be turned “on” or “off” to represent a particular digit (source: Wikipedia).

Sevent-segment displays can take on a total of 128 possible states:

Figure 3: A seven-segment display is capable of 128 possible states (source: Wikipedia).

Figure 3: A seven-segment display is capable of 128 possible states (source: Wikipedia).

Luckily for us, we are only interested in ten of them — the digits zero to nine:

Figure 4: However, for the task of digit recognition we only need to recognize ten of these states.

Figure 4: For the task of digit recognition we only need to recognize ten of these states.

Our goal is to write OpenCV and Python code to recognize each of these ten digit states in an image.

Planning the OpenCV digit recognizer

Just like in the original post on /r/computervision, we’ll be using the thermostat image as input:

Figure 5: Our example input image. Our goal is to recognize the digits on the thermostat using OpenCV and Python.

Figure 5: Our example input image. Our goal is to recognize the digits on the thermostat using OpenCV and Python.

Whenever I am trying to recognize/identify object(s) in an image I first take a few minutes to assess the problem. Given that my end goal is to recognize the digits on the LCD display I know I need to:

  • Step #1: Localize the LCD on the thermostat. This can be done using edge detection since there is enough contrast between the plastic shell and the LCD.
  • Step #2: Extract the LCD. Given an input edge map I can find contours and look for outlines with a rectangular shape — the largest rectangular region should correspond to the LCD. A perspective transform will give me a nice extraction of the LCD.
  • Step #3: Extract the digit regions. Once I have the LCD itself I can focus on extracting the digits. Since there seems to be contrast between the digit regions and the background of the LCD I’m confident that thresholding and morphological operations can accomplish this.
  • Step #4: Identify the digits. Recognizing the actual digits with OpenCV will involve dividing the digit ROI into seven segments. From there I can apply pixel counting on the thresholded image to determine if a given segment is “on” or “off”.

So see how we can accomplish this four-step process to digit recognition with OpenCV and Python, keep reading.

Recognizing digits with computer vision and OpenCV

Let’s go ahead and get this example started.

Open up a new file, name it

recognize_digits.py
 , and insert the following code:
# import the necessary packages
from imutils.perspective import four_point_transform
from imutils import contours
import imutils
import cv2

# define the dictionary of digit segments so we can identify
# each digit on the thermostat
DIGITS_LOOKUP = {
	(1, 1, 1, 0, 1, 1, 1): 0,
	(0, 0, 1, 0, 0, 1, 0): 1,
	(1, 0, 1, 1, 1, 1, 0): 2,
	(1, 0, 1, 1, 0, 1, 1): 3,
	(0, 1, 1, 1, 0, 1, 0): 4,
	(1, 1, 0, 1, 0, 1, 1): 5,
	(1, 1, 0, 1, 1, 1, 1): 6,
	(1, 0, 1, 0, 0, 1, 0): 7,
	(1, 1, 1, 1, 1, 1, 1): 8,
	(1, 1, 1, 1, 0, 1, 1): 9
}

Lines 2-5 import our required Python packages. We’ll be using imutils, my series of convenience functions to make working with OpenCV + Python easier. If you don’t already have

imutils
  installed, you should take a second now to install the package on your system using
pip
 :
$ pip install imutils

Lines 9-20 define a Python dictionary named

DIGITS_LOOKUP
 . Inspired by the approach of /u/Jonno_FTW in the Reddit thread, we can easily define this lookup table where:
  1. They key to the table is the seven-segment array. A one in the array indicates that the given segment is on and a zero indicates that the segment is off.
  2. The value is the actual numerical digit itself: 0-9.

Once we identify the segments in the thermostat display we can pass the array into our

DIGITS_LOOKUP
  table and obtain the digit value.

For reference, this dictionary uses the same segment ordering as in Figure 2 above.

Let’s continue with our example:

# import the necessary packages
from imutils.perspective import four_point_transform
from imutils import contours
import imutils
import cv2

# define the dictionary of digit segments so we can identify
# each digit on the thermostat
DIGITS_LOOKUP = {
	(1, 1, 1, 0, 1, 1, 1): 0,
	(0, 0, 1, 0, 0, 1, 0): 1,
	(1, 0, 1, 1, 1, 1, 0): 2,
	(1, 0, 1, 1, 0, 1, 1): 3,
	(0, 1, 1, 1, 0, 1, 0): 4,
	(1, 1, 0, 1, 0, 1, 1): 5,
	(1, 1, 0, 1, 1, 1, 1): 6,
	(1, 0, 1, 0, 0, 1, 0): 7,
	(1, 1, 1, 1, 1, 1, 1): 8,
	(1, 1, 1, 1, 0, 1, 1): 9
}

# load the example image
image = cv2.imread("example.jpg")

# pre-process the image by resizing it, converting it to
# graycale, blurring it, and computing an edge map
image = imutils.resize(image, height=500)
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
blurred = cv2.GaussianBlur(gray, (5, 5), 0)
edged = cv2.Canny(blurred, 50, 200, 255)

Line 23 loads our image from disk.

We then pre-process the image on Lines 27-30 by:

  • Resizing it.
  • Converting the image to grayscale.
  • Applying Gaussian blurring with a 5×5 kernel to reduce high-frequency noise.
  • Computing the edge map via the Canny edge detector.

After applying these pre-processing steps our edge map looks like this:

Figure 6: Applying image processing steps to compute the edge map of our input image.

Figure 6: Applying image processing steps to compute the edge map of our input image.

Notice how the outlines of the LCD are clearly visible — this accomplishes Step #1.

We can now move on to Step #2, extracting the LCD itself:

# import the necessary packages
from imutils.perspective import four_point_transform
from imutils import contours
import imutils
import cv2

# define the dictionary of digit segments so we can identify
# each digit on the thermostat
DIGITS_LOOKUP = {
	(1, 1, 1, 0, 1, 1, 1): 0,
	(0, 0, 1, 0, 0, 1, 0): 1,
	(1, 0, 1, 1, 1, 1, 0): 2,
	(1, 0, 1, 1, 0, 1, 1): 3,
	(0, 1, 1, 1, 0, 1, 0): 4,
	(1, 1, 0, 1, 0, 1, 1): 5,
	(1, 1, 0, 1, 1, 1, 1): 6,
	(1, 0, 1, 0, 0, 1, 0): 7,
	(1, 1, 1, 1, 1, 1, 1): 8,
	(1, 1, 1, 1, 0, 1, 1): 9
}

# load the example image
image = cv2.imread("example.jpg")

# pre-process the image by resizing it, converting it to
# graycale, blurring it, and computing an edge map
image = imutils.resize(image, height=500)
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
blurred = cv2.GaussianBlur(gray, (5, 5), 0)
edged = cv2.Canny(blurred, 50, 200, 255)

# find contours in the edge map, then sort them by their
# size in descending order
cnts = cv2.findContours(edged.copy(), cv2.RETR_EXTERNAL,
	cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if imutils.is_cv2() else cnts[1]
cnts = sorted(cnts, key=cv2.contourArea, reverse=True)
displayCnt = None

# loop over the contours
for c in cnts:
	# approximate the contour
	peri = cv2.arcLength(c, True)
	approx = cv2.approxPolyDP(c, 0.02 * peri, True)

	# if the contour has four vertices, then we have found
	# the thermostat display
	if len(approx) == 4:
		displayCnt = approx
		break

In order to find the LCD regions, we need to extract the contours (i.e., outlines) of the regions in the edge map (Lines 35 and 35).

We then sort the contours by their area, ensuring that contours with a larger area are placed at the front of the list (Line 37).

Given our sorted contours list, we loop over them individually on Line 41 and apply contour approximation.

If our approximated contour has four vertices then we assume we have found the thermostat display (Lines 48-50). This is a reasonable assumption since the largest rectangular region in our input image should be the LCD itself.

After obtaining the four vertices we can extract the LCD via a four point perspective transform:

# import the necessary packages
from imutils.perspective import four_point_transform
from imutils import contours
import imutils
import cv2

# define the dictionary of digit segments so we can identify
# each digit on the thermostat
DIGITS_LOOKUP = {
	(1, 1, 1, 0, 1, 1, 1): 0,
	(0, 0, 1, 0, 0, 1, 0): 1,
	(1, 0, 1, 1, 1, 1, 0): 2,
	(1, 0, 1, 1, 0, 1, 1): 3,
	(0, 1, 1, 1, 0, 1, 0): 4,
	(1, 1, 0, 1, 0, 1, 1): 5,
	(1, 1, 0, 1, 1, 1, 1): 6,
	(1, 0, 1, 0, 0, 1, 0): 7,
	(1, 1, 1, 1, 1, 1, 1): 8,
	(1, 1, 1, 1, 0, 1, 1): 9
}

# load the example image
image = cv2.imread("example.jpg")

# pre-process the image by resizing it, converting it to
# graycale, blurring it, and computing an edge map
image = imutils.resize(image, height=500)
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
blurred = cv2.GaussianBlur(gray, (5, 5), 0)
edged = cv2.Canny(blurred, 50, 200, 255)

# find contours in the edge map, then sort them by their
# size in descending order
cnts = cv2.findContours(edged.copy(), cv2.RETR_EXTERNAL,
	cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if imutils.is_cv2() else cnts[1]
cnts = sorted(cnts, key=cv2.contourArea, reverse=True)
displayCnt = None

# loop over the contours
for c in cnts:
	# approximate the contour
	peri = cv2.arcLength(c, True)
	approx = cv2.approxPolyDP(c, 0.02 * peri, True)

	# if the contour has four vertices, then we have found
	# the thermostat display
	if len(approx) == 4:
		displayCnt = approx
		break

# extract the thermostat display, apply a perspective transform
# to it
warped = four_point_transform(gray, displayCnt.reshape(4, 2))
output = four_point_transform(image, displayCnt.reshape(4, 2))

Applying this perspective transform gives us a top-down, birds-eye-view of the LCD:

Figure 7: Applying a perspective transform to our image to obtain the LCD region.

Figure 7: Applying a perspective transform to our image to obtain the LCD region.

Obtaining this view of the LCD satisfies Step #2 — we are now ready to extract the digits from the LCD:

# import the necessary packages
from imutils.perspective import four_point_transform
from imutils import contours
import imutils
import cv2

# define the dictionary of digit segments so we can identify
# each digit on the thermostat
DIGITS_LOOKUP = {
	(1, 1, 1, 0, 1, 1, 1): 0,
	(0, 0, 1, 0, 0, 1, 0): 1,
	(1, 0, 1, 1, 1, 1, 0): 2,
	(1, 0, 1, 1, 0, 1, 1): 3,
	(0, 1, 1, 1, 0, 1, 0): 4,
	(1, 1, 0, 1, 0, 1, 1): 5,
	(1, 1, 0, 1, 1, 1, 1): 6,
	(1, 0, 1, 0, 0, 1, 0): 7,
	(1, 1, 1, 1, 1, 1, 1): 8,
	(1, 1, 1, 1, 0, 1, 1): 9
}

# load the example image
image = cv2.imread("example.jpg")

# pre-process the image by resizing it, converting it to
# graycale, blurring it, and computing an edge map
image = imutils.resize(image, height=500)
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
blurred = cv2.GaussianBlur(gray, (5, 5), 0)
edged = cv2.Canny(blurred, 50, 200, 255)

# find contours in the edge map, then sort them by their
# size in descending order
cnts = cv2.findContours(edged.copy(), cv2.RETR_EXTERNAL,
	cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if imutils.is_cv2() else cnts[1]
cnts = sorted(cnts, key=cv2.contourArea, reverse=True)
displayCnt = None

# loop over the contours
for c in cnts:
	# approximate the contour
	peri = cv2.arcLength(c, True)
	approx = cv2.approxPolyDP(c, 0.02 * peri, True)

	# if the contour has four vertices, then we have found
	# the thermostat display
	if len(approx) == 4:
		displayCnt = approx
		break

# extract the thermostat display, apply a perspective transform
# to it
warped = four_point_transform(gray, displayCnt.reshape(4, 2))
output = four_point_transform(image, displayCnt.reshape(4, 2))

# threshold the warped image, then apply a series of morphological
# operations to cleanup the thresholded image
thresh = cv2.threshold(warped, 0, 255,
	cv2.THRESH_BINARY_INV | cv2.THRESH_OTSU)[1]
kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (1, 5))
thresh = cv2.morphologyEx(thresh, cv2.MORPH_OPEN, kernel)

To obtain the digits themselves we need to threshold the

warped
  image (Lines 59 and 60) to reveal the dark regions (i.e., digits) against the lighter background (i.e., the background of the LCD display):
Figure 8: Thresholding LCD allows us to segment the dark regions (digits/symbols) from the lighter background (the LCD display itself).

Figure 8: Thresholding LCD allows us to segment the dark regions (digits/symbols) from the lighter background (the LCD display itself).

We then apply a series of morphological operations to clean up the thresholded image (Lines 61 and 62):

Figure 9: Applying a series of morphological operations cleans up our thresholded LCD and will allow us to segment out each of the digits.

Figure 9: Applying a series of morphological operations cleans up our thresholded LCD and will allow us to segment out each of the digits.

Now that we have a nice segmented image we once again need to apply contour filtering, only this time we are looking for the actual digits:

# import the necessary packages
from imutils.perspective import four_point_transform
from imutils import contours
import imutils
import cv2

# define the dictionary of digit segments so we can identify
# each digit on the thermostat
DIGITS_LOOKUP = {
	(1, 1, 1, 0, 1, 1, 1): 0,
	(0, 0, 1, 0, 0, 1, 0): 1,
	(1, 0, 1, 1, 1, 1, 0): 2,
	(1, 0, 1, 1, 0, 1, 1): 3,
	(0, 1, 1, 1, 0, 1, 0): 4,
	(1, 1, 0, 1, 0, 1, 1): 5,
	(1, 1, 0, 1, 1, 1, 1): 6,
	(1, 0, 1, 0, 0, 1, 0): 7,
	(1, 1, 1, 1, 1, 1, 1): 8,
	(1, 1, 1, 1, 0, 1, 1): 9
}

# load the example image
image = cv2.imread("example.jpg")

# pre-process the image by resizing it, converting it to
# graycale, blurring it, and computing an edge map
image = imutils.resize(image, height=500)
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
blurred = cv2.GaussianBlur(gray, (5, 5), 0)
edged = cv2.Canny(blurred, 50, 200, 255)

# find contours in the edge map, then sort them by their
# size in descending order
cnts = cv2.findContours(edged.copy(), cv2.RETR_EXTERNAL,
	cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if imutils.is_cv2() else cnts[1]
cnts = sorted(cnts, key=cv2.contourArea, reverse=True)
displayCnt = None

# loop over the contours
for c in cnts:
	# approximate the contour
	peri = cv2.arcLength(c, True)
	approx = cv2.approxPolyDP(c, 0.02 * peri, True)

	# if the contour has four vertices, then we have found
	# the thermostat display
	if len(approx) == 4:
		displayCnt = approx
		break

# extract the thermostat display, apply a perspective transform
# to it
warped = four_point_transform(gray, displayCnt.reshape(4, 2))
output = four_point_transform(image, displayCnt.reshape(4, 2))

# threshold the warped image, then apply a series of morphological
# operations to cleanup the thresholded image
thresh = cv2.threshold(warped, 0, 255,
	cv2.THRESH_BINARY_INV | cv2.THRESH_OTSU)[1]
kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (1, 5))
thresh = cv2.morphologyEx(thresh, cv2.MORPH_OPEN, kernel)

# find contours in the thresholded image, then initialize the
# digit contours lists
cnts = cv2.findContours(thresh.copy(), cv2.RETR_EXTERNAL,
	cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if imutils.is_cv2() else cnts[1]
digitCnts = []

# loop over the digit area candidates
for c in cnts:
	# compute the bounding box of the contour
	(x, y, w, h) = cv2.boundingRect(c)

	# if the contour is sufficiently large, it must be a digit
	if w >= 15 and (h >= 30 and h <= 40):
		digitCnts.append(c)

To accomplish this we find contours in our thresholded image (Lines 66 and 67). We also initialize the

digitsCnts
  list on Line 69 — this list will store the contours of the digits themselves.

Line 72 starts looping over each of the contours.

For each contour, we compute the bounding box (Line 74), ensure the width and height are of an acceptable size, and if so, update the

digitsCnts
  list (Lines 77 and 78).

Note: Determining the appropriate width and height constraints requires a few rounds of trial and error. I would suggest looping over each of the contours, drawing them individually, and inspecting their dimensions. Doing this process ensures you can find commonalities across digit contour properties.

If we were to loop over the contours inside

digitsCnts
  and draw the bounding box on our image, the result would look like this:
Figure 10: Drawing the bounding box of each of the digits on the LCD.

Figure 10: Drawing the bounding box of each of the digits on the LCD.

Sure enough, we have found the digits on the LCD!

The final step is to actually identify each of the digits:

# import the necessary packages
from imutils.perspective import four_point_transform
from imutils import contours
import imutils
import cv2

# define the dictionary of digit segments so we can identify
# each digit on the thermostat
DIGITS_LOOKUP = {
	(1, 1, 1, 0, 1, 1, 1): 0,
	(0, 0, 1, 0, 0, 1, 0): 1,
	(1, 0, 1, 1, 1, 1, 0): 2,
	(1, 0, 1, 1, 0, 1, 1): 3,
	(0, 1, 1, 1, 0, 1, 0): 4,
	(1, 1, 0, 1, 0, 1, 1): 5,
	(1, 1, 0, 1, 1, 1, 1): 6,
	(1, 0, 1, 0, 0, 1, 0): 7,
	(1, 1, 1, 1, 1, 1, 1): 8,
	(1, 1, 1, 1, 0, 1, 1): 9
}

# load the example image
image = cv2.imread("example.jpg")

# pre-process the image by resizing it, converting it to
# graycale, blurring it, and computing an edge map
image = imutils.resize(image, height=500)
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
blurred = cv2.GaussianBlur(gray, (5, 5), 0)
edged = cv2.Canny(blurred, 50, 200, 255)

# find contours in the edge map, then sort them by their
# size in descending order
cnts = cv2.findContours(edged.copy(), cv2.RETR_EXTERNAL,
	cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if imutils.is_cv2() else cnts[1]
cnts = sorted(cnts, key=cv2.contourArea, reverse=True)
displayCnt = None

# loop over the contours
for c in cnts:
	# approximate the contour
	peri = cv2.arcLength(c, True)
	approx = cv2.approxPolyDP(c, 0.02 * peri, True)

	# if the contour has four vertices, then we have found
	# the thermostat display
	if len(approx) == 4:
		displayCnt = approx
		break

# extract the thermostat display, apply a perspective transform
# to it
warped = four_point_transform(gray, displayCnt.reshape(4, 2))
output = four_point_transform(image, displayCnt.reshape(4, 2))

# threshold the warped image, then apply a series of morphological
# operations to cleanup the thresholded image
thresh = cv2.threshold(warped, 0, 255,
	cv2.THRESH_BINARY_INV | cv2.THRESH_OTSU)[1]
kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (1, 5))
thresh = cv2.morphologyEx(thresh, cv2.MORPH_OPEN, kernel)

# find contours in the thresholded image, then initialize the
# digit contours lists
cnts = cv2.findContours(thresh.copy(), cv2.RETR_EXTERNAL,
	cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if imutils.is_cv2() else cnts[1]
digitCnts = []

# loop over the digit area candidates
for c in cnts:
	# compute the bounding box of the contour
	(x, y, w, h) = cv2.boundingRect(c)

	# if the contour is sufficiently large, it must be a digit
	if w >= 15 and (h >= 30 and h <= 40):
		digitCnts.append(c)

# sort the contours from left-to-right, then initialize the
# actual digits themselves
digitCnts = contours.sort_contours(digitCnts,
	method="left-to-right")[0]
digits = []

Here we are simply sorting our digit contours from left-to-right based on their (x, y)-coordinates.

This sorting step is necessary as there are no guarantees that the contours are already sorted from left-to-right (the same direction in which we would read the digits).

Next, comes the actual digit recognition process:

# import the necessary packages
from imutils.perspective import four_point_transform
from imutils import contours
import imutils
import cv2

# define the dictionary of digit segments so we can identify
# each digit on the thermostat
DIGITS_LOOKUP = {
	(1, 1, 1, 0, 1, 1, 1): 0,
	(0, 0, 1, 0, 0, 1, 0): 1,
	(1, 0, 1, 1, 1, 1, 0): 2,
	(1, 0, 1, 1, 0, 1, 1): 3,
	(0, 1, 1, 1, 0, 1, 0): 4,
	(1, 1, 0, 1, 0, 1, 1): 5,
	(1, 1, 0, 1, 1, 1, 1): 6,
	(1, 0, 1, 0, 0, 1, 0): 7,
	(1, 1, 1, 1, 1, 1, 1): 8,
	(1, 1, 1, 1, 0, 1, 1): 9
}

# load the example image
image = cv2.imread("example.jpg")

# pre-process the image by resizing it, converting it to
# graycale, blurring it, and computing an edge map
image = imutils.resize(image, height=500)
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
blurred = cv2.GaussianBlur(gray, (5, 5), 0)
edged = cv2.Canny(blurred, 50, 200, 255)

# find contours in the edge map, then sort them by their
# size in descending order
cnts = cv2.findContours(edged.copy(), cv2.RETR_EXTERNAL,
	cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if imutils.is_cv2() else cnts[1]
cnts = sorted(cnts, key=cv2.contourArea, reverse=True)
displayCnt = None

# loop over the contours
for c in cnts:
	# approximate the contour
	peri = cv2.arcLength(c, True)
	approx = cv2.approxPolyDP(c, 0.02 * peri, True)

	# if the contour has four vertices, then we have found
	# the thermostat display
	if len(approx) == 4:
		displayCnt = approx
		break

# extract the thermostat display, apply a perspective transform
# to it
warped = four_point_transform(gray, displayCnt.reshape(4, 2))
output = four_point_transform(image, displayCnt.reshape(4, 2))

# threshold the warped image, then apply a series of morphological
# operations to cleanup the thresholded image
thresh = cv2.threshold(warped, 0, 255,
	cv2.THRESH_BINARY_INV | cv2.THRESH_OTSU)[1]
kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (1, 5))
thresh = cv2.morphologyEx(thresh, cv2.MORPH_OPEN, kernel)

# find contours in the thresholded image, then initialize the
# digit contours lists
cnts = cv2.findContours(thresh.copy(), cv2.RETR_EXTERNAL,
	cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if imutils.is_cv2() else cnts[1]
digitCnts = []

# loop over the digit area candidates
for c in cnts:
	# compute the bounding box of the contour
	(x, y, w, h) = cv2.boundingRect(c)

	# if the contour is sufficiently large, it must be a digit
	if w >= 15 and (h >= 30 and h <= 40):
		digitCnts.append(c)

# sort the contours from left-to-right, then initialize the
# actual digits themselves
digitCnts = contours.sort_contours(digitCnts,
	method="left-to-right")[0]
digits = []

# loop over each of the digits
for c in digitCnts:
	# extract the digit ROI
	(x, y, w, h) = cv2.boundingRect(c)
	roi = thresh[y:y + h, x:x + w]

	# compute the width and height of each of the 7 segments
	# we are going to examine
	(roiH, roiW) = roi.shape
	(dW, dH) = (int(roiW * 0.25), int(roiH * 0.15))
	dHC = int(roiH * 0.05)

	# define the set of 7 segments
	segments = [
		((0, 0), (w, dH)),	# top
		((0, 0), (dW, h // 2)),	# top-left
		((w - dW, 0), (w, h // 2)),	# top-right
		((0, (h // 2) - dHC) , (w, (h // 2) + dHC)), # center
		((0, h // 2), (dW, h)),	# bottom-left
		((w - dW, h // 2), (w, h)),	# bottom-right
		((0, h - dH), (w, h))	# bottom
	]
	on = [0] * len(segments)

We start looping over each of the digit contours on Line 87.

For each of these regions, we compute the bounding box and extract the digit ROI (Lines 89 and 90).

I have included a GIF animation of each of these digit ROIs below:

Figure 11: Extracting each individual digit ROI by computing the bounding box and applying NumPy array slicing.

Figure 11: Extracting each individual digit ROI by computing the bounding box and applying NumPy array slicing.

Given the digit ROI we now need to localize and extract the seven segments of the digit display.

Lines 94-96 compute the approximate width and height of each segment based on the ROI dimensions.

We then define a list of (x, y)-coordinates that correspond to the seven segments on Lines 99-107. This list follows the same order of segments as Figure 2 above.

Here is an example GIF animation that draws a green box over the current segment being investigated:

Figure 12: An example of drawing the segment ROI for each of the seven segments of the digit.

Figure 12: An example of drawing the segment ROI for each of the seven segments of the digit.

Finally, Line 108 initializes our

on
  list — a value of one inside this list indicates that a given segment is turned “on” while a value of zero indicates the segment is “off”.

Given the (x, y)-coordinates of the seven display segments, identifying a whether a segment is on or off is fairly easy:

# import the necessary packages
from imutils.perspective import four_point_transform
from imutils import contours
import imutils
import cv2

# define the dictionary of digit segments so we can identify
# each digit on the thermostat
DIGITS_LOOKUP = {
	(1, 1, 1, 0, 1, 1, 1): 0,
	(0, 0, 1, 0, 0, 1, 0): 1,
	(1, 0, 1, 1, 1, 1, 0): 2,
	(1, 0, 1, 1, 0, 1, 1): 3,
	(0, 1, 1, 1, 0, 1, 0): 4,
	(1, 1, 0, 1, 0, 1, 1): 5,
	(1, 1, 0, 1, 1, 1, 1): 6,
	(1, 0, 1, 0, 0, 1, 0): 7,
	(1, 1, 1, 1, 1, 1, 1): 8,
	(1, 1, 1, 1, 0, 1, 1): 9
}

# load the example image
image = cv2.imread("example.jpg")

# pre-process the image by resizing it, converting it to
# graycale, blurring it, and computing an edge map
image = imutils.resize(image, height=500)
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
blurred = cv2.GaussianBlur(gray, (5, 5), 0)
edged = cv2.Canny(blurred, 50, 200, 255)

# find contours in the edge map, then sort them by their
# size in descending order
cnts = cv2.findContours(edged.copy(), cv2.RETR_EXTERNAL,
	cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if imutils.is_cv2() else cnts[1]
cnts = sorted(cnts, key=cv2.contourArea, reverse=True)
displayCnt = None

# loop over the contours
for c in cnts:
	# approximate the contour
	peri = cv2.arcLength(c, True)
	approx = cv2.approxPolyDP(c, 0.02 * peri, True)

	# if the contour has four vertices, then we have found
	# the thermostat display
	if len(approx) == 4:
		displayCnt = approx
		break

# extract the thermostat display, apply a perspective transform
# to it
warped = four_point_transform(gray, displayCnt.reshape(4, 2))
output = four_point_transform(image, displayCnt.reshape(4, 2))

# threshold the warped image, then apply a series of morphological
# operations to cleanup the thresholded image
thresh = cv2.threshold(warped, 0, 255,
	cv2.THRESH_BINARY_INV | cv2.THRESH_OTSU)[1]
kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (1, 5))
thresh = cv2.morphologyEx(thresh, cv2.MORPH_OPEN, kernel)

# find contours in the thresholded image, then initialize the
# digit contours lists
cnts = cv2.findContours(thresh.copy(), cv2.RETR_EXTERNAL,
	cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if imutils.is_cv2() else cnts[1]
digitCnts = []

# loop over the digit area candidates
for c in cnts:
	# compute the bounding box of the contour
	(x, y, w, h) = cv2.boundingRect(c)

	# if the contour is sufficiently large, it must be a digit
	if w >= 15 and (h >= 30 and h <= 40):
		digitCnts.append(c)

# sort the contours from left-to-right, then initialize the
# actual digits themselves
digitCnts = contours.sort_contours(digitCnts,
	method="left-to-right")[0]
digits = []

# loop over each of the digits
for c in digitCnts:
	# extract the digit ROI
	(x, y, w, h) = cv2.boundingRect(c)
	roi = thresh[y:y + h, x:x + w]

	# compute the width and height of each of the 7 segments
	# we are going to examine
	(roiH, roiW) = roi.shape
	(dW, dH) = (int(roiW * 0.25), int(roiH * 0.15))
	dHC = int(roiH * 0.05)

	# define the set of 7 segments
	segments = [
		((0, 0), (w, dH)),	# top
		((0, 0), (dW, h // 2)),	# top-left
		((w - dW, 0), (w, h // 2)),	# top-right
		((0, (h // 2) - dHC) , (w, (h // 2) + dHC)), # center
		((0, h // 2), (dW, h)),	# bottom-left
		((w - dW, h // 2), (w, h)),	# bottom-right
		((0, h - dH), (w, h))	# bottom
	]
	on = [0] * len(segments)

	# loop over the segments
	for (i, ((xA, yA), (xB, yB))) in enumerate(segments):
		# extract the segment ROI, count the total number of
		# thresholded pixels in the segment, and then compute
		# the area of the segment
		segROI = roi[yA:yB, xA:xB]
		total = cv2.countNonZero(segROI)
		area = (xB - xA) * (yB - yA)

		# if the total number of non-zero pixels is greater than
		# 50% of the area, mark the segment as "on"
		if total / float(area) > 0.5:
			on[i]= 1

	# lookup the digit and draw it on the image
	digit = DIGITS_LOOKUP[tuple(on)]
	digits.append(digit)
	cv2.rectangle(output, (x, y), (x + w, y + h), (0, 255, 0), 1)
	cv2.putText(output, str(digit), (x - 10, y - 10),
		cv2.FONT_HERSHEY_SIMPLEX, 0.65, (0, 255, 0), 2)

We start looping over the (x, y)-coordinates of each segment on Line 111.

We extract the segment ROI on Line 115, followed by computing the number of non-zero pixels on Line 116 (i.e., the number of pixels in the segment that are “on”).

If the ratio of non-zero pixels to the total area of the segment is greater than 50% then we can assume the segment is “on” and update our

on
  list accordingly (Lines 121 and 122).

After looping over the seven segments we can pass the

on
  list to
DIGITS_LOOKUP
  to obtain the digit itself.

We then draw a bounding box around the digit and display the digit on the

output
  image.

Finally, our last code block prints the digit to our screen and displays the output image:

# import the necessary packages
from imutils.perspective import four_point_transform
from imutils import contours
import imutils
import cv2

# define the dictionary of digit segments so we can identify
# each digit on the thermostat
DIGITS_LOOKUP = {
	(1, 1, 1, 0, 1, 1, 1): 0,
	(0, 0, 1, 0, 0, 1, 0): 1,
	(1, 0, 1, 1, 1, 1, 0): 2,
	(1, 0, 1, 1, 0, 1, 1): 3,
	(0, 1, 1, 1, 0, 1, 0): 4,
	(1, 1, 0, 1, 0, 1, 1): 5,
	(1, 1, 0, 1, 1, 1, 1): 6,
	(1, 0, 1, 0, 0, 1, 0): 7,
	(1, 1, 1, 1, 1, 1, 1): 8,
	(1, 1, 1, 1, 0, 1, 1): 9
}

# load the example image
image = cv2.imread("example.jpg")

# pre-process the image by resizing it, converting it to
# graycale, blurring it, and computing an edge map
image = imutils.resize(image, height=500)
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
blurred = cv2.GaussianBlur(gray, (5, 5), 0)
edged = cv2.Canny(blurred, 50, 200, 255)

# find contours in the edge map, then sort them by their
# size in descending order
cnts = cv2.findContours(edged.copy(), cv2.RETR_EXTERNAL,
	cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if imutils.is_cv2() else cnts[1]
cnts = sorted(cnts, key=cv2.contourArea, reverse=True)
displayCnt = None

# loop over the contours
for c in cnts:
	# approximate the contour
	peri = cv2.arcLength(c, True)
	approx = cv2.approxPolyDP(c, 0.02 * peri, True)

	# if the contour has four vertices, then we have found
	# the thermostat display
	if len(approx) == 4:
		displayCnt = approx
		break

# extract the thermostat display, apply a perspective transform
# to it
warped = four_point_transform(gray, displayCnt.reshape(4, 2))
output = four_point_transform(image, displayCnt.reshape(4, 2))

# threshold the warped image, then apply a series of morphological
# operations to cleanup the thresholded image
thresh = cv2.threshold(warped, 0, 255,
	cv2.THRESH_BINARY_INV | cv2.THRESH_OTSU)[1]
kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (1, 5))
thresh = cv2.morphologyEx(thresh, cv2.MORPH_OPEN, kernel)

# find contours in the thresholded image, then initialize the
# digit contours lists
cnts = cv2.findContours(thresh.copy(), cv2.RETR_EXTERNAL,
	cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if imutils.is_cv2() else cnts[1]
digitCnts = []

# loop over the digit area candidates
for c in cnts:
	# compute the bounding box of the contour
	(x, y, w, h) = cv2.boundingRect(c)

	# if the contour is sufficiently large, it must be a digit
	if w >= 15 and (h >= 30 and h <= 40):
		digitCnts.append(c)

# sort the contours from left-to-right, then initialize the
# actual digits themselves
digitCnts = contours.sort_contours(digitCnts,
	method="left-to-right")[0]
digits = []

# loop over each of the digits
for c in digitCnts:
	# extract the digit ROI
	(x, y, w, h) = cv2.boundingRect(c)
	roi = thresh[y:y + h, x:x + w]

	# compute the width and height of each of the 7 segments
	# we are going to examine
	(roiH, roiW) = roi.shape
	(dW, dH) = (int(roiW * 0.25), int(roiH * 0.15))
	dHC = int(roiH * 0.05)

	# define the set of 7 segments
	segments = [
		((0, 0), (w, dH)),	# top
		((0, 0), (dW, h // 2)),	# top-left
		((w - dW, 0), (w, h // 2)),	# top-right
		((0, (h // 2) - dHC) , (w, (h // 2) + dHC)), # center
		((0, h // 2), (dW, h)),	# bottom-left
		((w - dW, h // 2), (w, h)),	# bottom-right
		((0, h - dH), (w, h))	# bottom
	]
	on = [0] * len(segments)

	# loop over the segments
	for (i, ((xA, yA), (xB, yB))) in enumerate(segments):
		# extract the segment ROI, count the total number of
		# thresholded pixels in the segment, and then compute
		# the area of the segment
		segROI = roi[yA:yB, xA:xB]
		total = cv2.countNonZero(segROI)
		area = (xB - xA) * (yB - yA)

		# if the total number of non-zero pixels is greater than
		# 50% of the area, mark the segment as "on"
		if total / float(area) > 0.5:
			on[i]= 1

	# lookup the digit and draw it on the image
	digit = DIGITS_LOOKUP[tuple(on)]
	digits.append(digit)
	cv2.rectangle(output, (x, y), (x + w, y + h), (0, 255, 0), 1)
	cv2.putText(output, str(digit), (x - 10, y - 10),
		cv2.FONT_HERSHEY_SIMPLEX, 0.65, (0, 255, 0), 2)

# display the digits
print(u"{}{}.{} \u00b0C".format(*digits))
cv2.imshow("Input", image)
cv2.imshow("Output", output)
cv2.waitKey(0)

Notice how we have been able to correctly recognize the digits on the LCD screen using Python and OpenCV:

Figure 13: Correctly recognizing digits in images with OpenCV and Python.

Figure 13: Correctly recognizing digits in images with OpenCV and Python.

Summary

In today’s blog post I demonstrated how to utilize OpenCV and Python to recognize digits in images.

This approach is specifically intended for seven-segment displays (i.e., the digit displays you would typically see on a digital alarm clock).

By extracting each of the seven segments and applying basic thresholding and morphological operations we can determine which segments are “on” and which are “off”.

From there, we can look up the on/off segments in a Python dictionary data structure to quickly determine the actual digit — no machine learning required!

As I mentioned at the top of this blog post, applying computer vision to recognizing digits in a thermostat image tends to overcomplicate the problem itself — utilizing a data logging thermometer would be more reliable and require substantially less effort.

However, in the case that (1) you do not have access to a data logging sensor or (2) you simply want to hone and practice your computer vision/OpenCV skills, it’s often helpful to see a solution such as this one demonstrating how to solve the project.

I hope you enjoyed today’s post!

To be notified when future blog posts are published, be sure to enter your email address in the form below!

Downloads:

If you would like to download the code and images used in this post, please enter your email address in the form below. Not only will you get a .zip of the code, I’ll also send you a FREE 11-page Resource Guide on Computer Vision and Image Search Engines, including exclusive techniques that I don’t post on this blog! Sound good? If so, enter your email address and I’ll send you the code immediately!

The post Recognizing digits with OpenCV and Python appeared first on PyImageSearch.


Text skew correction with OpenCV and Python

$
0
0

Today’s tutorial is a Python implementation of my favorite blog post by Félix Abecassis on the process of text skew correction (i.e., “deskewing text”) using OpenCV and image processing functions.

Given an image containing a rotated block of text at an unknown angle, we need to correct the text skew by:

  1. Detecting the block of text in the image.
  2. Computing the angle of the rotated text.
  3. Rotating the image to correct for the skew.

We typically apply text skew correction algorithms in the field of automatic document analysis, but the process itself can be applied to other domains as well.

To learn more about text skew correction, just keep reading.

Looking for the source code to this post?
Jump right to the downloads section.

Text skew correction with OpenCV and Python

The remainder of this blog post will demonstrate how to deskew text using basic image processing operations with Python and OpenCV.

We’ll start by creating a simple dataset that we can use to evaluate our text skew corrector.

We’ll then write Python and OpenCV code to automatically detect and correct the text skew angle in our images.

Creating a simple dataset

Similar to Félix’s example, I have prepared a small dataset of four images that have been rotated by a given number of degrees:

Figure 1: Our four example images that we’ll be applying text skew correction to with OpenCV and Python.

The text block itself is from Chapter 11 of my book, Practical Python and OpenCV, where I’m discussing contours and how to utilize them for image processing and computer vision.

The filenames of the four files follow:

$ ls images/
neg_28.png	neg_4.png	pos_24.png	pos_41.png

The first part of the filename specifies whether our image has been rotated counter-clockwise (negative) or clockwise (positive).

The second component of the filename is the actual number of degrees the image has been rotated by.

The goal our text skew correction algorithm will be to correctly determine the direction and angle of the rotation, then correct for it.

To see how our text skew correction algorithm is implemented with OpenCV and Python, be sure to read the next section.

Deskewing text with OpenCV and Python

To get started, open up a new file and name it

correct_skew.py
 .

From there, insert the following code:

# import the necessary packages
import numpy as np
import argparse
import cv2

# construct the argument parse and parse the arguments
ap = argparse.ArgumentParser()
ap.add_argument("-i", "--image", required=True,
	help="path to input image file")
args = vars(ap.parse_args())

# load the image from disk
image = cv2.imread(args["image"])

Lines 2-4 import our required Python packages. We’ll be using OpenCV via our

cv2
  bindings, so if you don’t already have OpenCV installed on your system, please refer to my list of OpenCV install tutorials to help you get your system setup and configured.

We then parse our command line arguments on Lines 7-10. We only need a single argument here,

--image
 , which is the path to our input image.

The image is then loaded from disk on Line 13.

Our next step is to isolate the text in the image:

# import the necessary packages
import numpy as np
import argparse
import cv2

# construct the argument parse and parse the arguments
ap = argparse.ArgumentParser()
ap.add_argument("-i", "--image", required=True,
	help="path to input image file")
args = vars(ap.parse_args())

# load the image from disk
image = cv2.imread(args["image"])

# convert the image to grayscale and flip the foreground
# and background to ensure foreground is now "white" and
# the background is "black"
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
gray = cv2.bitwise_not(gray)

# threshold the image, setting all foreground pixels to
# 255 and all background pixels to 0
thresh = cv2.threshold(gray, 0, 255,
	cv2.THRESH_BINARY | cv2.THRESH_OTSU)[1]

Our input images contain text that is dark on a light background; however, to apply our text skew correction process, we first need to invert the image (i.e., the text is now light on a dark background — we need the inverse).

When applying computer vision and image processing operations, it’s common for the foreground to be represented as light while the background (the part of the image we are not interested in) is dark.

A thresholding operation (Lines 23 and 24) is then applied to binarize the image:

Figure 2: Applying a thresholding operation to binarize our image. Our text is now white on a black background.

Given this thresholded image, we can now compute the minimum rotated bounding box that contains the text regions:

# import the necessary packages
import numpy as np
import argparse
import cv2

# construct the argument parse and parse the arguments
ap = argparse.ArgumentParser()
ap.add_argument("-i", "--image", required=True,
	help="path to input image file")
args = vars(ap.parse_args())

# load the image from disk
image = cv2.imread(args["image"])

# convert the image to grayscale and flip the foreground
# and background to ensure foreground is now "white" and
# the background is "black"
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
gray = cv2.bitwise_not(gray)

# threshold the image, setting all foreground pixels to
# 255 and all background pixels to 0
thresh = cv2.threshold(gray, 0, 255,
	cv2.THRESH_BINARY | cv2.THRESH_OTSU)[1]

# grab the (x, y) coordinates of all pixel values that
# are greater than zero, then use these coordinates to
# compute a rotated bounding box that contains all
# coordinates
coords = np.column_stack(np.where(thresh > 0))
angle = cv2.minAreaRect(coords)[-1]

# the `cv2.minAreaRect` function returns values in the
# range [-90, 0); as the rectangle rotates clockwise the
# returned angle trends to 0 -- in this special case we
# need to add 90 degrees to the angle
if angle < -45:
	angle = -(90 + angle)

# otherwise, just take the inverse of the angle to make
# it positive
else:
	angle = -angle

Line 30 finds all (x, y)-coordinates in the

thresh
  image that are part of the foreground.

We pass these coordinates into

cv2.minAreaRect
  which then computes the minimum rotated rectangle that contains the entire text region.

The

cv2.minAreaRect
  function returns angle values in the range [-90, 0). As the rectangle is rotated clockwise the angle value increases towards zero. When zero is reached, the angle is set back to -90 degrees again and the process continues.

Note: For more information on

cv2.minAreaRect
 , please see this excellent explanation by Adam Goodwin.

Lines 37 and 38 handle if the angle is less than -45 degrees, in which case we need to add 90 degrees to the angle and take the inverse.

Otherwise, Lines 42 and 43 simply take the inverse of the angle.

Now that we have determined the text skew angle, we need to apply an affine transformation to correct for the skew:

# import the necessary packages
import numpy as np
import argparse
import cv2

# construct the argument parse and parse the arguments
ap = argparse.ArgumentParser()
ap.add_argument("-i", "--image", required=True,
	help="path to input image file")
args = vars(ap.parse_args())

# load the image from disk
image = cv2.imread(args["image"])

# convert the image to grayscale and flip the foreground
# and background to ensure foreground is now "white" and
# the background is "black"
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
gray = cv2.bitwise_not(gray)

# threshold the image, setting all foreground pixels to
# 255 and all background pixels to 0
thresh = cv2.threshold(gray, 0, 255,
	cv2.THRESH_BINARY | cv2.THRESH_OTSU)[1]

# grab the (x, y) coordinates of all pixel values that
# are greater than zero, then use these coordinates to
# compute a rotated bounding box that contains all
# coordinates
coords = np.column_stack(np.where(thresh > 0))
angle = cv2.minAreaRect(coords)[-1]

# the `cv2.minAreaRect` function returns values in the
# range [-90, 0); as the rectangle rotates clockwise the
# returned angle trends to 0 -- in this special case we
# need to add 90 degrees to the angle
if angle < -45:
	angle = -(90 + angle)

# otherwise, just take the inverse of the angle to make
# it positive
else:
	angle = -angle

# rotate the image to deskew it
(h, w) = image.shape[:2]
center = (w // 2, h // 2)
M = cv2.getRotationMatrix2D(center, angle, 1.0)
rotated = cv2.warpAffine(image, M, (w, h),
	flags=cv2.INTER_CUBIC, borderMode=cv2.BORDER_REPLICATE)

Lines 46 and 47 determine the center (x, y)-coordinate of the image. We pass the

center
  coordinates and rotation angle into the
cv2.getRotationMatrix2D
  (Line 48). This rotation matrix
M
  is then used to perform the actual transformation on Lines 49 and 50.

Finally, we display the results to our screen:

# import the necessary packages
import numpy as np
import argparse
import cv2

# construct the argument parse and parse the arguments
ap = argparse.ArgumentParser()
ap.add_argument("-i", "--image", required=True,
	help="path to input image file")
args = vars(ap.parse_args())

# load the image from disk
image = cv2.imread(args["image"])

# convert the image to grayscale and flip the foreground
# and background to ensure foreground is now "white" and
# the background is "black"
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
gray = cv2.bitwise_not(gray)

# threshold the image, setting all foreground pixels to
# 255 and all background pixels to 0
thresh = cv2.threshold(gray, 0, 255,
	cv2.THRESH_BINARY | cv2.THRESH_OTSU)[1]

# grab the (x, y) coordinates of all pixel values that
# are greater than zero, then use these coordinates to
# compute a rotated bounding box that contains all
# coordinates
coords = np.column_stack(np.where(thresh > 0))
angle = cv2.minAreaRect(coords)[-1]

# the `cv2.minAreaRect` function returns values in the
# range [-90, 0); as the rectangle rotates clockwise the
# returned angle trends to 0 -- in this special case we
# need to add 90 degrees to the angle
if angle < -45:
	angle = -(90 + angle)

# otherwise, just take the inverse of the angle to make
# it positive
else:
	angle = -angle

# rotate the image to deskew it
(h, w) = image.shape[:2]
center = (w // 2, h // 2)
M = cv2.getRotationMatrix2D(center, angle, 1.0)
rotated = cv2.warpAffine(image, M, (w, h),
	flags=cv2.INTER_CUBIC, borderMode=cv2.BORDER_REPLICATE)

# draw the correction angle on the image so we can validate it
cv2.putText(rotated, "Angle: {:.2f} degrees".format(angle),
	(10, 30), cv2.FONT_HERSHEY_SIMPLEX, 0.7, (0, 0, 255), 2)

# show the output image
print("[INFO] angle: {:.3f}".format(angle))
cv2.imshow("Input", image)
cv2.imshow("Rotated", rotated)
cv2.waitKey(0)

Line 53 draws the

angle
  on our image so we can verify that the output image matches the rotation angle (you would obviously want to remove this line in a document processing pipeline).

Lines 57-60 handle displaying the output image.

Skew correction results

To grab the code + example images used inside this blog post, be sure to use the “Downloads” section at the bottom of this post.

From there, execute the following command to correct the skew for our

neg_4.png
  image:
$ python correct_skew.py --image images/neg_4.png 
[INFO] angle: -4.086

Figure 3: Applying skew correction using OpenCV and Python.

Here we can see that that input image has a counter-clockwise skew of 4 degrees. Applying our skew correction with OpenCV detects this 4 degree skew and corrects for it.

Here is another example, this time with a counter-clockwise skew of 28 degrees:

$ python correct_skew.py --image images/neg_28.png 
[INFO] angle: -28.009

Figure 4: Deskewing images using OpenCV and Python.

Again, our skew correction algorithm is able to correct the input image.

This time, let’s try a clockwise skew:

$ python correct_skew.py --image images/pos_24.png 
[INFO] angle: 23.974

Figure 5: Correcting for skew in text regions with computer vision.

And finally a more extreme clockwise skew of 41 degrees:

$ python correct_skew.py --image images/pos_41.png 
[INFO] angle: 41.037

Figure 6: Deskewing text with OpenCV.

Regardless of skew angle, our algorithm is able to correct for skew in images using OpenCV and Python.

Interested in learning more about computer vision and OpenCV?

If you’re interested in learning more about the fundamentals of computer vision and image processing, be sure to take a look at my book, Practical Python and OpenCV:

Inside the book you’ll learn the basics of computer vision and OpenCV, working your way up to more advanced topics such as face detectionobject tracking in video, and handwriting recognition, all with lots of examples, code, and detailed walkthroughs.

If you’re interested in learning more (and how my book can teach you these algorithms in less than a single weekend), just click the button below:

Summary

In today’s blog post I provided a Python implementation of Félix Abecassis’ approach to skew correction.

The algorithm itself is quite straightforward, relying on only basic image processing techniques such as thresholding, computing the minimum area rotated rectangle, and then applying an affine transformation to correct the skew.

We would commonly use this type of text skew correction in an automatic document analysis pipeline where our goal is to digitize a set of documents, correct for text skew, and then apply OCR to convert the text in the image to machine-encoded text.

I hope you enjoyed today’s tutorial!

To be notified when future blog posts are published, be sure to enter your email address in the form below!

Downloads:

If you would like to download the code and images used in this post, please enter your email address in the form below. Not only will you get a .zip of the code, I’ll also send you a FREE 11-page Resource Guide on Computer Vision and Image Search Engines, including exclusive techniques that I don’t post on this blog! Sound good? If so, enter your email address and I’ll send you the code immediately!

The post Text skew correction with OpenCV and Python appeared first on PyImageSearch.

Resolving macOS, OpenCV, and Homebrew install errors

$
0
0

As you undoubtedly know, configuring and installing OpenCV on your macOS machine can be a bit of a pain.

To help you and other PyImageSearch readers get OpenCV installed faster (and with less headaches), I put together a tutorial on using Homebrew to install OpenCV.

Using Homebrew allows you to skip manually configuring your build and compiling OpenCV from source.

Instead, you simply use what are called brew formulas which define how a given package should be automatically configured and installed, similar to how a package manager can intelligently install libraries and software on your system.

However, a bit of a problem arose a few weeks ago when it was discovered that there were some errors in the most recent Homebrew formula used to build and install OpenCV on macOS.

This formula caused two types of errors when building OpenCV on macOS via Homebrew:

  • Error #1: A report that both Python 2 and Python 3 wrappers could not be built (this is not true, you can build both Python 2.7 and Python 3 bindings in the same Homebrew command).
  • Error #2: A missing
    downloader.cmake
      file.

Myself, as well as PyImageSearch readers Andreas Linnarsson, Francis, and Patrick (see the comments section of the Homebrew OpenCV install post for the gory details) dove into the problem and tackled it head on.

Today I’m going to share our findings in hopes that it helps you and other PyImageSearch readers install OpenCV via Homebrew on your macOS machines.

In an ideal world these instructions will eventually become out of date as the Homebrew formula used to configure and install OpenCV is updated to correct these errors.

To learn more about resolving Homebrew errors when installing OpenCV, just keep reading.

Resolving macOS, OpenCV, and Homebrew install errors

In the remainder of this blog post I’ll discuss common errors you may run into when installing OpenCV via Homebrew on your macOS system.

I’ll also provide extra bonus suggestions regarding checking your Python version to help you debug these errors further.

Error #1: opencv3: Does not support building both Python 2 and 3 wrappers

Assuming you followed my original Homebrew + OpenCV install post, you may have ran into the following error when trying to install OpenCV:

$ brew install opencv3 --with-contrib --with-python3 --HEAD
...
Error: opencv3: Does not support building both Python 2 and 3 wrappers

This error was introduced by the following commit. I find the error frustrating for two reasons:

  1. There is no need to make this check…
  2. …because Homebrew can be used to compile OpenCV twice: once for Python 2.7 and then again for Python 3.

To start, OpenCV 3 can be built with Python 2.7 and Python 3 bindings. It just requires two separate compiles.

The first compile handles building OpenCV 3 + Python 2.7 bindings while the second compile generates the OpenCV 3 + Python 3 bindings. Doing this installs OpenCV 3 properly while generating the correct

cv2.so
  bindings for each respective Python version.

There are two ways to resolve this error, as discussed in this StackOverflow thread.

The first method is arguably simpler, but doesn’t address the real problem. Here we just update the

brew install opencv3
  command to indicate that we want to build OpenCV 3 without Python 3 bindings:
$ brew install opencv3 --with-contrib

Notice how we have left out the

--with-python3
  switch. In this case, Homebrew automatically builds Python 2.7 bindings for OpenCV 3 (there is no
--with-python2
  switch; it’s automatically assumed).

Similarly, if we wanted to build OpenCV 3 with Python 3 bindings, we would update the

brew install opencv3
  command to be:
$ brew install opencv3 --with-contrib --with-python3 --without-python

Here we supply

--with-python3
  to indicate we would like OpenCV 3 + Python 3 bindings to be generated, but to skip generating the OpenCV 3 + Python 2.7 bindings using the
--without-python
  switch.

This method works; however, I find it both frustrating and confusing. To start, the

--without-python
  switch is extremely ambiguous.

If I were to supply a switch named

--without-python
  to an install command I would assume that it would build NO Python bindings what-so-ever, regardless of Python version. However, that’s not the case. Instead,
--without-python
  really means no Python 2.7 bindings.

These switches are confusing to both OpenCV install veterans such as my myself along with novices who are just trying to get their development environment configured correctly for the first time.

In my opinion, a better solution (until a fix is fully released, of course) is to edit the OpenCV 3 install formula itself.

To edit the OpenCV 3 Homebrew install formula, execute the following command:

$ brew edit opencv3

And then find the following configuration block:

if build.with?("python3") && build.with?("python")
  # Opencv3 Does not support building both Python 2 and 3 versions
  odie "opencv3: Does not support building both Python 2 and 3 wrappers"
end

As you can see from my screenshot below, this configuration is on Lines 187-190 (however, these lines will change as the OpenCV 3 Homebrew formula is updated).

Figure 1: Finding the Homebrew + OpenCV 3 formula that needs to be edited.

Once you’ve found this section, comment these four lines out:

#if build.with?("python3") && build.with?("python")
#  # Opencv3 Does not support building both Python 2 and 3 versions
#  odie "opencv3: Does not support building both Python 2 and 3 wrappers"
#end

I’ve provided a screenshot demonstrating commenting these lines out as well:

Figure 2: Updating the Homebrew + OpenCV 3 install formula to resolve the error.

After you’ve commented the lines out, save and exit the editor to update the OpenCV 3 Homebrew install formula.

From there you should be able to successfully install OpenCV 3 via Homebrew using the following command:

$ brew install opencv3 --with-contrib --with-python3

Figure 3: Successfully compiling OpenCV 3 with Python 2.7 and Python 3 bindings on macOS via Homebrew.

Note: If you receive an error message related to

downloader.cmake
 , make sure you proceed to the next section.

After OpenCV 3 has finished installing, go back to the original tutorial, and follow the instructions starting with the “Handling the Python 3 issue” section.

From there, you will have OpenCV 3 installed with both Python 2.7 and Python 3 bindings:

Figure 4: Importing the cv2 library into a Python 2.7 and Python 3 shell.

Again, keep in mind that two separate compiles were done in order to generate these bindings. The first compile generated the OpenCV 3 + Python 2.7 bindings while the second compile created the OpenCV 3 + Python 3 bindings.

Error #2: No such file or directory 3rdparty/ippicv/downloader.cmake

The second error you may encounter when installing OpenCV 3 via Homebrew is related to the

downloader.cmake
  file. This error only occurs when you supply the
--HEAD
  switch to the
brew install opencv3
  command.

The reason for this error is that the

3rdparty/ippicv/downloader.cmake
  file was removed from the repo; however, the Homebrew install formula has not been updated to reflect this (source).

Therefore, the easiest way to get around this error is to simply omit the

--HEAD
  switch.

For example, if your previous OpenCV 3 + Homebrew install command was:

$ brew install opencv3 --with-contrib --with-python3 --HEAD

Simply update it to be:

$ brew install opencv3 --with-contrib --with-python3

Provided you’ve followed the instructions from the “Error #1” section above, Homebrew should now install OpenCV 3 with Python 2.7 and Python 3 bindings. You’ll now want to go back to the original Homebrew + OpenCV tutorial, and follow the instructions starting with the “Handling the Python 3 issue” section.

BONUS: Check your Python version and update paths accordingly

If you’re new to Unix environments and the command line (or if this is the first time you’ve worked with Python + OpenCV together), a common mistake I see novices make is forgetting to check their Python version number.

You can check your version of Python 2.7 using the following command:

$ python --version
Python 2.7.13

Similarly, this command will give you your Python 3 version:

$ python3 --version
Python 3.6.1

Why is this so important?

The original Homebrew + OpenCV install tutorial was written for Python 2.7 and Python 3.5. However, Python versions update. Python 3.6 has been officially released and is being used on many machines. In fact, if you were to install Python 3 via Homebrew (at the time of this writing), Python 3.6 would be installed.

This is important because you need to check your file paths.

For example, if I were to tell you to check the

site-packages
  directory of your Python 3 install and provide an example command of:
$ ls /usr/local/opt/opencv3/lib/python3.5/site-packages/

You should first check your Python 3 version. The command executed above assumes Python 3.5. However, if after running

python3 --version
  you find you are using Python 3.6, would need to update your path to be:
$ ls /usr/local/opt/opencv3/lib/python3.6/site-packages/

Notice how

python3.5
  was changed to
python3.6
 .

Forgetting to check and validate file paths is a common mistake that I see novices make when installing and configuring OpenCV with Python bindings.

Do not blindly copy and paste commands in your terminal. Instead, take the time to understand what they are doing so you can adapt the instructions to your own development environment.

In general, the instructions to install OpenCV + Python on a system do not change — but Python and OpenCV versions do change, therefore some file paths will change slightlyNormally all this amounts to changing one or two characters in a file path.

Summary

In today’s blog post we reviewed two common error messages you may encounter when installing OpenCV 3 via Homebrew:

  • Error #1: A report that both Python 2 and Python 3 wrappers could not be built.
  • Error #2: A missing
    downloader.cmake
      file.

I then provided solutions to each of these errors thanks to the help of PyImageSearch readers Andreas Linnarsson, Francis, and Patrick.

I hope these instructions help you avoid these common errors when installing OpenCV 3 via Homebrew on your macOS machine!

Before you go, be sure to enter your email address in the form below to be notified when future blog posts are published on PyImageSearch!

The post Resolving macOS, OpenCV, and Homebrew install errors appeared first on PyImageSearch.

Deep Learning with OpenCV

$
0
0

Two weeks ago OpenCV 3.3 was officially released, bringing with it a highly improved deep learning (

dnn
 ) module. This module now supports a number of deep learning frameworks, including Caffe, TensorFlow, and Torch/PyTorch.

Furthermore, this API for using pre-trained deep learning models is compatible with both the C++ API and the Python bindings, making it dead simple to:

  1. Load a model from disk.
  2. Pre-process an input image.
  3. Pass the image through the network and obtain the output classifications.

While we cannot train deep learning models using OpenCV (nor should we), this does allow us to take our models trained using dedicated deep learning libraries/tools and then efficiently use them directly inside our OpenCV scripts.

In the remainder of this blog post I’ll demonstrate the fundamentals of how to take a pre-trained deep learning network on the ImageNet dataset and apply it to input images.

To learn more about deep learning with OpenCV, just keep reading.

Looking for the source code to this post?
Jump right to the downloads section.

Deep Learning with OpenCV

In the first part of this post, we’ll discuss the OpenCV 3.3 release and the overhauled

dnn
  module.

We’ll then write a Python script that will use OpenCV and GoogleLeNet (pre-trained on ImageNet) to classify images.

Finally, we’ll explore the results of our classifications.

Deep Learning inside OpenCV 3.3

The dnn module of OpenCV has been part of the

opencv_contrib
  repository since version v3.1. Now in OpenCV 3.3 it is included in the main repository.

Why should you care?

Deep Learning is a fast growing domain of Machine Learning and if you’re working in the field of computer vision/image processing already (or getting up to speed), it’s a crucial area to explore.

With OpenCV 3.3, we can utilize pre-trained networks with popular deep learning frameworks. The fact that they are pre-trained implies that we don’t need to spend many hours training the network — rather we can complete a forward pass and utilize the output to make a decision within our application.

OpenCV does not (and does not intend to be) to be a tool for training networks — there are already great frameworks available for that purpose. Since a network (such as a CNN) can be used as a classifier, it makes logical sense that OpenCV has a Deep Learning module that we can leverage easily within the OpenCV ecosystem.

Popular network architectures compatible with OpenCV 3.3 include:

  • GoogleLeNet (used in this blog post)
  • AlexNet
  • SqueezeNet
  • VGGNet (and associated flavors)
  • ResNet

The release notes for this module are available on the OpenCV repository page.

Aleksandr Rybnikov, the main contributor for this module, has ambitious plans for this module so be sure to stay on the lookout and read his release notes (in Russian, so make sure you have Google Translation enabled in your browser if Russian is not your native language).

It’s my opinion that the

dnn
  module will have a big impact on the OpenCV community, so let’s get the word out.

Configure your machine with OpenCV 3.3

Installing OpenCV 3.3 is on par with installing other versions. The same install tutorials can be utilized — just make sure you download and use the correct release.

Simply follow these instructions for MacOS or Ubuntu while making sure to use the opencv and opencv_contrib releases for OpenCV 3.3. If you opt for the MacOS + homebrew install instructions, be sure to use the

--HEAD
  switch (among the others mentioned) to get the bleeding edge version of OpenCV.

If you’re using virtual environments (highly recommended), you can easily install OpenCV 3.3 alongside a previous version. Just create a brand new virtual environment (and name it appropriately) as you follow the tutorial corresponding to your system.

OpenCV deep learning functions and frameworks

OpenCV 3.3 supports the Caffe, TensorFlow, and Torch/PyTorch frameworks.

Keras is currently not supported (since Keras is actually a wrapper around backends such as TensorFlow and Theano), although I imagine it’s only a matter of time until Keras is directly supported given the popularity of the deep learning library.

Using OpenCV 3.3 we can load images from disk using the following functions inside

dnn
 :
  • cv2.dnn.blobFromImage
  • cv2.dnn.blobFromImages

We can directly import models from various frameworks via the “create” methods:

  • cv2.dnn.createCaffeImporter
  • cv2.dnn.createTensorFlowImporter
  • cv2.dnn.createTorchImporter

Although I think it’s easier to simply use the “read” methods and load a serialized model from disk directly:

  • cv2.dnn.readNetFromCaffe
  • cv2.dnn.readNetFromTensorFlow
  • cv2.dnn.readNetFromTorch
  • cv2.dnn.readhTorchBlob

Once we have loaded a model from disk, the .forward method is used to forward-propagate our image and obtain the actual classification.

To learn how all these OpenCV deep learning pieces fit together, let’s move on to the next section.

Classifying images using deep learning and OpenCV

In this section, we’ll be creating a Python script that can be used to classify input images using OpenCV and GoogLeNet (pre-trained on ImageNet) using the Caffe framework.

The GoogLeNet architecture (now known as “Inception” after the novel micro-architecture) was introduced by Szegedy et al. in their 2014 paper, Going deeper with convolutions.

Other architectures are also supported with OpenCV 3.3 including AlexNet, ResNet, and SqueezeNet — we’ll be examining these architectures for deep learning with OpenCV in a future blog post.

In the meantime, let’s learn how we can load a pre-trained Caffe model and use it to classify an image using OpenCV.

To begin, open up a new file, name it

deep_learning_with_opencv.py
 , and insert the following code:
# import the necessary packages
import numpy as np
import argparse
import time
import cv2

On Lines 2-5 we import our necessary packages.

Then we parse command line arguments:

# construct the argument parse and parse the arguments
ap = argparse.ArgumentParser()
ap.add_argument("-i", "--image", required=True,
	help="path to input image")
ap.add_argument("-p", "--prototxt", required=True,
	help="path to Caffe 'deploy' prototxt file")
ap.add_argument("-m", "--model", required=True,
	help="path to Caffe pre-trained model")
ap.add_argument("-l", "--labels", required=True,
	help="path to ImageNet labels (i.e., syn-sets)")
args = vars(ap.parse_args())

On Line 8 we create an argument parser followed by establishing four required command line arguments (Lines 9-16):

  • --image
     : The path to the input image.
  • --prototxt
     : The path to the Caffe “deploy” prototxt file.
  • --model
     : The pre-trained Caffe model (i.e,. the network weights themselves).
  • --labels
     : The path to ImageNet labels (i.e., “syn-sets”).

Now that we’ve established our arguments, we parse them and store them in a variable,

args
 , for easy access later.

Let’s load the input image and class labels:

# load the input image from disk
image = cv2.imread(args["image"])

# load the class labels from disk
rows = open(args["labels"]).read().strip().split("\n")
classes = [r[r.find(" ") + 1:].split(",")[0] for r in rows]

On Line 20, we load the

image
  from disk via
cv2.imread
 .

Let’s take a closer look at the class label data which we load on Lines 23 and 24:

n01440764 tench, Tinca tinca
n01443537 goldfish, Carassius auratus
n01484850 great white shark, white shark, man-eater, man-eating shark, Carcharodon carcharias
n01491361 tiger shark, Galeocerdo cuvieri
n01494475 hammerhead, hammerhead shark
n01496331 electric ray, crampfish, numbfish, torpedo
n01498041 stingray
...

As you can see, we have a unique identifier followed by a space, some class labels, and a new-line. Parsing this file line-by-line is straightforward and efficient using Python.

First, we load the class label

rows
  from disk into a list. To do this we strip whitespace from the beginning and end of each line while using the new-line (‘
\n
 ‘) as the row delimiter (Line 23). The result is a list of IDs and labels:
['n01440764 tench, Tinca tinca', 'n01443537 goldfish, Carassius auratus',
'n01484850 great white shark, white shark, man-eater, man-eating shark, Carcharodon carcharias',
'n01491361 tiger shark, Galeocerdo cuvieri',
'n01494475 hammerhead, hammerhead shark',
'n01496331 electric ray, crampfish, numbfish, torpedo',
'n01498041 stingray', ...]

Second, we use list comprehension to extract the relevant class labels from

rows
  by looking for the space (‘ ‘) after the ID, followed by delimiting class labels with a comma (‘
,
 ‘). The result is simply a list of class labels:
['tench', 'goldfish', 'great white shark', 'tiger shark',
'hammerhead', 'electric ray', 'stingray', ...]

Now that we’ve taken care of the labels, let’s dig into the

dnn
  module of OpenCV 3.3:
# our CNN requires fixed spatial dimensions for our input image(s)
# so we need to ensure it is resized to 224x224 pixels while
# performing mean subtraction (104, 117, 123) to normalize the input;
# after executing this command our "blob" now has the shape:
# (1, 3, 224, 224)
blob = cv2.dnn.blobFromImage(image, 1, (224, 224), (104, 117, 123))

Taking note of the comment in the block above, we use

cv2.dnn.blobFromImage
  to perform mean subtraction to normalize the input image which results in a known blob shape (Line 31).

We then load our model from disk:

# load our serialized model from disk
print("[INFO] loading model...")
net = cv2.dnn.readNetFromCaffe(args["prototxt"], args["model"])

Since we’ve opted to use Caffe, we utilize

cv2.dnn.readNetFromCaffe
  to load our Caffe model definition
prototxt
  and pre-trained 
model
  from disk (Line 35).

If you are familiar with Caffe, you’ll recognize the

prototxt
  file as a plain text configuration which follows a JSON-like structure — I recommend that you open
bvlc_googlenet.prototxt
  from the “Downloads” section in a text editor to inspect it.

Note: If you are unfamiliar with configuring Caffe CNNs, then this is a great time to consider the PyImageSearch Gurus course — inside the course you’ll get an in depth look at using deep nets for computer vision and image classification.

Now let’s complete a forward pass through the network with

blob
  as the input:
# set the blob as input to the network and perform a forward-pass to
# obtain our output classification
net.setInput(blob)
start = time.time()
preds = net.forward()
end = time.time()
print("[INFO] classification took {:.5} seconds".format(end - start))

It is important to note at this step that we aren’t training a CNN — rather, we are making use of a pre-trained network. Therefore we are just passing the blob through the network (i.e., forward propagation) to obtain the result (no back-propagation).

First, we specify

blob
  as our input (Line 39). Second, we make a
start
  timestamp (Line 40), followed by passing our input image through the network and storing the predictions. Finally, we set an
end
  timestamp (Line 42) so we can calculate the difference and print the elapsed time (Line 43).

Let’s finish up by determining the top five predictions for our input image:

# sort the indexes of the probabilities in descending order (higher
# probabilitiy first) and grab the top-5 predictions
idxs = np.argsort(preds[0])[::-1][:5]

Using NumPy, we can easily sort and extract the top five predictions on Line 47.

Next, we will display the top five class predictions:

# loop over the top-5 predictions and display them
for (i, idx) in enumerate(idxs):
	# draw the top prediction on the input image
	if i == 0:
		text = "Label: {}, {:.2f}%".format(classes[idx],
			preds[0][idx] * 100)
		cv2.putText(image, text, (5, 25),  cv2.FONT_HERSHEY_SIMPLEX,
			0.7, (0, 0, 255), 2)

	# display the predicted label + associated probability to the
	# console	
	print("[INFO] {}. label: {}, probability: {:.5}".format(i + 1,
		classes[idx], preds[0][idx]))

# display the output image
cv2.imshow("Image", image)
cv2.waitKey(0)

The idea for this loop is to (1) draw the top prediction label on the image itself and (2) print the associated class label probabilities to the terminal.

Lastly, we display the image to the screen (Line 64) and wait for the user to press a key before exiting (Line 65).

Deep learning and OpenCV classification results

Now that we have implemented our Python script to utilize deep learning with OpenCV, let’s go ahead and apply it to a few example images.

Make sure you use the “Downloads” section of this blog post to download the source code + pre-trained GoogLeNet architecture + example images.

From there, open up a terminal and execute the following command:

$ python deep_learning_with_opencv.py --image images/jemma.png 
	--prototxt bvlc_googlenet.prototxt \
	--model bvlc_googlenet.caffemodel --labels synset_words.txt
[INFO] loading model...
[INFO] classification took 0.075035 seconds
[INFO] 1. label: beagle, probability: 0.81137
[INFO] 2. label: Labrador retriever, probability: 0.031416
[INFO] 3. label: bluetick, probability: 0.023929
[INFO] 4. label: EntleBucher, probability: 0.017507
[INFO] 5. label: Greater Swiss Mountain dog, probability: 0.01444

Figure 1: Using OpenCV and deep learning to predict the class label for an input image.

In the above example, we have Jemma, the family beagle. Using OpenCV and GoogLeNet we have correctly classified this image as “beagle”.

Furthermore, inspecting the top-5 results we can see that the other top predictions are also relevant, all of them of which are dogs that have similar physical appearances as beagles.

Taking a look at the timing we also see that the forward pass took < 1 second, even though we are using our CPU.

Keep in mind that the forward pass is substantially faster than the backward pass as we do not need to compute the gradient and backpropagate through the network.

Let’s classify another image using OpenCV and deep learning:

$ python deep_learning_with_opencv.py --image images/traffic_light.png 
	--prototxt bvlc_googlenet.prototxt \
	--model bvlc_googlenet.caffemodel --labels synset_words.txt
[INFO] loading model...
[INFO] classification took 0.080521 seconds
[INFO] 1. label: traffic light, probability: 1.0
[INFO] 2. label: pole, probability: 4.9961e-07
[INFO] 3. label: spotlight, probability: 3.4974e-08
[INFO] 4. label: street sign, probability: 3.3623e-08
[INFO] 5. label: loudspeaker, probability: 2.0235e-08

Figure 2: OpenCV and deep learning is used to correctly label this image as “traffic light”.

OpenCV and GoogLeNet correctly label this image as “traffic light” with 100% certainty.

In this example we have a “bald eagle”:

$ python deep_learning_with_opencv.py --image images/eagle.png
	--prototxt bvlc_googlenet.prototxt \
	--model bvlc_googlenet.caffemodel --labels synset_words.txt
[INFO] loading model...
[INFO] classification took 0.087207 seconds
[INFO] 1. label: bald eagle, probability: 0.96768
[INFO] 2. label: kite, probability: 0.031964
[INFO] 3. label: vulture, probability: 0.00023595
[INFO] 4. label: albatross, probability: 6.3653e-05
[INFO] 5. label: black grouse, probability: 1.6147e-05

Figure 3: The “deep neural network” (dnn) module inside OpenCV 3.3 can be used to classify images using pre-trained models.

We are once again able to correctly classify the input image.

Our final example is a “vending machine”:

$ python deep_learning_with_opencv.py --image images/vending_machine.png
	--prototxt bvlc_googlenet.prototxt \
	--model bvlc_googlenet.caffemodel --labels synset_words.txt
[INFO] loading model...
[INFO] classification took 0.099602 seconds
[INFO] 1. label: vending machine, probability: 0.99269
[INFO] 2. label: cash machine, probability: 0.0023691
[INFO] 3. label: pay-phone, probability: 0.00097005
[INFO] 4. label: ashcan, probability: 0.00092097
[INFO] 5. label: mailbox, probability: 0.00061188

Figure 4: Since our GoogLeNet model is pre-trained on ImageNet, we can classify each of the 1,000 labels inside the dataset using OpenCV + deep learning.

OpenCV + deep learning once again correctly classifes the image.

Summary

In today’s blog post we learned how to use OpenCV for deep learning.

With the release of OpenCV 3.3 the deep neural network (

dnn
 ) library has been substantially overhauled, allowing us to load pre-trained networks via the Caffe, TensorFlow, and Torch/PyTorch frameworks and then use them to classify input images.

I imagine Keras support will also be coming soon, given how popular the framework is. This will likely take be a non-trivial implementation as Keras itself can support multiple numeric computation backends.

Over the next few weeks we’ll:

  1. Take a deeper dive into the
    dnn
      module and how it can be used inside our Python + OpenCV scripts.
  2. Learn how to modify Caffe
    .prototxt
      files to be compatible with OpenCV.
  3. Discover how we can apply deep learning using OpenCV to the Raspberry Pi.

This is a can’t-miss series of blog posts, so be before you go, make sure you enter your email address in the form below to be notified when these posts go live!

Downloads:

If you would like to download the code and images used in this post, please enter your email address in the form below. Not only will you get a .zip of the code, I’ll also send you a FREE 11-page Resource Guide on Computer Vision and Image Search Engines, including exclusive techniques that I don’t post on this blog! Sound good? If so, enter your email address and I’ll send you the code immediately!

The post Deep Learning with OpenCV appeared first on PyImageSearch.

Raspbian Stretch: Install OpenCV 3 + Python on your Raspberry Pi

$
0
0

It’s been over two years since the release of Raspbian Jessie. As of August 17th, 2017, the Raspberry Pi foundation has officially released the successor to Raspbian Jessie — Raspbian Stretch.

Just as I have done in previous blog posts, I’ll be demonstrating how to install OpenCV 3 with Python bindings on Raspbian Stretch.

If you are looking for previous installation instructions for different platforms, please consult this list:

Otherwise, let’s proceed with getting OpenCV 3 with Python bindings installed on Raspian Stretch!

The quick start video tutorial

If this is your first time installing OpenCV or you are just getting started with Linux I highly suggest that you watch the video below and follow along with me as you guide you step-by-step on how to install OpenCV 3 on your Raspberry Pi running Raspbian Stretch:

Otherwise, if you feel comfortable using the command line or if you have previous experience with Linux environments, feel free to use the text-based version of this guide below.

Assumptions

In this tutorial, I am going to assume that you already own a Raspberry Pi 3 with Raspbian Stretch installed.

If you don’t already have the Raspbian Stretch OS, you’ll need to upgrade your OS to take advantage of Raspbian Stretch’s new features.

To upgrade your Raspberry Pi 3 to Raspbian Stretch, you may download it here and follow these upgrade instructions (or these for the NOOBS route which is recommended for beginners). The former instructions take approximately 10 minutes to download via a torrent client and about 10 minutes to flash the SD card at which point you can power up and proceed to the next section.

Note: If you are upgrading your Raspberry Pi 3 from Raspbian Jessie to Raspbian Stretch, there is the potential for problems. Proceed at your own risk, and consult the Raspberry Pi forums for help.

Important: It is my recommendation that you proceed with a fresh install of Raspbian Stretch! Upgrading from Raspbian Jessie is not recommended.

Assuming that your OS is up to date, you’ll need one of the following for the remainder of this post:

  • Physical access to your Raspberry Pi 3 so that you can open up a terminal and execute commands
  • Remote access via SSH or VNC.

I’ll be doing the majority of this tutorial via SSH, but as long as you have access to a terminal, you can easily follow along.

Can’t SSH? If you see your Pi on your network, but can’t ssh to it, you may need to enable SSH. This can easily be done via the Raspberry Pi desktop preferences menu (you’ll need an HDMI cable and a keyboard/mouse) or running

sudo service ssh start
  from the command line of your Pi.

After you’ve changed the setting and rebooted, you can test SSH directly on the Pi with the localhost address. Open a terminal and type

ssh pi@127.0.0.1
  to see if it is working.

Keyboard layout giving you problems? Change your keyboard layout by going to the Raspberry Pi desktop preferences menu. I use the standard US Keyboard layout, but you’ll want to select the one appropriate for your keyboard or desire (any Dvorkac users out there?).

Installing OpenCV 3 on a Raspberry Pi 3 running Raspbian Stretch

If you’ve ever installed OpenCV on a Raspberry Pi (or any other platform before), you know that the process can be quite time consuming with many dependencies and pre-requisites that have to be installed. The goal of this tutorial is to thus guide you step-by-step through the compile and installation process.

In order to make the installation process go more smoothly, I’ve included timings for each step so you know when to take a break, grab a cup of coffee, and checkup on email while the Pi compiles OpenCV.

Let’s go ahead and get started installing OpenCV 3 on your Raspberry Pi 3 running Raspbian Stretch.

Step #1: Expand filesystem

Are you using a brand new install of Raspbian Stretch?

If so, the first thing you should do is expand your filesystem to include all available space on your micro-SD card:

$ sudo raspi-config

And then select the “Advanced Options” menu item:

Figure 1: Select the “Advanced Options” item from the “raspi-config” menu.

Followed by selecting “Expand filesystem”:

Figure 2: Expanding the filesystem on your Raspberry Pi 3.

Once prompted, you should select the first option, “A1. Expand File System”, hit Enter on your keyboard, arrow down to the “<Finish>” button, and then reboot your Pi — you may be prompted to reboot, but if you aren’t you can execute:

$ sudo reboot

After rebooting, your file system should have been expanded to include all available space on your micro-SD card. You can verify that the disk has been expanded by executing

df -h
and examining the output:
$ df -h
Filesystem      Size  Used Avail Use% Mounted on
/dev/root        30G  4.2G   24G  15% /
devtmpfs        434M     0  434M   0% /dev
tmpfs           438M     0  438M   0% /dev/shm
tmpfs           438M   12M  427M   3% /run
tmpfs           5.0M  4.0K  5.0M   1% /run/lock
tmpfs           438M     0  438M   0% /sys/fs/cgroup
/dev/mmcblk0p1   42M   21M   21M  51% /boot
tmpfs            88M     0   88M   0% /run/user/1000

As you can see, my Raspbian filesystem has been expanded to include all 32GB of the micro-SD card.

However, even with my filesystem expanded, I have already used 15% of my 32GB card.

If you are using an 8GB card you may be using close to 50% of the available space, so one simple thing to do is to delete both LibreOffice and Wolfram engine to free up some space on your Pi:

$ sudo apt-get purge wolfram-engine
$ sudo apt-get purge libreoffice*
$ sudo apt-get clean
$ sudo apt-get autoremove

After removing the Wolfram Engine and LibreOffice, you can reclaim almost 1GB!

Step #2: Install dependencies

This isn’t the first time I’ve discussed how to install OpenCV on the Raspberry Pi, so I’ll keep these instructions on the briefer side, allowing you to work through the installation process: I’ve also included the amount of time it takes to execute each command (some depend on your Internet speed) so you can plan your OpenCV + Raspberry Pi 3 install accordingly (OpenCV itself takes approximately 4 hours to compile — more on this later).

The first step is to update and upgrade any existing packages:

$ sudo apt-get update && sudo apt-get upgrade

Timing: 2m 14s

We then need to install some developer tools, including CMake, which helps us configure the OpenCV build process:

$ sudo apt-get install build-essential cmake pkg-config

Timing: 19s

Next, we need to install some image I/O packages that allow us to load various image file formats from disk. Examples of such file formats include JPEG, PNG, TIFF, etc.:

$ sudo apt-get install libjpeg-dev libtiff5-dev libjasper-dev libpng12-dev

Timing: 21s

Just as we need image I/O packages, we also need video I/O packages. These libraries allow us to read various video file formats from disk as well as work directly with video streams:

$ sudo apt-get install libavcodec-dev libavformat-dev libswscale-dev libv4l-dev
$ sudo apt-get install libxvidcore-dev libx264-dev

Timing: 32s

The OpenCV library comes with a sub-module named

highgui
which is used to display images to our screen and build basic GUIs. In order to compile the
highgui
module, we need to install the GTK development library:
$ sudo apt-get install libgtk2.0-dev libgtk-3-dev

Timing: 1m 36s

Many operations inside of OpenCV (namely matrix operations) can be optimized further by installing a few extra dependencies:

$ sudo apt-get install libatlas-base-dev gfortran

Timing: 23s

These optimization libraries are especially important for resource constrained devices such as the Raspberry Pi.

Lastly, let’s install both the Python 2.7 and Python 3 header files so we can compile OpenCV with Python bindings:

$ sudo apt-get install python2.7-dev python3-dev

Timing: 45s

If you’re working with a fresh install of the OS, it is possible that these versions of Python are already at the newest version (you’ll see a terminal message stating this).

If you skip this step, you may notice an error related to the

Python.h
header file not being found when running
make
to compile OpenCV.

Step #3: Download the OpenCV source code

Now that we have our dependencies installed, let’s grab the

3.3.0
archive of OpenCV from the official OpenCV repository. This version includes the
dnn
  module which we discussed in a previous post where we did Deep Learning with OpenCV (Note: As future versions of openCV are released, you can replace
3.3.0
with the latest version number):
$ cd ~
$ wget -O opencv.zip https://github.com/Itseez/opencv/archive/3.3.0.zip
$ unzip opencv.zip

Timing: 41s

We’ll want the full install of OpenCV 3 (to have access to features such as SIFT and SURF, for instance), so we also need to grab the opencv_contrib repository as well:

$ wget -O opencv_contrib.zip https://github.com/Itseez/opencv_contrib/archive/3.3.0.zip
$ unzip opencv_contrib.zip

Timing: 37s

You might need to expand the command above using the “<=>” button during your copy and paste. The

.zip
in the
3.3.0.zip
may appear to be cutoff in some browsers. The full URL of the OpenCV 3.3.0 archive is:

https://github.com/Itseez/opencv_contrib/archive/3.3.0.zip

Note: Make sure your

opencv
and
opencv_contrib
versions are the same (in this case,
3.3.0
). If the versions numbers do not match up, then you’ll likely run into either compile-time or runtime errors.

Step #4: Python 2.7 or Python 3?

Before we can start compiling OpenCV on our Raspberry Pi 3, we first need to install

pip
, a Python package manager:
$ wget https://bootstrap.pypa.io/get-pip.py
$ sudo python get-pip.py
$ sudo python3 get-pip.py

Timing: 33s

You may get a message that pip is already up to date when issuing these commands, but it is best not to skip this step.

If you’re a longtime PyImageSearch reader, then you’ll know that I’m a huge fan of both virtualenv and virtualenvwrapper. Installing these packages is not a requirement and you can absolutely get OpenCV installed without them, but that said, I highly recommend you install them as other existing PyImageSearch tutorials (as well as future tutorials) also leverage Python virtual environments. I’ll also be assuming that you have both

virtualenv
and
virtualenvwrapper
installed throughout the remainder of this guide.

So, given that, what’s the point of using

virtualenv
and
virtualenvwrapper
?

First, it’s important to understand that a virtual environment is a special tool used to keep the dependencies required by different projects in separate places by creating isolated, independent Python environments for each of them.

In short, it solves the “Project X depends on version 1.x, but Project Y needs 4.x” dilemma. It also keeps your global

site-packages
neat, tidy, and free from clutter.

If you would like a full explanation on why Python virtual environments are good practice, absolutely give this excellent blog post on RealPython a read.

It’s standard practice in the Python community to be using virtual environments of some sort, so I highly recommend that you do the same:

$ sudo pip install virtualenv virtualenvwrapper
$ sudo rm -rf ~/.cache/pip

Timing: 35s

Now that both

virtualenv
and
virtualenvwrapper
have been installed, we need to update our
~/.profile
file to include the following lines at the bottom of the file:
# virtualenv and virtualenvwrapper
export WORKON_HOME=$HOME/.virtualenvs
source /usr/local/bin/virtualenvwrapper.sh

In previous tutorials, I’ve recommended using your favorite terminal-based text editor such as

vim
,
emacs
, or
nano
to update the
~/.profile
file. If you’re comfortable with these editors, go ahead and update the file to reflect the changes mentioned above.

Otherwise, you should simply use

cat
and output redirection to handle updating
~/.profile
:
$ echo -e "\n# virtualenv and virtualenvwrapper" >> ~/.profile
$ echo "export WORKON_HOME=$HOME/.virtualenvs" >> ~/.profile
$ echo "source /usr/local/bin/virtualenvwrapper.sh" >> ~/.profile

Now that we have our

~/.profile
updated, we need to reload it to make sure the changes take affect. You can force a reload of your
~/.profile
file by:
  1. Logging out and then logging back in.
  2. Closing a terminal instance and opening up a new one
  3. Or my personal favorite, just use the
    source
    command:

$ source ~/.profile

Note: I recommend running the

source ~/.profile
file each time you open up a new terminal to ensure your system variables have been setup correctly.

Creating your Python virtual environment

Next, let’s create the Python virtual environment that we’ll use for computer vision development:

$ mkvirtualenv cv -p python2

This command will create a new Python virtual environment named

cv
using Python 2.7.

If you instead want to use Python 3, you’ll want to use this command instead:

$ mkvirtualenv cv -p python3

Timing: 24s

Again, I can’t stress this point enough: the

cv
Python virtual environment is entirely independent and sequestered from the default Python version included in the download of Raspbian Stretch. Any Python packages in the global
site-packages
directory will not be available to the
cv
virtual environment. Similarly, any Python packages installed in
site-packages
of
cv
will not be available to the global install of Python. Keep this in mind when you’re working in your Python virtual environment and it will help avoid a lot of confusion and headaches.

How to check if you’re in the “cv” virtual environment

If you ever reboot your Raspberry Pi; log out and log back in; or open up a new terminal, you’ll need to use the

workon
command to re-access the
cv
virtual environment. In previous blog posts, I’ve seen readers use the
mkvirtualenv
command — this is entirely unneeded! The
mkvirtualenv
command is meant to be executed only once: to actually create the virtual environment.

After that, you can use

workon
and you’ll be dropped down into your virtual environment:
$ source ~/.profile
$ workon cv

To validate and ensure you are in the

cv
virtual environment, examine your command line — if you see the text
(cv)
preceding your prompt, then you are in the
cv
virtual environment:

Figure 3: Make sure you see the “(cv)” text on your prompt, indicating that you are in the cv virtual environment.

Otherwise, if you do not see the

(cv)
text, then you are not in the
cv
virtual environment:

Figure 4: If you do not see the “(cv)” text on your prompt, then you are not in the cv virtual environment and need to run “source” and “workon” to resolve this issue.

To fix this, simply execute the

source
and
workon
commands mentioned above.

Installing NumPy on your Raspberry Pi

Assuming you’ve made it this far, you should now be in the

cv
virtual environment (which you should stay in for the rest of this tutorial). Our only Python dependency is NumPy, a Python package used for numerical processing:
$ pip install numpy

Timing: 11m 12s

Be sure to grab a cup of coffee or go for a nice walk, the NumPy installation can take a bit of time.

Note: A question I’ve often seen is “Help, my NumPy installation has hung and it’s not installing!” Actually, it is installing, it just takes time to pull down the sources and compile. You can verify that NumPy is compiling and installing by running

top
 . Here you’ll see that your CPU cycles are being used compiling NumPy. Be patient. The Raspberry Pi isn’t as fast as your laptop/desktop.

Step #5: Compile and Install OpenCV

We are now ready to compile and install OpenCV! Double-check that you are in the

cv
virtual environment by examining your prompt (you should see the
(cv)
text preceding it), and if not, simply execute
workon
:
$ workon cv

Once you have ensured you are in the

cv
virtual environment, we can setup our build using CMake:
$ cd ~/opencv-3.3.0/
$ mkdir build
$ cd build
$ cmake -D CMAKE_BUILD_TYPE=RELEASE \
    -D CMAKE_INSTALL_PREFIX=/usr/local \
    -D INSTALL_PYTHON_EXAMPLES=ON \
    -D OPENCV_EXTRA_MODULES_PATH=~/opencv_contrib-3.3.0/modules \
    -D BUILD_EXAMPLES=ON ..

Timing: 2m 56s

Now, before we move on to the actual compilation step, make sure you examine the output of CMake!

Start by scrolling down the section titled

Python 2
and
Python 3
.

If you are compiling OpenCV 3 for Python 2.7, then make sure your

Python 2
section includes valid paths to the
Interpreter
,
Libraries
,
numpy
and
packages path
, similar to my screenshot below:

Figure 6: Checking that Python 3 will be used when compiling OpenCV 3 for Raspbian Stretch on the Raspberry Pi 3.

Notice how the

Interpreter
points to our
python2.7
binary located in the
cv
virtual environment. The
numpy
variable also points to the NumPy installation in the
cv
environment.

Similarly, if you’re compiling OpenCV for Python 3, make sure the

Python 3
section looks like the figure below:

Figure 6: Checking that Python 3 will be used when compiling OpenCV 3 for Raspbian Stretch on the Raspberry Pi 3.

Again, the

Interpreter
points to our
python3.5
binary located in the
cv
virtual environment while
numpy
points to our NumPy install.

In either case, if you do not see the

cv
virtual environment in these variables paths, it’s almost certainly because you are NOT in the
cv
virtual environment prior to running CMake!

If this is the case, access the

cv
virtual environment using
workon cv
and re-run the
cmake
command outlined above.

Finally, we are now ready to compile OpenCV:

$ make

Timing: 4h 0m

I was not able to successfully leverage compiling with 2 or 4 cores (the Pi 3 has 4 cores) which would significantly cut down on compile time (previous posts here on PyImageSearch demonstrate that OpenCV and be compiled in a little over 60 minutes).

I recommend you not compile with

make -j2
  or
make -j4
  as your compile will likely freeze up at 90%. I have not been able to diagnose whether this is an issue with (1) the Pi overheating, (2) a race condition with the compile, (3) the Raspbian Stretch OS, or some combination of all three. Based on all my experimentations thus far I think it’s an issue introduced with the Raspbian Stretch OS.

If you do have a compile error using

-j2
  or
-j4
, I suggest starting the compilation over again and using only one core:
$ make clean
$ make

Once OpenCV 3 has finished compiling, your output should look similar to mine below:

Figure 7: Our OpenCV 3 compile on Raspbian Stretch has completed successfully.

From there, all you need to do is install OpenCV 3 on your Raspberry Pi 3:

$ sudo make install
$ sudo ldconfig

Timing: 52s

Step #6: Finish installing OpenCV on your Pi

We’re almost done — just a few more steps to go and you’ll be ready to use your Raspberry Pi 3 with OpenCV 3 on Raspbian Stretch.

For Python 2.7:

Provided your Step #5 finished without error, OpenCV should now be installed in

/usr/local/lib/python2.7/site-pacakges
. You can verify this using the
ls
command:
$ ls -l /usr/local/lib/python2.7/site-packages/
total 1852
-rw-r--r-- 1 root staff 1895772 Mar 20 20:00 cv2.so

Note: In some cases, OpenCV can be installed in

/usr/local/lib/python2.7/dist-packages
(note the
dist-packages
rather than
site-packages
 ). If you do not find the
cv2.so
bindings in
site-packages
, we be sure to check
dist-packages
 .

Our final step is to sym-link the OpenCV bindings into our

cv
virtual environment for Python 2.7:
$ cd ~/.virtualenvs/cv/lib/python2.7/site-packages/
$ ln -s /usr/local/lib/python2.7/site-packages/cv2.so cv2.so

For Python 3:

After running

make install
, your OpenCV + Python bindings should be installed in
/usr/local/lib/python3.5/site-packages
. Again, you can verify this with the
ls
command:
$ ls -l /usr/local/lib/python3.5/site-packages/
total 1852
-rw-r--r-- 1 root staff 1895932 Mar 20 21:51 cv2.cpython-34m.so

I honestly don’t know why, perhaps it’s a bug in the CMake script, but when compiling OpenCV 3 bindings for Python 3+, the output

.so
file is named
cv2.cpython-35m-arm-linux-gnueabihf.so
(or some variant of) rather than simply
cv2.so
(like in the Python 2.7 bindings).

Again, I’m not sure exactly why this happens, but it’s an easy fix. All we need to do is rename the file:

$ cd /usr/local/lib/python3.5/site-packages/
$ sudo mv cv2.cpython-35m-arm-linux-gnueabihf.so cv2.so

After renaming to

cv2.so
, we can sym-link our OpenCV bindings into the
cv
virtual environment for Python 3.5:
$ cd ~/.virtualenvs/cv/lib/python3.5/site-packages/
$ ln -s /usr/local/lib/python3.5/site-packages/cv2.so cv2.so

Step #7: Testing your OpenCV 3 install

Congratulations, you now have OpenCV 3 installed on your Raspberry Pi 3 running Raspbian Stretch!

But before we pop the champagne and get drunk on our victory, let’s first verify that your OpenCV installation is working properly.

Open up a new terminal, execute the

source
and
workon
commands, and then finally attempt to import the Python + OpenCV bindings:
$ source ~/.profile 
$ workon cv
$ python
>>> import cv2
>>> cv2.__version__
'3.3.0'
>>>

As you can see from the screenshot of my own terminal, OpenCV 3 has been successfully installed on my Raspberry Pi 3 + Python 3.5 environment:

Figure 8: Confirming OpenCV 3 has been successfully installed on my Raspberry Pi 3 running Raspbian Stretch.

Once OpenCV has been installed, you can remove both the

opencv-3.3.0
and
opencv_contrib-3.3.0
directories to free up a bunch of space on your disk:
$ rm -rf opencv-3.3.0 opencv_contrib-3.3.0

However, be cautious with this command! Make sure OpenCV has been properly installed on your system before blowing away these directories. A mistake here could cost you hours in compile time.

Troubleshooting and FAQ

Q. When I try to execute

mkvirtualenv
and
workon
, I get a “command not found error”.

A. There are three reasons why this could be happening, all of them related to Step #4:

  1. Make certain that you have installed
    virtualenv
    and
    virtualenvwrapper
    via
    pip
    . You can check this by running
    pip freeze
    and then examining the output, ensuring you see occurrences of both
    virtualenv
    and
    virtualenvwrapper
    .
  2. You might not have updated your
    ~/.profile
    correctly. Use a text editor such as
    nano
    to view your
    ~/.profile
    file and ensure that the proper
    export
    and
    source
    commands are present (again, check Step #4 for the contents that should be appended to
    ~/.profile
    .
  3. You did not
    source
    your
    ~/.profile
    after editing it, rebooting, opening a new terminal, etc. Any time you open a new terminal and want to use a virtual environment, make sure you execute
    source ~/.profile
    to load the contents — this will give you access to the
    mkvirtualenv
    and
    workon
    commands.

Q. After I open a new terminal, logout, or reboot my Pi, I cannot execute

mkvirtualenv
or
workon
.

A. See reason #3 from the previous question.

Q. When I (1) open up a Python shell that imports OpenCV or (2) execute a Python script that calls OpenCV, I get an error:

ImportError: No module named cv2
.

A. Unfortunately, this error is extremely hard to diagnose, mainly because there are multiple issues that could be causing the problem. To start, make sure you are in the

cv
virtual environment by using
workon cv
. If the
workon
command fails, then see the first question in this FAQ. If you’re still getting an error, investigate the contents of the
site-packages
directory for your
cv
virtual environment. You can find the
site-packages
directory in
~/.virtualenvs/cv/lib/python2.7/site-packages/
or
~/.virtualenvs/cv/lib/python3.5/site-packages/
(depending on which Python version you used for the install). Make sure that your sym-link to the
cv2.so
file is valid and points to an existing file.

Q. I’m running into other errors.

A. Feel free to leave a comment and I’ll try to provide guidance; however, please understand that without physical access to your Pi it can often be hard to diagnose compile/install errors. If you’re in a rush to get OpenCV up and running on your Raspberry Pi be sure to take a look at the Quickstart Bundle and Hardcopy Bundle of my book, Practical Python and OpenCV. Both of these bundles include a Raspbian .img file with OpenCV pre-configured and pre-installed. Simply download the .img file, flash it to your Raspberry Pi, and boot! This method is by far the easiest, hassle free method to getting started with OpenCV on your Raspberry Pi.

So, what’s next?

Congrats! You have a brand new, fresh install of OpenCV on your Raspberry Pi — and I’m sure you’re just itching to leverage your Raspberry Pi to build some awesome computer vision apps.

But I’m also willing to bet that you’re just getting started learning computer vision and OpenCV, and you’re probably feeling a bit confused and overwhelmed on where exactly to start.

Personally, I’m a big fan of learning by example, so a good first step would be to read this blog post on accessing your Raspberry Pi Camera with the picamera module. This tutorial details the exact steps you need to take to (1) capture photos from the camera module and (2) access the raw video stream.

And if you’re really interested in leveling-up your computer vision skills, you should definitely check out my book, Practical Python and OpenCV + Case Studies. My book not only covers the basics of computer vision and image processing, but also teaches you how to solve real world computer vision problems including face detection in images and video streams, object tracking in video, and handwriting recognition.

raspberry_pi_in_post

All code examples covered in the book are guaranteed to run on the Raspberry Pi 2 and Pi 3 as well! Most programs will also run on the B+ and Zero models, but might be a bit slow due to the limited computing power of the B+ and Zero.

So let’s put your fresh install of OpenCV on your Raspberry Pi to good use — just click here to learn more about the real-world projects you can solve using your Raspberry Pi + Practical Python and OpenCV .

Summary

In this blog post, we learned how to upgrade your Raspberry Pi 3‘s OS to Raspbian Stretch and to install OpenCV 3 with either Python 2.7 or Python 3 bindings.

If you are running a different version of Raspbian (such as Raspbian Wheezy) or want to install a different version of OpenCV (such as OpenCV 2.4), please consult the following tutorials:

Are you looking for a project to work on with your new install of OpenCV on Raspbian Stretch? Readers have been big fans of this post on Home surveillance and motion detection with the Raspberry Pi, Python, OpenCV, and Dropbox.

But before you go…

I tend to utilize the Raspberry Pi quite a bit on this blog, so if you’re interested in learning more about the Raspberry Pi + computer vision, enter your email address in the form below to be notified when these posts go live!

The post Raspbian Stretch: Install OpenCV 3 + Python on your Raspberry Pi appeared first on PyImageSearch.

Object detection with deep learning and OpenCV

$
0
0

A couple weeks ago we learned how to classify images using deep learning and OpenCV 3.3’s deep neural network (

dnn
 ) module.

While this original blog post demonstrated how we can categorize an image into one of ImageNet’s 1,000 separate class labels it could not tell us where an object resides in image.

In order to obtain the bounding box (x, y)-coordinates for an object in a image we need to instead apply object detection.

Object detection can not only tell us what is in an image but also where the object is as well.

In the remainder of today’s blog post we’ll discuss how to apply object detection using deep learning and OpenCV.

Looking for the source code to this post?
Jump right to the downloads section.

Object detection with deep learning and OpenCV

In the first part of today’s post on object detection using deep learning we’ll discuss Single Shot Detectors and MobileNets.

When combined together these methods can be used for super fast, real-time object detection on resource constrained devices (including the Raspberry Pi, smartphones, etc.)

From there we’ll discover how to use OpenCV’s

dnn
  module to load a pre-trained object detection network.

This will enable us to pass input images through the network and obtain the output bounding box (x, y)-coordinates of each object in the image.

Finally we’ll look at the results of applying the MobileNet Single Shot Detector to example input images.

In a future blog post we’ll extend our script to work with real-time video streams as well.

Single Shot Detectors for object detection

Figure 1: Examples of object detection using Single Shot Detectors (SSD) from Liu et al.

When it comes to deep learning-based object detection there are three primary object detection methods that you’ll likely encounter:

Faster R-CNNs are likely the most “heard of” method for object detection using deep learning; however, the technique can be difficult to understand (especially for beginners in deep learning), hard to implement, and challenging to train.

Furthermore, even with the “faster” implementation R-CNNs (where the “R” stands for “Region Proposal”) the algorithm can be quite slow, on the order of 7 FPS.

If we are looking for pure speed then we tend to use YOLO as this algorithm is much faster, capable of processing 40-90 FPS on a Titan X GPU. The super fast variant of YOLO can even get up to 155 FPS.

The problem with YOLO is that it leaves much accuracy to be desired.

SSDs, originally developed by Google, are a balance between the two. The algorithm is more straightforward (and I would argue better explained in the original seminal paper) than Faster R-CNNs.

We can also enjoy a much faster FPS throughput than Girshick et al. at 22-46 FPS depending on which variant of the network we use. SSDs also tend to be more accurate than YOLO. To learn more about SSDs, please refer to Liu et al.

MobileNets: Efficient (deep) neural networks

Figure 2: (Left) Standard convolutional layer with batch normalization and ReLU. (Right) Depthwise separable convolution with depthwise and pointwise layers followed by batch normalization and ReLU (figure and caption from Liu et al.).

When building object detection networks we normally use an existing network architecture, such as VGG or ResNet, and then use it inside the object detection pipeline. The problem is that these network architectures can be very large in the order of 200-500MB.

Network architectures such as these are unsuitable for resource constrained devices due to their sheer size and resulting number of computations.

Instead, we can use MobileNets (Howard et al., 2017), another paper by Google researchers. We call these networks “MobileNets” because they are designed for resource constrained devices such as your smartphone. MobileNets differ from traditional CNNs through the usage of depthwise separable convolution (Figure 2 above).

The general idea behind depthwise separable convolution is to split convolution into two stages:

  1. A 3×3 depthwise convolution.
  2. Followed by a 1×1 pointwise convolution.

This allows us to actually reduce the number of parameters in our network.

The problem is that we sacrifice accuracy — MobileNets are normally not as accurate as their larger big brothers…

…but they are much more resource efficient.

For more details on MobileNets please see Howard et al.

Combining MobileNets and Single Shot Detectors for fast, efficient deep-learning based object detection

If we combine both the MobileNet architecture and the Single Shot Detector (SSD) framework, we arrive at a fast, efficient deep learning-based method to object detection.

The model we’ll be using in this blog post is a Caffe version of the original TensorFlow implementation by Howard et al. and was trained by chuanqi305 (see GitHub).

The MobileNet SSD was first trained on the COCO dataset (Common Objects in Context) and was then fine-tuned on PASCAL VOC reaching 72.7% mAP (mean average precision).

We can therefore detect 20 objects in images (+1 for the background class), including airplanes, bicycles, birds, boats, bottles, buses, cars, cats, chairs, cows, dining tables, dogs, horses, motorbikes, people, potted plants, sheep, sofas, trains, and tv monitors.

Deep learning-based object detection with OpenCV

In this section we will use the MobileNet SSD + deep neural network (

dnn
 ) module in OpenCV to build our object detector.

I would suggest using the “Downloads” code at the bottom of this blog post to download the source code + trained network + example images so you can test them on your machine.

Let’s go ahead and get started building our deep learning object detector using OpenCV.

Open up a new file, name it

deep_learning_object_detection.py
 , and insert the following code:
# import the necessary packages
import numpy as np
import argparse
import cv2

# construct the argument parse and parse the arguments
ap = argparse.ArgumentParser()
ap.add_argument("-i", "--image", required=True,
	help="path to input image")
ap.add_argument("-p", "--prototxt", required=True,
	help="path to Caffe 'deploy' prototxt file")
ap.add_argument("-m", "--model", required=True,
	help="path to Caffe pre-trained model")
ap.add_argument("-c", "--confidence", type=float, default=0.2,
	help="minimum probability to filter weak detections")
args = vars(ap.parse_args())

On Lines 2-4 we import packages required for this script — the

dnn
  module is included in
cv2
, again, making hte assumption that you’re using OpenCV 3.3.

Then, we parse our command line arguments (Lines 7-16):

  • --image
     : The path to the input image.
  • --prototxt
     : The path to the Caffe prototxt file.
  • --model
     : The path to the pre-trained model.
  • --confidence
     : The minimum probability threshold to filter weak detections. The default is 20%.

Again, example files for the first three arguments are included in the “Downloads” section of this blog post. I urge you to start there while also supplying some query images of your own.

Next, let’s initialize class labels and bounding box colors:

# initialize the list of class labels MobileNet SSD was trained to
# detect, then generate a set of bounding box colors for each class
CLASSES = ["background", "aeroplane", "bicycle", "bird", "boat",
	"bottle", "bus", "car", "cat", "chair", "cow", "diningtable",
	"dog", "horse", "motorbike", "person", "pottedplant", "sheep",
	"sofa", "train", "tvmonitor"]
COLORS = np.random.uniform(0, 255, size=(len(CLASSES), 3))

Lines 20-23 build a list called

CLASSES
  containing our labels. This is followed by a list,
COLORS
  which contains corresponding random colors for bounding boxes (Line 24).

Now we need to load our model:

# load our serialized model from disk
print("[INFO] loading model...")
net = cv2.dnn.readNetFromCaffe(args["prototxt"], args["model"])

The above lines are self-explanatory, we simply print a message and load our

model
  (Lines 27 and 28).

Next, we will load our query image and prepare our

blob
 , which we will feed-forward through the network:
# load the input image and construct an input blob for the image
# by resizing to a fixed 300x300 pixels and then normalizing it
# (note: normalization is done via the authors of the MobileNet SSD
# implementation)
image = cv2.imread(args["image"])
(h, w) = image.shape[:2]
blob = cv2.dnn.blobFromImage(image, 0.007843, (300, 300), 127.5)

Taking note of the comment in this block, we load our

image
  (Line 34), extract the height and width (Line 35), and calculate a 300 by 300 pixel
blob
  from our image (Line 36).

Now we’re ready to do the heavy lifting — we’ll pass this blob through the neural network:

# pass the blob through the network and obtain the detections and
# predictions
print("[INFO] computing object detections...")
net.setInput(blob)
detections = net.forward()

On Lines 41 and 42 we set the input to the network and compute the forward pass for the input, storing the result as

detections
 . Computing the forward pass and associated detections could take awhile depending on your model and input size, but for this example it will be relatively quick on most CPUs.

Let’s loop through our

detections
  and determine what and where the objects are in the image:
# loop over the detections
for i in np.arange(0, detections.shape[2]):
	# extract the confidence (i.e., probability) associated with the
	# prediction
	confidence = detections[0, 0, i, 2]

	# filter out weak detections by ensuring the `confidence` is
	# greater than the minimum confidence
	if confidence > args["confidence"]:
		# extract the index of the class label from the `detections`,
		# then compute the (x, y)-coordinates of the bounding box for
		# the object
		idx = int(detections[0, 0, i, 1])
		box = detections[0, 0, i, 3:7] * np.array([w, h, w, h])
		(startX, startY, endX, endY) = box.astype("int")

		# display the prediction
		label = "{}: {:.2f}%".format(CLASSES[idx], confidence * 100)
		print("[INFO] {}".format(label))
		cv2.rectangle(image, (startX, startY), (endX, endY),
			COLORS[idx], 2)
		y = startY - 15 if startY - 15 > 15 else startY + 15
		cv2.putText(image, label, (startX, y),
			cv2.FONT_HERSHEY_SIMPLEX, 0.5, COLORS[idx], 2)

We start by looping over our detections, keeping in mind that multiple objects can be detected in a single image. We also apply a check to the confidence (i.e., probability) associated with each detection. If the confidence is high enough (i.e. above the threshold), then we’ll display the prediction in the terminal as well as draw the prediction on the image with text and a colored bounding box. Let’s break it down line-by-line:

Looping through our

detections
 , first we extract the
confidence
  value (Line 48).

If the

confidence
  is above our minimum threshold (Line 52), we extract the class label index (Line 56) and compute the bounding box around the detected object (Line 57).

Then, we extract the (x, y)-coordinates of the box (Line 58) which we will will use shortly for drawing a rectangle and displaying text.

Next, we build a text

label
  containing the
CLASS
  name and the
confidence
  (Line 61).

Using the label, we print it to the terminal (Line 62), followed by drawing a colored rectangle around the object using our previously extracted (x, y)-coordinates (Lines 63 and 64).

In general, we want the label to be displayed above the rectangle, but if there isn’t room, we’ll display it just below the top of the rectangle (Line 65).

Finally, we overlay the colored text onto the

image
  using the y-value that we just calculated (Lines 66 and 67).

The only remaining step is to display the result:

# show the output image
cv2.imshow("Output", image)
cv2.waitKey(0)

We display the resulting output image to the screen until a key is pressed (Lines 70 and 71).

OpenCV and deep learning object detection results

To download the code + pre-trained network + example images, be sure to use the “Downloads” section at the bottom of this blog post.

From there, unzip the archive and execute the following command:

$ python deep_learning_object_detection.py \
	--prototxt MobileNetSSD_deploy.prototxt.txt \
	--model MobileNetSSD_deploy.caffemodel --image images/example_01.jpg 
[INFO] loading model...
[INFO] computing object detections...
[INFO] loading model...
[INFO] computing object detections...
[INFO] car: 99.78%
[INFO] car: 99.25%

Figure 3: Two Toyotas on the highway recognized with near-100% confidence using OpenCV, deep learning, and object detection.

Our first result shows cars recognized and detected with near-100% confidence.

In this example we detect an airplane using deep learning-based object detection:

$ python deep_learning_object_detection.py \
	--prototxt MobileNetSSD_deploy.prototxt.txt \
	--model MobileNetSSD_deploy.caffemodel --image images/example_02.jpg 
[INFO] loading model...
[INFO] computing object detections...
[INFO] loading model...
[INFO] computing object detections...
[INFO] aeroplane: 98.42%

Figure 4: An airplane successfully detected with high confidence via Python, OpenCV, and deep learning.

The ability for deep learning to detect and localize obscured objects is demonstrated in the following image, where we see a horse (and it’s rider) jumping a fence flanked by two potted plants:

$ python deep_learning_object_detection.py \
	--prototxt MobileNetSSD_deploy.prototxt.txt \
	--model MobileNetSSD_deploy.caffemodel --image images/example_03.jpg
[INFO] loading model...
[INFO] computing object detections...
[INFO] horse: 96.67%
[INFO] person: 92.58%
[INFO] pottedplant: 96.87%
[INFO] pottedplant: 34.42%

Figure 5: A person riding a horse and two potted plants are successfully identified despite a lot of objects in the image via deep learning-based object detection.

In this example we can see a beer bottle is detected with an impressive 100% confidence:

$ python deep_learning_object_detection.py --prototxt MobileNetSSD_deploy.prototxt.txt \
	--model MobileNetSSD_deploy.caffemodel --image images/example_04.jpg 
[INFO] loading model...
[INFO] computing object detections...
[INFO] bottle: 100.00%

Figure 6: Deep learning + OpenCV are able to correctly detect a beer bottle in an input image.

Followed by another horse image which also contains a dog, car, and person:

$ python deep_learning_object_detection.py \
	--prototxt MobileNetSSD_deploy.prototxt.txt \
	--model MobileNetSSD_deploy.caffemodel --image images/example_05.jpg 
[INFO] loading model...
[INFO] computing object detections...
[INFO] car: 99.87%
[INFO] dog: 94.88%
[INFO] horse: 99.97%
[INFO] person: 99.88%

Figure 7: Several objects in this image including a car, dog, horse, and person are all recognized.

Finally, a picture of me and Jemma, the family beagle:

$ python deep_learning_object_detection.py \
	--prototxt MobileNetSSD_deploy.prototxt.txt \
	--model MobileNetSSD_deploy.caffemodel --image images/example_06.jpg 
[INFO] loading model...
[INFO] computing object detections...
[INFO] dog: 95.88%
[INFO] person: 99.95%

Figure 8: Me and the family beagle are corrected as a “person” and a “dog” via deep learning, object detection, and OpenCV. The TV monitor is not recognized.

Unfortunately the TV monitor isn’t recognized in this image which is likely due to (1) me blocking it and (2) poor contrast around the TV. That being said, we have demonstrated excellent object detection results using OpenCV’s

dnn
  module.

Summary

In today’s blog post we learned how to perform object detection using deep learning and OpenCV.

Specifically, we used both MobileNets + Single Shot Detectors along with OpenCV 3.3’s brand new (totally overhauled)

dnn
  module to detect objects in images.

As a computer vision and deep learning community we owe a lot to the contributions of Aleksandr Rybnikov, the main contributor to the

dnn
  module for making deep learning so accessible from within the OpenCV library. You can find Aleksandr’s original OpenCV example script here — I have modified it for the purposes of this blog post.

In a future blog post I’ll be demonstrating how we can modify today’s tutorial to work with real-time video streams, thus enabling us to perform deep learning-based object detection to videos. We’ll be sure to leverage efficient frame I/O to increase the FPS throughout our pipeline as well.

To be notified when future blog posts (such as the real-time object detection tutorial) are published here on PyImageSearch, simply enter your email address in the form below.

Downloads:

If you would like to download the code and images used in this post, please enter your email address in the form below. Not only will you get a .zip of the code, I’ll also send you a FREE 11-page Resource Guide on Computer Vision and Image Search Engines, including exclusive techniques that I don’t post on this blog! Sound good? If so, enter your email address and I’ll send you the code immediately!

The post Object detection with deep learning and OpenCV appeared first on PyImageSearch.

Real-time object detection with deep learning and OpenCV

$
0
0

Today’s blog post was inspired by PyImageSearch reader, Emmanuel. Emmanuel emailed me after last week’s tutorial on object detection with deep learning + OpenCV and asked:

“Hi Adrian,

I really enjoyed last week’s blog post on object detection with deep learning and OpenCV, thanks for putting it together and for making deep learning with OpenCV so accessible.

I want to apply the same technique to real-time video.

What is the best way to do this?

How can I achieve the most efficiency?

If you could do a tutorial on real-time object detection with deep learning and OpenCV I would really appreciate it.”

Great question, thanks for asking Emmanuel.

Luckily, extending our previous tutorial on object detection with deep learning and OpenCV to real-time video streams is fairly straightforward — we simply need to combine some efficient, boilerplate code for real-time video access and then add in our object detection.

By the end of this tutorial you’ll be able to apply deep learning-based object detection to real-time video streams using OpenCV and Python — to learn how, just keep reading.

Looking for the source code to this post?
Jump right to the downloads section.

Real-time object detection with deep learning and OpenCV

Today’s blog post is broken into two parts.

In the first part we’ll learn how to extend last week’s tutorial to apply real-time object detection using deep learning and OpenCV to work with video streams and video files. This will be accomplished using the highly efficient

VideoStream
  class discussed in this tutorial.

From there, we’ll apply our deep learning + object detection code to actual video streams and measure the FPS processing rate.

Object detection in video with deep learning and OpenCV

To build our deep learning-based real-time object detector with OpenCV we’ll need to (1) access our webcam/video stream in an efficient manner and (2) apply object detection to each frame.

To see how this is done, open up a new file, name it 

real_time_object_detection.py
  and insert the following code:
# import the necessary packages
from imutils.video import VideoStream
from imutils.video import FPS
import numpy as np
import argparse
import imutils
import time
import cv2

We begin by importing packages on Lines 2-8. For this tutorial, you will need imutils and OpenCV 3.3.

To get your system set up, simply install OpenCV using the relevant instructions for your system (while ensuring you’re following any Python virtualenv commands).

Note: Make sure to download and install opencv and and opencv-contrib releases for OpenCV 3.3. This will ensure that the deep neural network (

dnn
) module is installed. You must have OpenCV 3.3 (or newer) to run the code in this tutorial.

Next, we’ll parse our command line arguments:

# construct the argument parse and parse the arguments
ap = argparse.ArgumentParser()
ap.add_argument("-p", "--prototxt", required=True,
	help="path to Caffe 'deploy' prototxt file")
ap.add_argument("-m", "--model", required=True,
	help="path to Caffe pre-trained model")
ap.add_argument("-c", "--confidence", type=float, default=0.2,
	help="minimum probability to filter weak detections")
args = vars(ap.parse_args())

Compared to last week, we don’t need the image argument since we’re working with streams and videos — other than that the following arguments remain the same:

  • --prototxt
     : The path to the Caffe prototxt file.
  • --model
     : The path to the pre-trained model.
  • --confidence
     : The minimum probability threshold to filter weak detections. The default is 20%.

We then initialize a class list and a color set:

# initialize the list of class labels MobileNet SSD was trained to
# detect, then generate a set of bounding box colors for each class
CLASSES = ["background", "aeroplane", "bicycle", "bird", "boat",
	"bottle", "bus", "car", "cat", "chair", "cow", "diningtable",
	"dog", "horse", "motorbike", "person", "pottedplant", "sheep",
	"sofa", "train", "tvmonitor"]
COLORS = np.random.uniform(0, 255, size=(len(CLASSES), 3))

On Lines 22-26 we initialize

CLASS
  labels and corresponding random
COLORS
 . For more information on these classes (and how the network was trained), please refer to last week’s blog post.

Now, let’s load our model and set up our video stream:

# load our serialized model from disk
print("[INFO] loading model...")
net = cv2.dnn.readNetFromCaffe(args["prototxt"], args["model"])

# initialize the video stream, allow the cammera sensor to warmup,
# and initialize the FPS counter
print("[INFO] starting video stream...")
vs = VideoStream(src=1).start()
time.sleep(2.0)
fps = FPS().start()

We load our serialized model, providing the references to our prototxt and model files on Line 30 — notice how easy this is in OpenCV 3.3.

Next let’s initialize our video stream (this can be from a video file or a camera). First we start the

VideoStream
  (Line 35), then we wait for the camera to warm up (Line 36), and finally we start the frames per second counter (Line 37). The
VideoStream
  and
FPS
  classes are part of my
imutils
  package.

Now, let’s loop over each and every frame (for speed purposes, you could skip frames):

# loop over the frames from the video stream
while True:
	# grab the frame from the threaded video stream and resize it
	# to have a maximum width of 400 pixels
	frame = vs.read()
	frame = imutils.resize(frame, width=400)

	# grab the frame dimensions and convert it to a blob
	(h, w) = frame.shape[:2]
	blob = cv2.dnn.blobFromImage(frame, 0.007843, (300, 300), 127.5)

	# pass the blob through the network and obtain the detections and
	# predictions
	net.setInput(blob)
	detections = net.forward()

First, we read  a

frame
  (Line 43) from the stream, followed by resizing it (Line 44).

Since we will need the width and height later, we grab these now on Line 47. This is followed by converting the

frame
  to a
blob
  with the
dnn
  module (Line 48).

Now for the heavy lifting: we set the

blob
  as the input to our neural network (Line 52) and feed the input through the
net
  (Line 53) which gives us our
detections
 .

At this point, we have detected objects in the input frame. It is now time to look at confidence values and determine if we should draw a box + label surrounding the object– you’ll recognize this code block from last week:

# loop over the detections
	for i in np.arange(0, detections.shape[2]):
		# extract the confidence (i.e., probability) associated with
		# the prediction
		confidence = detections[0, 0, i, 2]

		# filter out weak detections by ensuring the `confidence` is
		# greater than the minimum confidence
		if confidence > args["confidence"]:
			# extract the index of the class label from the
			# `detections`, then compute the (x, y)-coordinates of
			# the bounding box for the object
			idx = int(detections[0, 0, i, 1])
			box = detections[0, 0, i, 3:7] * np.array([w, h, w, h])
			(startX, startY, endX, endY) = box.astype("int")

			# draw the prediction on the frame
			label = "{}: {:.2f}%".format(CLASSES[idx],
				confidence * 100)
			cv2.rectangle(frame, (startX, startY), (endX, endY),
				COLORS[idx], 2)
			y = startY - 15 if startY - 15 > 15 else startY + 15
			cv2.putText(frame, label, (startX, y),
				cv2.FONT_HERSHEY_SIMPLEX, 0.5, COLORS[idx], 2)

We start by looping over our

detections
 , keeping in mind that multiple objects can be detected in a single image. We also apply a check to the confidence (i.e., probability) associated with each detection. If the confidence is high enough (i.e. above the threshold), then we’ll display the prediction in the terminal as well as draw the prediction on the image with text and a colored bounding box. Let’s break it down line-by-line:

Looping through our

detections
 , first we extract the
confidence
  value (Line 59).

If the

confidence
  is above our minimum threshold (Line 63), we extract the class label index (Line 67) and compute the bounding box coordinates around the detected object (Line 68).

Then, we extract the (x, y)-coordinates of the box (Line 69) which we will will use shortly for drawing a rectangle and displaying text.

We build a text

label
  containing the
CLASS
  name and the
confidence
  (Lines 72 and 73).

Let’s also draw a colored rectangle around the object using our class color and previously extracted (x, y)-coordinates (Lines 74 and 75).

In general, we want the label to be displayed above the rectangle, but if there isn’t room, we’ll display it just below the top of the rectangle (Line 76).

Finally, we overlay the colored text onto the

frame
  using the y-value that we just calculated (Lines 77 and 78).

The remaining steps in the frame capture loop involve (1) displaying the frame, (2) checking for a quit key, and (3) updating our frames per second counter:

# show the output frame
	cv2.imshow("Frame", frame)
	key = cv2.waitKey(1) & 0xFF

	# if the `q` key was pressed, break from the loop
	if key == ord("q"):
		break

	# update the FPS counter
	fps.update()

The above code block is pretty self-explanatory — first we display the frame (Line 81). Then we capture a key press (Line 82) while checking if the ‘q’ key (for “quit”) is pressed, at which point we break out of the frame capture loop (Lines 85 and 86).

Finally we update our fps counter (Line 89).

If we break out of the loop (‘q’ key press or end of the video stream), we have some housekeeping to take care of:

# stop the timer and display FPS information
fps.stop()
print("[INFO] elapsed time: {:.2f}".format(fps.elapsed()))
print("[INFO] approx. FPS: {:.2f}".format(fps.fps()))

# do a bit of cleanup
cv2.destroyAllWindows()
vs.stop()

When we’ve exited the loop, we stop the

fps
  counter (Line 92) and print information about the frames per second to our terminal (Lines 93 and 94).

We close the open window (Line 97) followed by stopping the video stream (Line 98).

If you’ve made it this far, you’re probably ready to give it a try with your webcam — to see how it’s done, let’s move on to the next section.

Real-time deep learning object detection results

To see our real-time deep-learning based object detector in action, make sure you use the “Downloads” section of this guide to download the example code + pre-trained Convolutional Neural Network.

From there, open up a terminal and execute the following command:

$ python real_time_object_detection.py \
	--prototxt MobileNetSSD_deploy.prototxt.txt \
	--model MobileNetSSD_deploy.caffemodel
[INFO] loading model...
[INFO] starting video stream...
[INFO] elapsed time: 55.07
[INFO] approx. FPS: 6.54

Provided that OpenCV can access your webcam you should see the output video frame with any detected objects. I have included sample results of applying deep learning object detection to an example video below:

Figure 1: A short clip of real-time object detection with deep learning and OpenCV + Python.

Notice how our deep learning object detector can detect not only myself (a person), but also the sofa I am sitting on and the chair next to me — all in real-time!

The full video can be found below:

Summary

In today’s blog post we learned how to perform real-time object detection using deep learning + OpenCV + video streams.

We accomplished this by combing two separate tutorials:

  1. Object detection with deep learning and OpenCV
  2. Efficient, threaded video streams with OpenCV

The end result is a deep learning-based object detector that can process approximately 6-8 FPS (depending on the speed of your system, of course).

Further speed improvements can be obtained by:

  1. Applying skip frames.
  2. Swapping different variations of MobileNet (that are faster, but less accurate).
  3. Potentially using the quantized variation of SqueezeNet (I haven’t tested this, but imagine it would be faster due to smaller network footprint).

In future blog posts we’ll be discussing deep learning object detection methods in more detail.

In the meantime, be sure to take a look at my book, Deep Learning for Computer Vision with Python, where I’ll be reviewing object detection frameworks such as Faster R-CNNs and Single Shot Detectors!

If you’re interested in studying deep learning for computer vision and image classification tasks, you just can’t beat this book — click here to learn more.

Downloads:

If you would like to download the code and images used in this post, please enter your email address in the form below. Not only will you get a .zip of the code, I’ll also send you a FREE 11-page Resource Guide on Computer Vision and Image Search Engines, including exclusive techniques that I don’t post on this blog! Sound good? If so, enter your email address and I’ll send you the code immediately!

The post Real-time object detection with deep learning and OpenCV appeared first on PyImageSearch.

Install OpenCV 3 on macOS with Homebrew (the easy way)

$
0
0

homebrew_opencv3_header

Over the past few weeks I have demonstrated how to compile OpenCV 3 on macOS with Python (2.7, 3.5) bindings from source.

Compiling OpenCV via source gives you complete and total control over which modules you want to build, how they are built, and where they are installed.

All this control can come at a price though.

The downside is that determining the correct CMake paths to your Python interpreter, libraries, and include directories can be non-trivial, especially for users who are new to OpenCV/Unix systems.

That begs the question…

“Is there an easier way to install OpenCV on macOS? A way that avoids the complicated CMake configuration?”

It turns out, there is — just use Homebrew, what many consider to be “the missing package manager for Mac”.

So, is it really that easy? Just can a few simple keystrokes and commands can be used to avoid the hassle and install OpenCV 3 without the headaches?

Well, there’s a little more to it than that…but the process is greatly simplified. You lose a bit of control (as compared to compiling from source), but what you gain is an easier to follow path to installing OpenCV on your Mac system.

To discover the easy way to install OpenCV 3 on macOS via Homebrew, just keep reading.

Install OpenCV 3 on macOS with Homebrew (the easy way)

The remainder of this blog post demonstrates how to install OpenCV 3 with both Python 2.7 and Python 3 bindings on macOS via Homebrew. The benefit of using Homebrew is that it greatly simplifies the install process (although it can pose problems of its own if you aren’t careful) to only a few set of commands that need to be run.

If you prefer to compile OpenCV from source with Python bindings on macOS, please refer to these tutorials:

Step #1: Install XCode

Before we can install OpenCV 3 on macOS via Homebrew, we first need to install Xcode, a set of software development tools for the Mac Operating System.

Download Xcode

The easiest method to download and install Xcode is to use the included App Store application on your macOS system. Simply open up App Store, search for “Xcode” in the search bar, and then click the “Get” button:

Figure 1: Downloading and installing Xcode on macOS.

Figure 1: Downloading and installing Xcode on macOS.

Depending on your internet connection and system speed, the download and install process can take anywhere from 30 to 60 minutes. I would suggest installing Xocde in the background while you are getting some other work done or going for a nice long walk.

Accept the Apple developer license

I’m assuming that you’re working with a fresh install of macOS and Xcode. If so, you’ll need to accept the developer license before continuing. Personally, I think this is easier to do via the terminal. Just open up a terminal and execute the following command:

$ sudo xcodebuild -license

Scroll to the bottom of the license and accept it.

If you have already installed Xcode and previously accepted the Apple developer license, you can skip this step.

Install the Apple Command Line Tools

Now that Xcode is installed and we have accepted the Apple developer license, we can install the Apple Command Line Tools. These tools include packages such as make, GCC, clang, etc. This is a required step, so make you install the Apple Command line tools via:

$ sudo xcode-select --install

When executing the above command you’ll see a confirmation window pop up asking you to approve the install:

Figure 2: Installing Apple Command Line Tools on macOS.

Figure 2: Installing Apple Command Line Tools on macOS.

Simply click the “Install” button to continue. The actual install process of Apple Command Line Tools should take less than a 5 minutes.

If you haven’t already, be sure you have accepted the Xcode license using the following command:

$ sudo xcodebuild -license

Step #2: Install Homebrew

We are now ready to install Homebrew, a package manager for macOS. You can think of Homebrew as the macOS equivalent of the Ubuntu/Debian-based apt-get.

Installing Homebrew is dead simple — simply copy and paste the command below the “Install Homebrew” section of the Homebrew website (make sure you copy and paste the entire command into your terminal). I have included the command below as reference:

$ ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)"

After Homebrew installs you should make sure the package definitions (i.e., the instructions used to install a given library/package) are up to date by executing the following command:

$ brew update

Now that Homebrew is successfully installed and updated, we need to update our

~/.bash_profile
  file so that it searches the Homebrew path for packages/libraries before it searches the system path. Failure to complete this step can lead to confusing errors, import problems, and segfaults when trying to utilize Python and OpenCV, so make sure you update your
~/.bash_profile
 
 file correctly!

The

~/.bash_profile
  file may or may not already exist on your system. In either case, open it with your favorite text editor (I’ll be using
nano
  in this example):
$ nano ~/.bash_profile

And then insert the following lines at the bottom of the file (if

~/.bash_profile
  does not exist the file will be empty — this is okay, just add the following lines to the file):
# Homebrew
export PATH=/usr/local/bin:$PATH

All this snippet is doing is updating your

PATH
  variable to look for libraries/binaries along the Homebrew path before it searches the system path.

After updating the

~/.bash_profile
  file, save and exit your text editor.

To make sure you are on the right path, I have included a screenshot of my

~/.bash_profile
  below so you can compare it to yours:
Figure 3: Updating my .bash_profile file to include Homebrew.

Figure 3: Updating my .bash_profile file to include Homebrew.

Remember, your

~/.bash_profile
  may look very different than mine — that’s okay! Just make sure you have included the above Homebrew snippet in your file, followed by successfully saving and editing the editor.

Finally, we need to manually

source
  the
~/.bash_profile
  file to ensure the changes have been reloaded:
$ source ~/.bash_profile

The above command only needs to be executed once. Whenever you open up a new terminal, login, etc., your

.bash_profile
  file will be automatically loaded and sourced for you.

Step #3: Install Python 2.7 and Python 3 using Homebrew

The next step is to install the Homebrew versions of Python 2.7 and Python 3. It is considered bad form to develop against the system Python as your main interpreter. The system version of Python should serve exactly that — system routines.

Instead, you should install your own version of Python that is independent from the system install. Using Homebrew, we can install both Python 2.7 and Python 3 using the following command:

$ brew install python python3

At the time of this writing the current Python versions installed by Homebrew are Python 2.7.12 and Python 3.5.2.

After the Python 2.7 and Python 3 install completes, we need to create some symbolic links:

$ brew linkapps python
$ brew linkapps python3

As a sanity check, let’s confirm that you are using the Homebrew version of Python rather than the system version of Python. You can accomplish this via the

which
  command:
$ which python
/usr/local/bin/python
$ which python3
/usr/local/bin/python3

Inspect the output of

which
  closely. If you are see
/usr/local/bin/python
  and
/usr/local/bin/python3
  for each of the paths then you are correctly using the Homebrew versions of Python. However, if the output is instead
/usr/bin/python
  and
/usr/bin/python3
  then you are incorrectly using the system version of Python.

If you find yourself in this situation you should:

  1. Go back to Step #2 and ensure Homebrew installed without error.
  2. Check that
    brew install python python3
      finished successfully.
  3. You have correctly updated your
    ~/.bash_profile
      file and reloaded the changes via
    source
     . Your
    ~/.bash_profile
      should look similar to mine in Figure 3 above.

Check your Python versions

After installing Python 2.7 and Python 3, you’ll want to check your Python version numbers using the following commands:

$ python --version
Python 2.7.10
$ python3 --version
Python 3.5.0

In particular, pay attention to both the major and minor version numbers. For the first command, my major Python version is 2 and the minor version is 7. Similarly, for the second command my major Python version is 3 and the minor version is 5.

The reason I bring this up is because file paths can and will change based on your particular Python version numbers. The instructions detailed in this tutorial will successfully install OpenCV via Homebrew on your macOS machine provided you pay attention to your Python version numbers.

For example, if I were to tell you to check the

site-packages
  directory of your Python 3 install and provided an example command of:
$ ls /usr/local/opt/opencv3/lib/python3.5/site-packages/

You should first check your Python 3 version. If the

python3 --version
  command above reported 3.6, then you would need to update your path to be:
$ ls /usr/local/opt/opencv3/lib/python3.6/site-packages/

Notice how

python3.5
  was changed to
python3.6
 .

Forgetting to check and validate file paths is a common mistake I see readers make when trying to install OpenCV on their macOS machines for the first time. Do not blindly copy and paste commands and file paths. Instead, take the time to validate your file paths based on your Python version numbers. Doing so will ensure your commands are correctly constructed and will help you immensely when installing OpenCV for the first time.

Step #4: Install OpenCV 3 with Python bindings on macOS using Homebrew

Now that we have installed the Homebrew versions of Python 2.7 and Python 3 we are now ready to install OpenCV 3.

Tap the “homebrew/science” repo

The first step is to add the

homebrew/science
  repository to the set of packages we are tracking. This allows us to access the formulae to install OpenCV. To accomplish this, just use the following command:
$ brew tap homebrew/science

Understanding the “brew install” command

To install OpenCV on our macOS system via Homebrew we are going to use the

brew install
  command. This command accepts the name of a package to install (like Debian/Ubuntu’s apt-get), followed by set of optional arguments.

The base of our command is:

brew install opencv3
 ; however, we need to add some additional parameters.

The most important set of parameters are listed below:

  • --with-contrib
     : This ensures that the opencv_contrib repository is installed, giving us access to additional, critical OpenCV features such as SIFT, SURF, etc.
  • --with-python3
     : OpenCV 3 + Python 2.7 bindings will be automatically compiled; however, to compile OpenCV 3 + Python 3 bindings we need to explicitly supply the
    --with-python3
      switch.
  • --HEAD
     : Rather than compiling a tagged OpenCV release (i.e., v3.0, v3.1, etc.) the
    --HEAD
      switch instead clones down the bleeding-edge version of OpenCV from GitHub. Why would we bother doing this? Simple. We need to avoid the QTKit error that plagues macOS Sierra systems with the current tagged OpenCV 3 releases (please see the “Avoiding the QTKit/QTKit.h file not found error” section of this blog post for more information)

You can see the full listing of options/switches by running

brew info opencv3
 , the output of which I’ve included below:
$ brew info opencv3
...
--32-bit
	Build 32-bit only
--c++11
	Build using C++11 mode
--with-contrib
	Build "extra" contributed modules
--with-cuda
	Build with CUDA v7.0+ support
--with-examples
	Install C and python examples (sources)
--with-ffmpeg
	Build with ffmpeg support
--with-gphoto2
	Build with gphoto2 support
--with-gstreamer
	Build with gstreamer support
--with-jasper
	Build with jasper support
--with-java
	Build with Java support
--with-libdc1394
	Build with libdc1394 support
--with-opengl
	Build with OpenGL support (must use --with-qt5)
--with-openni
	Build with openni support
--with-openni2
	Build with openni2 support
--with-python3
	Build with python3 support
--with-qt5
	Build the Qt5 backend to HighGUI
--with-quicktime
	Use QuickTime for Video I/O instead of QTKit
--with-static
	Build static libraries
--with-tbb
	Enable parallel code in OpenCV using Intel TBB
--with-vtk
	Build with vtk support
--without-eigen
	Build without eigen support
--without-numpy
	Use a numpy you've installed yourself instead of a Homebrew-packaged numpy
--without-opencl
	Disable GPU code in OpenCV using OpenCL
--without-openexr
	Build without openexr support
--without-python
	Build without Python support
--without-test
	Build without accuracy & performance tests
--HEAD
	Install HEAD version

For those who are curious, the Homebrew formulae (i.e., the actual commands used to install OpenCV 3) can be found here. Use the parameters above and the install script as a reference if you want to add any additional OpenCV 3 features.

We are now ready to install OpenCV 3 with Python bindings on your macOS system via Homebrew. Depending on the dependencies you do or do not already have installed, along with the speed of your system, this compilation could easily take a couple of hours, so you might want to go for a walk once you kick-off the install process.

Installing OpenCV 3 with Python 3 bindings via Homebrew

To start the OpenCV 3 install process, just execute the following command:

$ brew install opencv3 --with-contrib --with-python3 --HEAD

This command will install OpenCV 3 on your macOS system with both Python 2.7 and Python 3 bindings via Homebew. We’ll also be compiling the latest, bleeding edge version of OpenCV 3 (to avoid any QTKit errors) along with

opencv_contrib
  support enabled.

Update — 15 May 2017:

There was recently an update to the Homebrew formula used to install OpenCV on your macOS machine that may cause two types of errors.

Ideally the Homebrew formula will be updated in the future to prevent these errors, but in meantime, if you encounter either of the errors below:

  • opencv3: Does not support building both Python 2 and 3 wrappers
  • No such file or directory 3rdparty/ippicv/downloader.cmake

Then be sure to refer to this updated blog post where I provide solutions to both of the errors.


As I mentioned, this install process can take some time so consider going for a long walk while OpenCV installs. However, make sure your computer doesn’t go to sleep/shut down while you are gone! If it does, the install process will break and you’ll have to restart it.

Assuming OpenCV 3 installed without a problem, your terminal output should look similar to mine below:

Figure 5: Compiling and installing OpenCV 3 with Python bindings on macOS with Homebrew.

Figure 4: Compiling and installing OpenCV 3 with Python bindings on macOS with Homebrew.

However, we’re not quite done yet.

You’ll notice a little note at the bottom of the install output:

If you need Python to find bindings for this keg-only formula, run:
  echo /usr/local/opt/opencv3/lib/python2.7/site-packages >> /usr/local/lib/python2.7/site-packages/opencv3.pth

This means that our Python 2.7 + OpenCV 3 bindings are now installed in

/usr/local/opt/opencv3/lib/python2.7/site-packages
 , which is Homebrew path to the OpenCV compile. We can verify this via the
ls
  command:
$ ls -l /usr/local/opt/opencv3/lib/python2.7/site-packages
total 6944
-r--r--r--  1 admin  admin  3552288 Dec 15 09:28 cv2.so

However, we need to get these bindings into

/usr/local/lib/python2.7/site-packages/
 , which is the
site-packages
  directory for Python 2.7. We can do this by executing the following command:
$ echo /usr/local/opt/opencv3/lib/python2.7/site-packages >> /usr/local/lib/python2.7/site-packages/opencv3.pth

The above command creates a

.pth
  file which tells Homebrew’s Python 2.7 install to look for additional packages in
/usr/local/opt/opencv3/lib/python2.7/site-packages
 — in essence, the
.pth
  file can be considered a “glorified sym-link”.

At this point you now have OpenCV 3 + Python 2.7 bindings installed!

However, we’re not quite done yet…there is still a few extra steps we need to take for Python 3.

Handling the Python 3 issue

Remember the

--with-python3
  option we supplied to
brew install opencv3
 ?

Well, this option did work (although it might not seem like it) — we do have Python 3 + OpenCV 3 bindings installed on our system.

Note: A big thank you to Brandon Hurr for pointing this out. For a long time I thought the

--with-python3
  switch simply wasn’t working.

However, there’s a bit of a problem. If you check the contents of

/usr/local/opt/opencv3/lib/python3.5/site-packages/
  you’ll see that our
cv2.so
  file has a funny name:
$ ls -l /usr/local/opt/opencv3/lib/python3.5/site-packages/
total 6952
-r--r--r--  1 admin  admin  3556384 Dec 15 09:28 cv2.cpython-35m-darwin.so

I have no idea why the Python 3 + OpenCV 3 bindings are not named

cv2.so
  as they should be, but the same is true across operating systems. You’ll see this same issue on macOS, Ubuntu, and Raspbian.

Luckily, the fix is easy — all you need to do is rename

cv2.cpython-35m-darwin.so
  to
cv2.so
 :
$ cd /usr/local/opt/opencv3/lib/python3.5/site-packages/
$ mv cv2.cpython-35m-darwin.so cv2.so
$ cd ~

From there, we can create another

.pth
  file, this time for the Python 3 + OpenCV 3 install:
$ echo /usr/local/opt/opencv3/lib/python3.5/site-packages >> /usr/local/lib/python3.5/site-packages/opencv3.pth

At this point you now have both Python 2.7 + OpenCV 3 and Python 3 + OpenCV 3 installed on your macOS system via Homebrew.

Verifying that OpenCV 3 has been installed

Here are the commands I use to validate that OpenCV 3 with Python 2.7 bindings are working on my system:

$ python
Python 2.7.12 (default, Oct 11 2016, 05:20:59) 
[GCC 4.2.1 Compatible Apple LLVM 8.0.0 (clang-800.0.38)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import cv2
>>> cv2.__version__
'3.1.0-dev'
>>>

The screenshot below shows how to import the OpenCV 3 bindings into a Python 3 shell as well:

Figure 6: Confirming that OpenCV 3 with Python 3 bindings have been successfully installed on my macOS system via Homebrew.

Figure 5: Confirming that OpenCV 3 with Python 3 bindings have been successfully installed on my macOS system via Homebrew.

Congratulations, you have installed OpenCV 3 with Python bindings on your macOS system via Homebrew!

But if you’re a longtime reader reader of this blog, you know that I use Python virtual environments extensively — and you should too.

Step #5: Setup your Python virtual environment (optional)

You’ll notice that unlike many of my previous OpenCV 3 install tutorials, Homebrew does not make use of Python virtual environments, a best practice when doing Python development.

While Steps #5-#7 are optional, I highly recommend that you do them to ensure your system is configured in the same way as my previous tutorials. You’ll see many tutorials on the PyImageSearch blog leverage Python virtual environments. While they are indeed optional, you’ll find that in the long run they make your life easier.

Installing virtualenv and virtualenvwrapper

The virtualenv and virtualenvwrapper packages allow us to create separate, independent Python virtual environments for each project we are working on. I’ve mentioned Python virtual environments many times before on this blog so I won’t rehash what’s already been said. Instead, if you are unfamiliar with Python virtual environments, how they work, and why we use them, please refer to the first half of this blog post. I also recommend this excellent tutorial on the RealPython.com blog that takes a more in-depth dive into Python virtual environments.

To install both

virtualenv
  and
virtualenvwrapper
 , just use
pip
 :
$ pip install virtualenv virtualenvwrapper

After both packages have successfully installed, you’ll need to update your

~/.bash_profile
  file again:
$ nano ~/.bash_profile

Append the following lines to the file:

# Virtualenv/VirtualenvWrapper
source /usr/local/bin/virtualenvwrapper.sh

After updating, your

~/.bash_profile
  should look similar to mine below:
Figure 7: Update your .bash_profile file to include virtualenv/virtualenvwrapper.

Figure 6: Update your .bash_profile file to include virtualenv/virtualenvwrapper.

Once you have confirmed that your

~/.bash_profile
  has been created, you need to refresh your shell by using the
source
  command:
$ source ~/.bash_profile

This command only needs to be executed once. Assuming that your

~/.bash_profile
  has been updated correctly, it will automatically be loaded and
source
 ‘d each time you open a new shell, login, etc.

Create your Python virtual environment

We are now ready to use the

mkvirtualenv
  command to create a Python virtual environment named
cv
  (for “computer vision”).

For Python 2.7 use the following command:

$ mkvirtualenv cv -p python

For Python 3 use this command:

$ mkvirtualenv cv -p python3

The

-p
  switch controls which Python version is used to create your virtual environment. Please note that each virtual environment needs to be uniquely named so if you want to create two separate virtual environments, one for Python 2.7 and another for Python 3, you’ll want to make sure that each environment has a separate name — both cannot be named “cv”.

The

mkvirtualenv
  command only needs to be executed once. To access the
cv
  Python virtual environment after you have already created it, just use the
workon
  command:
$ workon cv

To visually validate you are in the

cv
  virtual environment, just examine your command line. If you see the text
(cv)
  preceding the prompt, then you are in the
cv
  virtual environment:
Figure 8: Make sure you see the "(cv)" text on your prompt, indicating that you are in the cv virtual environment.

Figure 7: Make sure you see the “(cv)” text on your prompt, indicating that you are in the cv virtual environment.

Otherwise, if you do not  see the

cv
  text, then you are not in the
cv
  virtual environment:
Figure 9: If you do not see the “(cv)” text on your prompt, then you are not in the cv virtual environment and you need to run the "workon" command to resolve this issue before continuing.

Figure 8: If you do not see the “(cv)” text on your prompt, then you are not in the cv virtual environment and you need to run the “workon” command to resolve this issue before continuing.

Install NumPy

The only Python prerequisite for OpenCV is NumPy, a scientific computing package.

To install NumPy, first make sure you are in the

cv
  virtual environment and then let
pip
  handle the actual installation:
$ pip install numpy

Step #6: Sym-link the OpenCV 3 bindings (optional)

We are now ready to sym-link in the

cv2.so
  bindings into our
cv
  virtual environment. I have included the commands for both Python 2.7 and Python 3, although the process is very similar.

For Python 2.7

To sym-link the

cv2.so
  bindings into your Python 2.7 virtual environment named
cv
 , use these commands:
$ cd ~/.virtualenvs/cv/lib/python2.7/site-packages/
$ ln -s /usr/local/opt/opencv3/lib/python2.7/site-packages/cv2.so cv2.so
$ cd ~

For Python 3:

To sym-link the

cv2.so
  bindings installed via Homebrew to your Python 3 virtual environment (named
cv
 ), execute these commands:
$ cd ~/.virtualenvs/cv/lib/python3.5/site-packages/
$ ln -s /usr/local/opt/opencv3/lib/python3.5/site-packages/cv2.so cv2.so
$ cd ~

Repeat as necessary

If you would like to have OpenCV 3 bindings installed for both Python 2.7 and Python 3, then you’ll want to repeat Step #5 and Step #6 for both Python versions. This includes creating a uniquely named Python virtual environment, installing NumPy, and sym-linking in the

cv2.so
  bindings.

Step #7: Test your OpenCV 3 install (optional)

To verify that your OpenCV 3 + Python + virtual environment install on macOS is working properly, you should:

  1. Open up a new terminal window.
  2. Execute the
    workon
      command to access the
    cv
      Python virtual environment.
  3. Attempt to import your Python + OpenCV 3 bindings on macOS.

Here are the exact commands I used to validate that my Python virtual environment + OpenCV install are working correctly:

$ workon cv
$ python
>>> import cv2
>>> cv2.__version__
'3.1.0-dev'
>>>

Note that the above output demonstrates how to use OpenCV 3 + Python 2.7 with virtual environments.

I also created an OpenCV 3 + Python 3 virtual environment as well (named

py3cv3
 ), installed NumPy, and sym-linked the OpenCV 3 bindings. The output of me accessing the
py3cv3
  virtual environment and importing OpenCV can be seen below:
Figure 10: Utilizing virtual environments with Python 3 + OpenCV 3 on macOS.

Figure 10: Utilizing virtual environments with Python 3 + OpenCV 3 on macOS.

So, what’s next?

Congrats! You now have a brand new, fresh install of OpenCV on your macOS system — and I’m sure you’re just itching to leverage your install to build some awesome computer vision apps…

…but I’m also willing to bet that you’re just getting started learning computer vision and OpenCV, and probably feeling a bit confused and overwhelmed on exactly where to start.

Personally, I’m a big fan of learning by example, so a good first step would be to have some fun and read this blog post on detecting cats in images/videos. This tutorial is meant to be very hands-on and demonstrate how you can (quickly) build a Python + OpenCV application to detect the presence of cats in images.

And if you’re really interested in leveling-up your computer vision skills, you should definitely check out my book, Practical Python and OpenCV + Case Studies. My book not only covers the basics of computer vision and image processing, but also teaches you how to solve real-world computer vision problems including face detection in images and video streamsobject tracking in video, and handwriting recognition.

curious_about_cv

So, let’s put that fresh install of OpenCV 3 on your macOS system to good use — just click here to learn more about the real-world projects you can solve using Practical Python and OpenCV.

Summary

In today’s blog post I demonstrated how to install OpenCV 3 with Python 2.7 and Python 3 bindings on your macOS system via Homebrew.

As you can see, utilizing Homebrew is a great method to avoid the tedious process of manually configuring your CMake command to compile OpenCV via source (my full list of OpenCV install tutorials can be found on this page).

The downside is that you lose much of the control that CMake affords you.

Furthermore, while the Homebrew method certainly requires executing less commands and avoids potentially frustrating configurations, it’s still worth mentioning that you still need to do a bit of work yourself, especially when it comes to the Python 3 bindings.

These steps also compound if you decide to use virtual environments, a best practice when doing Python development.

When it comes to installing OpenCV 3 on your own macOS system I would suggest you:

  1. First try to install OpenCV 3 via source. If you run into considerable trouble and struggle to get OpenCV 3 to compile, use this as an opportunity to teach yourself more about Unix environments. More times than not, OpenCV 3 failing to compile is due to an incorrect CMake parameter that can be correctly determined with a little more knowledge over Unix systems, paths, and libraries.
  2. Use Homebrew as a fallback. I would recommend using the Homebrew method to install OpenCV 3 as your fallback option. You lose a bit of control when installing OpenCV 3 via Homebrew, and worse, if any sym-links break during a major operating system upgrade you’ll struggle to resolve them. Don’t get me wrong: I love Homebrew and think it’s a great tool — but make sure you use it wisely.

Anyway, I hope you enjoyed this blog post! And I hope it helps you get OpenCV 3 installed on their macOS systems.

If you’re interested in learning more about OpenCV, computer vision, and image processing, be sure to enter your email address in the form below to be notified when new blog posts + tutorials are published!

The post Install OpenCV 3 on macOS with Homebrew (the easy way) appeared first on PyImageSearch.


Rotate images (correctly) with OpenCV and Python

$
0
0

opencv_rotated_header

Let me tell you an embarrassing story of how I wasted three weeks of research time during graduate school six years ago.

It was the end of my second semester of coursework.

I had taken all of my exams early and all my projects for the semester had been submitted.

Since my school obligations were essentially nil, I started experimenting with (automatically) identifying prescription pills in images, something I know a thing or two about (but back then I was just getting started with my research).

At the time, my research goal was to find and identify methods to reliably quantify pills in a rotation invariant manner. Regardless of how the pill was rotated, I wanted the output feature vector to be (approximately) the same (the feature vectors will never be to completely identical in a real-world application due to lighting conditions, camera sensors, floating point errors, etc.).

After the first week I was making fantastic progress.

I was able to extract features from my dataset of pills, index them, and then identify my test set of pills regardless of how they were oriented…

…however, there was a problem:

My method was only working with round, circular pills — I was getting completely nonsensical results for oblong pills.

How could that be?

I racked my brain for the explanation.

Was there a flaw in the logic of my feature extraction algorithm?

Was I not matching the features correctly?

Or was it something else entirely…like a problem with my image preprocessing.

While I might have been ashamed to admit this as a graduate student, the problem was the latter:

I goofed up.

It turns out that during the image preprocessing phase, I was rotating my images incorrectly.

Since round pills have are approximately square in their aspect ratio, the rotation bug wasn’t a problem for them. Here you can see a round pill being rotated a full 360 degrees without an issue:

Figure 1: Rotating a circular pill doesn't reveal any obvious problems.

Figure 1: Rotating a circular pill doesn’t reveal any obvious problems.

But for oblong pills, they would be “cut off” in the rotation process, like this:

Figure 2: However, rotating oblong pills using the OpenCV's standard cv2.getRotationMatrix2D and cv2.warpAffine functions caused me some problems.

Figure 2: However, rotating oblong pills using the OpenCV’s standard cv2.getRotationMatrix2D and cv2.warpAffine functions caused me some problems that weren’t immediately obvious.

In essence, I was only quantifying part of the rotated, oblong pills; hence my strange results.

I spent three weeks and part of my Christmas vacation banging my head against the wall trying to diagnose the bug — only to feel quite embarrassed when I realized it was due to me being negligent with the

cv2.rotate
  function.

You see, the size of the output image needs to be adjusted, otherwise, the corners of my image would be cut off.

How did I accomplish this and squash the bug for good?

To learn how to rotate images with OpenCV such that the entire image is included and none of the image is cut off, just keep reading.

Looking for the source code to this post?
Jump right to the downloads section.

Rotate images (correctly) with OpenCV and Python

In the remainder of this blog post I’ll discuss common issues that you may run into when rotating images with OpenCV and Python.

Specifically, we’ll be examining the problem of what happens when the corners of an image are “cut off” during the rotation process.

To make sure we all understand this rotation issue with OpenCV and Python I will:

  • Start with a simple example demonstrating the rotation problem.
  • Provide a rotation function that ensures images are not cut off in the rotation process.
  • Discuss how I resolved my pill identification issue using this method.

A simple rotation problem with OpenCV

Let’s get this blog post started with an example script.

Open up a new file, name it

rotate_simple.py
 , and insert the following code:
# import the necessary packages
import numpy as np
import argparse
import imutils
import cv2

# construct the argument parse and parse the arguments
ap = argparse.ArgumentParser()
ap.add_argument("-i", "--image", required=True,
	help="path to the image file")
args = vars(ap.parse_args())

Lines 2-5 start by importing our required Python packages.

If you don’t already have imutils, my series of OpenCV convenience functions installed, you’ll want to do that now:

$ pip install imutils

If you already have

imutils
  installed, make sure you have upgraded to the latest version:
$ pip install --upgrade imutils

From there, Lines 8-10 parse our command line arguments. We only need a single switch here,

--image
 , which is the path to where our image resides on disk.

Let’s move on to actually rotating our image:

# import the necessary packages
import numpy as np
import argparse
import imutils
import cv2

# construct the argument parse and parse the arguments
ap = argparse.ArgumentParser()
ap.add_argument("-i", "--image", required=True,
	help="path to the image file")
args = vars(ap.parse_args())

# load the image from disk
image = cv2.imread(args["image"])

# loop over the rotation angles
for angle in np.arange(0, 360, 15):
	rotated = imutils.rotate(image, angle)
	cv2.imshow("Rotated (Problematic)", rotated)
	cv2.waitKey(0)

# loop over the rotation angles again, this time ensuring
# no part of the image is cut off
for angle in np.arange(0, 360, 15):
	rotated = imutils.rotate_bound(image, angle)
	cv2.imshow("Rotated (Correct)", rotated)
	cv2.waitKey(0)

Line 14 loads the image we want to rotate from disk.

We then loop over various angles in the range [0, 360] in 15 degree increments (Line 17).

For each of these angles we call

imutils.rotate
 , which rotates our
image
  the specified number of
angle
  degrees about the center of the image. We then display the rotated image to our screen.

Lines 24-27 perform an identical process, but this time we call

imutils.rotate_bound
  (I’ll provide the implementation of this function in the next section).

As the name of this method suggests, we are going to ensure the entire image is bound inside the window and none is cut off.

To see this script in action, be sure to download the source code using the “Downloads” section of this blog post, followed by executing the command below:

$ python rotate_simple.py --image images/saratoga.jpg

The output of using the

imutils.rotate
  function on a non-square image can be seen below:
Figure 3: An example of corners being cut off when rotating an image using OpenCV and Python.

Figure 3: An example of corners being cut off when rotating an image using OpenCV and Python.

As you can see, the image is “cut off” when it’s rotated — the entire image is not kept in the field of view.

But if we use

imutils.rotate_bound
  we can resolve this issue:
Figure 4: We can ensure the entire image is kept in the field of view by modifying the matrix returned by cv2.getRotationMatrix2D.

Figure 4: We can ensure the entire image is kept in the field of view by modifying the matrix returned by cv2.getRotationMatrix2D.

Awesome, we fixed the problem!

So does this mean that we should always use

.rotate_bound
  over the
.rotate
  method?

What makes it so special?

And what’s going on under the hood?

I’ll answer these questions in the next section.

Implementing a rotation function that doesn’t cut off your images

Let me start off by saying there is nothing wrong with the

cv2.getRotationMatrix2D
  and
cv2.warpAffine
  functions that are used to rotate images inside OpenCV.

In reality, these functions give us more freedom than perhaps we are comfortable with (sort of like comparing manual memory management with C versus automatic garbage collection with Java).

The

cv2.getRotationMatrix2D
  function doesn’t care if we would like the entire rotated image to kept.

It doesn’t care if the image is cut off.

And it won’t help you if you shoot yourself in the foot when using this function (I found this out the hard way and it took 3 weeks to stop the bleeding).

Instead, what you need to do is understand what the rotation matrix is and how it’s constructed.

You see, when you rotate an image with OpenCV you call

cv2.getRotationMatrix2D
  which returns a matrix M that looks something like this:
Figure 5: The structure of the matrix M returned by cv2.getRotationMatrix2D.

Figure 5: The structure of the matrix M returned by cv2.getRotationMatrix2D.

This matrix looks scary, but I promise you: it’s not.

To understand it, let’s assume we want to rotate our image \theta degrees about some center (c_{x}, c_{y}) coordinates at some scale (i.e., smaller or larger).

We can then plug in values for \alpha and \beta:

\alpha = scale * cos \theta and \beta = scale * sin \theta

That’s all fine and good for simple rotation — but it doesn’t take into account what happens if an image is cut off along the borders. How do we remedy this?

The answer is inside the

rotate_bound
  function in convenience.py of imutils:
# author:    Adrian Rosebrock
# website:   https://www.pyimagesearch.com

# import the necessary packages
import numpy as np
import cv2
import sys

# import any special Python 2.7 packages
if sys.version_info.major == 2:
    from urllib import urlopen

# import any special Python 3 packages
elif sys.version_info.major == 3:
    from urllib.request import urlopen

def translate(image, x, y):
    # define the translation matrix and perform the translation
    M = np.float32([[1, 0, x], [0, 1, y]])
    shifted = cv2.warpAffine(image, M, (image.shape[1], image.shape[0]))

    # return the translated image
    return shifted

def rotate(image, angle, center=None, scale=1.0):
    # grab the dimensions of the image
    (h, w) = image.shape[:2]

    # if the center is None, initialize it as the center of
    # the image
    if center is None:
        center = (w // 2, h // 2)

    # perform the rotation
    M = cv2.getRotationMatrix2D(center, angle, scale)
    rotated = cv2.warpAffine(image, M, (w, h))

    # return the rotated image
    return rotated

def rotate_bound(image, angle):
    # grab the dimensions of the image and then determine the
    # center
    (h, w) = image.shape[:2]
    (cX, cY) = (w // 2, h // 2)

    # grab the rotation matrix (applying the negative of the
    # angle to rotate clockwise), then grab the sine and cosine
    # (i.e., the rotation components of the matrix)
    M = cv2.getRotationMatrix2D((cX, cY), -angle, 1.0)
    cos = np.abs(M[0, 0])
    sin = np.abs(M[0, 1])

    # compute the new bounding dimensions of the image
    nW = int((h * sin) + (w * cos))
    nH = int((h * cos) + (w * sin))

    # adjust the rotation matrix to take into account translation
    M[0, 2] += (nW / 2) - cX
    M[1, 2] += (nH / 2) - cY

    # perform the actual rotation and return the image
    return cv2.warpAffine(image, M, (nW, nH))

On Line 41 we define our

rotate_bound
  function.

This method accepts an input

image
  and an
angle
  to rotate it by.

We assume we’ll be rotating our image about its center (x, y)-coordinates, so we determine these values on lines 44 and 45.

Given these coordinates, we can call

cv2.getRotationMatrix2D
  to obtain our rotation matrix M (Line 50).

However, to adjust for any image border cut off issues, we need to apply some manual calculations of our own.

We start by grabbing the cosine and sine values from our rotation matrix M (Lines 51 and 52).

This enables us to compute the new width and height of the rotated image, ensuring no part of the image is cut off.

Once we know the new width and height, we can adjust for translation on Lines 59 and 60 by modifying our rotation matrix once again.

Finally,

cv2.warpAffine
  is called on Line 63 to rotate the actual image using OpenCV while ensuring none of the image is cut off.

For some other interesting solutions (some better than others) to the rotation cut off problem when using OpenCV, be sure to refer to this StackOverflow thread and this one too.

Fixing the rotated image “cut off” problem with OpenCV and Python

Let’s get back to my original problem of rotating oblong pills and how I used

.rotate_bound
  to solve the issue (although back then I had not created the
imutils
  Python package — it was simply a utility function in a helper file).

We’ll be using the following pill as our example image:

Figure 6: The example oblong pill we will be rotating with OpenCV.

Figure 6: The example oblong pill we will be rotating with OpenCV.

To start, open up a new file and name it

rotate_pills.py
 . Then, insert the following code:
# import the necessary packages
import numpy as np
import argparse
import imutils
import cv2

# construct the argument parse and parse the arguments
ap = argparse.ArgumentParser()
ap.add_argument("-i", "--image", required=True,
	help="path to the image file")
args = vars(ap.parse_args())

Lines 2-5 import our required Python packages. Again, make sure you have installed and/or upgraded the imutils Python package before continuing.

We then parse our command line arguments on Lines 8-11. Just like in the example at the beginning of the blog post, we only need one switch:

--image
 , the path to our input image.

Next, we load our pill image from disk and preprocess it by converting it to grayscale, blurring it, and detecting edges:

# import the necessary packages
import numpy as np
import argparse
import imutils
import cv2

# construct the argument parse and parse the arguments
ap = argparse.ArgumentParser()
ap.add_argument("-i", "--image", required=True,
	help="path to the image file")
args = vars(ap.parse_args())

# load the image from disk, convert it to grayscale, blur it,
# and apply edge detection to reveal the outline of the pill
image = cv2.imread(args["image"])
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
gray = cv2.GaussianBlur(gray, (3, 3), 0)
edged = cv2.Canny(gray, 20, 100)

After executing these preprocessing functions our pill image now looks like this:

Figure 7: Detecting edges in the pill.

Figure 7: Detecting edges in the pill.

The outline of the pill is clearly visible, so let’s apply contour detection to find the outline of the pill:

# import the necessary packages
import numpy as np
import argparse
import imutils
import cv2

# construct the argument parse and parse the arguments
ap = argparse.ArgumentParser()
ap.add_argument("-i", "--image", required=True,
	help="path to the image file")
args = vars(ap.parse_args())

# load the image from disk, convert it to grayscale, blur it,
# and apply edge detection to reveal the outline of the pill
image = cv2.imread(args["image"])
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
gray = cv2.GaussianBlur(gray, (3, 3), 0)
edged = cv2.Canny(gray, 20, 100)

# find contours in the edge map
cnts = cv2.findContours(edged.copy(), cv2.RETR_EXTERNAL,
	cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if imutils.is_cv2() else cnts[1]

We are now ready to extract the pill ROI from the image:

# import the necessary packages
import numpy as np
import argparse
import imutils
import cv2

# construct the argument parse and parse the arguments
ap = argparse.ArgumentParser()
ap.add_argument("-i", "--image", required=True,
	help="path to the image file")
args = vars(ap.parse_args())

# load the image from disk, convert it to grayscale, blur it,
# and apply edge detection to reveal the outline of the pill
image = cv2.imread(args["image"])
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
gray = cv2.GaussianBlur(gray, (3, 3), 0)
edged = cv2.Canny(gray, 20, 100)

# find contours in the edge map
cnts = cv2.findContours(edged.copy(), cv2.RETR_EXTERNAL,
	cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if imutils.is_cv2() else cnts[1]

# ensure at least one contour was found
if len(cnts) > 0:
	# grab the largest contour, then draw a mask for the pill
	c = max(cnts, key=cv2.contourArea)
	mask = np.zeros(gray.shape, dtype="uint8")
	cv2.drawContours(mask, [c], -1, 255, -1)

	# compute its bounding box of pill, then extract the ROI,
	# and apply the mask
	(x, y, w, h) = cv2.boundingRect(c)
	imageROI = image[y:y + h, x:x + w]
	maskROI = mask[y:y + h, x:x + w]
	imageROI = cv2.bitwise_and(imageROI, imageROI,
		mask=maskROI)

First, we ensure that at least one contour was found in the edge map (Line 26).

Provided we have at least one contour, we construct a

mask
  for the largest contour region on Lines 29 and 30.

Our

mask
  looks like this:
Figure 8: The mask representing the entire pill region in the image.

Figure 8: The mask representing the entire pill region in the image.

Given the contour region, we can compute the (x, y)-coordinates of the bounding box of the region (Line 34).

Using both the bounding box and

mask
 , we can extract the actual pill region ROI (Lines 35-38).

Now, let’s go ahead and apply both the

imutils.rotate
  and
imutils.rotate_bound
  functions to the
imageROI
 , just like we did in the simple examples above:
# import the necessary packages
import numpy as np
import argparse
import imutils
import cv2

# construct the argument parse and parse the arguments
ap = argparse.ArgumentParser()
ap.add_argument("-i", "--image", required=True,
	help="path to the image file")
args = vars(ap.parse_args())

# load the image from disk, convert it to grayscale, blur it,
# and apply edge detection to reveal the outline of the pill
image = cv2.imread(args["image"])
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
gray = cv2.GaussianBlur(gray, (3, 3), 0)
edged = cv2.Canny(gray, 20, 100)

# find contours in the edge map
cnts = cv2.findContours(edged.copy(), cv2.RETR_EXTERNAL,
	cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if imutils.is_cv2() else cnts[1]

# ensure at least one contour was found
if len(cnts) > 0:
	# grab the largest contour, then draw a mask for the pill
	c = max(cnts, key=cv2.contourArea)
	mask = np.zeros(gray.shape, dtype="uint8")
	cv2.drawContours(mask, [c], -1, 255, -1)

	# compute its bounding box of pill, then extract the ROI,
	# and apply the mask
	(x, y, w, h) = cv2.boundingRect(c)
	imageROI = image[y:y + h, x:x + w]
	maskROI = mask[y:y + h, x:x + w]
	imageROI = cv2.bitwise_and(imageROI, imageROI,
		mask=maskROI)

	# loop over the rotation angles
	for angle in np.arange(0, 360, 15):
		rotated = imutils.rotate(imageROI, angle)
		cv2.imshow("Rotated (Problematic)", rotated)
		cv2.waitKey(0)

	# loop over the rotation angles again, this time ensure the
	# entire pill is still within the ROI after rotation
	for angle in np.arange(0, 360, 15):
		rotated = imutils.rotate_bound(imageROI, angle)
		cv2.imshow("Rotated (Correct)", rotated)
		cv2.waitKey(0)

After downloading the source code to this tutorial using the “Downloads” section below, you can execute the following command to see the output:

$ python rotate_pills.py --image images/pill_01.png

The output of

imutils.rotate
  will look like:
Figure 9: Incorrectly rotating an image with OpenCV causes parts of the image to be cut off.

Figure 9: Incorrectly rotating an image with OpenCV causes parts of the image to be cut off.

Notice how the pill is cut off during the rotation process — we need to explicitly compute the new dimensions of the rotated image to ensure the borders are not cut off.

By using

imutils.rotate_bound
 , we can ensure that no part of the image is cut off when using OpenCV:
Figure 10: By modifying OpenCV's rotation matrix we can resolve the issue and ensure the entire image is visible.

Figure 10: By modifying OpenCV’s rotation matrix we can resolve the issue and ensure the entire image is visible.

Using this function I was finally able to finish my research for the winter break — but not before I felt quite embarrassed about my rookie mistake.

Summary

In today’s blog post I discussed how image borders can be cut off when rotating images with OpenCV and

cv2.warpAffine
 .

The fact that image borders can be cut off is not a bug in OpenCV — in fact, it’s how

cv2.getRotationMatrix2D
  and
cv2.warpAffine
  are designed.

While it may seem frustrating and cumbersome to compute new image dimensions to ensure you don’t lose your borders, it’s actually a blessing in disguise.

OpenCV gives us so much control that we can modify our rotation matrix to make it do exactly what we want.

Of course, this requires us to know how our rotation matrix M is formed and what each of its components represents (discussed earlier in this tutorial). Provided we understand this, the math falls out naturally.

To learn more about image processing and computer vision, be sure to take a look at the PyImageSearch Gurus course where I discuss these topics in more detail.

Otherwise, I encourage you to enter your email address in the form below to be notified when future blog posts are published.

Downloads:

If you would like to download the code and images used in this post, please enter your email address in the form below. Not only will you get a .zip of the code, I’ll also send you a FREE 11-page Resource Guide on Computer Vision and Image Search Engines, including exclusive techniques that I don’t post on this blog! Sound good? If so, enter your email address and I’ll send you the code immediately!

The post Rotate images (correctly) with OpenCV and Python appeared first on PyImageSearch.

Faster video file FPS with cv2.VideoCapture and OpenCV

$
0
0

file_video_sream_animation

Have you ever worked with a video file via OpenCV’s

cv2.VideoCapture
  function and found that reading frames just felt slow and sluggish?

I’ve been there — and I know exactly how it feels.

Your entire video processing pipeline crawls along, unable to process more than one or two frames per second — even though you aren’t doing any type of computationally expensive image processing operations.

Why is that?

Why, at times, does it seem like an eternity for

cv2.VideoCapture
  and the associated
.read
  method to poll another frame from your video file?

The answer is almost always video compression and frame decoding.

Depending on your video file type, the codecs you have installed, and not to mention, the physical hardware of your machine, much of your video processing pipeline can actually be consumed by reading and decoding the next frame in the video file.

That’s just computationally wasteful — and there is a better way.

In the remainder of today’s blog post, I’ll demonstrate how to use threading and a queue data structure to improve your video file FPS rate by over 52%!

Looking for the source code to this post?
Jump right to the downloads section.

Faster video file FPS with cv2.VideoCapture and OpenCV

When working with video files and OpenCV you are likely using the

cv2.VideoCapture
  function.

First, you instantiate your

cv2.VideoCapture
  object by passing in the path to your input video file.

Then you start a loop, calling the

.read
  method of
cv2.VideoCapture
  to poll the next frame from the video file so you can process it in your pipeline.

The problem (and the reason why this method can feel slow and sluggish) is that you’re both reading and decoding the frame in your main processing thread!

As I’ve mentioned in previous posts, the

.read
  method is a blocking operation — the main thread of your Python + OpenCV application is entirely blocked (i.e., stalled) until the frame is read from the video file, decoded, and returned to the calling function.

By moving these blocking I/O operations to a separate thread and maintaining a queue of decoded frames we can actually improve our FPS processing rate by over 52%!

This increase in frame processing rate (and therefore our overall video processing pipeline) comes from dramatically reducing latency — we don’t have to wait for the

.read
  method to finish reading and decoding a frame; instead, there is always a pre-decoded frame ready for us to process.

To accomplish this latency decrease our goal will be to move the reading and decoding of video file frames to an entirely separate thread of the program, freeing up our main thread to handle the actual image processing.

But before we can appreciate the faster, threaded method to video frame processing, we first need to set a benchmark/baseline with the slower, non-threaded version.

The slow, naive method to reading video frames with OpenCV

The goal of this section is to obtain a baseline on our video frame processing throughput rate using OpenCV and Python.

To start, open up a new file, name it

read_frames_slow.py
 , and insert the following code:
# import the necessary packages
from imutils.video import FPS
import numpy as np
import argparse
import imutils
import cv2

# construct the argument parse and parse the arguments
ap = argparse.ArgumentParser()
ap.add_argument("-v", "--video", required=True,
	help="path to input video file")
args = vars(ap.parse_args())

# open a pointer to the video stream and start the FPS timer
stream = cv2.VideoCapture(args["video"])
fps = FPS().start()

Lines 2-6 import our required Python packages. We’ll be using my imutils library, a series of convenience functions to make image and video processing operations easier with OpenCV and Python.

If you don’t already have

imutils
  installed or if you are using a previous version, you can install/upgrade
imutils
  by using the following command:
$ pip install --upgrade imutils

Lines 9-12 then parse our command line arguments. We only need a single switch for this script,

--video
 , which is the path to our input video file.

Line 15 opens a pointer to the

--video
  file using the
cv2.VideoCapture
  class while Line 16 starts a timer that we can use to measure FPS, or more specifically, the throughput rate of our video processing pipeline.

With

cv2.VideoCapture
  instantiated, we can start reading frames from the video file and processing them one-by-one:
# import the necessary packages
from imutils.video import FPS
import numpy as np
import argparse
import imutils
import cv2

# construct the argument parse and parse the arguments
ap = argparse.ArgumentParser()
ap.add_argument("-v", "--video", required=True,
	help="path to input video file")
args = vars(ap.parse_args())

# open a pointer to the video stream and start the FPS timer
stream = cv2.VideoCapture(args["video"])
fps = FPS().start()

# loop over frames from the video file stream
while True:
	# grab the frame from the threaded video file stream
	(grabbed, frame) = stream.read()

	# if the frame was not grabbed, then we have reached the end
	# of the stream
	if not grabbed:
		break

	# resize the frame and convert it to grayscale (while still
	# retaining 3 channels)
	frame = imutils.resize(frame, width=450)
	frame = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
	frame = np.dstack([frame, frame, frame])

	# display a piece of text to the frame (so we can benchmark
	# fairly against the fast method)
	cv2.putText(frame, "Slow Method", (10, 30),
		cv2.FONT_HERSHEY_SIMPLEX, 0.6, (0, 255, 0), 2)	

	# show the frame and update the FPS counter
	cv2.imshow("Frame", frame)
	cv2.waitKey(1)
	fps.update()

On Line 19 we start looping over the frames of our video file.

A call to the

.read
  method on Line 21 returns a 2-tuple containing:
  1. grabbed
     : A boolean indicating if the frame was successfully read or not.
  2. frame
     : The actual video frame itself.

If

grabbed
  is
False
  then we know we have reached the end of the video file and can break from the loop (Lines 25 and 26).

Otherwise, we perform some basic image processing tasks, including:

  1. Resizing the frame to have a width of 450 pixels.
  2. Converting the frame to grayscale.
  3. Drawing the text on the frame via the
    cv2.putText
      method. We do this because we’ll be using the
    cv2.putText
      function to display our queue size in the fast, threaded example below and want to have a fair, comparable pipeline.

Lines 40-42 display the frame to our screen and update our FPS counter.

The final code block handles computing the approximate FPS/frame rate throughput of our pipeline, releasing the video stream pointer, and closing any open windows:

# import the necessary packages
from imutils.video import FPS
import numpy as np
import argparse
import imutils
import cv2

# construct the argument parse and parse the arguments
ap = argparse.ArgumentParser()
ap.add_argument("-v", "--video", required=True,
	help="path to input video file")
args = vars(ap.parse_args())

# open a pointer to the video stream and start the FPS timer
stream = cv2.VideoCapture(args["video"])
fps = FPS().start()

# loop over frames from the video file stream
while True:
	# grab the frame from the threaded video file stream
	(grabbed, frame) = stream.read()

	# if the frame was not grabbed, then we have reached the end
	# of the stream
	if not grabbed:
		break

	# resize the frame and convert it to grayscale (while still
	# retaining 3 channels)
	frame = imutils.resize(frame, width=450)
	frame = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
	frame = np.dstack([frame, frame, frame])

	# display a piece of text to the frame (so we can benchmark
	# fairly against the fast method)
	cv2.putText(frame, "Slow Method", (10, 30),
		cv2.FONT_HERSHEY_SIMPLEX, 0.6, (0, 255, 0), 2)	

	# show the frame and update the FPS counter
	cv2.imshow("Frame", frame)
	cv2.waitKey(1)
	fps.update()

# stop the timer and display FPS information
fps.stop()
print("[INFO] elasped time: {:.2f}".format(fps.elapsed()))
print("[INFO] approx. FPS: {:.2f}".format(fps.fps()))

# do a bit of cleanup
stream.release()
cv2.destroyAllWindows()

To execute this script, be sure to download the source code + example video to this blog post using the “Downloads” section at the bottom of the tutorial.

For this example we’ll be using the first 31 seconds of the Jurassic Park trailer (the .mp4 file is included in the code download):

Let’s go ahead and obtain a baseline for frame processing throughput on this example video:

$ python read_frames_slow.py --video videos/jurassic_park_intro.mp4

Figure 1: The slow, naive method to read frames from a video file using Python and OpenCV.

Figure 1: The slow, naive method to read frames from a video file using Python and OpenCV.

As you can see, processing each individual frame of the 31 second video clip takes approximately 47 seconds with a FPS processing rate of 20.21.

These results imply that it’s actually taking longer to read and decode the individual frames than the actual length of the video clip!

To see how we can speedup our frame processing throughput, take a look at the technique I describe in the next section.

Using threading to buffer frames with OpenCV

To improve the FPS processing rate of frames read from video files with OpenCV we are going to utilize threading and the queue data structure:

Figure 2: An example of the queue data structure. New data is enqueued to the back of the list while older data is dequeued from the front of the list. (source: Wikipedia)

Figure 2: An example of the queue data structure. New data is enqueued to the back of the list while older data is dequeued from the front of the list. (source: Wikipedia)

Since the

.read
  method of
cv2.VideoCapture
  is a blocking I/O operation we can obtain a significant speedup simply by creating a separate thread from our main Python script that is solely responsible for reading frames from the video file and maintaining a queue.

Since Python’s Queue data structure is thread safe, much of the hard work is done for us already — we just need to put all the pieces together.

I’ve already implemented the FileVideoStream class in imutils but we’re going to review the code so you can understand what’s going on under the hood:

# import the necessary packages
from threading import Thread
import sys
import cv2

# import the Queue class from Python 3
if sys.version_info >= (3, 0):
	from queue import Queue

# otherwise, import the Queue class for Python 2.7
else:
	from Queue import Queue

Lines 2-4 handle importing our required Python packages. The

Thread
  class is used to create and start threads in the Python programming language.

We need to take special care when importing the

Queue
  data structure as the name of the queue package is different based on which Python version you are using (Lines 7-12).

We can now define the constructor to

FileVideoStream
 :
# import the necessary packages
from threading import Thread
import sys
import cv2

# import the Queue class from Python 3
if sys.version_info >= (3, 0):
	from queue import Queue

# otherwise, import the Queue class for Python 2.7
else:
	from Queue import Queue

class FileVideoStream:
	def __init__(self, path, queueSize=128):
		# initialize the file video stream along with the boolean
		# used to indicate if the thread should be stopped or not
		self.stream = cv2.VideoCapture(path)
		self.stopped = False

		# initialize the queue used to store frames read from
		# the video file
		self.Q = Queue(maxsize=queueSize)

Our constructor takes a single required argument followed by an optional one:

  • path
     : The path to our input video file.
  • queueSize
     : The maximum number of frames to store in the queue. This value defaults to 128 frames, but you depending on (1) the frame dimensions of your video and (2) the amount of memory you can spare, you may want to raise/lower this value.

Line 18 instantiates our

cv2.VideoCapture
  object by passing in the video
path
 .

We then initialize a boolean to indicate if the threading process should be stopped (Line 19) along with our actual

Queue
  data structure (Line 23).

To kick off the thread, we’ll next define the

start
  method:
# import the necessary packages
from threading import Thread
import sys
import cv2

# import the Queue class from Python 3
if sys.version_info >= (3, 0):
	from queue import Queue

# otherwise, import the Queue class for Python 2.7
else:
	from Queue import Queue

class FileVideoStream:
	def __init__(self, path, queueSize=128):
		# initialize the file video stream along with the boolean
		# used to indicate if the thread should be stopped or not
		self.stream = cv2.VideoCapture(path)
		self.stopped = False

		# initialize the queue used to store frames read from
		# the video file
		self.Q = Queue(maxsize=queueSize)

	def start(self):
		# start a thread to read frames from the file video stream
		t = Thread(target=self.update, args=())
		t.daemon = True
		t.start()
		return self

This method simply starts a thread separate from the main thread. This thread will call the

.update
  method (which we’ll define in the next code block).

The

update
  method is responsible for reading and decoding frames from the video file, along with maintaining the actual queue data structure:
# import the necessary packages
from threading import Thread
import sys
import cv2

# import the Queue class from Python 3
if sys.version_info >= (3, 0):
	from queue import Queue

# otherwise, import the Queue class for Python 2.7
else:
	from Queue import Queue

class FileVideoStream:
	def __init__(self, path, queueSize=128):
		# initialize the file video stream along with the boolean
		# used to indicate if the thread should be stopped or not
		self.stream = cv2.VideoCapture(path)
		self.stopped = False

		# initialize the queue used to store frames read from
		# the video file
		self.Q = Queue(maxsize=queueSize)

	def start(self):
		# start a thread to read frames from the file video stream
		t = Thread(target=self.update, args=())
		t.daemon = True
		t.start()
		return self

	def update(self):
		# keep looping infinitely
		while True:
			# if the thread indicator variable is set, stop the
			# thread
			if self.stopped:
				return

			# otherwise, ensure the queue has room in it
			if not self.Q.full():
				# read the next frame from the file
				(grabbed, frame) = self.stream.read()

				# if the `grabbed` boolean is `False`, then we have
				# reached the end of the video file
				if not grabbed:
					self.stop()
					return

				# add the frame to the queue
				self.Q.put(frame)

On the surface, this code is very similar to our example in the slow, naive method detailed above.

The key takeaway here is that this code is actually running in a separate thread — this is where our actual FPS processing rate increase comes from.

On Line 34 we start looping over the frames in the video file.

If the

stopped
  indicator is set, we exit the thread (Lines 37 and 38).

If our queue is not full we read the next frame from the video stream, check to see if we have reached the end of the video file, and then update the queue (Lines 41-52).

The

read
  method will handle returning the next frame in the queue:
# import the necessary packages
from threading import Thread
import sys
import cv2

# import the Queue class from Python 3
if sys.version_info >= (3, 0):
	from queue import Queue

# otherwise, import the Queue class for Python 2.7
else:
	from Queue import Queue

class FileVideoStream:
	def __init__(self, path, queueSize=128):
		# initialize the file video stream along with the boolean
		# used to indicate if the thread should be stopped or not
		self.stream = cv2.VideoCapture(path)
		self.stopped = False

		# initialize the queue used to store frames read from
		# the video file
		self.Q = Queue(maxsize=queueSize)

	def start(self):
		# start a thread to read frames from the file video stream
		t = Thread(target=self.update, args=())
		t.daemon = True
		t.start()
		return self

	def update(self):
		# keep looping infinitely
		while True:
			# if the thread indicator variable is set, stop the
			# thread
			if self.stopped:
				return

			# otherwise, ensure the queue has room in it
			if not self.Q.full():
				# read the next frame from the file
				(grabbed, frame) = self.stream.read()

				# if the `grabbed` boolean is `False`, then we have
				# reached the end of the video file
				if not grabbed:
					self.stop()
					return

				# add the frame to the queue
				self.Q.put(frame)

	def read(self):
		# return next frame in the queue
		return self.Q.get()

We’ll create a convenience function named

more
  that will return
True
  if there are still more frames in the queue (and
False
  otherwise):
# import the necessary packages
from threading import Thread
import sys
import cv2

# import the Queue class from Python 3
if sys.version_info >= (3, 0):
	from queue import Queue

# otherwise, import the Queue class for Python 2.7
else:
	from Queue import Queue

class FileVideoStream:
	def __init__(self, path, queueSize=128):
		# initialize the file video stream along with the boolean
		# used to indicate if the thread should be stopped or not
		self.stream = cv2.VideoCapture(path)
		self.stopped = False

		# initialize the queue used to store frames read from
		# the video file
		self.Q = Queue(maxsize=queueSize)

	def start(self):
		# start a thread to read frames from the file video stream
		t = Thread(target=self.update, args=())
		t.daemon = True
		t.start()
		return self

	def update(self):
		# keep looping infinitely
		while True:
			# if the thread indicator variable is set, stop the
			# thread
			if self.stopped:
				return

			# otherwise, ensure the queue has room in it
			if not self.Q.full():
				# read the next frame from the file
				(grabbed, frame) = self.stream.read()

				# if the `grabbed` boolean is `False`, then we have
				# reached the end of the video file
				if not grabbed:
					self.stop()
					return

				# add the frame to the queue
				self.Q.put(frame)

	def read(self):
		# return next frame in the queue
		return self.Q.get()

	def more(self):
		# return True if there are still frames in the queue
		return self.Q.qsize() > 0

And finally, the

stop
  method will be called if we want to stop the thread prematurely (i.e., before we have reached the end of the video file):
# import the necessary packages
from threading import Thread
import sys
import cv2

# import the Queue class from Python 3
if sys.version_info >= (3, 0):
	from queue import Queue

# otherwise, import the Queue class for Python 2.7
else:
	from Queue import Queue

class FileVideoStream:
	def __init__(self, path, queueSize=128):
		# initialize the file video stream along with the boolean
		# used to indicate if the thread should be stopped or not
		self.stream = cv2.VideoCapture(path)
		self.stopped = False

		# initialize the queue used to store frames read from
		# the video file
		self.Q = Queue(maxsize=queueSize)

	def start(self):
		# start a thread to read frames from the file video stream
		t = Thread(target=self.update, args=())
		t.daemon = True
		t.start()
		return self

	def update(self):
		# keep looping infinitely
		while True:
			# if the thread indicator variable is set, stop the
			# thread
			if self.stopped:
				return

			# otherwise, ensure the queue has room in it
			if not self.Q.full():
				# read the next frame from the file
				(grabbed, frame) = self.stream.read()

				# if the `grabbed` boolean is `False`, then we have
				# reached the end of the video file
				if not grabbed:
					self.stop()
					return

				# add the frame to the queue
				self.Q.put(frame)

	def read(self):
		# return next frame in the queue
		return self.Q.get()

	def more(self):
		# return True if there are still frames in the queue
		return self.Q.qsize() > 0

	def stop(self):
		# indicate that the thread should be stopped
		self.stopped = True

The faster, threaded method to reading video frames with OpenCV

Now that we have defined our

FileVideoStream
  class we can put all the pieces together and enjoy a faster, threaded video file read with OpenCV.

Open up a new file, name it

read_frames_fast.py
 , and insert the following code:
# import the necessary packages
from imutils.video import FileVideoStream
from imutils.video import FPS
import numpy as np
import argparse
import imutils
import time
import cv2

# construct the argument parse and parse the arguments
ap = argparse.ArgumentParser()
ap.add_argument("-v", "--video", required=True,
	help="path to input video file")
args = vars(ap.parse_args())

# start the file video stream thread and allow the buffer to
# start to fill
print("[INFO] starting video file thread...")
fvs = FileVideoStream(args["video"]).start()
time.sleep(1.0)

# start the FPS timer
fps = FPS().start()

Lines 2-8 import our required Python packages. Notice how we are using the

FileVideoStream
  class from the
imutils
  library to facilitate faster frame reads with OpenCV.

Lines 11-14 parse our command line arguments. Just like the previous example, we only need a single switch,

--video
 , the path to our input video file.

We then instantiate the

FileVideoStream
  object and start the frame reading thread (Line 19).

Line 23 then starts the FPS timer.

Our next section handles reading frames from the

FileVideoStream
 , processing them, and displaying them to our screen:
# import the necessary packages
from imutils.video import FileVideoStream
from imutils.video import FPS
import numpy as np
import argparse
import imutils
import time
import cv2

# construct the argument parse and parse the arguments
ap = argparse.ArgumentParser()
ap.add_argument("-v", "--video", required=True,
	help="path to input video file")
args = vars(ap.parse_args())

# start the file video stream thread and allow the buffer to
# start to fill
print("[INFO] starting video file thread...")
fvs = FileVideoStream(args["video"]).start()
time.sleep(1.0)

# start the FPS timer
fps = FPS().start()

# loop over frames from the video file stream
while fvs.more():
	# grab the frame from the threaded video file stream, resize
	# it, and convert it to grayscale (while still retaining 3
	# channels)
	frame = fvs.read()
	frame = imutils.resize(frame, width=450)
	frame = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
	frame = np.dstack([frame, frame, frame])

	# display the size of the queue on the frame
	cv2.putText(frame, "Queue Size: {}".format(fvs.Q.qsize()),
		(10, 30), cv2.FONT_HERSHEY_SIMPLEX, 0.6, (0, 255, 0), 2)	

	# show the frame and update the FPS counter
	cv2.imshow("Frame", frame)
	cv2.waitKey(1)
	fps.update()

We start a

while
  loop on Line 26 that will keep grabbing frames from the
FileVideoStream
  queue until the queue is empty.

For each of these frames we’ll apply the same image processing operations, including: resizing, conversion to grayscale, and displaying text on the frame (in this case, our text will be the number of frames in the queue).

The processed frame is displayed to our screen on Lines 40-42.

The last code block computes our FPS throughput rate and performs a bit of cleanup:

# import the necessary packages
from imutils.video import FileVideoStream
from imutils.video import FPS
import numpy as np
import argparse
import imutils
import time
import cv2

# construct the argument parse and parse the arguments
ap = argparse.ArgumentParser()
ap.add_argument("-v", "--video", required=True,
	help="path to input video file")
args = vars(ap.parse_args())

# start the file video stream thread and allow the buffer to
# start to fill
print("[INFO] starting video file thread...")
fvs = FileVideoStream(args["video"]).start()
time.sleep(1.0)

# start the FPS timer
fps = FPS().start()

# loop over frames from the video file stream
while fvs.more():
	# grab the frame from the threaded video file stream, resize
	# it, and convert it to grayscale (while still retaining 3
	# channels)
	frame = fvs.read()
	frame = imutils.resize(frame, width=450)
	frame = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
	frame = np.dstack([frame, frame, frame])

	# display the size of the queue on the frame
	cv2.putText(frame, "Queue Size: {}".format(fvs.Q.qsize()),
		(10, 30), cv2.FONT_HERSHEY_SIMPLEX, 0.6, (0, 255, 0), 2)	

	# show the frame and update the FPS counter
	cv2.imshow("Frame", frame)
	cv2.waitKey(1)
	fps.update()

# stop the timer and display FPS information
fps.stop()
print("[INFO] elasped time: {:.2f}".format(fps.elapsed()))
print("[INFO] approx. FPS: {:.2f}".format(fps.fps()))

# do a bit of cleanup
cv2.destroyAllWindows()
fvs.stop()

To see the results of the

read_frames_fast.py
  script, make sure you download the source code + example video using the “Downloads” section at the bottom of this tutorial.

From there, execute the following command:

$ python read_frames_fast.py --video videos/jurassic_park_intro.mp4

Figure 3: Utilizing threading with cv2.VideoCapture and OpenCV leads to higher FPS and a larger throughput rate.

Figure 3: Utilizing threading with cv2.VideoCapture and OpenCV leads to higher FPS and a larger throughput rate.

As we can see from the results we were able to process the entire 31 second video clip in 31.09 seconds — that’s an improvement of 34% from the slow, naive method!

The actual frame throughput processing rate is much faster, clocking in at 30.75 frames per second, an improvement of 52.15%.

Threading can dramatically improve the speed of your video processing pipeline — use it whenever you can.

What about built-in webcams, USB cameras, and the Raspberry Pi? What do I do then?

This post has focused on using threading to improve the frame processing rate of video files.

If you’re instead interested in speeding up the FPS of your built-in webcam, USB camera, or Raspberry Pi camera module, please refer to these blog posts:

Summary

In today’s tutorial I demonstrated how to use threading and a queue data structure to improve the FPS throughput rate of your video processing pipeline.

By placing the call to

.read
  of a
cv2.VideoCapture
  object in a thread separate from the main Python script we can avoid blocking I/O operations that would otherwise dramatically slow down our pipeline.

Finally, I provided an example comparing threading with no threading. The results show that by using threading we can improve our processing pipeline by up to 52%.

However, keep in mind that the more steps (i.e., function calls) you make inside your

while
  loop, the more computation needs to be done — therefore, your actual frames per second rate will drop, but you’ll still be processing faster than the non-threaded version.

To be notified when future blog posts are published, be sure to enter your email address in the form below!

Downloads:

If you would like to download the code and images used in this post, please enter your email address in the form below. Not only will you get a .zip of the code, I’ll also send you a FREE 11-page Resource Guide on Computer Vision and Image Search Engines, including exclusive techniques that I don’t post on this blog! Sound good? If so, enter your email address and I’ll send you the code immediately!

The post Faster video file FPS with cv2.VideoCapture and OpenCV appeared first on PyImageSearch.

Recognizing digits with OpenCV and Python

$
0
0

Today’s tutorial is inspired by a post I saw a few weeks back on /r/computervision asking how to recognize digits in an image containing a thermostat identical to the one at the top of this post.

As Reddit users were quick to point out, utilizing computer vision to recognize digits on a thermostat tends to overcomplicate the problem — a simple data logging thermometer would give much more reliable results with a fraction of the effort.

On the other hand, applying computer vision to projects such as these are really good practice.

Whether you are just getting started with computer vision/OpenCV, or you’re already writing computer vision code on a daily basis, taking the time to hone your skills on mini-projects are paramount to mastering your trade — in fact, I find it so important that I do exercises like this one twice a month.

Every other Friday afternoon I block off two hours on my calendar and practice my basic image processing and computer vision skills on computer vision/OpenCV questions I’ve found on Reddit or StackOverflow.

Doing this exercise helps me keep my skills sharp — it also has the added benefit of making great blog post content.

In the remainder of today’s blog post, I’ll demonstrate how to recognize digits in images using OpenCV and Python.

Looking for the source code to this post?
Jump right to the downloads section.

Recognizing digits with OpenCV and Python

In the first part of this tutorial, we’ll discuss what a seven-segment display is and how we can apply computer vision and image processing operations to recognize these types of digits (no machine learning required!)

From there I’ll provide actual Python and OpenCV code that can be used to recognize these digits in images.

The seven-segment display

You’re likely already familiar with a seven-segment display, even if you don’t recognize the particular term.

A great example of such a display is your classic digital alarm clock:

Figure 1: A classic digital alarm clock that contains four seven-segment displays to represent the time of day.

Figure 1: A classic digital alarm clock that contains four seven-segment displays to represent the time of day.

Each digit on the alarm clock is represented by a seven-segment component just like the one below:

Figure 2: An example of a single seven-segment display. Each segment can be turned "on" or "off" to represent a particular digit.

Figure 2: An example of a single seven-segment display. Each segment can be turned “on” or “off” to represent a particular digit (source: Wikipedia).

Sevent-segment displays can take on a total of 128 possible states:

Figure 3: A seven-segment display is capable of 128 possible states (source: Wikipedia).

Figure 3: A seven-segment display is capable of 128 possible states (source: Wikipedia).

Luckily for us, we are only interested in ten of them — the digits zero to nine:

Figure 4: For the task of digit recognition we only need to recognize ten of these states.

Our goal is to write OpenCV and Python code to recognize each of these ten digit states in an image.

Planning the OpenCV digit recognizer

Just like in the original post on /r/computervision, we’ll be using the thermostat image as input:

Figure 5: Our example input image. Our goal is to recognize the digits on the thermostat using OpenCV and Python.

Figure 5: Our example input image. Our goal is to recognize the digits on the thermostat using OpenCV and Python.

Whenever I am trying to recognize/identify object(s) in an image I first take a few minutes to assess the problem. Given that my end goal is to recognize the digits on the LCD display I know I need to:

  • Step #1: Localize the LCD on the thermostat. This can be done using edge detection since there is enough contrast between the plastic shell and the LCD.
  • Step #2: Extract the LCD. Given an input edge map I can find contours and look for outlines with a rectangular shape — the largest rectangular region should correspond to the LCD. A perspective transform will give me a nice extraction of the LCD.
  • Step #3: Extract the digit regions. Once I have the LCD itself I can focus on extracting the digits. Since there seems to be contrast between the digit regions and the background of the LCD I’m confident that thresholding and morphological operations can accomplish this.
  • Step #4: Identify the digits. Recognizing the actual digits with OpenCV will involve dividing the digit ROI into seven segments. From there I can apply pixel counting on the thresholded image to determine if a given segment is “on” or “off”.

So see how we can accomplish this four-step process to digit recognition with OpenCV and Python, keep reading.

Recognizing digits with computer vision and OpenCV

Let’s go ahead and get this example started.

Open up a new file, name it

recognize_digits.py
 , and insert the following code:
# import the necessary packages
from imutils.perspective import four_point_transform
from imutils import contours
import imutils
import cv2

# define the dictionary of digit segments so we can identify
# each digit on the thermostat
DIGITS_LOOKUP = {
	(1, 1, 1, 0, 1, 1, 1): 0,
	(0, 0, 1, 0, 0, 1, 0): 1,
	(1, 0, 1, 1, 1, 1, 0): 2,
	(1, 0, 1, 1, 0, 1, 1): 3,
	(0, 1, 1, 1, 0, 1, 0): 4,
	(1, 1, 0, 1, 0, 1, 1): 5,
	(1, 1, 0, 1, 1, 1, 1): 6,
	(1, 0, 1, 0, 0, 1, 0): 7,
	(1, 1, 1, 1, 1, 1, 1): 8,
	(1, 1, 1, 1, 0, 1, 1): 9
}

Lines 2-5 import our required Python packages. We’ll be using imutils, my series of convenience functions to make working with OpenCV + Python easier. If you don’t already have

imutils
  installed, you should take a second now to install the package on your system using
pip
 :
$ pip install imutils

Lines 9-20 define a Python dictionary named

DIGITS_LOOKUP
 . Inspired by the approach of /u/Jonno_FTW in the Reddit thread, we can easily define this lookup table where:
  1. They key to the table is the seven-segment array. A one in the array indicates that the given segment is on and a zero indicates that the segment is off.
  2. The value is the actual numerical digit itself: 0-9.

Once we identify the segments in the thermostat display we can pass the array into our

DIGITS_LOOKUP
  table and obtain the digit value.

For reference, this dictionary uses the same segment ordering as in Figure 2 above.

Let’s continue with our example:

# import the necessary packages
from imutils.perspective import four_point_transform
from imutils import contours
import imutils
import cv2

# define the dictionary of digit segments so we can identify
# each digit on the thermostat
DIGITS_LOOKUP = {
	(1, 1, 1, 0, 1, 1, 1): 0,
	(0, 0, 1, 0, 0, 1, 0): 1,
	(1, 0, 1, 1, 1, 1, 0): 2,
	(1, 0, 1, 1, 0, 1, 1): 3,
	(0, 1, 1, 1, 0, 1, 0): 4,
	(1, 1, 0, 1, 0, 1, 1): 5,
	(1, 1, 0, 1, 1, 1, 1): 6,
	(1, 0, 1, 0, 0, 1, 0): 7,
	(1, 1, 1, 1, 1, 1, 1): 8,
	(1, 1, 1, 1, 0, 1, 1): 9
}

# load the example image
image = cv2.imread("example.jpg")

# pre-process the image by resizing it, converting it to
# graycale, blurring it, and computing an edge map
image = imutils.resize(image, height=500)
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
blurred = cv2.GaussianBlur(gray, (5, 5), 0)
edged = cv2.Canny(blurred, 50, 200, 255)

Line 23 loads our image from disk.

We then pre-process the image on Lines 27-30 by:

  • Resizing it.
  • Converting the image to grayscale.
  • Applying Gaussian blurring with a 5×5 kernel to reduce high-frequency noise.
  • Computing the edge map via the Canny edge detector.

After applying these pre-processing steps our edge map looks like this:

Figure 6: Applying image processing steps to compute the edge map of our input image.

Figure 6: Applying image processing steps to compute the edge map of our input image.

Notice how the outlines of the LCD are clearly visible — this accomplishes Step #1.

We can now move on to Step #2, extracting the LCD itself:

# import the necessary packages
from imutils.perspective import four_point_transform
from imutils import contours
import imutils
import cv2

# define the dictionary of digit segments so we can identify
# each digit on the thermostat
DIGITS_LOOKUP = {
	(1, 1, 1, 0, 1, 1, 1): 0,
	(0, 0, 1, 0, 0, 1, 0): 1,
	(1, 0, 1, 1, 1, 1, 0): 2,
	(1, 0, 1, 1, 0, 1, 1): 3,
	(0, 1, 1, 1, 0, 1, 0): 4,
	(1, 1, 0, 1, 0, 1, 1): 5,
	(1, 1, 0, 1, 1, 1, 1): 6,
	(1, 0, 1, 0, 0, 1, 0): 7,
	(1, 1, 1, 1, 1, 1, 1): 8,
	(1, 1, 1, 1, 0, 1, 1): 9
}

# load the example image
image = cv2.imread("example.jpg")

# pre-process the image by resizing it, converting it to
# graycale, blurring it, and computing an edge map
image = imutils.resize(image, height=500)
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
blurred = cv2.GaussianBlur(gray, (5, 5), 0)
edged = cv2.Canny(blurred, 50, 200, 255)

# find contours in the edge map, then sort them by their
# size in descending order
cnts = cv2.findContours(edged.copy(), cv2.RETR_EXTERNAL,
	cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if imutils.is_cv2() else cnts[1]
cnts = sorted(cnts, key=cv2.contourArea, reverse=True)
displayCnt = None

# loop over the contours
for c in cnts:
	# approximate the contour
	peri = cv2.arcLength(c, True)
	approx = cv2.approxPolyDP(c, 0.02 * peri, True)

	# if the contour has four vertices, then we have found
	# the thermostat display
	if len(approx) == 4:
		displayCnt = approx
		break

In order to find the LCD regions, we need to extract the contours (i.e., outlines) of the regions in the edge map (Lines 35 and 35).

We then sort the contours by their area, ensuring that contours with a larger area are placed at the front of the list (Line 37).

Given our sorted contours list, we loop over them individually on Line 41 and apply contour approximation.

If our approximated contour has four vertices then we assume we have found the thermostat display (Lines 48-50). This is a reasonable assumption since the largest rectangular region in our input image should be the LCD itself.

After obtaining the four vertices we can extract the LCD via a four point perspective transform:

# import the necessary packages
from imutils.perspective import four_point_transform
from imutils import contours
import imutils
import cv2

# define the dictionary of digit segments so we can identify
# each digit on the thermostat
DIGITS_LOOKUP = {
	(1, 1, 1, 0, 1, 1, 1): 0,
	(0, 0, 1, 0, 0, 1, 0): 1,
	(1, 0, 1, 1, 1, 1, 0): 2,
	(1, 0, 1, 1, 0, 1, 1): 3,
	(0, 1, 1, 1, 0, 1, 0): 4,
	(1, 1, 0, 1, 0, 1, 1): 5,
	(1, 1, 0, 1, 1, 1, 1): 6,
	(1, 0, 1, 0, 0, 1, 0): 7,
	(1, 1, 1, 1, 1, 1, 1): 8,
	(1, 1, 1, 1, 0, 1, 1): 9
}

# load the example image
image = cv2.imread("example.jpg")

# pre-process the image by resizing it, converting it to
# graycale, blurring it, and computing an edge map
image = imutils.resize(image, height=500)
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
blurred = cv2.GaussianBlur(gray, (5, 5), 0)
edged = cv2.Canny(blurred, 50, 200, 255)

# find contours in the edge map, then sort them by their
# size in descending order
cnts = cv2.findContours(edged.copy(), cv2.RETR_EXTERNAL,
	cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if imutils.is_cv2() else cnts[1]
cnts = sorted(cnts, key=cv2.contourArea, reverse=True)
displayCnt = None

# loop over the contours
for c in cnts:
	# approximate the contour
	peri = cv2.arcLength(c, True)
	approx = cv2.approxPolyDP(c, 0.02 * peri, True)

	# if the contour has four vertices, then we have found
	# the thermostat display
	if len(approx) == 4:
		displayCnt = approx
		break

# extract the thermostat display, apply a perspective transform
# to it
warped = four_point_transform(gray, displayCnt.reshape(4, 2))
output = four_point_transform(image, displayCnt.reshape(4, 2))

Applying this perspective transform gives us a top-down, birds-eye-view of the LCD:

Figure 7: Applying a perspective transform to our image to obtain the LCD region.

Figure 7: Applying a perspective transform to our image to obtain the LCD region.

Obtaining this view of the LCD satisfies Step #2 — we are now ready to extract the digits from the LCD:

# import the necessary packages
from imutils.perspective import four_point_transform
from imutils import contours
import imutils
import cv2

# define the dictionary of digit segments so we can identify
# each digit on the thermostat
DIGITS_LOOKUP = {
	(1, 1, 1, 0, 1, 1, 1): 0,
	(0, 0, 1, 0, 0, 1, 0): 1,
	(1, 0, 1, 1, 1, 1, 0): 2,
	(1, 0, 1, 1, 0, 1, 1): 3,
	(0, 1, 1, 1, 0, 1, 0): 4,
	(1, 1, 0, 1, 0, 1, 1): 5,
	(1, 1, 0, 1, 1, 1, 1): 6,
	(1, 0, 1, 0, 0, 1, 0): 7,
	(1, 1, 1, 1, 1, 1, 1): 8,
	(1, 1, 1, 1, 0, 1, 1): 9
}

# load the example image
image = cv2.imread("example.jpg")

# pre-process the image by resizing it, converting it to
# graycale, blurring it, and computing an edge map
image = imutils.resize(image, height=500)
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
blurred = cv2.GaussianBlur(gray, (5, 5), 0)
edged = cv2.Canny(blurred, 50, 200, 255)

# find contours in the edge map, then sort them by their
# size in descending order
cnts = cv2.findContours(edged.copy(), cv2.RETR_EXTERNAL,
	cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if imutils.is_cv2() else cnts[1]
cnts = sorted(cnts, key=cv2.contourArea, reverse=True)
displayCnt = None

# loop over the contours
for c in cnts:
	# approximate the contour
	peri = cv2.arcLength(c, True)
	approx = cv2.approxPolyDP(c, 0.02 * peri, True)

	# if the contour has four vertices, then we have found
	# the thermostat display
	if len(approx) == 4:
		displayCnt = approx
		break

# extract the thermostat display, apply a perspective transform
# to it
warped = four_point_transform(gray, displayCnt.reshape(4, 2))
output = four_point_transform(image, displayCnt.reshape(4, 2))

# threshold the warped image, then apply a series of morphological
# operations to cleanup the thresholded image
thresh = cv2.threshold(warped, 0, 255,
	cv2.THRESH_BINARY_INV | cv2.THRESH_OTSU)[1]
kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (1, 5))
thresh = cv2.morphologyEx(thresh, cv2.MORPH_OPEN, kernel)

To obtain the digits themselves we need to threshold the

warped
  image (Lines 59 and 60) to reveal the dark regions (i.e., digits) against the lighter background (i.e., the background of the LCD display):
Figure 8: Thresholding LCD allows us to segment the dark regions (digits/symbols) from the lighter background (the LCD display itself).

Figure 8: Thresholding LCD allows us to segment the dark regions (digits/symbols) from the lighter background (the LCD display itself).

We then apply a series of morphological operations to clean up the thresholded image (Lines 61 and 62):

Figure 9: Applying a series of morphological operations cleans up our thresholded LCD and will allow us to segment out each of the digits.

Figure 9: Applying a series of morphological operations cleans up our thresholded LCD and will allow us to segment out each of the digits.

Now that we have a nice segmented image we once again need to apply contour filtering, only this time we are looking for the actual digits:

# import the necessary packages
from imutils.perspective import four_point_transform
from imutils import contours
import imutils
import cv2

# define the dictionary of digit segments so we can identify
# each digit on the thermostat
DIGITS_LOOKUP = {
	(1, 1, 1, 0, 1, 1, 1): 0,
	(0, 0, 1, 0, 0, 1, 0): 1,
	(1, 0, 1, 1, 1, 1, 0): 2,
	(1, 0, 1, 1, 0, 1, 1): 3,
	(0, 1, 1, 1, 0, 1, 0): 4,
	(1, 1, 0, 1, 0, 1, 1): 5,
	(1, 1, 0, 1, 1, 1, 1): 6,
	(1, 0, 1, 0, 0, 1, 0): 7,
	(1, 1, 1, 1, 1, 1, 1): 8,
	(1, 1, 1, 1, 0, 1, 1): 9
}

# load the example image
image = cv2.imread("example.jpg")

# pre-process the image by resizing it, converting it to
# graycale, blurring it, and computing an edge map
image = imutils.resize(image, height=500)
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
blurred = cv2.GaussianBlur(gray, (5, 5), 0)
edged = cv2.Canny(blurred, 50, 200, 255)

# find contours in the edge map, then sort them by their
# size in descending order
cnts = cv2.findContours(edged.copy(), cv2.RETR_EXTERNAL,
	cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if imutils.is_cv2() else cnts[1]
cnts = sorted(cnts, key=cv2.contourArea, reverse=True)
displayCnt = None

# loop over the contours
for c in cnts:
	# approximate the contour
	peri = cv2.arcLength(c, True)
	approx = cv2.approxPolyDP(c, 0.02 * peri, True)

	# if the contour has four vertices, then we have found
	# the thermostat display
	if len(approx) == 4:
		displayCnt = approx
		break

# extract the thermostat display, apply a perspective transform
# to it
warped = four_point_transform(gray, displayCnt.reshape(4, 2))
output = four_point_transform(image, displayCnt.reshape(4, 2))

# threshold the warped image, then apply a series of morphological
# operations to cleanup the thresholded image
thresh = cv2.threshold(warped, 0, 255,
	cv2.THRESH_BINARY_INV | cv2.THRESH_OTSU)[1]
kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (1, 5))
thresh = cv2.morphologyEx(thresh, cv2.MORPH_OPEN, kernel)

# find contours in the thresholded image, then initialize the
# digit contours lists
cnts = cv2.findContours(thresh.copy(), cv2.RETR_EXTERNAL,
	cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if imutils.is_cv2() else cnts[1]
digitCnts = []

# loop over the digit area candidates
for c in cnts:
	# compute the bounding box of the contour
	(x, y, w, h) = cv2.boundingRect(c)

	# if the contour is sufficiently large, it must be a digit
	if w >= 15 and (h >= 30 and h <= 40):
		digitCnts.append(c)

To accomplish this we find contours in our thresholded image (Lines 66 and 67). We also initialize the

digitsCnts
  list on Line 69 — this list will store the contours of the digits themselves.

Line 72 starts looping over each of the contours.

For each contour, we compute the bounding box (Line 74), ensure the width and height are of an acceptable size, and if so, update the

digitsCnts
  list (Lines 77 and 78).

Note: Determining the appropriate width and height constraints requires a few rounds of trial and error. I would suggest looping over each of the contours, drawing them individually, and inspecting their dimensions. Doing this process ensures you can find commonalities across digit contour properties.

If we were to loop over the contours inside

digitsCnts
  and draw the bounding box on our image, the result would look like this:
Figure 10: Drawing the bounding box of each of the digits on the LCD.

Figure 10: Drawing the bounding box of each of the digits on the LCD.

Sure enough, we have found the digits on the LCD!

The final step is to actually identify each of the digits:

# import the necessary packages
from imutils.perspective import four_point_transform
from imutils import contours
import imutils
import cv2

# define the dictionary of digit segments so we can identify
# each digit on the thermostat
DIGITS_LOOKUP = {
	(1, 1, 1, 0, 1, 1, 1): 0,
	(0, 0, 1, 0, 0, 1, 0): 1,
	(1, 0, 1, 1, 1, 1, 0): 2,
	(1, 0, 1, 1, 0, 1, 1): 3,
	(0, 1, 1, 1, 0, 1, 0): 4,
	(1, 1, 0, 1, 0, 1, 1): 5,
	(1, 1, 0, 1, 1, 1, 1): 6,
	(1, 0, 1, 0, 0, 1, 0): 7,
	(1, 1, 1, 1, 1, 1, 1): 8,
	(1, 1, 1, 1, 0, 1, 1): 9
}

# load the example image
image = cv2.imread("example.jpg")

# pre-process the image by resizing it, converting it to
# graycale, blurring it, and computing an edge map
image = imutils.resize(image, height=500)
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
blurred = cv2.GaussianBlur(gray, (5, 5), 0)
edged = cv2.Canny(blurred, 50, 200, 255)

# find contours in the edge map, then sort them by their
# size in descending order
cnts = cv2.findContours(edged.copy(), cv2.RETR_EXTERNAL,
	cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if imutils.is_cv2() else cnts[1]
cnts = sorted(cnts, key=cv2.contourArea, reverse=True)
displayCnt = None

# loop over the contours
for c in cnts:
	# approximate the contour
	peri = cv2.arcLength(c, True)
	approx = cv2.approxPolyDP(c, 0.02 * peri, True)

	# if the contour has four vertices, then we have found
	# the thermostat display
	if len(approx) == 4:
		displayCnt = approx
		break

# extract the thermostat display, apply a perspective transform
# to it
warped = four_point_transform(gray, displayCnt.reshape(4, 2))
output = four_point_transform(image, displayCnt.reshape(4, 2))

# threshold the warped image, then apply a series of morphological
# operations to cleanup the thresholded image
thresh = cv2.threshold(warped, 0, 255,
	cv2.THRESH_BINARY_INV | cv2.THRESH_OTSU)[1]
kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (1, 5))
thresh = cv2.morphologyEx(thresh, cv2.MORPH_OPEN, kernel)

# find contours in the thresholded image, then initialize the
# digit contours lists
cnts = cv2.findContours(thresh.copy(), cv2.RETR_EXTERNAL,
	cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if imutils.is_cv2() else cnts[1]
digitCnts = []

# loop over the digit area candidates
for c in cnts:
	# compute the bounding box of the contour
	(x, y, w, h) = cv2.boundingRect(c)

	# if the contour is sufficiently large, it must be a digit
	if w >= 15 and (h >= 30 and h <= 40):
		digitCnts.append(c)

# sort the contours from left-to-right, then initialize the
# actual digits themselves
digitCnts = contours.sort_contours(digitCnts,
	method="left-to-right")[0]
digits = []

Here we are simply sorting our digit contours from left-to-right based on their (x, y)-coordinates.

This sorting step is necessary as there are no guarantees that the contours are already sorted from left-to-right (the same direction in which we would read the digits).

Next, comes the actual digit recognition process:

# import the necessary packages
from imutils.perspective import four_point_transform
from imutils import contours
import imutils
import cv2

# define the dictionary of digit segments so we can identify
# each digit on the thermostat
DIGITS_LOOKUP = {
	(1, 1, 1, 0, 1, 1, 1): 0,
	(0, 0, 1, 0, 0, 1, 0): 1,
	(1, 0, 1, 1, 1, 1, 0): 2,
	(1, 0, 1, 1, 0, 1, 1): 3,
	(0, 1, 1, 1, 0, 1, 0): 4,
	(1, 1, 0, 1, 0, 1, 1): 5,
	(1, 1, 0, 1, 1, 1, 1): 6,
	(1, 0, 1, 0, 0, 1, 0): 7,
	(1, 1, 1, 1, 1, 1, 1): 8,
	(1, 1, 1, 1, 0, 1, 1): 9
}

# load the example image
image = cv2.imread("example.jpg")

# pre-process the image by resizing it, converting it to
# graycale, blurring it, and computing an edge map
image = imutils.resize(image, height=500)
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
blurred = cv2.GaussianBlur(gray, (5, 5), 0)
edged = cv2.Canny(blurred, 50, 200, 255)

# find contours in the edge map, then sort them by their
# size in descending order
cnts = cv2.findContours(edged.copy(), cv2.RETR_EXTERNAL,
	cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if imutils.is_cv2() else cnts[1]
cnts = sorted(cnts, key=cv2.contourArea, reverse=True)
displayCnt = None

# loop over the contours
for c in cnts:
	# approximate the contour
	peri = cv2.arcLength(c, True)
	approx = cv2.approxPolyDP(c, 0.02 * peri, True)

	# if the contour has four vertices, then we have found
	# the thermostat display
	if len(approx) == 4:
		displayCnt = approx
		break

# extract the thermostat display, apply a perspective transform
# to it
warped = four_point_transform(gray, displayCnt.reshape(4, 2))
output = four_point_transform(image, displayCnt.reshape(4, 2))

# threshold the warped image, then apply a series of morphological
# operations to cleanup the thresholded image
thresh = cv2.threshold(warped, 0, 255,
	cv2.THRESH_BINARY_INV | cv2.THRESH_OTSU)[1]
kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (1, 5))
thresh = cv2.morphologyEx(thresh, cv2.MORPH_OPEN, kernel)

# find contours in the thresholded image, then initialize the
# digit contours lists
cnts = cv2.findContours(thresh.copy(), cv2.RETR_EXTERNAL,
	cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if imutils.is_cv2() else cnts[1]
digitCnts = []

# loop over the digit area candidates
for c in cnts:
	# compute the bounding box of the contour
	(x, y, w, h) = cv2.boundingRect(c)

	# if the contour is sufficiently large, it must be a digit
	if w >= 15 and (h >= 30 and h <= 40):
		digitCnts.append(c)

# sort the contours from left-to-right, then initialize the
# actual digits themselves
digitCnts = contours.sort_contours(digitCnts,
	method="left-to-right")[0]
digits = []

# loop over each of the digits
for c in digitCnts:
	# extract the digit ROI
	(x, y, w, h) = cv2.boundingRect(c)
	roi = thresh[y:y + h, x:x + w]

	# compute the width and height of each of the 7 segments
	# we are going to examine
	(roiH, roiW) = roi.shape
	(dW, dH) = (int(roiW * 0.25), int(roiH * 0.15))
	dHC = int(roiH * 0.05)

	# define the set of 7 segments
	segments = [
		((0, 0), (w, dH)),	# top
		((0, 0), (dW, h // 2)),	# top-left
		((w - dW, 0), (w, h // 2)),	# top-right
		((0, (h // 2) - dHC) , (w, (h // 2) + dHC)), # center
		((0, h // 2), (dW, h)),	# bottom-left
		((w - dW, h // 2), (w, h)),	# bottom-right
		((0, h - dH), (w, h))	# bottom
	]
	on = [0] * len(segments)

We start looping over each of the digit contours on Line 87.

For each of these regions, we compute the bounding box and extract the digit ROI (Lines 89 and 90).

I have included a GIF animation of each of these digit ROIs below:

Figure 11: Extracting each individual digit ROI by computing the bounding box and applying NumPy array slicing.

Figure 11: Extracting each individual digit ROI by computing the bounding box and applying NumPy array slicing.

Given the digit ROI we now need to localize and extract the seven segments of the digit display.

Lines 94-96 compute the approximate width and height of each segment based on the ROI dimensions.

We then define a list of (x, y)-coordinates that correspond to the seven segments on Lines 99-107. This list follows the same order of segments as Figure 2 above.

Here is an example GIF animation that draws a green box over the current segment being investigated:

Figure 12: An example of drawing the segment ROI for each of the seven segments of the digit.

Figure 12: An example of drawing the segment ROI for each of the seven segments of the digit.

Finally, Line 108 initializes our

on
  list — a value of one inside this list indicates that a given segment is turned “on” while a value of zero indicates the segment is “off”.

Given the (x, y)-coordinates of the seven display segments, identifying a whether a segment is on or off is fairly easy:

# import the necessary packages
from imutils.perspective import four_point_transform
from imutils import contours
import imutils
import cv2

# define the dictionary of digit segments so we can identify
# each digit on the thermostat
DIGITS_LOOKUP = {
	(1, 1, 1, 0, 1, 1, 1): 0,
	(0, 0, 1, 0, 0, 1, 0): 1,
	(1, 0, 1, 1, 1, 1, 0): 2,
	(1, 0, 1, 1, 0, 1, 1): 3,
	(0, 1, 1, 1, 0, 1, 0): 4,
	(1, 1, 0, 1, 0, 1, 1): 5,
	(1, 1, 0, 1, 1, 1, 1): 6,
	(1, 0, 1, 0, 0, 1, 0): 7,
	(1, 1, 1, 1, 1, 1, 1): 8,
	(1, 1, 1, 1, 0, 1, 1): 9
}

# load the example image
image = cv2.imread("example.jpg")

# pre-process the image by resizing it, converting it to
# graycale, blurring it, and computing an edge map
image = imutils.resize(image, height=500)
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
blurred = cv2.GaussianBlur(gray, (5, 5), 0)
edged = cv2.Canny(blurred, 50, 200, 255)

# find contours in the edge map, then sort them by their
# size in descending order
cnts = cv2.findContours(edged.copy(), cv2.RETR_EXTERNAL,
	cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if imutils.is_cv2() else cnts[1]
cnts = sorted(cnts, key=cv2.contourArea, reverse=True)
displayCnt = None

# loop over the contours
for c in cnts:
	# approximate the contour
	peri = cv2.arcLength(c, True)
	approx = cv2.approxPolyDP(c, 0.02 * peri, True)

	# if the contour has four vertices, then we have found
	# the thermostat display
	if len(approx) == 4:
		displayCnt = approx
		break

# extract the thermostat display, apply a perspective transform
# to it
warped = four_point_transform(gray, displayCnt.reshape(4, 2))
output = four_point_transform(image, displayCnt.reshape(4, 2))

# threshold the warped image, then apply a series of morphological
# operations to cleanup the thresholded image
thresh = cv2.threshold(warped, 0, 255,
	cv2.THRESH_BINARY_INV | cv2.THRESH_OTSU)[1]
kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (1, 5))
thresh = cv2.morphologyEx(thresh, cv2.MORPH_OPEN, kernel)

# find contours in the thresholded image, then initialize the
# digit contours lists
cnts = cv2.findContours(thresh.copy(), cv2.RETR_EXTERNAL,
	cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if imutils.is_cv2() else cnts[1]
digitCnts = []

# loop over the digit area candidates
for c in cnts:
	# compute the bounding box of the contour
	(x, y, w, h) = cv2.boundingRect(c)

	# if the contour is sufficiently large, it must be a digit
	if w >= 15 and (h >= 30 and h <= 40):
		digitCnts.append(c)

# sort the contours from left-to-right, then initialize the
# actual digits themselves
digitCnts = contours.sort_contours(digitCnts,
	method="left-to-right")[0]
digits = []

# loop over each of the digits
for c in digitCnts:
	# extract the digit ROI
	(x, y, w, h) = cv2.boundingRect(c)
	roi = thresh[y:y + h, x:x + w]

	# compute the width and height of each of the 7 segments
	# we are going to examine
	(roiH, roiW) = roi.shape
	(dW, dH) = (int(roiW * 0.25), int(roiH * 0.15))
	dHC = int(roiH * 0.05)

	# define the set of 7 segments
	segments = [
		((0, 0), (w, dH)),	# top
		((0, 0), (dW, h // 2)),	# top-left
		((w - dW, 0), (w, h // 2)),	# top-right
		((0, (h // 2) - dHC) , (w, (h // 2) + dHC)), # center
		((0, h // 2), (dW, h)),	# bottom-left
		((w - dW, h // 2), (w, h)),	# bottom-right
		((0, h - dH), (w, h))	# bottom
	]
	on = [0] * len(segments)

	# loop over the segments
	for (i, ((xA, yA), (xB, yB))) in enumerate(segments):
		# extract the segment ROI, count the total number of
		# thresholded pixels in the segment, and then compute
		# the area of the segment
		segROI = roi[yA:yB, xA:xB]
		total = cv2.countNonZero(segROI)
		area = (xB - xA) * (yB - yA)

		# if the total number of non-zero pixels is greater than
		# 50% of the area, mark the segment as "on"
		if total / float(area) > 0.5:
			on[i]= 1

	# lookup the digit and draw it on the image
	digit = DIGITS_LOOKUP[tuple(on)]
	digits.append(digit)
	cv2.rectangle(output, (x, y), (x + w, y + h), (0, 255, 0), 1)
	cv2.putText(output, str(digit), (x - 10, y - 10),
		cv2.FONT_HERSHEY_SIMPLEX, 0.65, (0, 255, 0), 2)

We start looping over the (x, y)-coordinates of each segment on Line 111.

We extract the segment ROI on Line 115, followed by computing the number of non-zero pixels on Line 116 (i.e., the number of pixels in the segment that are “on”).

If the ratio of non-zero pixels to the total area of the segment is greater than 50% then we can assume the segment is “on” and update our

on
  list accordingly (Lines 121 and 122).

After looping over the seven segments we can pass the

on
  list to
DIGITS_LOOKUP
  to obtain the digit itself.

We then draw a bounding box around the digit and display the digit on the

output
  image.

Finally, our last code block prints the digit to our screen and displays the output image:

# import the necessary packages
from imutils.perspective import four_point_transform
from imutils import contours
import imutils
import cv2

# define the dictionary of digit segments so we can identify
# each digit on the thermostat
DIGITS_LOOKUP = {
	(1, 1, 1, 0, 1, 1, 1): 0,
	(0, 0, 1, 0, 0, 1, 0): 1,
	(1, 0, 1, 1, 1, 1, 0): 2,
	(1, 0, 1, 1, 0, 1, 1): 3,
	(0, 1, 1, 1, 0, 1, 0): 4,
	(1, 1, 0, 1, 0, 1, 1): 5,
	(1, 1, 0, 1, 1, 1, 1): 6,
	(1, 0, 1, 0, 0, 1, 0): 7,
	(1, 1, 1, 1, 1, 1, 1): 8,
	(1, 1, 1, 1, 0, 1, 1): 9
}

# load the example image
image = cv2.imread("example.jpg")

# pre-process the image by resizing it, converting it to
# graycale, blurring it, and computing an edge map
image = imutils.resize(image, height=500)
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
blurred = cv2.GaussianBlur(gray, (5, 5), 0)
edged = cv2.Canny(blurred, 50, 200, 255)

# find contours in the edge map, then sort them by their
# size in descending order
cnts = cv2.findContours(edged.copy(), cv2.RETR_EXTERNAL,
	cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if imutils.is_cv2() else cnts[1]
cnts = sorted(cnts, key=cv2.contourArea, reverse=True)
displayCnt = None

# loop over the contours
for c in cnts:
	# approximate the contour
	peri = cv2.arcLength(c, True)
	approx = cv2.approxPolyDP(c, 0.02 * peri, True)

	# if the contour has four vertices, then we have found
	# the thermostat display
	if len(approx) == 4:
		displayCnt = approx
		break

# extract the thermostat display, apply a perspective transform
# to it
warped = four_point_transform(gray, displayCnt.reshape(4, 2))
output = four_point_transform(image, displayCnt.reshape(4, 2))

# threshold the warped image, then apply a series of morphological
# operations to cleanup the thresholded image
thresh = cv2.threshold(warped, 0, 255,
	cv2.THRESH_BINARY_INV | cv2.THRESH_OTSU)[1]
kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (1, 5))
thresh = cv2.morphologyEx(thresh, cv2.MORPH_OPEN, kernel)

# find contours in the thresholded image, then initialize the
# digit contours lists
cnts = cv2.findContours(thresh.copy(), cv2.RETR_EXTERNAL,
	cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if imutils.is_cv2() else cnts[1]
digitCnts = []

# loop over the digit area candidates
for c in cnts:
	# compute the bounding box of the contour
	(x, y, w, h) = cv2.boundingRect(c)

	# if the contour is sufficiently large, it must be a digit
	if w >= 15 and (h >= 30 and h <= 40):
		digitCnts.append(c)

# sort the contours from left-to-right, then initialize the
# actual digits themselves
digitCnts = contours.sort_contours(digitCnts,
	method="left-to-right")[0]
digits = []

# loop over each of the digits
for c in digitCnts:
	# extract the digit ROI
	(x, y, w, h) = cv2.boundingRect(c)
	roi = thresh[y:y + h, x:x + w]

	# compute the width and height of each of the 7 segments
	# we are going to examine
	(roiH, roiW) = roi.shape
	(dW, dH) = (int(roiW * 0.25), int(roiH * 0.15))
	dHC = int(roiH * 0.05)

	# define the set of 7 segments
	segments = [
		((0, 0), (w, dH)),	# top
		((0, 0), (dW, h // 2)),	# top-left
		((w - dW, 0), (w, h // 2)),	# top-right
		((0, (h // 2) - dHC) , (w, (h // 2) + dHC)), # center
		((0, h // 2), (dW, h)),	# bottom-left
		((w - dW, h // 2), (w, h)),	# bottom-right
		((0, h - dH), (w, h))	# bottom
	]
	on = [0] * len(segments)

	# loop over the segments
	for (i, ((xA, yA), (xB, yB))) in enumerate(segments):
		# extract the segment ROI, count the total number of
		# thresholded pixels in the segment, and then compute
		# the area of the segment
		segROI = roi[yA:yB, xA:xB]
		total = cv2.countNonZero(segROI)
		area = (xB - xA) * (yB - yA)

		# if the total number of non-zero pixels is greater than
		# 50% of the area, mark the segment as "on"
		if total / float(area) > 0.5:
			on[i]= 1

	# lookup the digit and draw it on the image
	digit = DIGITS_LOOKUP[tuple(on)]
	digits.append(digit)
	cv2.rectangle(output, (x, y), (x + w, y + h), (0, 255, 0), 1)
	cv2.putText(output, str(digit), (x - 10, y - 10),
		cv2.FONT_HERSHEY_SIMPLEX, 0.65, (0, 255, 0), 2)

# display the digits
print(u"{}{}.{} \u00b0C".format(*digits))
cv2.imshow("Input", image)
cv2.imshow("Output", output)
cv2.waitKey(0)

Notice how we have been able to correctly recognize the digits on the LCD screen using Python and OpenCV:

Figure 13: Correctly recognizing digits in images with OpenCV and Python.

Figure 13: Correctly recognizing digits in images with OpenCV and Python.

Summary

In today’s blog post I demonstrated how to utilize OpenCV and Python to recognize digits in images.

This approach is specifically intended for seven-segment displays (i.e., the digit displays you would typically see on a digital alarm clock).

By extracting each of the seven segments and applying basic thresholding and morphological operations we can determine which segments are “on” and which are “off”.

From there, we can look up the on/off segments in a Python dictionary data structure to quickly determine the actual digit — no machine learning required!

As I mentioned at the top of this blog post, applying computer vision to recognizing digits in a thermostat image tends to overcomplicate the problem itself — utilizing a data logging thermometer would be more reliable and require substantially less effort.

However, in the case that (1) you do not have access to a data logging sensor or (2) you simply want to hone and practice your computer vision/OpenCV skills, it’s often helpful to see a solution such as this one demonstrating how to solve the project.

I hope you enjoyed today’s post!

To be notified when future blog posts are published, be sure to enter your email address in the form below!

Downloads:

If you would like to download the code and images used in this post, please enter your email address in the form below. Not only will you get a .zip of the code, I’ll also send you a FREE 11-page Resource Guide on Computer Vision and Image Search Engines, including exclusive techniques that I don’t post on this blog! Sound good? If so, enter your email address and I’ll send you the code immediately!

The post Recognizing digits with OpenCV and Python appeared first on PyImageSearch.

Text skew correction with OpenCV and Python

$
0
0

Today’s tutorial is a Python implementation of my favorite blog post by Félix Abecassis on the process of text skew correction (i.e., “deskewing text”) using OpenCV and image processing functions.

Given an image containing a rotated block of text at an unknown angle, we need to correct the text skew by:

  1. Detecting the block of text in the image.
  2. Computing the angle of the rotated text.
  3. Rotating the image to correct for the skew.

We typically apply text skew correction algorithms in the field of automatic document analysis, but the process itself can be applied to other domains as well.

To learn more about text skew correction, just keep reading.

Looking for the source code to this post?
Jump right to the downloads section.

Text skew correction with OpenCV and Python

The remainder of this blog post will demonstrate how to deskew text using basic image processing operations with Python and OpenCV.

We’ll start by creating a simple dataset that we can use to evaluate our text skew corrector.

We’ll then write Python and OpenCV code to automatically detect and correct the text skew angle in our images.

Creating a simple dataset

Similar to Félix’s example, I have prepared a small dataset of four images that have been rotated by a given number of degrees:

Figure 1: Our four example images that we’ll be applying text skew correction to with OpenCV and Python.

The text block itself is from Chapter 11 of my book, Practical Python and OpenCV, where I’m discussing contours and how to utilize them for image processing and computer vision.

The filenames of the four files follow:

$ ls images/
neg_28.png	neg_4.png	pos_24.png	pos_41.png

The first part of the filename specifies whether our image has been rotated counter-clockwise (negative) or clockwise (positive).

The second component of the filename is the actual number of degrees the image has been rotated by.

The goal our text skew correction algorithm will be to correctly determine the direction and angle of the rotation, then correct for it.

To see how our text skew correction algorithm is implemented with OpenCV and Python, be sure to read the next section.

Deskewing text with OpenCV and Python

To get started, open up a new file and name it

correct_skew.py
 .

From there, insert the following code:

# import the necessary packages
import numpy as np
import argparse
import cv2

# construct the argument parse and parse the arguments
ap = argparse.ArgumentParser()
ap.add_argument("-i", "--image", required=True,
	help="path to input image file")
args = vars(ap.parse_args())

# load the image from disk
image = cv2.imread(args["image"])

Lines 2-4 import our required Python packages. We’ll be using OpenCV via our

cv2
  bindings, so if you don’t already have OpenCV installed on your system, please refer to my list of OpenCV install tutorials to help you get your system setup and configured.

We then parse our command line arguments on Lines 7-10. We only need a single argument here,

--image
 , which is the path to our input image.

The image is then loaded from disk on Line 13.

Our next step is to isolate the text in the image:

# import the necessary packages
import numpy as np
import argparse
import cv2

# construct the argument parse and parse the arguments
ap = argparse.ArgumentParser()
ap.add_argument("-i", "--image", required=True,
	help="path to input image file")
args = vars(ap.parse_args())

# load the image from disk
image = cv2.imread(args["image"])

# convert the image to grayscale and flip the foreground
# and background to ensure foreground is now "white" and
# the background is "black"
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
gray = cv2.bitwise_not(gray)

# threshold the image, setting all foreground pixels to
# 255 and all background pixels to 0
thresh = cv2.threshold(gray, 0, 255,
	cv2.THRESH_BINARY | cv2.THRESH_OTSU)[1]

Our input images contain text that is dark on a light background; however, to apply our text skew correction process, we first need to invert the image (i.e., the text is now light on a dark background — we need the inverse).

When applying computer vision and image processing operations, it’s common for the foreground to be represented as light while the background (the part of the image we are not interested in) is dark.

A thresholding operation (Lines 23 and 24) is then applied to binarize the image:

Figure 2: Applying a thresholding operation to binarize our image. Our text is now white on a black background.

Given this thresholded image, we can now compute the minimum rotated bounding box that contains the text regions:

# import the necessary packages
import numpy as np
import argparse
import cv2

# construct the argument parse and parse the arguments
ap = argparse.ArgumentParser()
ap.add_argument("-i", "--image", required=True,
	help="path to input image file")
args = vars(ap.parse_args())

# load the image from disk
image = cv2.imread(args["image"])

# convert the image to grayscale and flip the foreground
# and background to ensure foreground is now "white" and
# the background is "black"
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
gray = cv2.bitwise_not(gray)

# threshold the image, setting all foreground pixels to
# 255 and all background pixels to 0
thresh = cv2.threshold(gray, 0, 255,
	cv2.THRESH_BINARY | cv2.THRESH_OTSU)[1]

# grab the (x, y) coordinates of all pixel values that
# are greater than zero, then use these coordinates to
# compute a rotated bounding box that contains all
# coordinates
coords = np.column_stack(np.where(thresh > 0))
angle = cv2.minAreaRect(coords)[-1]

# the `cv2.minAreaRect` function returns values in the
# range [-90, 0); as the rectangle rotates clockwise the
# returned angle trends to 0 -- in this special case we
# need to add 90 degrees to the angle
if angle < -45:
	angle = -(90 + angle)

# otherwise, just take the inverse of the angle to make
# it positive
else:
	angle = -angle

Line 30 finds all (x, y)-coordinates in the

thresh
  image that are part of the foreground.

We pass these coordinates into

cv2.minAreaRect
  which then computes the minimum rotated rectangle that contains the entire text region.

The

cv2.minAreaRect
  function returns angle values in the range [-90, 0). As the rectangle is rotated clockwise the angle value increases towards zero. When zero is reached, the angle is set back to -90 degrees again and the process continues.

Note: For more information on

cv2.minAreaRect
 , please see this excellent explanation by Adam Goodwin.

Lines 37 and 38 handle if the angle is less than -45 degrees, in which case we need to add 90 degrees to the angle and take the inverse.

Otherwise, Lines 42 and 43 simply take the inverse of the angle.

Now that we have determined the text skew angle, we need to apply an affine transformation to correct for the skew:

# import the necessary packages
import numpy as np
import argparse
import cv2

# construct the argument parse and parse the arguments
ap = argparse.ArgumentParser()
ap.add_argument("-i", "--image", required=True,
	help="path to input image file")
args = vars(ap.parse_args())

# load the image from disk
image = cv2.imread(args["image"])

# convert the image to grayscale and flip the foreground
# and background to ensure foreground is now "white" and
# the background is "black"
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
gray = cv2.bitwise_not(gray)

# threshold the image, setting all foreground pixels to
# 255 and all background pixels to 0
thresh = cv2.threshold(gray, 0, 255,
	cv2.THRESH_BINARY | cv2.THRESH_OTSU)[1]

# grab the (x, y) coordinates of all pixel values that
# are greater than zero, then use these coordinates to
# compute a rotated bounding box that contains all
# coordinates
coords = np.column_stack(np.where(thresh > 0))
angle = cv2.minAreaRect(coords)[-1]

# the `cv2.minAreaRect` function returns values in the
# range [-90, 0); as the rectangle rotates clockwise the
# returned angle trends to 0 -- in this special case we
# need to add 90 degrees to the angle
if angle < -45:
	angle = -(90 + angle)

# otherwise, just take the inverse of the angle to make
# it positive
else:
	angle = -angle

# rotate the image to deskew it
(h, w) = image.shape[:2]
center = (w // 2, h // 2)
M = cv2.getRotationMatrix2D(center, angle, 1.0)
rotated = cv2.warpAffine(image, M, (w, h),
	flags=cv2.INTER_CUBIC, borderMode=cv2.BORDER_REPLICATE)

Lines 46 and 47 determine the center (x, y)-coordinate of the image. We pass the

center
  coordinates and rotation angle into the
cv2.getRotationMatrix2D
  (Line 48). This rotation matrix
M
  is then used to perform the actual transformation on Lines 49 and 50.

Finally, we display the results to our screen:

# import the necessary packages
import numpy as np
import argparse
import cv2

# construct the argument parse and parse the arguments
ap = argparse.ArgumentParser()
ap.add_argument("-i", "--image", required=True,
	help="path to input image file")
args = vars(ap.parse_args())

# load the image from disk
image = cv2.imread(args["image"])

# convert the image to grayscale and flip the foreground
# and background to ensure foreground is now "white" and
# the background is "black"
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
gray = cv2.bitwise_not(gray)

# threshold the image, setting all foreground pixels to
# 255 and all background pixels to 0
thresh = cv2.threshold(gray, 0, 255,
	cv2.THRESH_BINARY | cv2.THRESH_OTSU)[1]

# grab the (x, y) coordinates of all pixel values that
# are greater than zero, then use these coordinates to
# compute a rotated bounding box that contains all
# coordinates
coords = np.column_stack(np.where(thresh > 0))
angle = cv2.minAreaRect(coords)[-1]

# the `cv2.minAreaRect` function returns values in the
# range [-90, 0); as the rectangle rotates clockwise the
# returned angle trends to 0 -- in this special case we
# need to add 90 degrees to the angle
if angle < -45:
	angle = -(90 + angle)

# otherwise, just take the inverse of the angle to make
# it positive
else:
	angle = -angle

# rotate the image to deskew it
(h, w) = image.shape[:2]
center = (w // 2, h // 2)
M = cv2.getRotationMatrix2D(center, angle, 1.0)
rotated = cv2.warpAffine(image, M, (w, h),
	flags=cv2.INTER_CUBIC, borderMode=cv2.BORDER_REPLICATE)

# draw the correction angle on the image so we can validate it
cv2.putText(rotated, "Angle: {:.2f} degrees".format(angle),
	(10, 30), cv2.FONT_HERSHEY_SIMPLEX, 0.7, (0, 0, 255), 2)

# show the output image
print("[INFO] angle: {:.3f}".format(angle))
cv2.imshow("Input", image)
cv2.imshow("Rotated", rotated)
cv2.waitKey(0)

Line 53 draws the

angle
  on our image so we can verify that the output image matches the rotation angle (you would obviously want to remove this line in a document processing pipeline).

Lines 57-60 handle displaying the output image.

Skew correction results

To grab the code + example images used inside this blog post, be sure to use the “Downloads” section at the bottom of this post.

From there, execute the following command to correct the skew for our

neg_4.png
  image:
$ python correct_skew.py --image images/neg_4.png 
[INFO] angle: -4.086

Figure 3: Applying skew correction using OpenCV and Python.

Here we can see that that input image has a counter-clockwise skew of 4 degrees. Applying our skew correction with OpenCV detects this 4 degree skew and corrects for it.

Here is another example, this time with a counter-clockwise skew of 28 degrees:

$ python correct_skew.py --image images/neg_28.png 
[INFO] angle: -28.009

Figure 4: Deskewing images using OpenCV and Python.

Again, our skew correction algorithm is able to correct the input image.

This time, let’s try a clockwise skew:

$ python correct_skew.py --image images/pos_24.png 
[INFO] angle: 23.974

Figure 5: Correcting for skew in text regions with computer vision.

And finally a more extreme clockwise skew of 41 degrees:

$ python correct_skew.py --image images/pos_41.png 
[INFO] angle: 41.037

Figure 6: Deskewing text with OpenCV.

Regardless of skew angle, our algorithm is able to correct for skew in images using OpenCV and Python.

Interested in learning more about computer vision and OpenCV?

If you’re interested in learning more about the fundamentals of computer vision and image processing, be sure to take a look at my book, Practical Python and OpenCV:

Inside the book you’ll learn the basics of computer vision and OpenCV, working your way up to more advanced topics such as face detectionobject tracking in video, and handwriting recognition, all with lots of examples, code, and detailed walkthroughs.

If you’re interested in learning more (and how my book can teach you these algorithms in less than a single weekend), just click the button below:

Summary

In today’s blog post I provided a Python implementation of Félix Abecassis’ approach to skew correction.

The algorithm itself is quite straightforward, relying on only basic image processing techniques such as thresholding, computing the minimum area rotated rectangle, and then applying an affine transformation to correct the skew.

We would commonly use this type of text skew correction in an automatic document analysis pipeline where our goal is to digitize a set of documents, correct for text skew, and then apply OCR to convert the text in the image to machine-encoded text.

I hope you enjoyed today’s tutorial!

To be notified when future blog posts are published, be sure to enter your email address in the form below!

Downloads:

If you would like to download the code and images used in this post, please enter your email address in the form below. Not only will you get a .zip of the code, I’ll also send you a FREE 11-page Resource Guide on Computer Vision and Image Search Engines, including exclusive techniques that I don’t post on this blog! Sound good? If so, enter your email address and I’ll send you the code immediately!

The post Text skew correction with OpenCV and Python appeared first on PyImageSearch.

Resolving macOS, OpenCV, and Homebrew install errors

$
0
0

As you undoubtedly know, configuring and installing OpenCV on your macOS machine can be a bit of a pain.

To help you and other PyImageSearch readers get OpenCV installed faster (and with less headaches), I put together a tutorial on using Homebrew to install OpenCV.

Using Homebrew allows you to skip manually configuring your build and compiling OpenCV from source.

Instead, you simply use what are called brew formulas which define how a given package should be automatically configured and installed, similar to how a package manager can intelligently install libraries and software on your system.

However, a bit of a problem arose a few weeks ago when it was discovered that there were some errors in the most recent Homebrew formula used to build and install OpenCV on macOS.

This formula caused two types of errors when building OpenCV on macOS via Homebrew:

  • Error #1: A report that both Python 2 and Python 3 wrappers could not be built (this is not true, you can build both Python 2.7 and Python 3 bindings in the same Homebrew command).
  • Error #2: A missing
    downloader.cmake
      file.

Myself, as well as PyImageSearch readers Andreas Linnarsson, Francis, and Patrick (see the comments section of the Homebrew OpenCV install post for the gory details) dove into the problem and tackled it head on.

Today I’m going to share our findings in hopes that it helps you and other PyImageSearch readers install OpenCV via Homebrew on your macOS machines.

In an ideal world these instructions will eventually become out of date as the Homebrew formula used to configure and install OpenCV is updated to correct these errors.

To learn more about resolving Homebrew errors when installing OpenCV, just keep reading.

Resolving macOS, OpenCV, and Homebrew install errors

In the remainder of this blog post I’ll discuss common errors you may run into when installing OpenCV via Homebrew on your macOS system.

I’ll also provide extra bonus suggestions regarding checking your Python version to help you debug these errors further.

Error #1: opencv3: Does not support building both Python 2 and 3 wrappers

Assuming you followed my original Homebrew + OpenCV install post, you may have ran into the following error when trying to install OpenCV:

$ brew install opencv3 --with-contrib --with-python3 --HEAD
...
Error: opencv3: Does not support building both Python 2 and 3 wrappers

This error was introduced by the following commit. I find the error frustrating for two reasons:

  1. There is no need to make this check…
  2. …because Homebrew can be used to compile OpenCV twice: once for Python 2.7 and then again for Python 3.

To start, OpenCV 3 can be built with Python 2.7 and Python 3 bindings. It just requires two separate compiles.

The first compile handles building OpenCV 3 + Python 2.7 bindings while the second compile generates the OpenCV 3 + Python 3 bindings. Doing this installs OpenCV 3 properly while generating the correct

cv2.so
  bindings for each respective Python version.

There are two ways to resolve this error, as discussed in this StackOverflow thread.

The first method is arguably simpler, but doesn’t address the real problem. Here we just update the

brew install opencv3
  command to indicate that we want to build OpenCV 3 without Python 3 bindings:
$ brew install opencv3 --with-contrib

Notice how we have left out the

--with-python3
  switch. In this case, Homebrew automatically builds Python 2.7 bindings for OpenCV 3 (there is no
--with-python2
  switch; it’s automatically assumed).

Similarly, if we wanted to build OpenCV 3 with Python 3 bindings, we would update the

brew install opencv3
  command to be:
$ brew install opencv3 --with-contrib --with-python3 --without-python

Here we supply

--with-python3
  to indicate we would like OpenCV 3 + Python 3 bindings to be generated, but to skip generating the OpenCV 3 + Python 2.7 bindings using the
--without-python
  switch.

This method works; however, I find it both frustrating and confusing. To start, the

--without-python
  switch is extremely ambiguous.

If I were to supply a switch named

--without-python
  to an install command I would assume that it would build NO Python bindings what-so-ever, regardless of Python version. However, that’s not the case. Instead,
--without-python
  really means no Python 2.7 bindings.

These switches are confusing to both OpenCV install veterans such as my myself along with novices who are just trying to get their development environment configured correctly for the first time.

In my opinion, a better solution (until a fix is fully released, of course) is to edit the OpenCV 3 install formula itself.

To edit the OpenCV 3 Homebrew install formula, execute the following command:

$ brew edit opencv3

And then find the following configuration block:

if build.with?("python3") && build.with?("python")
  # Opencv3 Does not support building both Python 2 and 3 versions
  odie "opencv3: Does not support building both Python 2 and 3 wrappers"
end

As you can see from my screenshot below, this configuration is on Lines 187-190 (however, these lines will change as the OpenCV 3 Homebrew formula is updated).

Figure 1: Finding the Homebrew + OpenCV 3 formula that needs to be edited.

Once you’ve found this section, comment these four lines out:

#if build.with?("python3") && build.with?("python")
#  # Opencv3 Does not support building both Python 2 and 3 versions
#  odie "opencv3: Does not support building both Python 2 and 3 wrappers"
#end

I’ve provided a screenshot demonstrating commenting these lines out as well:

Figure 2: Updating the Homebrew + OpenCV 3 install formula to resolve the error.

After you’ve commented the lines out, save and exit the editor to update the OpenCV 3 Homebrew install formula.

From there you should be able to successfully install OpenCV 3 via Homebrew using the following command:

$ brew install opencv3 --with-contrib --with-python3

Figure 3: Successfully compiling OpenCV 3 with Python 2.7 and Python 3 bindings on macOS via Homebrew.

Note: If you receive an error message related to

downloader.cmake
 , make sure you proceed to the next section.

After OpenCV 3 has finished installing, go back to the original tutorial, and follow the instructions starting with the “Handling the Python 3 issue” section.

From there, you will have OpenCV 3 installed with both Python 2.7 and Python 3 bindings:

Figure 4: Importing the cv2 library into a Python 2.7 and Python 3 shell.

Again, keep in mind that two separate compiles were done in order to generate these bindings. The first compile generated the OpenCV 3 + Python 2.7 bindings while the second compile created the OpenCV 3 + Python 3 bindings.

Error #2: No such file or directory 3rdparty/ippicv/downloader.cmake

The second error you may encounter when installing OpenCV 3 via Homebrew is related to the

downloader.cmake
  file. This error only occurs when you supply the
--HEAD
  switch to the
brew install opencv3
  command.

The reason for this error is that the

3rdparty/ippicv/downloader.cmake
  file was removed from the repo; however, the Homebrew install formula has not been updated to reflect this (source).

Therefore, the easiest way to get around this error is to simply omit the

--HEAD
  switch.

For example, if your previous OpenCV 3 + Homebrew install command was:

$ brew install opencv3 --with-contrib --with-python3 --HEAD

Simply update it to be:

$ brew install opencv3 --with-contrib --with-python3

Provided you’ve followed the instructions from the “Error #1” section above, Homebrew should now install OpenCV 3 with Python 2.7 and Python 3 bindings. You’ll now want to go back to the original Homebrew + OpenCV tutorial, and follow the instructions starting with the “Handling the Python 3 issue” section.

BONUS: Check your Python version and update paths accordingly

If you’re new to Unix environments and the command line (or if this is the first time you’ve worked with Python + OpenCV together), a common mistake I see novices make is forgetting to check their Python version number.

You can check your version of Python 2.7 using the following command:

$ python --version
Python 2.7.13

Similarly, this command will give you your Python 3 version:

$ python3 --version
Python 3.6.1

Why is this so important?

The original Homebrew + OpenCV install tutorial was written for Python 2.7 and Python 3.5. However, Python versions update. Python 3.6 has been officially released and is being used on many machines. In fact, if you were to install Python 3 via Homebrew (at the time of this writing), Python 3.6 would be installed.

This is important because you need to check your file paths.

For example, if I were to tell you to check the

site-packages
  directory of your Python 3 install and provide an example command of:
$ ls /usr/local/opt/opencv3/lib/python3.5/site-packages/

You should first check your Python 3 version. The command executed above assumes Python 3.5. However, if after running

python3 --version
  you find you are using Python 3.6, would need to update your path to be:
$ ls /usr/local/opt/opencv3/lib/python3.6/site-packages/

Notice how

python3.5
  was changed to
python3.6
 .

Forgetting to check and validate file paths is a common mistake that I see novices make when installing and configuring OpenCV with Python bindings.

Do not blindly copy and paste commands in your terminal. Instead, take the time to understand what they are doing so you can adapt the instructions to your own development environment.

In general, the instructions to install OpenCV + Python on a system do not change — but Python and OpenCV versions do change, therefore some file paths will change slightlyNormally all this amounts to changing one or two characters in a file path.

Summary

In today’s blog post we reviewed two common error messages you may encounter when installing OpenCV 3 via Homebrew:

  • Error #1: A report that both Python 2 and Python 3 wrappers could not be built.
  • Error #2: A missing
    downloader.cmake
      file.

I then provided solutions to each of these errors thanks to the help of PyImageSearch readers Andreas Linnarsson, Francis, and Patrick.

I hope these instructions help you avoid these common errors when installing OpenCV 3 via Homebrew on your macOS machine!

Before you go, be sure to enter your email address in the form below to be notified when future blog posts are published on PyImageSearch!

The post Resolving macOS, OpenCV, and Homebrew install errors appeared first on PyImageSearch.

Viewing all 49 articles
Browse latest View live