Computer

VISION


Miguel Araujo @maraujop

http://bit.ly/pycones



DISCLAIMER

Just an amateur

http://bit.ly/PYCONES

















Red LIGHT HAL




HARDWARE

CAMERAS


  • Compact cameras
  • DSLR cameras (Reflex)
  • Micro cameras
  • USB cameras (webcams)
  • IP cameras
  • Depth field / 3D cameras

CHOOSING A CAMERA


  • Volume / Weight
  • Size of the sensor, bigger is always better
  • Focal Length
  • Resolution
  • Light conditions
  • Adjustable
  • Price

PHOTOGRAPHY 101


3 Pillars

  • Shutter speed
  • Aperture
  • ISO (Film speed)

ALSO

  • White balance
  • etc

Shutter speed



APERTURE

Depth of field


ISO


LIBGPHOTO2


  • Linux Open Source project
  • Handles digital cameras DSLRs/compact cameras through USB.
  • Supports MTP and PTP v1 & v2.


Vision

Compact Cameras

  • Many take from 6-15 seconds using libgphoto2. 
  • Rarely can stream video in real time.
  • Rarely can adjust camera settings on the go.


Vision

DSLRs

  • Good time response.
  • Very well supported, many features.
  • Many camera parameters adjustable on the fly.


VISION

Micro Cameras

  • Custom drivers
  • Proprietary ports


VISION

Webcams

  • Bad resolution
  • Handled through V4L2 
  • Poor performance in bad lighting conditions
  • Not very adjustable

EXTRA

  • Lenses
  • Number of cameras



SOFTWARE

OpenCV


  • Open Source
  • Known and respected
  • C++ powered
  • Python bindings
  • Low level concepts, hard for newbies
  • opencv-processing and others

Simplecv


  • Built on top of OpenCV using Python
  • Not a replacement
  • High level concepts and data structures
  • It also stands on the shoulders of others giants: numpy, Orange, scipy...
  • Well, yeah, it uses camelCase
  • simplecv-js

HELLO WORLD

COORDINATES

FEATURE DETECTION


  • Edges
  • Lines
  • Corners
  • Circles
  • Blobs

BLOB


A region of an image in which some properties are constant or vary within a prescribed range of values.

Blue M&Ms are blobs


m_and_ms = Image('m&ms.jpg')
blue_dist = m_and_ms.colorDistance(Color.BLUE)
blue_dist.show()

BLUE BLOBS



blue_dist = blue_dist.invert()
blobs = blue_dist.findBlobs()
print len(blobs)
>> 122

blobs.draw(Color.RED, width=-1)
blue_dist.show()


Polishing it

findBlobs(minsize, maxsize, threshval, ...)
blue_dist.findBlobs(minsize=200)
blobs = blobs.filter(blobs.area() > 200)
len(blobs)
>> 36

average_area = np.average(blobs.area())
>> 37792.77

blue_dist = blue_dist.scale(0.35)
blobs = blue_dist.findBlobs(threshval=177, minsize=100)
len(blobs)
>> 25


RULes


  • Dynamic is better than fixed, but harder to achieve.
  • If color is not needed, drop it, at least until needed. 
  • The smaller the picture, less information, faster processing.
  • Always use the easiest solution, which will usually be the fastest too.
  • Real life vs laboratory situations.
  • Some things are harder than they look like.
  • When working in artificial vision, don't forget about other input sources (time, sounds, etc).

GOLDEN RULE

  • Always do in hardware what you can do in hardware.

COLOR SPACES

RGB / BGR

image.toRGB()

HSV (Hue Saturation Value)

image.toHSV()

YCbCr

image.toYCbCr()

huedistance



blue_hue_dist = m_and_ms.hueDistance((0,117,245))


IDEAL



blue_hue_dist = m_and_ms.hueDistance(Color.BLUE)


binarize


  • Creates a binary (black/white) image. It's got many parameters you can tweak.
  • Use Otsu's method by default, adjusting the threshold dynamically for better results.

blue_dist.binarize(blocksize=501).show() 





MATCHING




Detector
Descriptor
Matcher
Filtering or Pruning best matches

DETECTORS

They need to be effective with changes in:

  • Viewpoint 
  • Scale 
  • Blur 
  • Illumination
  • Noise

DETECTORS

Find ROIs


Corners

  • Hessian Affine
  • Harris Affine
  • FAST

Keypoints

  • SIFT
  • SURF
  • MSER
  • ORB (Tracking)
  • BRISK (Tracking)
  • FREAK (Tracking)

Many more

DESCRIPTORS

Speed vs correctness

  • SURF
  • SIFT
  • LAZY
  • ORB
  • BRIEF
  • RIFF
  • etc.

MATCHERS


  • FLANN
  • Brute Force

PRUNING


  • Cross-check
  • Ratio-Test
  • shape overlapping 

mATCHING


  • Template or Query image (Choose wisely)
  • Sample or Train image




result_image = sample.drawKeypointMatches(template)


skp, tkp = sample.findKeypointMatches(template)

skp - Keypoints matched in sample
tkp - Keypoints matched in template

findKEYPOINTMATCH


  • Detection: Hessian affine 
  • Description: SURF
  • Matching: FLANN Knn
  • Filtering: Lowe's ratio test
  • find an Homography
  • Returns a FeatureSet with one KeypointMatch

 TEMPLATE 

 SAMPLE 

FINDKEYPOINTMATCH



coupons = Image("coupons.jpg")
coupon = Image("coupon.jpg")
match = coupons.findKeypointMatch(coupon)
match.draw(width=10, color=Color.GREEN)
uno.save("result.jpg") 


2nd example





FAILS

 Many OUTLIERS 

CLUSTERING


def find_clusters(keypoints, separator=None):
    features = FeatureSet(keypoints)
    if separator is None:
        separator = np.average(features.area())

    features = features.filter(
        features.area() > separator
    )
    return features.cluster(
        method="hierarchical", 
        properties="position"
    )

BIGGEST CLUSTER



def find_biggest_cluster(clusters):
    max_number_of_clusters = 0
    for cluster in clusters:
        if len(cluster) > max_number_of_clusters:
            biggest_cluster = cluster
            max_number_of_clusters = len(cluster)

return biggest_cluster



NORMAL DISTRIBUTION


Point = namedtuple('Point', 'x y')
def distance_between_points(point_one, point_two):
    return sqrt(
        pow((point_one.x - point_two.x), 2) + \
        pow((point_one.y - point_two.y), 2)
    )

skp_set = FeatureSet(biggest_cluster)
x_avg, y_avg = find_centroid(skp_set)
centroid = Point(x_avg, y_avg)
uno.drawRectangle(
    x_avg, y_avg, 20, 20, width=30, color=Color.RED
)

NORMAL DISTRIBUTION


distances = []
for kp in biggest_cluster:
    distances.append(distance_between_points(kp, centroid))

mu, sigma = cv2.meanStdDev(np.array(distances))
mu = mu[0][0]
sigma = sigma[0][0]

for kp in skp:
    if distance_between_points(kp, centroid) < (mu + 2*sigma):
        uno.drawRectangle(
            kp.x, kp.y, 20, 20, width=30, color=Color.GREEN
        )

Normal DISTRIBUTION

ReAL WORLD EXAMPLE






DETECTION

HAAR

FACE DETECTION
Haar-like features 2001 Viola-Jones



HAAR

  • Needs to be trained with hundreds/thousands
  • Scale invariant
  • NOT Rotation invariant
  • Fast and robust
  • Not only for faces

HAAR


friends.listHaarFeatures()
['right_ear.xml', 'right_eye.xml', 'nose.xml', 'face4.xml', 'glasses.xml', ...]    

faces = friends.findHaarFeatures("face.xml")
faces.draw(width=10, color=Color.RED)
faces.save('result.jpg')

1 MISS FACE.XML

face2.xml




TRACKING

TRACKING

  • Detection != tracking
  • Uses information from previous frames
  • Initially tracks what we want


SOME Alternatives

  • Optic Flow: Lucas-Kanade
  • Descriptors: SURF
  • Probability/Statistics and histograms: Camshift

CAMSHIFT


  • Effective for tracking simple and constant objects with homogeneous colors, like faces.
  • Gary Bradski in 1998
  • Original implementation has problems with similar color objects around or crossing trajectories and  lightning changes.

Simple example

from SimpleCV import *

video = VirtualCamera("jack.mp4", 'video')
video_stream = VideoStream(
    "jack_tracking.mp4", framefill=False, codec="mp4v"
)

track_set = []
current = video.getImage()

while (disp.isNotDone()):
    frame = video.getImage()
    track_set = frame.track(
        'camshift', track_set, current, [100, 100, 50, 50]
    )
    track_set.drawBB()
    current = frame
    frame.save(video_stream)


MoRE COMPLEX

Initialization
video_stream = VideoStream(
    "jack_tracking.avi", framefill=False, 
    codec="mp4v"
)
video = VirtualCamera("jack.mp4", 'video')

disp = Display()

detected = False
current = video.getImage().scale(0.6)
tracked_objects = []
last_diff = None

while (disp.isNotDone()):
    frame = video.getImage().scale(0.6)

    # Scene changes
    diff = cv2.absdiff(frame.getNumpyCv2(), current.getNumpyCv2())
    if last_diff and diff.sum() > last_diff * 6:
        detected = False
    last_diff = diff.sum()

    # Detects faces and restarts tracking
    faces = frame.findHaarFeatures('face2.xml')
    if faces and not detected:
        tracked_objects = []
        final_faces = []
        for face in faces:
            if face.area() > 65:
                tracked_objects.append([])
                final_faces.append(face)
                detected = True


    # Restart if tracking grows too much
    if detected:
        for i, track_set in enumerate(tracked_objects):
            track_set = frame.track(
                'camshift', track_set, current,
                final_faces[i].boundingBox()
            )

            # Restart detection and tracking
            if track_set[-1].area > final_faces[i].area() * 3 \
                or not detected:
                    detected = False
                    break

            # Update tracked object and draw it
            tracked_objects[i] = track_set
            track_set.drawBB()

    current = frame
    frame.save(video_stream)




MoG

BACKGROUND SUBSTRACTION


  • Separate people and objects that move (foreground) from the fixed environment (background)
  • MOG - Adaptative Mixture Gaussian Model

BACKGROUND SUBSTRACTION

mog = MOGSegmentation(
    history=200, nMixtures=5, backgroundRatio=0.3, noiseSigma=16,
    learningRate=0.3
)

video = VirtualCamera('semaforo.mp4', 'video')
video_stream = VideoStream("mog.mp4", framefill=False, codec="mp4v")

while (disp.isNotDone()):
    frame = video.getImage().scale(0.5)

    mog.addImage(frame)
    # segmentedImage = mog.getSegmentedImage()
    blobs = mog.getSegmentedBlobs()
    if blobs:
        blobs.draw(width=-1)

    frame.save(video_stream)




RED-LIGHT HAL

Red light runners


1- Detect if traffic light is red, otherwise it's green. Using hysteresis.
2- Project a line for runners.
3- Do MOG and pruning for finding cars.
4- When traffic light is RED, if a car blob intersects the line, then it's a runner.
5- Recognize car to count it only once.



red_light_bb = [432, 212, 13, 13]
cross_line = Line(
    frame.scale(0.5), ((329, 230), (10, 360))
)

RED = False
number_of_opposite = 0
HISTERESIS_FRAMES = 5




def is_traffic_light_red(frame):
    red_light = frame.crop(*red_light_bb)

    # BLACK (30, 28, 35)
    # RED  (21, 17, 51)
    if red_light.meanColor()[2] > 42:
        return True

    return False


def hysteresis(red_detected=False, green_detected=False):
    global RED, number_of_opposite
    
    if RED and green_detected:
        number_of_opposite += 1
        if number_of_opposite == HISTERESIS_FRAMES:
            RED = False
            number_of_opposite = 0
    elif not RED and red_detected:
        number_of_opposite += 1
        if number_of_opposite == HISTERESIS_FRAMES:
            RED = True
            number_of_opposite = 0
    else:
        number_of_opposite = 0

while (disp.isNotDone()):
    frame = video.getImage()
    small_frame = frame.scale(0.5)
    mog.addImage(small_frame)

    if is_traffic_light_red(frame):
        hysteresis(red_detected=True)
        if RED:
            blobs = mog.getSegmentedBlobs()
            if blobs:
                big_blobs = blobs.filter(blobs.area() > 1000)

                for car in big_blobs:
                    if cross_line.intersects(car.getFullMask()):
                        # RED LIGHT RUNNER
                        small_frame.drawRectangle(
                            *car.boundingBox(), color=Color.RED, width=3
                        )
    else:
        hysteresis(green_detected=True)

    small_frame.save(disp)

first prototype

RASPBERRY


  • Raspberry SimpleCV Raspicam
  • Autonomous system, ethernet connected, uploads runner videos online.
  • No night time support yet.
  • Slower, not real time, discards green parts.




THANKS

QUESTIONS?

ComputerVISION

By Miguel Araujo Pérez