In-focus Region Detection Using Harmonic Variance of DCT Band-pass Filtering

Detecting in-focus regions in a low depth-of-field (DoF) image has important applications in scene understanding, object-based coding, image quality assessment and depth estimation because such regions may indicate semantically meaningful objects.

Here I show a challenging example,  a spider web. Note that some parts of the defocus regions still demonstrate strong patch variances.

spider-web-macro-photography

In our paper, we show that by simply using DCT band-pass filtering and harmonic variance, we could get very robust in-focus measurement, as shown in the following picture. The idea is simple. Defocus regions only cover some low frequency bands because of lens blur, while in-focus region cover the full spectrum bands. By using DCT band-pass filtering, we could extract those frequency responses. If we just sum those responses all together, the final estimate could be well biased due to very high responses of just some low frequency channels. Harmonic mean solves this problem by penalizing those outliers and could generate very robust outputs.

hv

In short, this algorithm works by
1) apply 63 DCT band-pass filtering (for 8×8 patches) to the input image, excluding the first DC/average filtering
2) compute patch variance for each 63 filtered images. Now at every pixel location (x, y), we have 63 variance measurements
3) compute harmonic mean of these 63 variance as harmonic variance
4) optionally, we could apply sigmoid function to the harmonic variance image to restrict the outputs to be [0, 1].

If you want to estimate the actual noise level of the input image, please refer to our paper for kurtosis analysis.

import numpy as np
import scipy.misc as misc
import scipy.ndimage as ndimage
import scipy.stats as stats
import matplotlib.pyplot as plt

def Sigmoid(x, ratio) :
    return 1 / (1 + np.exp(-ratio * x) )

def Tanh(x, ratio) :
    return 2 * Sigmoid(2 * x, ratio) - 1

def GenerateDCT8x8Base() :

    N = 8
    x = np.array( np.arange(0, N ), dtype=np.float32, ndmin=2 )

    C = np.cos( np.transpose(2*x+1) * x * np.pi / (2.0*N) ) * np.sqrt(2.0/N)
    C[:, 0] = C[:, 0] / np.sqrt(2.0)

    plt.close('all')
    plt.figure()

    filters = []
    for i in range( N ) :
        for j in range( N ) :

            X = np.transpose(np.array(C[:, i], ndmin=2)) * np.transpose( C[:, j])
            im = misc.imresize(X, (128, 128), interp='nearest')
            plt.subplot(N, N, i * N + j + 1)
            plt.imshow(im, cmap=plt.cm.gray)
            plt.axis('off')

            filters.append( X )

    del filters[0] 

    return filters

def ComputeHarmonicVariance(im, kernels):

    size = list(im.shape)
    size.append(len(kernels))

    k_size = kernels[0].shape
    k_avg = np.ones( k_size, np.float32) / (k_size[0] * k_size[1])

    outs = np.empty( size, dtype=np.float32)
    for i in range(len(kernels)) :
        data = ndimage.convolve(im, kernels[i])
        data2 = data * data

        EX2 = ndimage.convolve(data2, k_avg)
        E2X = np.square(ndimage.convolve(data, k_avg))

        outs[:,:,i] = EX2 - E2X

    variance = np.empty( im.shape, dtype=np.float32 )
    for y in range(size[0]):
        for x in range(size[1]):
            data = outs[y, x, :]
            variance[y, x] = stats.hmean(outs[y,x,:]+1e-8)

    return variance            

im = misc.imread('Spider-Web-Macro-Photography.jpg').astype(np.float32)
im = im[:,:,0] # extract Red channel for demonstration
kernels = GenerateDCT8x8Base()
hv = ComputeHarmonicVariance(im, kernels)
plt.figure(), plt.imshow(Tanh(hv, 0.01), cmap=plt.cm.gray), plt.title('hv t 0.01')
Advertisements

Multiply Strings

Given two numbers represented as strings, return multiplication of the numbers as a string.
Note: The numbers can be arbitrarily large and are non-negative.

string stringMultip(string &A, string &B) {

	string res(A.size() + B.size(), '0');
	
	string::reverse_iterator rit1, rit2, rit3, rit4;
	int num_a, num_b, num_c, num_carry, num_res, i;
	
	for(rit1 = A.rbegin(), rit3 = res.rbegin(); rit1 != A.rend(); rit1 ++, rit3 ++) {
		
		num_a = (*rit1) - '0';
		num_carry = 0;
		
		for(rit2 = B.rbegin(), rit4 = rit3; rit2 != B.rend(); rit2 ++, rit4 ++) {
			
			num_b = (*rit2) - '0';
			num_c = (*rit4) - '0';
			
			num_res = num_a * num_b + num_c + num_carry;
			num_carry = num_res / 10;
			(*rit4) = '0' + (num_res % 10);
		}

		// put the carry back to the result
		(*rit4) = '0' + num_carry;
	}

	// remove leading '0's
	for(i = 0; i < res.size(); i ++)
		if(res[i] != '0') 
			break;

	return res.erase(0, i);
}

The following provides a simple test case.

#include <iostream>
#include <string>

using namespace std;

int main ()
{
	string A("123456");
	string B("294987");
	string C("123");
	string D("234");	

	cout << stringMultip(A, B) <<endl;
	cout << stringMultip(C, D) <<endl;
	
	return 0;
}

Binary Tree Postorder Traversal using Stack

The problem:
Given a binary tree, return the postorder traversal of its nodes’ values. For example, given binary tree:
a              1
          /      \
       4           2
         \       /
e           10  3
return [10, 4, 3, 2, 1]

#include <iostream>
#include <stack>
#include <string>

using namespace std;

struct TreeNode {
    int val;
    TreeNode *left;
    TreeNode *right;
    TreeNode(int x): val(x), left(nullptr), right(nullptr) {};
};

int postorderTraversal(TreeNode *root) {

    const TreeNode *p = root;
    stack<const TreeNode*> s;

    if(p!=nullptr) {
        TreeNode x(p->val);
        s.push(&x);

        if(p->right != nullptr) s.push(p->right);
        if(p->left != nullptr) s.push(p->left);
    }

    while(!s.empty()) {

        p = s.top();
        s.pop();

        if(p->left == nullptr && p->right == nullptr)
            printf("%d\n", p->val);
        else {
            TreeNode x(p->val);
            s.push(&x);

            if(p->right != nullptr) s.push(p->right);
            if(p->left != nullptr) s.push(p->left);
        }
    }

    return 0;
}

int main() {

    TreeNode root(1);
    TreeNode node1(2);
    TreeNode node2(3);
    TreeNode node3(4);
    TreeNode node4(10);

    root.left = &node3;
    root.right = &node1;
    node1.left = &node2;
    node3.right = &node4;

    postorderTraversal(&root);

    return 0;
}

The similar code structure could be used for binary tree inorder traversal. The key idea here is to create a temporary node which holds only the value and push it into the stack. Whenever we meet a node without both childern, we print out its value.

A simple implementation of sobel filtering in Python

One can directly use ‘ndimage’ of scipy to compute the sobel filtering of the input image as follows:

dx = ndimage.sobel(im, 0)  # horizontal derivative
dy = ndimage.sobel(im, 1)  # vertical derivative
mag = np.hypot(dx, dy)  # magnitude
mag *= 255.0 / np.max(mag)  # normalize

Or your can write the function by yourself and add more features to it.

def sobel_filter(im, k_size):
    
    im = im.astype(np.float)
    width, height, c = im.shape
    if c > 1:
        img = 0.2126 * im[:,:,0] + 0.7152 * im[:,:,1] + 0.0722 * im[:,:,2]
    else:
        img = im
    
    assert(k_size == 3 or k_size == 5);
    
    if k_size == 3:
        kh = np.array([[-1, 0, 1], [-2, 0, 2], [-1, 0, 1]], dtype = np.float)
        kv = np.array([[1, 2, 1], [0, 0, 0], [-1, -2, -1]], dtype = np.float)
    else:
        kh = np.array([[-1, -2, 0, 2, 1], 
                   [-4, -8, 0, 8, 4], 
                   [-6, -12, 0, 12, 6],
                   [-4, -8, 0, 8, 4],
                   [-1, -2, 0, 2, 1]], dtype = np.float)
        kv = np.array([[1, 4, 6, 4, 1], 
                   [2, 8, 12, 8, 2],
                   [0, 0, 0, 0, 0], 
                   [-2, -8, -12, -8, -2],
                   [-1, -4, -6, -4, -1]], dtype = np.float)
    
    gx = signal.convolve2d(img, kh, mode='same', boundary = 'symm', fillvalue=0)
    gy = signal.convolve2d(img, kv, mode='same', boundary = 'symm', fillvalue=0)

    g = np.sqrt(gx * gx + gy * gy)
    g *= 255.0 / np.max(g)
  
    #plt.figure()
    #plt.imshow(g, cmap=plt.cm.gray)      
  
    return g

Longest Valid Parentheses

The Problem:

Given a string containing just the characters ‘(‘ and ‘)’, find the length of the longest valid (well formed) parentheses sub-strings.

#include <iostream>
#include <string>
#include <stack>

using namespace std;

int longest_valid_parentheses(string const& s) {

    int count = 0, max_count = 0;
    stack<char> stk;

    for(auto c : s) {

        if(c == '(')
            stk.push(c);
        else {
            if( stk.empty() ) //|| stk.top() != '(')
                count = 0;
            else {
                count += 2;
                stk.pop();
            }
        }

        max_count = count > max_count ? count : max_count;
    }

    return max_count;
}

You can test the above code using the following test cases @ http://www.cpp.sh/.

int main()
{
  string test1("(()");
  string test2("(()()(");
  string test3(")()()))(((())))");

  cout << "there are " << longest_valid_parentheses(test1) << " valid parentheses in " << test1 << endl;
  cout << "there are " << longest_valid_parentheses(test2) << " valid parentheses in " << test2 << endl;
  cout << "there are " << longest_valid_parentheses(test3) << " valid parentheses in " << test3 << endl;

  return 0;
}

The output would look like as the follows:

there are 2 valid parentheses in (() 
there are 4 valid parentheses in (()()( 
there are 8 valid parentheses in )()()))(((())))

Save Image into PPM ASCII Format in Matlab and Python

Here I wrote a function to save image/matrix into PPM format. I tried to simplify the saving process by using the matlab function ‘dlmwrite’ such that I only need to write the header for this PPM format. However, before streaming the image data, different channels of the input image must be interleaved. I experimented two ways to do this. Both work just fine, and the use of matlab function ‘cat’ (or ‘permute’) make the implementation very simple.

function write_ppm(im, fname, xval)

[height, width, c] = size(im);
assert(c == 3);

fid = fopen( fname, 'w' );

%% write headers
fprintf( fid, 'P3\n' );
fprintf( fid, '%d %d\n', width, height);
fprintf( fid, '%d \n', xval); %maximum values

fclose( fid ); 

%% interleave image channels before streaming
c1 = im(:, :, 1)';
c2 = im(:, :, 2)';
c3 = im(:, :, 3)';
im1 = cat(2, c1(:), c2(:), c3(:));

%% data streaming, could be slow if the image is large
dlmwrite(fname, int32(im1), '-append', 'delimiter', '\n')

The implementation in Python is even simpler. ‘f.write()’ is capable of writing the entire image in one function call with the help of ‘\n’.join().

def write_ppm(fname, im, xval):
    
    height, width, nc = im.shape
    assert nc == 3
    
    f = open(fname, 'w')
    
    f.write('P3\n')
    f.write(str(width)+' '+str(height)+'\n')
    f.write(str(xval)+'\n')
    
    # interleave image channels before streaming    
    c1 = np.reshape(im[:, :, 0], (width*height, 1))
    c2 = np.reshape(im[:, :, 1], (width*height, 1))
    c3 = np.reshape(im[:, :, 2], (width*height, 1))
    
    im1 = np.hstack([c1, c2, c3])
    im2 = im1.reshape(width*height*3)

    f.write('\n'.join(im2.astype('str')))
    f.write('\n')

    f.close()

Phase difference calculation for Phase Detection Autofocus (PDAF)

Phase Detection Autofocus (PDAF) is one of the key advantages of D-SLR cameras over conventional Point-and-Shoot cameras, which usually employ contrast based autofocus system by sweeping through the focal range and stopping at the point where maximum contrast is detected. A good tutorial about how PDAF works can be found @ http://photographylife.com/how-phase-detection-autofocus-works.

In the following, I show how to find the phase difference in Python when two separate distributions are obtained from two separate AF sensors corresponding to one particular AF point. The function ‘correlate’ actually does a brute-force search in the entire phase shift space and returns a phase shift value which corresponds to the maximum correlation value.

import numpy as np
from scipy import signal

def gaussian(x, mu, sig):
    return np.exp(-np.power(x - mu, 2.) / 2 * np.power(sig, 2.))

n_ele = 100
dt = np.linspace(0, 10, n_ele)
g1 = gaussian(dt, 2, 1.3)
g2 = gaussian(dt, 6, 2.2)

#Cross-correlation of two 1-dimensional sequences. This function 
# computes the correlation as generally defined in signal 
# processing texts: z[k] = sum_n a[n] * conj(v[n+k])
# with a and v sequences being zero-padded where necessary 
# and conj being the conjugate
xcorr = signal.correlate(g1, g2)

# The peak of the cross-correlation gives the shift between the 
# two signals The xcorr array goes from -nsamples to nsamples
dt2 = np.linspace(-dt[-1], dt[-1], 2 * n_ele - 1)
print 'phase shift: ', dt2[xcorr.argmax()]

Interested readers are encouraged to read more about phase difference calculation @ http://stackoverflow.com/questions/6157791/find-phase-difference-between-two-inharmonic-waves

To accelerate the calculation, people usually transform this problem into Fourier domain to determine the phase correlation between two PDAF sensors, which is also a general technique for translation, rotation, and scale-invariant image registration. For example, given the following two images, with only translational movement between them,

trans_t2trans_t1

one can compute the phase correlation between the two input images as:

import numpy as np
from scipy import misc
import matplotlib.pyplot as plt

def rgb2gray(rgb):
    return np.dot(rgb[...,:3], [0.299, 0.587, 0.144])

im1 = misc.imread('trans_t2.png')
im2 = misc.imread('trans_t1.png')

img1 = rgb2gray(im1).astype(np.float)
img2 = rgb2gray(im2).astype(np.float)

f1 = np.fft.fftshift(np.fft.fft2(img1))
f2 = np.fft.fftshift(np.fft.fft2(img2))

tmp1 = np.multiply(f1, np.conjugate(f2))
tmp2 = np.abs(np.multiply(f1, f2))
tmp3 = np.abs(np.fft.ifft2(tmp1 / tmp2))

translation = np.unravel_index(np.argmax(tmp3), tmp3.shape)
print translation

For more details, please refer to :
Reddy, B. S. and Chatterji, B. N., An FFT-Based Technique for Translation, Rotation, and Scale-Invariant Image Registration, IEEE Transactions on Image Processing, Vol. 5, No. 8, August 1996.