A short summary of the idea will go here.
This project is currently just a proposal.
Please complete the requirements listed on the ideas page and fill out this template.
School: Indian Institute of Information Technology, SriCity
Primary language (We have mentors who speak multiple languages): English
Typical work hours (We have mentors in various time zones): 8AM-5PM IST
Previous GSoC participation: https://summerofcode.withgoogle.com/archive/2016/projects/6295262146330624/
About your project
Project name: Stereo Vision support for BeagleBone using BeagleCV
The aim of the project is to create support for the stereo vision on the Beaglebone Black/Blue using the BeagleCV. This would consist of developing the BeagleCV library and creating custom OpenGLES2 shaders for utilizing SGX530 3D accelerator present onboard for faster computation. The APIs which would be developed for these shaders would allow other users to write their own computer vision algorithms and also enable faster computation. I would be using these APIs for implementing stereo vision algorithm. Finally if time permits, this project will be added to the Beaglebone Blue APIs.
The kernel which I would using is 4.4.56-bone17. This version has prebuilt kernel modules omaplfb, tilcdc and pvrsrvkm which are essential for running SGX530. Following are the complete set of project goals and their challenges which I plan to deliver at the end of the tenure:
Creating shaders for utilizing SGX530: Beaglebone Blue/Black has an inbuilt PowerVR SGX530 3D accelerator which is capable of performing image processing. By the end of this project, I will create basic shaders using the OpenGLES2 graphics library and GLSL 1.0 shader implementation. Fragment shaders (FS) will be used to modify the pixel values and Vertex shaders (VS) will be used for transformation and computation of the indices of the image (stored as a sampler2d type). Some of the challenges which are present while implementing image processing algorithms are as follows:
Texture compression: By default I will be using only greyscale images for image processing. The algorithms would be best performed when the size of the images reduces. It would be apt to converting into grayscale with POWERVR Texture Compression (PVRTC) format with a compression ration of 8:1 for better performance.
Floating point Precision control: Using a float point unit has a lot of impact on the performance. For example, a 5x5 Gaussian filtering on an image took 3074.8 ms on a CPU and took 207.1 ms on a CPU with a fixed point implementation but took only 48.90 on a parallel implementation on SGX530.
OpenGLES SL has 3 precisions modifiers:
highp: single precision, 32 bit floating point value
mediump: half-precision floating point value (16 bit)
lowp: 10 bit fixed point format, values in the range of 1/256
Choosing a lower precision increases the performance but may introduce artifacts. I would be using lowp precision to represent colors in the range of 0.0 to 1.0 to enhance performance of the GPU.
Load Sharing between VS and FS: For image processing algorithms, the number of vertices processed is much lower than total number of fragments which are millions in number. Therefore, the number of operations per vertex is significantly cheaper than fragment, so it is generally recommended to perform calculations per vertex. For example, in filtering, the straightforward way is to precompute neighboring texture coordinates in a vertex shader. By moving the calculations to the vertex shader and directly using the vertex shader computed texture coordinates, the fragment shader avoids the dependent texture read.
Branching and loop unrolling: To process a single instance of a loop needs more instruction in increment and compare operations. I will mostly try to eliminating loop by either an optimized unrolling or vector utilization in the shader to perform operations would enable the process to achieve higher performance. In cases where loop cannot be unrolled, a constant loop count will be maintained so that dynamic branching is reduced. Similarly, even in the case of branching, I will mostly try to branch on a constant known value as branching on a value computed inside the shader results in significantly low performance.
Porting libcvd as BeagleCV: One of the most important deliverable of this project is to develop BeagleCV library. BeagleCV is a minimized fork of libcvd (real-time vision library). The sequential execution would be replaced with the optimized shaders in the library. These shaders will be accessible by APIs which will be developed along with the shaders. This would allow other users to write their own CV algorithms and also enable faster computation. I will be implementing Stereo matching and a generic feature extractor by the end of this project by utilizing the APIs.
Stereo vision implementation: Implementing stereo matching algorithm for getting disparity map from stereo images. Block Matching (BM) algorithm is the most widely used stereo matching algorithm used in the embedded community because of its favourable computational characteristics for parallel implementation. The cost function in this local block matching is NCC (Normalized Cross Correlation). To improve the accuracy of the disparity map a LR-RL consistency check can be included which will be implemented in a separate shader.
Documentation and examples: I will provide extensive and accurate documentation for whatever I build in this project. Functional documentation for the BeagleCV will be done in doxygen. Code documentation will be as comments in the source file. to utilize the GPU and also create appropriate documentation of how to use the SGX530 on the Beaglebone Blue/Black.
(Future work) Adding BeagleCV support for Beaglebone Blue APIs: Once BeagleCV is implemented and the v2 released, I will add the support to the Beaglebone Blue API repository. This would enable users to implement sensor fusion algorithms that help in robotic localization, tracking, detection and navigation.
Google Summer of Code stretches over a period of 12 weeks with the Phase-1, Phase-2 and final evaluations in the 4th, 8th and the 12th week respectively. Following are the timelines and milestones which I want to strictly follow throughout the project tenure:
May 30 - June 13 Week-1,2 Aim: Implementing OpenGLES2 shaders for Block Matching (BM) algorithm Description: Utility functions for a test shader will be implemented and APIs will be framed. Part 1 of shader implementations. Following shader functions will be written this week:
sampler2D scale_sample(sampler2D image, vec4 top, vec4 bottom, vec2 scale): Scale the image @top: coordinates of the top boundary @bottom: coordinates of the top boundary @scale : Scaling parameters in x and y direction Returns the value on success. sampler2D translate_sample(sampler2D image, vec4 top, vec4 bottom, vec2 trans): Translate the image @top: coordinates of the top boundary @bottom: coordinates of the bottom boundary @trans: Translation coordinates, namely x and y. Return the translated image sampler2D rotate_sample(sampler2D image, vec4 top, vec4 bottom, float angle): Rotate the image @top: coordinates of the top boundary @bottom: coordinates of the bottom boundary @angle: Rotate angle (in radians) Return the rotated image sampler2D rectify_sample(sampler2D image1, sampler2D image2, mat3 camera1, mat3 camera1, mat4 ext): Apply the rectification operation @camera1: 3x3 camera matrix for left image @camera2: 3x3 camera matrix for right image @ext: 4x4 extrinsic matrix for rotation and translation sampler2D convolve_sample(sampler2D image, mat2/mat3/mat4 trans): Convolve an with a given 2x2, 3x3 or 4x4 filter @kernel: kernel matrix specified by the user Return the convolved image
June 14 - June 27 Week-3,4 *Phase-1* Aim: Cleaning, Minimizing and testing bare BeagleCV Description: Shaders (Part 2). Following are some of the shaders which will be implemented and APIs will be developed enabling users to utilize these shader functions.
float trace/det_matrix(mat4 matrix): Find the trace/det of the matrix @matrix: 4x4 (3x3 or 2x2) matrix Return the trace/det of matrix float compute_ncc(sampler2D image1, sampler2D image2, vec2 start1, vec2 start2, int size): Correlation computation (NCC) with support window @start1: left corner coordinates of the fragment on the first image image1 @start2: left corner coordinates of the fragment on the second image image2 @size: size of the support window float compute_autocorr(sampler2D image1, vec2 start1, vec2 start2, int size): Autocorrelation computation with a support window @start1: left corner coordinates of the fragment on the first image image1 @start2: left corner coordinates of the fragment on the first image image1 @size: size of the support window sampler2D convolve_sample(sampler2D image, mat2/mat3/mat4 trans): Convolve an with a given 2x2, 3x3 or 4x4 filter @kernel: kernel matrix specified by the user Return the convolved image mat4 inverse_matrix(mat4 matrix): Find the inverse of the matrix @matrix: 4x4 (3x3 or 2x2) matrix Return the inverse of matrix
June 28 - July 11 Week-5,6
Aim: Implementing BM using OpenGLES2 shaders
Description: The algorithm will be implemented using the shaders created in the previous week. Based on the performance of the algorithm, vertex shaders will be added to minimize the cost of computation. The implementation details can be found here.
July 12 - July 18 Week 7
Aim: Testing stereo matching algorithm and checking performance
Description: Generating the final disparity map given 2 stereo images. Evaluating the algorithm using the popular Tsukuba stereo dataset. Checking the accuracy of the algorithm, fixing bugs and documenting the approach.
July 19 - August 1 Week-8,9
Aim: Implementing basic OpenGLES2 shaders
Description: Phase-3 of the shader implementations. Following are the fragment shaders which will be coded:
sampler_2D gradx_sample(sampler2D image, float weight): Compute weighted gradients in the x direction for the image. @weight: For weighted gradient, otherwise set weight=1 Return the computed gradient in x direction sampler_2D grady_sample(sampler2D image, float weight): Compute weighted gradients in the x direction for the image. @weight: For weighted gradient, otherwise set weight=1 Return the computed gradient in x direction sampler_2D threshold_sample(sampler2D image, float high, float low): Threshold image based on higher and lower thresholds @high: High threshold specified by the intensity value @low: Low threshold specified by the intensity value Return the filtered image
August 2 - August 15 Week-10,11
Aim: Implementing example algorithm using OpenGLES2 shaders
Description: A feature extractor (SURF if time permits or else Harris corner) will be implemented as an example program to make use of the APIs coded earlier. Unit tests will be run and any bugs pertaining to this source code will be fixed. The implementation details can be found in this paper.
August 16 - August 22 Week-12
Aim: Performance evaluation and documentation of the SURF algorithm
Description: Reserve week for testing SURF algorithm. I will also be checking the performance of those examples wrt to CPU, GPU and CPU+GPU support. All the functionalities will be properly documented.
August 23 - August 29 Week-13
Aim: FINAL EVALUATION !!
Description: Checking and fixing bugs. Refining the previous documentation so that it is more easy to understand. Checking the final implementation and doing the runthrough again. Final commit to the beaglecv repository and releasing v2.
August 29 onwards: Future work
Aim: Adding BeagleCV with Beaglebone Blue APIs
Description: Once BeagleCV is stable, it will be added in the Beaglebone Blue APIs to create support for vision along with other sensors. I will also maintain the BeagleCV library by adding more algorithms and fixing any bugs pertaining to Beaglebone
Experience and approach
I am a fourth-year undergraduate student studying in India. Besides having a key interest towards Robotics, Computer vision and Machine learning. I also like hacking on embedded boards especially to make agile robots. I like to work on an open-source project this summer because it is interesting and contributing to the project is fun and exciting. I did not work much on open-source before, but I have some idea about how things work in open-source community which seem to be very fascinating.
Accurate and Augmented Localization and Mapping for Indoor Quadcopters: In this project, a state-estimation system for Quadcopters operating in indoor environment is developed that enables the quadcopter to localize itself on a globally scaled map reconstructed by the system. To estimate the pose and the global map, we use ORB-SLAM, fused with onboard metric sensors along with a 2D LIDAR mounted on the Quadcopter which helps in robust tracking and scale estimation.
Enhancing Visual SLAM using IMU and Sonar: Increased the accuracy and robustness of ORB-SLAM by integrating Extended Kalman Filter (EKF) by fusing the IMU and sonar measurements. The scale of the map is estimated by a closed form Maximum Likelihood approach.
Semi-Autonomous Quadcopter for Person Following: Developed an IBVS based robotic system, implemented on Parrot AR Drone, which is capable of following a person or any moving object and simultaneously measuring the localized coordinates of the quadcopter, on a scaled map.
API Support for Beaglebone Blue: Created easy-to-use APIs for Beaglebone Blue. With these APIs, applications can be directly ported onto the board. This project was a collaboration of Beagleboard.org with the University of California, San Diego as part of Google Summer of Code 2016.
Intelligent Parking system: This module is a part of ADS(Autonomous Driving System) used for accurate autonomous parking. The Beaglebone Black in the robot finds the set point by matching features using SURF descriptors on the template image and directs the output to the actuators(motors) connected to PRU(Programmable Real-time Unit).
If I get stuck on my project and I don’t find my mentor, I will google the error and research about it myself. I personally feel that there is nothing in this world which is not present on the internet. I will also try take help from the other developers present on the IRC.
kiran4399: As a robotics researcher, I personally feel that just by using the data from low-level sensors. By developing these BeagleCV, its functionalities and applications, students be greatly benefited can apply many high-level concepts like visual tracking, localization, detection, pose estimation etc. By adding BeagleCV to BeagleBone APIs, it would be very easy to implement sensor-fusion algorithms. roject.
***Mentors!!! please add your Quote here.***
I plan all my work properly and sketch out a perfect routine so that the work planned gets completed within the given time. I always sketch out priorities and keep priority management above time management. My policy is: “Hard-work beats talent when talent doesn't work hard !!”. I strongly feel that striving to know something is the best way to learn something. I can assure that I will work around 50-55 hours a week without any other object of interest. I also hope for lot of learning experience throughout the program and come closer to the open-source world.