Automated Vertebra Numbering and Plane Prescription along the Spine Using a Multi Model Atlas
Manual numbering & plane prescription along spinal MRI image is a tedious and time consuming task. This paper describes a technology for automatic annotation, numbering, and oblique axial plane prescription along the vertebral column. A robust solution was developed for cervical, thoracic, and lumbar regions. [Link] [Abstract Link].
Improving Search Engine Relevance using Relevance Feedback from User Gestures
The ultimate aim of any search engine is to provide relevant search results to the user in order to satisfy his information need. Search results can be improved if the user provides a feedback to the system with the documents that he found useful. This is called relevance feedback. In this project, we have used head movements of the user as a form of implicit relevance feedback. The main reason for using head movements is that users don’t have to explicitly mark out relevant results, the system uses involuntary gestures of the user, making it less tedious.
The project was divided into two parts - (i) Gesture recognition and (ii) Relevance feedback. Gesture recognition was achieved using optical flow and tracking features. We have used relevance feedback for Automatic Query Reformulation. The system worked well in both recognizing gestures and providing relevant results to the user based on the feedback. The users who used the system found it helpful but not fully unobtrusive. Using this study we can infer that better, improved and more relevant search results can be obtained by using user’s gestures. [Report]
Transfer Learning a VGG-16 for CIFAR-10 Dataset
Trained a CNN to classify the cifar-10 database by using a vgg16 trained on Imagenet as base. The approach is to transfer learn using the first three blocks (top layers) of vgg16 network and adding FC layers on top of them and train it on CIFAR-10. Trained using two approaches for 50 epochs: 1. Keeping the base model’s layer fixed, and 2. By training end-to-end. First approach reached a validation accuracy of 95.06%. Second approach reached a validation accuracy of 97.41%. [Github] [CNN Architectures - Review] [Adversarial Examples - Review]
Supplier – Retailer Shrinkage Management, LH Ventures
Working with LH Ventures on a software development project in a team of 5. The focus is on learning how to manage software development, acquiring skills like project tracking, designing architecture, managing risk, quality and configuration. [Hosted Link]
3D Object Reconstruction from Hand-Object Interactions
Implemented the 2015 ICCV paper of the same title by Tzionas et al. for the Computer Vision course project. A symmetric, texture-less, feature-less 3D object was reconstructed from 2D images of the object being rotated by a hand. The point cloud of the hand was used to register the object point cloud. [Poster] [Report]
Object Detection using CIFAR-10 database
Implemented and evaluated three machine learning algorithms to detect objects in the CIFAR-10 database – SVMs, Neural Networks, Logistic Regression. Best accuracy of 59% was achieved by using RBF kernel in a SVM.
Video Tampering Detection using Digital Image Watermarking
The problem of Video Authentication has been explored widely by researchers across the globe hence there are many algorithms presently available on this topic. The two main approaches generally seen are – digital signature and digital watermarking. Digital watermarking uses visual feature extraction and embeds them into the underlying frame of the video, whereas digital signature schemes often add some information in the frame which is visible to the user. In this project, we proposed a robust digital watermarking algorithm which counters all the global attacks that are used to alter the content of the video: frame addition, removal, shuffle and object addition and removal. As part of the project, two existing algorithms were implemented as baseline. Among the algorithms implemented, one was entropy based algorithm and the other was based on the measures of decay in wavelet transform of the frames. The solution we proposed used both inter-frame and intra-frame image properties. We used properties like edges within frame and the hamming distance between consecutive frames, to form a vector which is watermarked in the image in the form of pre-defined pixel-value manipulations. This vector is recovered from the watermark and matched with the vector recovered from the video to check for tampering.
Face is the most easily observable characteristic feature of a human. The main challenge in face mosaicing lies in the variation in face images like pose, illumination, and expression, among which pose is the hardest to deal with. In the due course of completing our project we took help from two papers which have been the foundation of the project work that has been done by us. In A Multi-Resolution Spline with Application To Image Mosaics the authors define a multiresolution technique for combining two or more images into larger image mosaic. In A Mosaicing Scheme for Pose-Invariant Face Recognition two side profile images are aligned with the frontal images to provide a composite face image. This paper has been the basis of our implementation of the algorithm of the course project. [Report]
Real Time Facial Expression Recognition in Video using Active Shape Models and Support Vector Machines
Facial expression constitutes 55 percent of the effect of a communicated message and is hence a major modality in human communication. Literature establishes that there are six basic emotions that a person shows. Most of the emotions can be mapped as a combination of these emotions. These emotions are: Joy, Sad, Disgust, Anger, Surprise and Fear. A lot of different approaches exist in the literature to solve this problem. In the project, the 2003 ICMI paper titled - Real Time Facial Expression Recognition in Video using Support Vector Machines by Michel at al. which talks about recognising facial expressions in real time video using support vector machines was implemented. The paper talks about using a Support Vector Machines on the features extracted from the consecutive frames of an expression making face. The paper used a propritery tracker to detect the facial features. The Active Shape Models library Stasm was used instead to find and track the facial features. [Report]
Information Seeking Support System for Low Literacy Users
A query optimizing module to work on top of existing search engines to help low-literacy users.
TwiTraffic - Traffic Density Estimation with updates on Twitter
As more and more vehicles continue to crowd the roads, the traffic jams are becoming an everyday norm. To help the user drive to his destination without being stuck in jams, we propose an application that will help the user make an informed choice regarding his route and possibly avoid crowded routes. Assuming there will be many other users, each user’s location and moving speed will be aggregated at the data center and thus the places where traffic is stuck will be determined and the updates will be broadcasted on Twitter. Using these updates, another user, who would be setting out to travel, can alter his route according to his convenience. [Github]
PhishBook is a cross-network user-profiling study wherein the social networking site Facebook was mined to collect publically available information of subjects to identify their “group of friends” and gain useful insights to increase the yield of a phishing attack. This involved identifying the susceptibilities of a subject towards people of same network, gender and with mutual friends on Facebook by creating fake as well as duplicate identities and thereby duping the subjects to gain access to their hidden information. This information was then further used in structuring the phishing attacks for profiling the users based on their vulnerabilities. Overall, it was found that the human tendency is to fall for social phishing attacks within their “group of friend” and also more the closeness of nodes in the social graph, more is the probability of them falling for such attacks. Gender was found to have a significant impact on the male segment only. This project was selected for IIITD Research Showcase 2011. [Report]
Assessment Management System for Medical Council of India
Visual Cryptography – Study and Implementation
Visual cryptography is a unique kind of cryptography which is used to encrypt printed texts, handwritten notes and pictures such that the decryption can be done only by the human visual system. This unique property of visual cryptography makes decryption process unattainable even with the help of a brute force attack, as it requires a constant human intervention to check whether the decoded image is valid or not. Visual cryptography, derived from the basic theory of secret sharing, extends the same sharing scheme to images in such a way that no single share reveals information about the original image. It finds its applications in sharing multimedia information secrets over the network, in thresholding the access to a bank account for example out of 6 participants only n such that 2<=n<=6 can combine the share of the key given to them and access the bank account. [Report]