Vineet Gupta |
Email: vineet@ural.wustl.edu |
6255 Delmar Blvd, Apt 3E · St. Louis, MO 63130, USA · 314-369-5520
Objective
Seeking an opportunity to work as a Summer Intern to demonstrate and further hone my technical skills and to gain an experience of the professional working atmosphere.
Education
Master of Science, Computer Science (2006 – Expected 2008) Washington University in St. Louis, St. Louis, MO
Bachelor of Engineering, Computer Science & Engineering with 1st division and distinction (2002 – 2006) Madhav Institute of Technology & Science, Gwalior, India
Research Interests
Computational Biology, Natural Language Processing, Machine Learning.
Systems Experience and Proficiency
Programming Languages
C/C++, Java, Visual Basic 6
Python, PHP
SQL
Applications
GATE ( General Architecture for Text Engineering )
MATLAB
Operating Systems
Linux, Windows
Current Activity
Working in Stormo Lab of Department of Genetics in Washington University School of Medicine under Dr. Garry Stormo. Right now involved in implementing algorithm for finding Transcription factor binding sites in a DNA sequences.
Projects
Text summarization using coreference resolution – The technique implemented is based on the single summarization strategy of outputting noun phrases, which represents the most important text entities. The most important entities of a text are the ones corresponding to the longest noun phrase coreference chains. For summarization all coreference chains were computed and then ranked, and the longest chain receives the highest rank. The code was written in Java and GATE API was used for NLP related tasks.
Spam filer based on Bayesian approach – It is a simple spam filter based on Bayesian statistical model. It learns from spam and from good mail, resulting in a very robust, adapting and efficient anti-spam filter that returns very few false positives. The code was written in Python.
Motif finding – Motif is a set of DNA bases which represents functionally conserved regions. This project was about motif finding in a DNA segment based on micro array experiment data. Decision trees and Support vector machine were used to build model. In the later phases cross validation was performed to check accuracy of model. Code was written in C++.
College alumni website – I worked for the alumni website of my college which was part of official institute website. Code was written in PHP, and MySQL was database server. This website showed tremendous response, and almost 500 alumni were registered during first three months. Website at http://www.mitsgwl.ac.in/alumni
Papers
A search strategy based on keyword generation using coreference resolution – In this paper, we have presented a search strategy that employs automatic keyword generation from the supplied document and searches for documents that resembles to supplied document in contents. Proposed strategy takes a document as input rather than a simple string of keywords. It then generates keywords automatically using 'coreference resolution' technique that best describe document’s content and searches for them. Thus enabling the user to concentrate on the information to be searched instead concentrating on keywords that best describe the information. Paper available at http://www.cse.wustl.edu/~vg3/papers/gupta06.pdf
References
Available upon request.