Application or Semester Project Ideas

From Cmsc734_11
Jump to: navigation, search

Project Ideas (please add ones you would like teammates for)

Develop an error detecting program for multi-variate (tabular) data.

Develop cause-effect analyses or anomaly detection for temporal data (numeric or categorical event data).

Develop text analysis software that compares two or more corpora/documents, based on keywords, keyphrases, or topics. A simple version of this is to compare two lists of keywords, keyphrases, or rows of data, and show the similarities and dissimilarities.

Study the fund raising of UMd vs other Universities and find ways to improve donations from alumni.

Study crime patterns from the UMd logs to help police reduce crime.

Consider adding features to NodeXL - open source in C# .NET (

 Potential project sponsor: Marc Smith 
 - Compare two networks
 - Triad Census (undirected and directed graphs)
 - Integrate NLP features
 - Write an importer for a new data source

Develop idea of searching networks for subgraphs (former student project that could be improved)

Study Twitter hashtags over time to see evolution of interests, especially convergence on hastag terms.

Study YouTube evolution for topics: number of videos, number of views.

Guest Speakers with Project ideas:

Michael L. Pack, Director University of Maryland Center for Advanced Transportation Technology J. Kim Engineering Bldg. Suite 3144 College Park, MD 20742 Work: 301-405-0722 Fax: 301-403-4591 [] He has worked with course teams in the past and hired at least a half dozen students for his projects.

Sigfried Gold [1] who collaborates with us on medical event temporal patterns will be auditing the course and is eager to work with a project team. He writes: I've been working on some new designs for managing and navigating medical observation metadata (attribute and classification information describing drugs, diagnostic events, procedures, and lab tests). RxNAV:, having decent tools for navigating medical terminologies and metadata is essential for exploring big medical databases.