Msc thesis proposal in computer science
Thesis Proposal Dept. of Computer Science University of Manitoba (Thursday, April 01, ) 1. Why are thesis proposals necessary? The purpose of having thesis proposals is threefold. First, it is to ensure that you are In the MSc program, students do not have advisory committees as they do in the PhD program. In essence, the GSC acts as.
msc Clustering of Software Artifacts Large-scale software development lead to huge amounts of proposal information, software artifacts of different abstraction levels are created and the result is document management systems, repositories, databases, web sites, wikis, servers and computer disks, CDs, USB-sticks etc. Most information that is stored is probably informally managed, code snippets that might be good to save, guides, sketches, calculations, tables, screen shots, easter eggs and mails.
When hundreds or thousands of developers work on such information and the information also come in different theses, chaos is unavoidable. Being able to navigate this mess is critical in order to efficiently develop software.
Normalized Compression Distance, is a function that has been developed to calculate the distance between information, which has been used to successfully cluster things as separated as genomes, Russian authors, composers of classical music and spoken languages.
It is based on Kolmogorov Complexity from the science of information theory.
MSc Thesis Proposals
This complexity measure can not be computed, but it can be approximated using msc methods. Task Develop a tool that recursively traverses a file system and visualizes the result. Test the tool on your own computer directory and evaluate the result. Try clustering the science articles business plan for 5 years the Reuters Corpus to evaluate the proposal on natural language text.
Try the tool on some bigger thesis from the open source domain.
MSc Thesis Proposals | Computer Science
Further Information in Swedish. An important part of the project is actual implementation essay about passions in life the algorithms.
An implementation is expected to produce some interactive visual output, which might be used in learning purposes. Some comments on complexity of different methods are most welcome. A brief discussion of the methods can be found at: Mikhail Barash Comparison of RNA secondary prediction approaches which are based on different families of formal grammars Last edited over 4 years ago RNA secondary structure prediction is one of the key questions in computational biology.How To Write A Research Proposal? 11 Things To Include In A Thesis Proposal
The problem appears to be computationally hard, and thus some approaches to tackle it have been studied. One of such approaches considers using methods of formal language theory, namely various kinds of grammar models. The sequence is considered as a string which is then parsed according to a certain grammar and the properties of the sequence are concluded from the obtained parse tree.
The student will have to give an introduction to the formulation of the problem and explain its relevance from the point of view of theoretical computer science. Then the models, such as context-free grammars, tree-adjoining grammars, multi-component grammars, cover grammars, dependency grammars, and some others, alongside with their stochastic versions, are to be introduced. After this, the student will have to discuss different kinds of issues arising with respect to the RNA secondary structure prediction, such as different kinds of loops and hairpins.
A programming part of the project assumes considering different grammar models and implementing them.
Analysis and comparison of the obtained results should conclude the work. The introductory material can be found in: Mitchison, Biological sequence analysis: Probabilistic models of proteins and nucleic acids, Cambridge University Press, In short, the problem is to predict the three dimensional folding of a given protein. Although our understanding of this folding is still limited, many models exist for this problem, combining various hypothesis and insights into the biochemistry of protein sequences with elegant algorithms on strings.