Theses
Please contact the respective supervisor in case of questions for a specific topic. Further topics can usually be found through direct contact with our group members. We are open to new ideas - send your own topics to one of our members or a mailing list (for projects, theses, and internships).
Topics for theses
The following open topics are currently offered for Bachelor, Master and Diploma theses (click for details).
Database Topics
- A Spark-based framework for discovering closed patterns
Supervisor: | Sadeq Darrab |
Abstract: | A frequent pattern mining method is a subfield of data mining that identifies patterns that frequently co-occur together in a dataset. There are several methods for enumerating the complete set of frequent patterns. However, these methods generate so many patterns (including redundant ones), which leads to extensive downstream analysis. Also, they are designed to handle small datasets and cannot handle big data. In this thesis, we investigate a framework to mine condensed representations for interesting patterns from big data by utilizing the spark tool. |
- NVM-optimized Bepsilon-tree
Supervisor: | Sajad Karim |
Abstract: | Non-volatile memory (NVM) is a new class in the traditional storage hierarchy. The technologies in this class share the characteristics of primary and secondary storage. They provide access latency close to DRAM, are addressable from cache lines, offer much higher capacity than DRAM, and are non-volatile. NVM is also often referred to as a disruptive memory technology as it has invalidated the traditional programming paradigm because, contrary to the traditional model, where data structures are generally categorized into memory and storage resident data structures, NVM-bound data structures cover both the aspects and the linked intricacies. Moreover, there has been considerable research made to leverage the characteristics of NVM, and in particular to the task mentioned in this posting, several designs for the index structures (e.g. wB+-Tree, NV-Tree, FP-Tree, LB+-Tree) [4, 5, 6, 7] that are typical to key-value storage engines are presented. However, one of the key aspects that is not addressed in the mentioned literature is they do not consider the heterogeneity of the modern storage landscape. For example, they all present NVM-DRAM optimized B-trees and do not consider block devices like SSD and HDD. Furthermore, and to the best of our knowledge, no research has been made to optimize Bepsilon-tree [1, 2, 3] for NVM despite the fact it offers similar scan operations as other B-tree variants yet its inserts and deletes are an order of magnitude faster. |
Goals and results: | The goal is to implement an NVM-optimized Bepsilon-tree. It includes reviewing the recent literature and proposing data structures for the internal and leaf nodes in the B?-tree that would leverage the characteristics of NVM. Moreover, our server is equipped with the NVM module from Intel® (Intel® Optane™ DC Persistent Memory Modules), therefore, the proposed layouts should consider the characteristics of the module [8]. For example, the read and write latencies of the module are asymmetric where the reads are faster than the writes. Lastly, the proposed design has to be evaluated against the typical DRAM-based and disk-based Bepsilon-trees.
[1] Rudolf Bayer and Edward McCreight. 1970. Organization and Maintenance of Large Ordered Indices. In Proceedings of the ACM SIGFIDET (now SIGMOD) Workshop on Data Description, Access and Control (Houston, Texas). Association for Computing Machinery, New York, NY, USA, 107–141. [2] Gerth et al. 2003. Lower bounds for external memory dictionaries.. In SODA, Vol. 3. 546–554. [3] Michael A Bender, Martin Farach-Colton, William Jannen, Rob Johnson, Bradley C Kuszmaul, Donald E Porter, Jun Yuan, and Yang Zhan. 2015. An Introduction to B? -trees and Write-Optimization. login; magazine 40, 5 [4] Shimin Chen and Qin Jin. 2015. Persistent B+-trees in non-volatile main memory. Proc. VLDB Endow. 8, 7 (February 2015), 786–797. [5] FPTree: A Hybrid SCM-DRAM Persistent and Concurrent B-Tree for Storage Class Memory. [6] Jihang Liu, Shimin Chen, and Lujun Wang. 2020. LB+Trees: optimizing persistent index performance on 3DXPoint memory. Proc. VLDB Endow. 13, 7 (March 2020), 1078–1090. [7] Y. Zhou, T. Sheng and J. Wan, "HBTree: an Efficient Index Structure Based on Hybrid DRAM-NVM," 2021 IEEE 10th Non-Volatile Memory Systems and Applications Symposium (NVMSA), Beijing, China, 2021, pp. 1-6, doi: 10.1109/NVMSA53655.2021.9628870. [8] Lessons learned from the early performance evaluation of Intel optane DC persistent memory in DBMS. 10.1145/3399666.3399898 |
Software Engineering Topics
- Semi-automatic approaches to support systematic literature reviews (Master)
Supervisor: | Yusra Shakeel |
Abstract: | Systematic Literature Review (SLR) is a methodology of research which aims to gather and evaluate all the available evidence regarding a specific research topic. The process is composed of three phases: Planning, Conducting and Reporting. Although SLRs gained immense popularity among evidence-based researchers, to conduct the entire process manually can be very time consuming. Hence, software engineering researchers are currently involved in proposing semi-automatic approaches to support different phases of an SLR. In this thesis, you analyze the current state of research related to reporting phase of the SLR process. Based on the findings, develop an approach to support researchers with the steps involved for reporting results of an SLR. |
Goals and results: |
|
- Automate quality assessment of studies to support literature analysis (Bachelor/Master)
Supervisor: | Yusra Shakeel |
Abstract: | The number of empirical studies reported in software engineering have significantly increased over the past years. However, there are some problems associated with them, for example, the approach used to conduct the study is not clear or the conclusions are incomplete. Thus, making it difficult for evidence-based researchers to conduct an effective and valid literature analysis. To overcome this problem, a methodology to assess the quality and validity of empirical studies is important. Manually performing quality assessment of empirical studies is quite challenging hence, we propose a semi-automatic approach. In this thesis, you improve the already existing prototype for assessing quality of articles. The aim is to provide the most promising studies relevant to answer a research question. |
Goals and results: |
|
- Advanced Topics in Feature-Model Analysis (Bachelor/Master)
Supervisor: | Elias Kuiter |
Abstract: | This is an overview of thesis topics and software projects concerned with feature-model analysis. To work on one of these topics or projects, you must have participated in our lecture on software product lines. |
Slides: |
Scientific team projects
For scientific team projects, we offer a lecture:
In the first lecture, several topics are presented for students to work on during the semester.
Software Projects
Currently, we offer the following topics for software projects:
- Advanced Topics in Feature-Model Analysis
Supervisor: | Elias Kuiter |
Abstract: | This is an overview of thesis topics and software projects concerned with feature-model analysis. To work on one of these topics or projects, you must have participated in our lecture on software product lines. |
Slides: |
Templates
For theses and presentation templates, have a look at the German version of this page.