Bei Fragen zu den hier angebotenen Themen wenden Sie sich bitte an die jeweiligen Betreuer. Weitere Themen können in der Regel direkt bei den Mitarbeitern der Arbeitsgruppe erfragt werden. Eigene Themenvorschläge können je nach Vorhaben an folgende Verteiler geschickt werden:

  • Projekt: dbse-project{ät}
  • Abschlussarbeit: dbse-thesis{ät}
  • Praktikum: dbse-internship{ät}

Themen für Abschlussarbeiten

Die folgenden Themen bieten wir derzeit für Bachelor-, Master- und Diplomarbeiten an.

Database Topics:
  • Elf: An efficient index structure for multi-column selection predicates
Supervisor:   David Broneske
Abstract: With analytical queries getting more and more complex, the number of evaluated selection predicates per query rises as well, which also involves several different columns. Our idea is to address this new requirement with Elf, a multi-dimensional index structure and storage structure. Elf is able to exploit the relation between data of several columns to accelerate multi-column selection predicate evaluation. Elf features cache sensitivity, an optimized storage layout, fixed search paths, and slight data compression. However, there are still many points that have to be researched. These include, but are not limited to: efficient insert/update/delete algorithms, a merging mechanism of two Elfs, Furthermore, we are interested in how far SIMD can be used to accelerate the multi-column predicate evaluation in Elf.
Goals and results: 
  • Implementation of efficient build, update, or search algorithms for Elf
  • Critical evaluation against state-of-the-art competitors
  • Join-Order Optimization
Supervisor:   Andreas Meister
Abstract: Within database management systems, users provide queries via SQL. The efficiency of these declarative queries, are highly dependent on the order of join operators. Within join-order optimization, databases try to determine efficient join orders. Based on almost 40 years of research, plenty of different, mainly sequential approaches are available, such as genetic algorithms, top-down enumeration, or dynamic programming. Within this context, several possible thesis topics are available, such as comparison or parallelizing of existing approaches.
Goals and results: 
  • Implementation or parallelization of existing approaches
  • Evaluation of implementation
  • Parallel Sorting
Supervisor:   Andreas Meister
Abstract: Sorting algorithms are basic building blocks for plenty of different complex optimization problems. In the past, sequential algorithms provided enough performance for a practical usage. Nowadays, these sequential algorithms cannot provide any performance improvements based on the parallel hardware architecture of current systems. Hence, parallel sorting algorithms were proposed to adapt sorting algorithms to the current hardware. Based on a literature review suitable parallel sorting algorithms should be identified. Furthermore, selected algorithms should be implemented and evaluated against existing algorithms (e.g., Boost Compute).
Goals and results: 
  • Overview of existing parallel sorting algorithms
  • Implementation and evaluation of selected sorting algorithms
  • Parallel Multi Reduction
Supervisor:   Andreas Meister
Abstract: Aggregations (e.g., sum, max, min, count) are common operations within database systems. Although the parallel aggregation of one group is well studied, how to efficiently aggregate groups seems to be still an open question. For example, one group could be evaluated in parallel or all groups at the same time. In this thesis, an overview of existing parallel aggregation/reduction algorithms should be created. Furthermore, selected parallel algorithms should be implemented and evaluated against existing algorithms (e.g., Boost Compute).
Goals and results: 
  • Overview of existing parallel aggregation/reduction algorithms
  • Implementation and evaluation of selected aggregation/reduction algorithms
  • Statistical analysis of run-time experiments (Bachelor)
Supervisor:   Andreas Meister
Abstract: Experiments can be effected by several internal and external factors. Additional analysis, e.g., significance or error analysis, to generalize measurements, improve the experiments and result quality. In this thesis, the state of the art of analysis for run-time experiments should be reviewed. Suitable analyses should be realized with existing tool-suites (e.g., R).
Goals and results: 
  • Overview of state of the art of analysis of run-time experiments
  • Prototypical implementations of analysis with existing frameworks
  • An automation tool to specify and generate benchmark datasets
Supervisor:   Marcus Pinnecke
Abstract: Data management spans a wide range of tasks, needs and constraints to database management systems and -tools. Since system development is about making the right design choice for trade-off situations, todays database system market offers a magnitude of tools each having its niche. To find the best-matching solution for cross-niche tasks (such as mixed OLTP and OLAP workload processing), benchmarking is an essential method for industry and research. As there is no database system that "fits all", there is also no benchmark specification that "fits all". Tailoring a benchmark is cumbersome for a third-party if this was not intended in the benchmark specification, and contradicts the idea of standardization and transparency. Clearly, this limits the evolution of benchmarks following modern trends in modern data management, and might open the door for non-standard custom-build benchmarks that are hard to share with the community. A promising solution to this challenge is a flexible benchmark automation tool for database systems that provides a machine-readable specification language, and automates the generation of data and meta information to industry-standard output formats. The purpose of this thesis is to lay the foundations of such a benchmark automation tool, CONTRABASS. The vision of CONTRABASS is as follows: using CONTRABASS, an arbitrary benchmark specification document can be formulated such that CONTRABASS knows how to generate the data and workflow. An important feature of CONTRABASS is mixing and tweaking specifications at the specification language level for tailoring benchmarks without the need to re-write the data generation tool or to bind to the system under evaluation. Further, evaluations based on CONTRABASS benchmarks will be transparent, repeatable, and easily shareable since they only depend on the statements formulated in the specification language.
Details, goals and results:  In previous work we surveyed important state-of-the-art database system benchmarks w.r.t. transactional and analytic processing (e.g., TCP suite), a combination of both processing types (e.g., CH-Benchmark), and several custom-made "micro"-benchmarks. This thesis is intended as a "proof-of-concept" towards an automation tool to specify and generate benchmark datasets.
This includes:
  • Generalization of specification parameters of a set of important database benchmarks (e.g., TCP-C, CH-Benchmark, Custom-Work) based on preliminary work
  • Conceptual work regarding an extensible framework (by using several design pattern) that covers the most important generic specifications parameters
  • Conceptual work regarding an data generation module for this framework that is capable to generate benchmark workloads depending on the specification parameters
  • Prototypic implementation and evaluation of this framework as "proof-of-concept"
Although a parser is needed to actually accept a formal language that covers the specification on the long-term, this parser is not a required part of this thesis. In fact, we expect an architectural prototype that could be extended with several functionalities (such as the parser) in future work. The actual system to develop has to provide a set of access points at the system-internal level that are used to specify a certain benchmark. In addition, the data generator accepts these specification and generates data according this specification. The data is outputted to a free-to-define implementation (e.g., a CSV printer). A strong interest in building flexible systems using industry standards (e.g., several design pattern), an strong interest in simplification of otherwise cumbersome processes, and a strong interest in contributing this solution to both the research- and the open-source community is tendency in the students profile that will perfectly match to this thesis. We are willing to shift the weight of several tasks depending on whether this thesis is a bachelor's or master's thesis, which means the focus could be more on the implementation part, or the focus could be more on the conceptual part depending on the type of thesis, and the students skills and interests. However, interested students have to have a strong self-motivation and must be able to work autonomously in several parts of the project. It is up to the student to choose a platform, a (mix of) programming language(s), possible libraries to implement the prototype if she/he can argue for that.
  • Robust Parallel Prefix sum
Supervisor:   Andreas Meister
Abstract: Prefix sum are basic building blocks for plenty of different complex optimization problems. In the past, sequential algorithms provided enough performance for a practical usage. Nowadays, these sequential algorithms cannot provide any performance improvements based on the parallel hardware architecture of current systems. Hence, parallel algorithms to calculate the prefix sum were proposed to adapt existing algorithms to the current hardware. However, within existing algorithms always different constraints are given, e.g. that the number of entries must be a power of two. Based on a literature review suitable parallel execution strategies and constraints for the prefix sum calculation should be identified. Furthermore, selected variants should be adapted to work without the given constraints. Since these adaptions will influence the performance, a comparison between the different variants should be conducted.
Goals and results: 
  • Overview of existing prefix sum calculations strategies
  • Implementation of robust prefix sum calculation strategy
  • Evaluation of implemented variants
  • Managing control logic of GPU kernels
Supervisor:   Andreas Meister
Abstract: When complex algorithms are executed control logic is involved to ensure the correctness. Based on architecture, this poses a challenge for GPUs. Hence, the question is, whether control logic should be included within the kernels of GPUs, or whether (most of) the control logic should be managed by the host system (CPU). In this theses, for a suitable algorithm the influence of control logic on the performance of GPU kernels should be evaluated.
Goals and results: 
  • Identification of suitable algorithm for evaluation
  • Implementation of different algorithm variants considering the control flow
  • Hints to the CPU's Branch Prediction Unit and Hardware Prefetcher for Vectorized Processing (Bachelor/Master)
Supervisor:   Marcus Pinnecke
Abstract: GCC/Clang provide two built-in functionalities to give hints to the data cache and the branch prediction unit. We use these hints to express the intention to minimize data cache misses and to maximize the right choice of a certain branch for conditional branches (@if...else@ constructs) inside your push-based vectorized execution engine. The task is to verify whether and which application of these data cache and the branch prediction unit actually bring benefits, lower performance, and where further application might be reasonable. For this task, a cleaned code branch having everything ready to run and to measure the performance is available and can be used out-of-the-box. For the purpose of measuring, we provide data of two columns of the TPC-H line item table. For quick development, we recommend the use of the TPC-H dataset scale factor 1 (SF1). For measuring it is required to use scale factor 10 (SF10) to actually have data that really exceeds the data cache capabilities of the CPU. Both datasets are available.
Goals and results: 
  • Evaluation of the impact of these hints
  • Statements on the code "as-is" + recommendations for improvements (supported by evaluation) for "modified code"
  • Basic C++ skills are required (for self-studying purposes of existing code).
  • Efficient memory layout of complex data types for parallel optimization (Bachelor)
Supervisor:   Andreas Meister
Abstract: Complex data types contain multiple simple data types, such as Integers, Floats, etc. For most of the tasks not only a single item is considered, but a collection of multiple items. Hence, from a logical view, a two-dimensional memory space is greated. Unfortunately, memory is only one-dimenionsal. Therefore, the logical view of the memory must be mapped to the physical view. For the mapping two approaches exist. First, the complete struct (all items of one struct) can be stored in neighboring memory (array of structs). Second, one item of the struct (one item of all structs) can be stored in neighboring memory (struct of arrays). In this thesis, both approaches should be evaluated for an optimization task in query optimization.
Goals and results: 
  • Adaption of existing optimization approach
  • Evaluation of the two approaches for the mapping to memory
Supervisor:   Andreas Meister
Abstract: The dynamic programming (DP) approach for join-order optimization is one state of the art approach for optimizing queries in relational databases. The DPE approach was proposed to extend sequential DP approaches for a parallel evaluation. In this thesis, the existing implementation should be extended based on a selected set of further variants. The implemented variants should be evaluated against existing variants.
Goals and results: 
  • Implementation of selected DPE variants in C++
  • Evaluation of the implemented variants
  • Evaluation of Dependency-aware parallel enumeration for join-order optimization
Supervisor:   Andreas Meister
Abstract: The dynamic programming (DP) approach for join-order optimization is one state of the art approach for optimizing queries in relational databases. The DPE approach was proposed to extend sequential DP approaches for a parallel evaluation. In this thesis, the existing implementation should be extended based on a selected set of further variants. The implemented variants should be evaluated against existing variants.
Goals and results: 
  • Implementation of selected DPE variants in C++
  • Evaluation of the implemented variants
  • DSL based query compilation in SparkSQl
Supervisor:   Bala Gurumurthy
Abstract: Spark SQL translates the analytical operations into relational database operations. They use their own code generator to create the runtime code for executing the user given query. This thesis explores a hierarchy based code compilation strategy where a intermediate Domain Specific Language (DSL) is used to model the different DBMS operations and are used in compiling the given query into executable byteode
Goals and results: 
  • DSL with Selection and Aggregation operations
  • Code compiler using DSL
  • Code compilation rules
Notes or Requirement: 
  • Understanding of Spark and SparkSQL
  • Scala Programming
  • Analytical report on primitives-based execution of TPC-H queries
Supervisor:   Bala Gurumurthy
Abstract: TPC-H provides compute intensive queries as benchmark for comparing different data processing mechanisms. In this topic, we explore the diiferent aspects of primitives to be tuned for efficient processing of all these queries and provide a extensive analytical report on the impact of the tuing opportunities on these queries
Goals and results: 
  • General primitive based execution of TPC-H queries
  • Variants of the primitives
  • Analytical report on the primitive-based execution
Notes or Requirement: 
  • programming in C++ & OpenCL and Python/R (for charting)
Software Engineering Topics:
  • Feature detection from textual requirements
Supervisor:   Yang Li
Abstract: Feature-oriented software development (FOSD) is a paradigm for the construction, customization, and synthesis of large-scale software systems. The key idea of FOSD is to emphasize the similarities of a family of software systems for a given application domain (e.g., database systems, banking software, text processing systems) with the goal of reusing software artifacts among the family members. Features distinguish different members of the family. Feature model as an important model to capture the domain requirement has been accepted by the mainstream domain engineering at the present stage. However, feature model construction from the requirements or textual descriptions of products can be often tedious and ineffective. To tackle this problem, Natural Language Processing (NLP) Techniques can be used to detect features in domain analysis. In spite of the diversity of existing techniques, the challenge is that achieve high accuracy: automatically find a high number of relevant elements (high recall) while maintaining a low number of false-positive results (high precision). Since there is no optimal technique and each approach is applicable in a particular context, it is an open issue how to increase the accuracy of some specific approaches on Feature detection from textual requirements.
Goals and results: 
  • Implementing an improved algorithm to detect features based on NLP Techniques
  • Evaluation against state of the art algorithms in this area
  • Variability-Encoding for Abstract State Machines
Supervisor:   Fabian Benduhn
Abstract: In Feature-Oriented Software Development (FOSD), individual products of a product line can be automatically generated for a given feature selection, either by composing feature modules or by extracting the relevant parts of an annotated code base. Due to the possibly massive amount of possible products, it is not feasible to analyse each product separately. Variability-Encoding is a technique in which compile-time variability is transformed to runtime variability, i.e., a meta-product is created that simulates the variable behaviour of the complete product line. This meta-product can be analysed efficiently to draw conclusions about all products. In our research, we have developed techniques for feature-oriented development of formal specfications based on Abstract State Machines. The goal of this thesis is to develop a concept for variability encoding of Abstract State Machines, implement a prototype, and evaluate it by performing a case study.
Goals and results: 
  • Develop and implement concept for variabilicy encoding for Abstract State Machines
  • Implement a prototype, and evaluate it by performing a case study
  • Lightweight, Variability-Aware Change Impact Analysis
Supervisor:   Sandro Schulze
Abstract: Change Impact Analysis (CIA) has been proposed as a powerful mean to identify the impact of source code changes, i.e., which part of a software system may be influenced by changes. To this end, data- and control-flow dependencies are employed. For variable software systems, such a technique has to take variability (in terms of features) into account, to answer questions such as "Which feature(s) are impacted by a change to feature X"? So far, no solution exists for common variability mechanisms such as the C preprocessor. In this MSc. thesis, the task is to implement a lightweight CIA based on the tool srcml, which provides an abstract program representation by means of XML annotations. Based on this representation, the necessary information should be extracted and used for computing a set of impacted statements, given a particular change.The technique should be evaluated using mid- and large-scale open source systems.
Goals and results: 
  • Concept for CIA, including envisioned workflow and tools to be used
  • Implementation of variability-aware CIA for the C preprocessor
  • A critical evaluation of the implemented technique
  • Analyzing the Birth, Life, and Death of Bug Reports
Supervisor:   Dr.-Ing. Sandro Schulze
Context: Nowadays, open-source systems (OSS) play a pivotal rule in software develop- ment, as they are used even in commercial software. To cope with the increasing demand of software quality, not only version control systems (e.g., GIT) or continuous integration (CI) are commonly used. Also, bug tracking systems are maintained to get rid of as many failures as possible, reported by a vast amount of different stakeholders (developer, user, tester). Over time, this bug databases may get confusing or contains too many bug reports, thus, leaving many of them open.
Task: In this thesis, the student has to investigate reasons for bug reports to be open (or closed, respectively). In particular, the student conducts an empirical analysis of a large amount of bug reports from Mozilla, and develop a technique that allows to reason about the birth, life, and death of bug reports (i.e., why they remain open or get closed). To this end, it might be necessary to dig into machine learning or NLP techniques, but in general, a particular degree of freedom how to solve the task is given.
Goals and results: 
  • Concept for analyzing bug databases, including techniques for reasoning (and pre- diction) of bug reports
  • Implementation of this concept (for mining the bug reports, 3rd party libraries may be used)
  • A critical evaluation of the implemented technique with existing bug databases
  • good programming skills
  • quick grasp of subject matter, strong work ethic, work on you on initiative (with guidance by the supervisor)
  • background in machine learning or data mining is a plus, but not required (can be obtained during MSc thesis)
  • You should be eager, creative, and open-minded to search for smart solutions (that may be not so easy to find)
  • Semi-automatic approaches to support systematic literature reviews (Master)
Supervisor:   Yusra Shakeel
Abstract: Systematic Literature Review (SLR) is a methodology of research which aims to gather and evaluate all the available evidence regarding a specific research topic. The process is composed of three phases: Planning, Conducting and Reporting. Although SLRs gained immense popularity among evidence-based researchers, to conduct the entire process manually can be very time consuming. Hence, software engineering researchers are currently involved in proposing semi-automatic approaches to support different phases of an SLR. In this thesis, you analyze the current state of research related to reporting phase of the SLR process. Based on the findings, develop an approach to support researchers with the steps involved for reporting results of an SLR.
Goals and results: 
  • Determine the current-state-of-art related to approaches for reporting of an SLR
  • Propose and evaluate your concept to semi-automate the steps involved in this phase
  • Automate quality assessment of studies to support literature analysis (Bachelor/Master)
Supervisor:   Yusra Shakeel
Abstract: The number of empirical studies reported in software engineering have significantly increased over the past years. However, there are some problems associated with them, for example, the approach used to conduct the study is not clear or the conclusions are incomplete. Thus, making it difficult for evidence-based researchers to conduct an effective and valid literature analysis. To overcome this problem, a methodology to assess the quality and validity of empirical studies is important. Manually performing quality assessment of empirical studies is quite challenging hence, we propose a semi-automatic approach. In this thesis, you improve the already existing prototype for assessing quality of articles. The aim is to provide the most promising studies relevant to answer a research question.
Goals and results: 
  • Extend existing prototype to assess quality of empirical studies
  • Evaluate the proposed approach
  • Temporal Topic Modeling on Microblog Streaming Data (Master)
Supervisor:   Sabine Wehnert
Abstract: The access to streaming data of microblogs offers the opportunity to analyze the constantly changing public discourse. It is a current research problem to determine methods of how to capture new and unexpected topics and how to model the development of a topic over time. This is not only interesting for news reporters, but as well for any kind of user who is in need of knowing the most important concepts or keywords from continuous updates in multiple data sources. Moreover, in a streaming setting for finding news stories, it is cumbersome to assess the quality of the system, because it is unknown which topic will be discussed in the media afterwards. This poses the challenge of finding a suitable method for determining the quality of the results. The aim of this thesis is to develop a semi-supervised or unsupervised approach for temporal topic modeling given an existing streaming architecture and furthermore to evaluate the model.
Goals and results: 
  • Collect state-of-the-art approaches for temporal topic modeling
  • Implement a prototype integrated into an existing streaming pipeline and evaluate it


Studentische Mitarbeit und offene Stellen

Aktuelle sind keine offene Stellen zu vergeben.

Wissenschaftliche Teamprojekte

Für wissenschaftlichen Teamprojekten bieten wir eine eigene Veranstaltung an: 

Zu Beginn dieser Veranstaltungen werden verschiedene Themen vorgestellt, die innerhalb des Semesters bearbeitet werden können.


Für Softwareprojekte bieten wir ebenfalls eine eigene Veranstaltung an: 

Zu Beginn dieser Veranstaltungen werden verschiedene Themen vorgestellt, die innerhalb der Veranstaltung bearbeitet werden können.

Darüberhinaus stehen folgende Themen für ein Softwareprojekt zur Verfügung.

  • Automatische Generierung von Visualiserungen (Bachelor)
Ansprechpartner:  Andreas Meister
Beschreibung: Zur Analyse von Ergebnissen ist es wichtig Messergebnisse zu visualisieren. Im Rahmen dieses Projektes soll innerhalb eines bestehenden Evaluierungs-Framework die Funktionalität erweitert werden, geeignete Visualisierungen aus den Messdaten abzuleiten. Aufgabe ist hierbei geeignete Visualisierungen zu bestimmen, und die automatische Erzeugung der Visualisierung prototypisch umzusetzen.
Ziele und Ergebnisse: 
  • Bestimmung geeigneter Visualisierungen für Messergebnisse
  • Umsetzung einer automatischen Generierung der Visualisierungen
  • Bestimmung des Ressourcenverbrauchs von UNIX-Prozessen (Bachelor)
Ansprechpartner:  Andreas Meister
Beschreibung: Um verschiedene Varianten von Algorithmen vergleichen zu können wird sich sehr häufig auf die Laufzeit beschränkt. In realen Systemen werden Prozesse jedoch nicht nur alleine, sondern parallel mit anderen Prozessen ausgeführt. Entsprechend ist es nicht nur wichtig, dass die Prozesse schnell sind, sondern effizient mit den vorhandenen Ressourcen (z.B. Hauptspeicher) umgehen. Im Rahmen dieses Projekts soll ein Konzept erstellt werden, um den Ressourcenverbrauch zu messen.
Ziele und Ergebnisse: 
  • Konzept zur Messung des Ressourcenverbrauchs
  • Evaluierung des umgesetzten Konzepts
  • Qualität von Implementierungen (Bachelor)
Ansprechpartner:  Andreas Meister
Beschreibung: Die Implementierung von komplexen Algorithmen ist fehleranfällig. Um die Korrektheit und Qualität der Implementierung sicherzustellen können verschiedene Konzept umgesetzt werden, z.B. Unit-Tests, Continuos Integration, usw. Im Rahmen dieses Projektes sollen geeignete Methoden zur Sicherung der Softwarequalität identifiziert werden, und ausgewählte Methoden in ein bestehendes Evaluierungs-Framework integriert werden.
Ziele und Ergebnisse: 
  • Identifizierung von Methoden zur Verbesserung der Implementierungsqualität
  • Integration von ausgewählten Methoden in ein bestehendes Framework
  • Database Processing Engine(Bachelor)
Ansprechpartner:  Andreas Meister
Beschreibung: Im Rahmen dieses Softwareprojekts soll ein bestehendes Framework zur Join-Order Optimierung um eine Processing Engine erweitert werden. Hierzu ist es notwendig einen bestehenden Anfragegraph in eine konkrete Ausführung zu überführen und ausführen. Hierzu müssen mindestens die Basis Operatoren implementiert werden. Um die Korrektheit zu garantieren sollte die Daten eines geeigneten Benchmarks (z.B. ImDB) sowie Unittests genutzt werden.
Ziele und Ergebnisse: 
  • Processing-Engine für bestehendes Optimierungsframework
  • Verbesserungen für SQLValidator (Bachelor)
Ansprechpartner:  David Broneske
Beschreibung: Im Rahmen dieses Softwareprojekts soll das bestehende Tool SQLValidator um weitere Funktionalität erweitert werden. Die zu implementierende Funktionalität ist dabei zusammen mit dem Betreuer abzustimmen und kann beliebig erweitert oder eingeschränkt werden. Mögliche Aufgaben sind:
  • User Statistiken über bearbeitete Aufgaben
  • User Account Management
  • Erfassung mehrerer Jahrgänge
  • Duplizierung von Aufgaben
  • Check der Korrektheit der Aufgaben bei deren Erstellung
  • Einreichung ER-Aufgaben
Ziele und Ergebnisse: 
  • Implementierung weiterer Funktionen im SQL Validator
  • Datenquälität im Datacenter: das nächste Level ist greifbar (Bachelor)
Ansprechpartner:  David Broneske, Marcus Pöhls
Beschreibung: Implementierung einer Applikation zur Verbesserung der Datenqualität von CPU-Daten eines Rechenzentrums. Zuerst wird die Datenqualität analysiert und anschließend mit Hilfe der APIs von Intel und AMD verbessert. Die Daten der Hardware-Infrastruktur stelle ich für das Projekt bereit. Konkretes Beispiel: für eine Maschine bei der keine Angabe der CPU-Cores vorhanden ist, kann über das Prozessor-Modell (bspw. Intel Xeon Processor E5-2650 v3) die Core-Lücke geschlossen werden. Das SW-Projekt beinhaltet auch die Duplikaterkennung. D.h. für Maschinen die im Datensatz mehrfach und sogar mit unterschiedlichen CPU-Daten vorhanden sind, wird der "beste" (plausibelste) Datenstand genutzt.
Ziele und Ergebnisse: 
  • Recherche und Implementierung von Algorithmen zur Datenqualitätsanalyse
  • API Integration mit Intel/AMD
  • Duplikate finden und bereinigen
  • Implementierung von statistichen Analysen(Bachelor)
Ansprechpartner:  Andreas Meister
Beschreibung: Im Rahmen dieses Softwareprojekts soll ein bestehendes Framework zur Join-Order Optimierung um eine statistische Analyse der Ergebnisse erweitert werden. Ziel des Projektes ist es eine Kovarianzanalyse auf Basis eines linearen Modells zu erstellen. Darüber hinaus sollte eine Power-Analyse zur Abschätzung der notwendigen Versuchsanzahl umgesetzt werden. Die Umsetzung soll hierbei in C++ und R erfolgen. Nach der Umsetzung sollte die Qualität der Umsetzung bewertet bzw. analysiert werden.
Ziele und Ergebnisse: 
  • Umsetzung einer Kovarianzanalyse auf Basis eines linearen Modells
  • Umsetzung einer Poweranalyse
  • Bewertung der umgesetzten Analysen





Last Modification: 10.04.2018 - Contact Person:

Sie können eine Nachricht versenden an: Webmaster