|Lecturer:||Fabian Benduhn, Gunter Saake|
|Credits:||6 Credit Points|
|Module:||Course for the Module "Schlüssel-und Methodenkompetenzen" of Master programs; DKE (Applications); DE (Fachliche Spezialisierung);
|Language:||The full course will be held in English. Papers and presentations have to be in English as well.|
Schedule and Deadlines updated on 24.04.17
Remember to start working on the papers early - deadlines are strict! First submitted draft of paper must be complete (including all chapters), so you can receive useful feedback.
In this master level course, we expect everyone to be familiar with basic rules of scientific referencing/citations and will not accept plagiarism of any kind in your work. Especially, copying sentences from third party work fully or in parts, or "rephrasing" sentences in order to hide plagiarism, will be treated as plagiarism.
Student Conference, 02.08.2017, Room 301
9:00-9:05 Opening Sessiom
9:05-9:30 Muhammad Umar Ashraf, Stream and Real-Time Processing of Big Data
9:30-09:55 Aqeel Syed Shamsi, Multi-Query Execution in the Many-Core Age
9:55-10:20 Syed Ejaz Haider, Influence of different types of fuels on the length of premixed flame using Aspen Plus
10:25-10:50 Sameh Manaa, A Survey on Multimodel Databases
10:50-11:15 Saheb Ghosh, Mutation testing in Software Product Line
11:15-11:40 Aditya Sai Ram Nemali, Comparing Performance with Apache Spark and Hadoop
11:40-11:50 Closing Session
The course is intended for graduate students who intend to pursue an academic career, primarily at Master students and PhD students (in the first year of their PhD). We especially recommend this course for Master students that consider to continue in a PhD position or want to practice academic writing in English for their thesis.
All participants should be interested in academic research and in practicing academic writing.
Although PhD students will not receive a grade or schein, we encourage them to participate in this course to practice academic writing and to prepare a paper for their own research project (to be sent to another conference or workshop or to be published as a technical report).
The participating students will simulate a scientific conference to acquire skills required for ...
- ... writing academic papers
- ... presenting scientific results
- ... participating in a conference
- ... reviewing academic papers of others
- ... organizing a conference
- ... using web-based paper submission and review systems
In summary, you will have to write a paper (with two chances to improve it after the initial version), write reviews (two review rounds with 3 reviews each), and present your work (a short initial presentation, a practice presentation, and a final presentation).
The course consists of a lecture that introduces topics such as academic writing, research ethics, organizing a conference, and presentation. The main focus will be on writing an own academic paper, reviewing papers by others, and presenting your results in front of the group.
- Every participant writes an academic paper, typically giving an overview of the current state and future challenges of a selected research area from software engineering or database systems. We are open for a wide range of topics, that can align with other research projects (e.g., part of PhD research or preparation for a Diploma or Master's thesis).
- Every participant presents his topic (and relevant literature) in a short presentation, for first feedback.
- After about six weeks, every participant submits a first version of his/her paper, which is subsequently reviewed by at least 3 other participants.
- Another two or three days later, an improved version of the paper based on the reviewers comments is submitted and subsequently reviewed again.
- The paper can be improved again. A final version is submitted at the end of the semester.
- All papers are presented in a conference. This conference will take place on a whole day near the end of the semester (July or August, date will be fixed in the first lectures). Before the conference, every participant will practice their presentation and get feedback from the others for possible improvements.
Although there are relatively few lectures, we expect that participants focus on reading, writing, reviewing and presentation and we recommend that you reserved at least one day per week (6 CP = 180h = about 12 hours per week).
The course will be graded based on the pre-versions of the paper, the final version, the reviews and the presentation.
(The specific dates may change.)
Lectures and Presentations (updated on 24.04.17, due to several requests we uploaded some of last years slides - however the actual content of the lectures may differ!)
- 2017-04-06 Lecture: Introduction and topic selection
- 2017-04-13 No Lecture
- 2017-04-20 No Lecture
- 2017-04-27 No Lecture
- 2017-05-04 Lecture: Academic writing I (structure, getting started)
- 2017-05-11 Short Presentations of topic and relevant literature (5 min, strict)
- 2017-05-18 Lecture: Academic writing II (style basics, references)
- 2017-05-25 No Lecture (Christi Himmelfahrt)
- 2017-06-01 No Lecture
- 2017-06-08 Lecture: Publication process (conferences, journals, how to select a venue, how to write a review, ...)
- 2017-06-15 Lecture: Academic writing III (clarity, cohesion, patterns, typical problems) (safeguarding good scientific practice, scientific misconduct, plagiarism, ...)
- 2017-06-22 Lecture: Research ethics
- 2017-06-29 No Lecture
- 2017-07-06 No Lecture
- 2017-07-27 Lecture: Scientific presentations + practice presentations
- 2017-08-02 Final presentations in a full-day event, room 301, 9-13
Deadlines (updated on 22.06.17)
All deadlines are strict, we cannot accept any submission or review with delay.
- 2017-04-12: Send your title/topic
- 2017-04-19: Send your slides for the topic presentation (if you use PPT, send me a PDF version additionally)
- 2017-05-07: Submission of abstract due, via email
- 2017-05-24: Submission of first draft due, online submission via easyChair (if you have no account please register first on the linked page)
- 2017-06-10: First review due
- 2017-06-24: Submission second draft due , online submission via easyChair
- 2017-07-02: Second review due
- 2017-07-09: Send your slides for the practice presentation
- 2017-07-26: Submission of final version due, online submission via easyChair
Format Instructions: ACM SIG Templates (choose option 1 or 2, as you like)
The paper must be between 4 and 8 pages long. PhD students that plan to submit their paper to a real workshop or conference may also use other templates. In this case, the paper should still have a similar length when it would be formatted as the ACM template.
Topics of interest
You can write about your own research topic or results from a bachelor/master thesis. However, if you do not bring your own topic, you can choose one of the following topics in software engineering or databases (mainly for Master students):
- Database Operation Tuning (David Broneske)
A current trend in database systems is to tune algorithms at a very fine granularity. Current code optimizations are controversially discussed, but a clear applicability of them is missing. Consequently, discuss the applicability of a subset of available code optimizations on selected database algorithms.
- Bogdan Raducanu, Peter Boncz, Marcin Zukowski: Micro Adaptivity in Vectorwise
- Jingren Zhou, Kenneth A. Ross: Implementing Database Operations Using SIMD Instructions
- John L. Hennessy, David A. Patterson: Computer Architecture -- A Quantitative Approach
- Database Operations on modern Processing Devices (David Broneske)
Tuning database operations to the underlying hardware is a hot topic with the increasing usage of co-processors. There are numerous publications involving different algorithms and processing devices. Create a survey regarding database operations on different processing device.
- Naga K. Govindaraju, Brandon Lloyd, Wei Wang, Ming Lin, Dinesh Manocha: Fast Computation of Database Operations Using Graphics Processors
- Rene Müller, Jens Teubner, Gustavo Alonso: Data Processing on FPGAs
- Thomas Willhalm, Yazan Boshmaf, Hasso Plattner, Nicolae Popovici, Alexander Zeier, Jan Schaffner: SIMD-Scan: Ultra Fast in-Memory Table Scan using on- Chip Vector Processing Units
- On the Verification of Distributed Data Systems (Gabriel Campero Durand)
In order to scale horizontally, modern data systems have eschewed strong ACID guarantees in favor of eventual consistency and the BASE model. However, while non-relational distributed data products can claim to ensure Consistency and Availability, or Availability and Partition Tolerance, these claims are not always accurate. Fault-injection testing is being used by the industry to verify these claims (among them, the Jepsen test suite). To an extent, such tests can be expected to become a part of the routine operational evaluations performed by DBA teams. We propose a survey on these cutting-edge testing technologies, and a close look at some cases that they evaluate.
- Caitie McCaffrey: The Verification of a Distributed System
- Peter Balis and Kyle Kingsbury: The Network is Reliable
- Dataversity (whitepaper): ACID vs. BASE: The Shifting pH of Database Transaction Processing
- Multi-model Databases (Gabriel Campero Durand)
A multi-model database (MMDB) can be roughly defined as one which simultaneously supports different data models. These systems can be further divided between extensions to single-databases and multi-database systems (including integrated, federated, polystore and multistore cases). For end-users MMDBs are valuable to simplify operations, reduce data duplication and the cost of integrating data. We propose a short comparative survey of research devoted to MMDBs, a brief categorization of existing proposals and a listing of challenges in the field.
- Jiaheng Lu and Irena Holubova: Multi-model Data Management: What’s New and What’s Next? (see http://udbms.cs.helsinki.fi/?tutorials)
- Jennie Duggan and others: The bigdawg polystore system.
- Francesca Bugiotti and others: Flexible Hybrid Stores: Constraint-Based Rewriting to the Rescue
- Variability Analysis of Textual Requirements (Yang Li)
Textual requirement is the initial artifact that provides a variety of information containing variation points. And, extracting variability information from textual requirements can also obtain traceability links easily to all other artifacts, e.g., source code or test cases. Discuss and compare different approaches to analyze and extract variability information from textual requirements in Software Product Lines. How is the automation of each approach (e.g., automatic, semi-automatic)?
- Kun Chen, Wei Zhang, Haiyan Zhao, Hong Mei: An Approach to Constructing Feature Models Based on Requirements Clustering
- Weston, Nathan and Chitchyan, Ruzanna and Rashid, Awais: A Framework for Constructing Semantically Composable Feature Models from Natural Language Requirements
- Davril, Jean-Marc and Delfosse, Edouard and Hariri, Negar and Acher, Mathieu and Cleland-Huang, Jane and Heymans, Patrick: Feature Model Extraction from Large Collections of Informal Product Descriptions
- Big Data Processing Frameworks (Xiao Chen)
Along with increasing data volume, many big data processing frameworks has emerged recent years. Although they have a common goal that to process big data efficiently and scalably, each of them has their emphasises on certain application scenarios or types of data. Discuss and compare different big data processing frameworks. How do they differ and what are they good at or weak in?
- Jeffrey Dean and Sanjay Ghemawat: MapReduce: simplified data processing on large clusters
- Matei Zaharia, Mosharaf Chowdhury, Michael J. Franklin, Scott Shenker and Ion Stoica: Spark: Cluster Computing with Working Sets
- Paris Carbone, Stephan Ewen, Seif Haridi, KTH & SICS Sweden, data Artisans and TU Berlin & DFKI: Apache Flink™: Stream and Batch Processing in a Single Engine
- Dynamic Programming on GPUs (Andreas Meister)
Dynamic programming provides optimal results for optimization problems using an exhaustive search. GPUs are used to accelerate Dynamic programming by using the highly parallel architecture of GPUs. Discuss and compare different application scenarios of Dynamic Programming on GPUs. What are possible requirements and limitations of these application scenarios?
- Kazufumi Nishida, Yasuaki Ito, Koji Nakano: Accelerating the dynamic programming for the matrix chain product on the GPU
- Kazufumi Nishida, Koji Nakano, Yasuaki Ito: Accelerating the dynamic programming for the optimal polygon triangulation on the GPU
- Vincent Boyer, Didier El Baz, Moussa Elkihel: Dense dynamic programming on multi GPU
- Multi Query Execution in the Many-Core Age (Marcus Pinnecke)
One of the outstanding properties of database systems is multi user support. Thus, multiple users can request queries to the database system during a certain point in time. Multi query execution deals with the challenges for this feature. Discuss and compare different approaches to manage multi query execution in database systems. What are the challenges, how does each approach face the challenges?
- Iraklis Psaroudakis, Manos Athanassoulis, Anastasia Ailamaki: Sharing Data and Work Across Concurrent Analytical Queries. PVLDB 6(9): 637-648 (2013)
- Iraklis Psaroudakis, Tobias Scheuer, Norman May, Anastasia Ailamaki: Task Scheduling for Highly Concurrent Analytical and Transactional Main-Memory Workloads. ADMS@VLDB 2013: 36-45
- Anastasia Ailamaki, Erietta Liarou, Pinar Tözün, Danica Porobic, Iraklis Psaroudakis: How to stop under-utilization and love multicores. ICDE 2015: 1530-1533
- Florian Wolf, Iraklis Psaroudakis, Norman May, Anastasia Ailamaki, Kai-Uwe Sattler: Extending database task schedulers for multi-threaded application code. SSDBM 2015: 25:1-25:12
- Sampling in Software Product Lines (Mustafa Al-Hajjaji)
Since it is not possible to generate all valid products of a product line, several approaches have been proposed to restrict the number of products by generating a set number of products. These algorithms aim to generate a minimum number of products while achieving a certain degree of coverage. Discuss and compare different sampling algorithms have been proposed to sample product lines.
- Christopher Henard, Mike Papadakis, Gilles Perrouin, Jacques Klein, Patrick Heymans, and Yves Le Le Traon. Bypassing the Combinatorial Explosion: Using Similarity to Generate and Prioritize T-Wise Test Configurations for Software Product Lines.
- Mustafa Al-Hajjaji, Sebastian Krieter, Thomas Thüm, Malte Lochau, and Gunter Saake. IncLing: Efficient Product-Line Testing Using Incremental Pairwise Sampling.
- Martin Fagereng Johansen,Oystein Haugen, and Franck Fleurey. An Algorithm for Generating T-Wise Covering Arrays from Large Feature Models.
- Mutation testing in software product line (Mustafa Al-Hajjaji)
Mutation is a well-known technique in the context of software code and formal modeling. Program mutation consists in introducing small modifications into program code such that these simple syntactic changes, called mutations, represent typical mistakes that programmers often make. Discuss different approaches of using mutation in SPL testing. How these approaches have been used to generate products and to assess SPL testing techniques? What are the challenges of applying mutation testing in product –line
- Paolo Arcaini, Angelo Gargantini , Paolo Vavassori : Generating Tests for Detecting Faults in Feature Models
- Christopher Henard, Mike Papadakis, Gilles Perrouiny, Jacques Klein, and Yves Le Traon: Assessing Software Product Line Testing via Model-based Mutation: An Application to Similarity Testing
- Mustafa Al-Hajjaji, Fabian Benduhn, Thomas Thüm, Thomas Leich, and Gunter Saake. Mutation Operators for Preprocessor-Based Variability.
- H. Lackner and M. Schmidt. Towards the Assessment of Software Product Line Tests: A Mutation System for Variable Systems.
- D. Reuling, J. Bürdek, S. Rotärmel, M. Lochau, and U. Kelter. Fault-Based Product-Line Testing: Effective Sample Generation Based on Feature-Diagram Mutation.
- Mustafa Al-Hajjaji, Jacob Krüger, Fabian Benduhn, Thomas Leich, and Gunter Saake. Efﬁcient Mutation Testing in Conﬁgurable Systems.
- H. Lackner and M. Schmidt. Towards the Assessment of Software Product Line Tests: A Mutation System for Variable Systems.
- A Survey of Visualization Mechanisms to Support the Extended Product Line Configuration Process (Juliana Alves Pereira)
Product line configuration practices have been employed by industries as a mass customization process. In this context, application engineers widely use extended feature models as the most accepted formalism for selecting features that are in accordance with stakeholders' requirements. Extended feature models describe product functional and non-functional requirements and their interdependencies. In industrial scenarios, application engineers often deal with large extended feature models with complex relationships. Consequently, the management of the configuration space becomes challenging. This survey has the goal of analyzing interactive available visualization mechanism to support application engineers to manage the challenges of configuring extended feature models. How does each approach support the product line configuration process?
- Jabier Martinez, Tewfik Ziadi, Raul Mazo, Tegawendé F. Bissyandé, Jacques Klein and Yves Le Traon. Feature Relations Graphs: A Visualisation Paradigm for Feature Constraints in Software Product Lines
- Juliana Alves Pereira, Sebastian Krieter, Jens Meinicke, Reimar Schröter, Gunter Saake, and Thomas Leich. FeatureIDE: Scalable Product Configuration of Variable Systems.
- Juliana Alves Pereira, Pawel Matuszyk, Sebastian Krieter, Myra Spiliopoulou, and Gunter Saake. A Feature-Based Personalized Recommender System for Product-Line Configuration
- Evidence of Problems Caused by the C Preprocessor (Wolfram Fenske)
The C preprocessor, CPP, has often been criticized as a source of incomprehensible code and subtle bugs. This critique has been voiced both by industry professionals as well as by academics. The question is, which scientific evidence (empirical or otherwise) exists to support it? Find and discuss relevant research on the topic! Some initial pointers are given below. One possible way to complete this list is to use "snowballing": First, read through the references of these papers and check the ones with promising titles. Second, check literature that quotes the papers you already have. Third, repeat until nothing new turns up.
- Flávio Medeiros et al.: The Love/Hate Relationship with the C Preprocessor: An Interview Study. ECOOP 2016
- Iago Abal et al.: 42 Variability Bugs in the Linux Kernel: A Qualitative Analysis. ASE 2014
- Sandro Schulze et al.: Does the Discipline of Preprocessor Annotations Matter?: A Controlled Experiment. GPCE 2014
- Michael D, Ernst et al.: An empirical Analysis of C Preprocessor Use. IEEE Transactions on Software Engineering 2002.
- Change-Proneness in Software (Wolfram Fenske)
A popular proverb advises to "Never change a running team!" In software development, however, changes are a fact of life. Existing code is constantly modified, for instance, to add new functionality or modify existing behavior. What is the main motivation to investigate change-proneness in the academic literature? Which techniques have been used to study it? What are important findings? Survey relevant literature and try to answer some or all of these questions! If you come up with other interesting question on the topic, go ahead and focus on those!
- Daniele Romano et al.: Analyzing the Impact of Antipatterns on Change-Proneness Using Fine-Grained Source Code Changes. WCRE 20012
- Marco D'Ambros et al.: On the Relationship Between Change Coupling and Software Defects. WCRE 2009
- Parastoo Mohagheghi et al.: An Empirical Study of Software Change: Origin, Acceptance Rate, and Functionality vs. Quality Attributes. International Symposium on Empirical Software Engineering 2004
- F. George Wilkie et al.: Coupling Measures and Change Ripples in C++ Application Software. Journal of Systems and Software 2000
- Variability Challenges in Cyber-Physical Systems (Fabian Benduhn)
Recently, researchers have started to investigate the idea to use software product line engineering to manage the variability of cyber-physical systems. Give an overview about research challenges that arise from variability in cyber-physical systems! Try to provide a taxonomy that considers different dimensions of variability and the role they play in the development process. What is specific about the challenges presented in literature regarding cyber-physical systems?
- Cyber-physical system product line engineering: comprehensive domain analysis and experience report, Yue et al., 2016
- Cyber-Physical Systems Product Lines: Variability Analysis and Challenges , Arrieta et al., 2016
- Structuring variability in the context of embedded systems during software engineering, Heuer and Pohl, 2014
- Modeling Runtime-Variability in Cyber-Physical Systems (Fabian Benduhn)
Modern Cyber-Physical systems such as wireless sensor networks or smart homes require adaptions and reconfigurations at runtime to cope with changing requirements and environments. Give an overview about how variability models can be used to handle runtime variability in such systems. What kind of variability models are used and how do they support runtime variability?
- Autonomic Computing through Reuse of Variability Models at Runtime: The Case of Smart Homes, Cetina et al., 2009
- Smart factory product lines: a configuration perspective on smart production ecosystems
- An overview of Dynamic Software Product Line architectures and techniques: Observations from research and industry, Capilla et al., 2014
- Runtime variability for dynamic reconfiguration in wireless sensor network product lines , Ortiz et al., 2012