Problem Description

When a user participates in an online discussion, he or she interacts with the other users by formulating replies to their comments.

Until now, this response has been based primarily on the user’s knowledge and experience. Of course the user has the possibility to inform himself further about the topic and to formulate a well thought-out answer.

However, such an approach requires a lot of time and only very few users would be willing to make such an effort. Furthermore, the discussion progresses in the meantime and the user’s answer is probably no longer up to date by then.

At this point, it would be an enormous help if the user were presented with a selection of comments that other users have already formulated on the content of the comment. In this way, the user can quickly and easily take advantage of the fact that other people have already thought about this topic and build on it. In this way, he or she is presented with different perspectives and points of view, which he or she can incorporate into the reply and thus formulate an answer that is well thought out and takes different points of view into account.

This is where we want to start with our research and provide the user with new tools to revolutionize online discussions. To achieve this, we develop a complete ecosystem that provides everything needed to present the user with the relevant comments.

On the one hand, we need the backend where we have a knowledge base to store comments for the various topics discussed in the comment sections. Next, we have a web scraper that scrapes the big news agencies on a regular basis to keep the knowledge base up-to-date. At last for the backend, we need a sophisticated model that decides which comments are presented to the user accordingly to the comment he or she is interested in.

On the other hand, we need a suitable plugin that represents the front end of the ecosystem. This is the part the user interacts with and therefore, we need to make sure that the results of the backend are presented in an appropriate manner to the user. Otherwise, the results of the backend are of no use.

However, in order to present the user with comments that are a good complement in terms of content, it is necessary to have a database that is always up to date and has comments on current topics.

This is where this work will start and crawl various German-language news sites and store the comments of the current articles in a meaningful format.


  1. Development of a strategy to identfiy articles and comments be crawled.
  2. Development of a format to store the comments and articles
  3. Processing and storing of articles and comments

Interest in social media and comment sections, natural language processing, web scrapping, programming and online argumentation


Jan Steimann
Raum ·