Data Science Chair

    Tobias Koopmann, M.Sc.

    Chair of Data Science (Informatik X)
    University of Würzburg
    Am Hubland
    97074 Würzburg

    Email: koopmann@informatik.uni-wuerzburg.de

    Phone: (+49 931)  31 - 89363

    Office: Room 0.007 (Building M4)

    Fingerprint:  CBC7 EACE BBF3 8CF8 EB3F 3478 A5FB FAA7 8F34 AD4E

    Projects and Research Interests

    I joined the DMIR group for my PhD studies after receiving my masters degree in Computer Science at the university of Würzburg in early 2019. At first, i was working on click trail analysis and human behavior prediction in the web based on Wikipedia click trails. 

    Currently i am employed at the REGIO project. My part in the project is to find and analyse locally successful research cooperations based on co-author networks. 

    Current Thesis/Practica Topics

    Hidden Topic Modelling of Bibliometric Data

    In recent work by Gong et al.[1] developed a method, which allows to extract so-called hidden topics from texts of varying length. The idea is to find matching documents to their summaries.

    We want to use this approach and apply it on texts from authors. Doing so, we can extract these hidden topics for each author. Based on these topics, we can evaluate the following tasks:

    How good is the self assumption of an author? For that, we can compare used kywords on their paper and compare then to the extracted hidden topics.

    Are we able to define and cluster research domains of authors? Based on the topic representation, we can cluster different authors according to their research domain. Maybe we can recommend cooperations based on similiar research topics.

    [1] Gong, H., Sakakini, T., Bhat, S. & Xiong, J. (2018). Document Similarity for Texts of Varying Lengths via Hidden Topics. Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (p./pp. 2341--2351), July, Melbourne, Australia: Association for Computational Linguistics.

    Recommending Co-Authorship based on Bert4Rec

    Recently a lot of work based on the transformer and Bert architecture has been published. One of them is the work by Sun et al.[1], which adopts the Bert architecture on recommandation. The use sequences of IDs from items to train the Bert architecture end-to-end.

    Our idea is to combine two Bert architectures. Firstly we want to create a latent representation for each author. The approach would be to apply a language model on all texts from an author. This creates a vector representing all the scientific work of an author.

    These representation then can be used as input for out own Bert4Rec. As sequences of clicked items, we generate random (or weighted random) walks based on the co-author graph. The actual recommandation task here would be to recommend a suitable cooperation for an author in the graph.

    [1] Sun, F., Liu, J., Wu, J., Pei, C., Lin, X., Ou, W. & Jiang, P. (2019). BERT4Rec: Sequential Recommendation with Bidirectional Encoder Representations from Transformer. Proceedings of the 28th ACM International Conference on Information and Knowledge Management (p./pp. 1441–1450), New York, NY, USA: Association for Computing Machinery. ISBN: 9781450369763



    • On the Right Track! Analy... - Download
      Koopmann, T., Dallmann, A., Hettinger, L., Niebler, T., Hotho, A. (2019) “On the Right Track! Analysing and Predicting Navigation Success in Wikipedia”, in Proceedings Of The 30Th Acm Conference On Hypertext And Social Media, Ht '19, ACM: Hof, Germany, 143--152, available: http://doi.acm.org/10.1145/3342220.3343650.