Koina (https://koina.wilhelmlab.org) is a platform designed to facilitate the broader adoption of machine learning (ML) in the proteomics community. The project aims to address the slow uptake of ML models, even when they demonstrably improve data analysis, due to accessibility and interoperability issues.
The core concept of Koina is to create a decentralized and online accessible repository of ML models, specifically focused on predicting peptide properties relevant to proteomics research. Koina hosts a diverse collection of models, covering different modeling approaches, training data, and application limits, thereby providing a comprehensive resource for the proteomics community. This approach is intended to "democratize" access to these models, enabling researchers without extensive computational resources or ML expertise to benefit from these advanced techniques.
Koina's design emphasizes reproducibility and transparency, key principles of scientific research. By explicitly encoding model dependencies and implementing version control, Koina ensures that predictions are reproducible and traceable, bolstering confidence in the results obtained using these models.
The development of client packages for Python and R further lowers the barrier to entry for utilizing Koina's models, enabling users to easily integrate them into their existing analysis pipelines. The ongoing integration of Koina with prominent proteomics software, such as Skyline, EncyclopeDIA, Oktoberfest, and FragPipe, highlights the project's commitment to accessibility and practical application.
Koina represents a significant step towards bridging the gap between ML development and its practical application in proteomics research. Its focus on accessibility, interoperability, and reproducibility lays the groundwork for broader adoption of ML, ultimately benefiting the entire scientific community by empowering researchers with more powerful and efficient tools for data analysis. While the initial focus is on proteomics, the core principles of Koina are applicable to other domains, demonstrating its potential for broader impact across the life sciences.
Publications
2024
Beltrao, Pedro; Van Den Bossche, Tim; Gabriels, Ralf; Holstein, Tanja; Kockmann, Tobias; Nameni, Alireza; Panse, Christian; Schlapbach, Ralph; Lautenbacher, Ludwig; Mattanovich, Matthias; Nesvizhskii, Alexey; Van Puyvelde, Bart; Scheid, Jonas; Schwämmle, Veit; Strauss, Maximilian; Susmelj, Anna Klimovskaia; The, Matthew; Webel, Henry; Wilhelm, Mathias; Winkelhardt, Dirk; Wolski, Witold E.; Xi, Muyao: Proceedings of the EuBIC-MS developers meeting 2023. Journal of Proteomics 305, 2024, 105246 mehr…
Gabriel, Wassim; Giurcoiu, Victor; Lautenbacher, Ludwig; Wilhelm, Mathias: Predicting fragment intensities and retention time of iTRAQ‐ and TMTPro‐labeled peptides with Prosit‐TMT. PROTEOMICS, 2022, 2100257 mehr…