eJournals International Colloquium Tribology 23/1

International Colloquium Tribology
ict
expert verlag Tübingen
125
2022
231

Tribological Experiments in the Age of Big Data

125
2022
Nikolay T. Garabedian
Paul J. Schreiber
Christian Greiner
ict2310429
23rd International Colloquium Tribology - January 2022 429 Tribological Experiments in the Age of Big Data Nikolay T. Garabedian Institute for Applied Materials (IAM), Karlsruhe Institute of Technology (KIT), Karlsruhe, Germany Paul J. Schreiber Institute for Applied Materials (IAM), Karlsruhe Institute of Technology (KIT), Karlsruhe, Germany Christian Greiner Institute for Applied Materials (IAM), Karlsruhe Institute of Technology (KIT), Karlsruhe, Germany Corresponding author: christian.greiner@kit.edu 1. Introduction Among the many reasons for implementing robust data management of scientific results, there are two that stand out: i) the increasing demand put forward by public funding agencies, and ii) the potential for accelerated scientific discovery by integrating machine learning (ML) into tribology research. Due to higher computing and storage capabilities, ML has evolved to rely heavily on large pools of data that can serve to train predictive or analytical algorithms. Such an approach sits in the core of the “big data” idea. Establishing a database which can be treated as “big data in tribology” presents multiple challenges. First, in an experimental lab setting exact tribological properties of a particular system are often unique to the specific test itself; this presupposes that in order to cover a broad range of testing conditions, a large number of tests need to be performed. Second, the field of tribology is inherently multidisciplinary, an artefact related to the diversity of scientific interrogations that could be conducted during the analysis of an entire system of bodies in motion. These two challenges can be addressed by constructing protocols for the fully digital collection and manipulation of data and metadata. The principles of producing findable, accessible, interoperable, and reusable (FAIR) [1] data were suggested as a set of strategies, which when observed enable the exchange and efficient utilization of scientific results. Since tribological properties are a function of entire systems, making data interoperable requires that enough details are attached to any shared data. These details usually need to include not just the features confined to the time frame of the experiment (e.g., base and counter bodies, tribometer, environment, etc.), but they must extend to the processes that produced the specimens of interest. This information is essential for the interpretability, traceability and repeatability of the scientific outcomes. In order to satisfy the FAIR principles, a certain degree of standardization is needed. This does not necessarily call for coordinating global standards, but rather the construction of local standards, which are interoperable between each other. Constructing such standards via ontologies provides a scalable way for knowledge formalization. This was already recognized in the field and a general tribology ontology (called tribAIn) was proposed [2]. Additionally, ontological schemas offer the systematic input of details about objects from different angles. Therefore, such an approach seems like a viable strategy for including the myriad of aspects in which a tribological system can be analyzed. However, just having standards defined is not enough for the digital transformation of tribological experiments. It is also essential that data is automatically recorded within the constructed schema, in order to minimize the effect of human intervention; it is also necessary that the new methods for data storage do not slow down existing scientific workflows. Thus, a viable solution could be the integration of virtual lab environments which ease the communication chain between equipment, data storage, and researchers. One such environment is Kadi4Mat [3], which among other benefits features an electronic lab notebook (ELN) functionality. Connecting tribological equipment and ELN’s, however, is only possible by involving both software developers and tribologists. This paper describes the reimplementation of an experiment, which was part of a past publication by the lab. By striving to digitize each step of the process, we had a chance to develop the necessary tools that can serve as the basis for expanding to system to the rest of experimental protocols in the lab. 430 23rd International Colloquium Tribology - January 2022 Tribological Experiments in the Age of Big Data Figure 1: A simplified overview of the route taken to digitizing one exemplary showcase pin-on-disk tribological experiment. The results are the proposed “FAIR Data Package” and set of tools for its assembly, described below. 2. Methods This project commenced by assembling a team of ten doctoral and postdoctoral researchers who collaboratively listed all objects, processes, and relevant input parameters that constitute typical laboratory experiments. In order to collaborate, a local “TriboWiki” was installed using an instance of MediaWiki, chosen for its wellknown and intuitive environment. Once the lists of terms were established, to narrow down the scope of the project, a “showcase” experiment was selected. It is based on a previous publication [4] and represents a typical lubricated pin-on-disk experiment on a commercial tribometer. The motivation for this particular experiment came from the availability of the same materials, machines and lab protocols. The terms describing the lifecycle of this experiment were organized within an ontology (called TriboData- FAIR Ontology), which was developed using the Protégé software, and the upper SUMO and EXPO ontologies were used, with tribAIn to a limited extent. The three major technical solutions developed during this project aimed at easing tribologists’ interaction with Kadi4Mat with the aim of ensuring that data was recorded observing the FAIR principles, while not requiring extended training or time. The first solution (called SurfTheOWL) is a program that converts the ontological class structure into a hierarchical list of descriptions for each procedure and instrument used in the showcase experiment; it is operated within a web interface, and offers a machine-readable export of the list, which could be utilized as a pre-assembled template directly into an ELN. The second solution came in the form of a guided user interface, which was mounted on a tablet computer; this software offers the collection of details about analogue processes (without computer control, such as polishing, cleaning, etc.) and aims to replace the paper lab journals with its flexible but still systematic event logging. The third solution comes in the form of a generic LabVIEW code, which attaches to the final parts of tribometer programs. This implementation establishes a connection with Kadi4Mat, creates a record within it, and then uploads the collected data, metadata, and descriptions. 3. Results The major outcome of this project (shown schematically in Figure 1) is the so-called “FAIR Data Package” of a tribological experiment. The main feature of the package is the availability of descriptions, data, and metadata in both a machineand human-operable structure. The package contains the necessary information for conducting inquiries about data provenance (e.g., converting raw into processed data), or to explore the relations between machines, operators, and processes. The documentation associated with the meaning of each entry in the package is contained within the TriboDataFAIR Ontology, which is publicly hosted on GitHub; the various records within the FAIR package are referenced to the specific version of the ontology through GitHub commit hashes. All of these features make the data that was collected during this project interoperable and reusable. The entire FAIR data package is uploaded to Zenodo where it becomes findable and accessible. 4. Conclusions and Outlook By going through an entire chain of sub-projects needed for the production of FAIR tribological data, we developed a possible approach and a set of tools for the future integration of such a system in experimentalists’ daily lab routine. Not surprisingly, to complete such a project three core teams needed to work simultaneously: a large group of domain experts in tribology, a team of virtual research environment developers, and a coordinating body of project managers. Looking into the future, once more similar systems are suggested by other labs, there will be a need for software solutions (again developed by tribologists and computer scientists), which will enable automatic remapping of knowledge graphs given the specificities of each individual research group. References [1] M. D. Wilkinson et al., “Comment: The FAIR Guiding Principles for scientific data management and stewardship,” Sci. Data, vol. 3, no. 1, p. 160018, Dec. 2016. 23rd International Colloquium Tribology - January 2022 431 Tribological Experiments in the Age of Big Data [2] P. Kügler, M. Marian, B. Schleich, S. Tremmel, and S. Wartzack, “tribAIn-Towards an explicit specification of shared tribological understanding,” Appl. Sci., vol. 10, no. 13, 2020. [3] N. Brandt et al., “Kadi4mat: A research data infrastructure for materials science,” Data Sci. J., vol. 20, no. 1, pp. 1-14, Feb. 2021. [4] A. Codrignani, B. Frohnapfel, F. Magagnato, P. Schreiber, J. Schneider, and P. Gumbsch, “Numerical and experimental investigation of texture shape and position in the macroscopic contact,” Tribol. Int., vol. 122, pp. 46-57, Jun. 2018.