Notice This is not the latest version of this item. The latest version can be found at: https://fordatis.fraunhofer.de/handle/fordatis/385.2
Full metadata record
DC FieldValueLanguage
dc.contributor.authorDurmaz, Ali Riza-
dc.contributor.authorThomas, Akhil-
dc.contributor.authorMishra, Lokesh-
dc.contributor.authorNiranjan Murthy, Rachana-
dc.contributor.authorStraub, Thomas-
dc.date.accessioned2024-03-18T11:19:29Z-
dc.date.available2024-03-18T11:19:29Z-
dc.date.issued2024-
dc.identifier.urihttps://fordatis.fraunhofer.de/handle/fordatis/385-
dc.identifier.urihttp://dx.doi.org/10.24406/fordatis/329-
dc.description.abstractThis repository contains named-entity recognition (NER) datasets for four materials science and engineering (MSE) publications and utility functions to handle the data. The scope of the scholarly articles used as a basis is crystallographic defects, microstructure, mechanical properties in particular fatigue. Each annotation corresponds to a class in a materials science domain ontology called materials mechanics ontology. This should prospectively enable linking materials knowledge and data to facilitate training neurosymbolic machine learning models. Two dataset variants are published: coarse-granular named-entity recognition (CG-NER) where the annotated concepts are high-level ontological classes fine-granular named-entity recognition (FG-NER) where the annotated concepts are low-level ontological classes Aside from the link to the ontology a characteristic of the dataset is its high degree of of annotation. Namely, 179 distinct ontological classes and 27% of all tokens are annotated in the fine-granular dataset.en
dc.description.sponsorshipThe authors express their gratitude to the German Federal Ministry of Education and Research (BMBF) for the funding in the scope of the iBain project (13XP5118B) as part of MaterialDigital.en
dc.language.isoenen
dc.rights.urihttps://creativecommons.org/licenses/by/4.0/en
dc.subjectNamed entity recognitionen
dc.subjectNERen
dc.subjectMaterials science and engineeringen
dc.subjectOntologyen
dc.subjectCONLLen
dc.subjectTSVen
dc.subject.ddcDDC::000 Informatik, Informationswissenschaft, allgemeine Werkeen
dc.titleMaterioMiner - An ontology-based text mining dataset for extraction of process-structure-property entitiesen
dc.typeTabular Dataen
dc.contributor.funderBundesministerium für Bildung und Forschung BMBF (Deutschland)en
dc.description.technicalinformationPlease follow the readme.md file.en
fordatis.groupWerkstoffe, Bauteileen
fordatis.instituteIWM Fraunhofer-Institut für Werkstoffmechaniken
fordatis.rawdatafalseen
fordatis.sponsorship.projectid13XP5118Ben
fordatis.sponsorship.projectnameIntelligent data-guided process design for fatigue-resistant steel components using the example of bainitic microstructureen
fordatis.sponsorship.projectacronymiBainen
fordatis.sponsorship.ResearchFrameworkProgrammMaterialDigitalen
Appears in Collections:Fraunhofer-Institut für Werkstoffmechanik IWM

Files in This Item:
File Description SizeFormat 
materio-miner-1.0.0.zipZip file corresponding to release 1.0.0 of the MaterioMiner dataset repository570,05 kBZIPDownload/Open

Version History
Version Item Date Summary
2 fordatis/385.2 2024-08-13 16:28:07.74 A new version of the ontology is linked which adds descriptions for all classes, normalizes the rdfs:label, and adds owl property characteristics of a few object properties. The data subsets' names are changed to "fine-grained" and "coarse-grained".
1 fordatis/385 2024-03-18 12:19:29.0

This item is licensed under a Creative Commons License Creative Commons