We found a match
Your institution may have access to this item. Find your institution then sign in to continue.
- Title
面向多源异构丝路文化遗产数据的智能挖掘方法.
- Authors
杨寒淋; 周娅鹃; 赵 丰; 徐 蓉; 安薇竹; 翁正秋; 宁灵舰; 金 宇
- Abstract
Silk as one of the important inventions in ancient China carries rich cultural technological and social connotations. The Silk Road which was originally used for silk transport opened the first large-scale trade exchange between the East and the West in the world history. Since the successful inscription of the Silk Road on the World Heritage List in 2014 its historical concept and practical significance have been further explored and expanded. In order to further promote the value of Silk Road cultural heritage and build a bridge for mutual learning between different civilizations it is necessary to conduct an in-depth analysis and exploration on the Silk Roads cultural heritage data. However the current data including that from different sources e. g. data from different countries data in different languages and data from different platforms and that in different modalities e. g. databases in structured data document report XML and other unstructured data present the characteristics of multi-source and heterogeneity which results in the difficulty of deep processing to the multi-dimensional massive data. For achievingthe deep and efficient integration of Silk Road cultural heritage data an intelligent mining method for multi-source and heterogeneous data is studied. We first collect information about Silk Roads through a vertical search of Internet data. For the multi-source and heterogeneous Silk Road data it can collect coarse-grained information on the entire Internet under the man-machine integration to ensure its wide coverage. It enables a massive data storage and full-text retrieval system and retrieval work implementing with a millisecond response speed for millions of documents. Through the man-machine method the translation software is integrated in the process of using the capture software to achieve the universally multilingual information. Then we use support vector machine to automatically complete the text classification work quickly and accurately. Specifically we use TF-IDF to extract words in the text that can highly effectively express the subject and content identify the full text of the information and extract key elements. The text is serialized to represent and output the abstract sentence with the highest weight and the support vector machine is used to classify the text content. Then the data information is cleaned using text clustering techniques such as de-duplication and denoising. In the aspect of redundancy removal a similarity calculation method based on text clustering technology is proposed to filter redundant data by setting a critical threshold. In terms of denoising outlier analysis is used to eliminate the noise data effectively. Finally the influential events are selected to form the Annual Report Cultural Heritage on the Silk Roads for public release around the world. In this paper a data acquisition system with high coverage and high efficiency is constructed redundant and noisy data are removed by multi-dimensional fusion data cleaning method and automatic indexing automatic abstracting and data classification methods are designed for multi-source heterogeneous silk road heritage data. It is found that using artificial intelligence data mining technology to study the data of Silk Road cultural heritage can effectively ensure the comprehensiveness multi-dimension and efficiency of the data. The research results aim to publicize the value of the Silk Road heritage to the public enhance and stimulate the public's attention and interest in the Silk Roads and provide experience and reference for the analysis and mining of the Silk Road cultural heritage data.
- Publication
Journal of Silk, 2023, Vol 60, Issue 1, p9
- ISSN
1001-7003
- Publication type
Article
- DOI
10.3969/j.issn.1001-7003.2023.01.002