In recent years, instead of closing data and analysis skills in-house, there has been much interest in widely releasing data analysis knowledge on the web. A data exchange platform is a type of digital platform that exchanges data between stakeholders, e.g., data owners, users, and analysts. However, the datasets handled on such platforms are independently acquired and stored by the data providers for their own purposes. These datasets are not based on the premise of coordination and combination, and there is currently little information available to discuss the systematic organization and combination of these datasets. In this study, we focus on a metadata, summary information of data, and examine the similarity of data on a data exchange platform using natural language processing. In our experiments, we use the metadata from the data exchange platform Kaggle. To compare the similarity of the data, our method employs word2vec and BERT as vectorize methods and converts data descriptions to vectors. Then, our method measures the distances of each vector by calculating cosine similarities between each vector. From experimental results, we found that Kaggle has the same character as other data exchange platforms. Additionally, the results indicated the usability of the natural language processing-based method for extracting similar data pairs.