@techreport{oai:ipsj.ixsq.nii.ac.jp:00206112, author = {築地, 毅 and 鈴木, 晴也 and 柴原, 一友 and 藤本, 浩司 and 池田, 龍司 and 尾﨑, 和基 and 森田, 克明 and 松原, 敬信 and Tsuyoshi, Tsukiji and Haruya, Suzuki and Kazutomo, Shibahara and Koji, Fujimoto and Ryuji, Ikeda and Kazuki, Ozaki and Katsuaki, Morita and Takanobu, Matsubara}, issue = {4}, month = {Jun}, note = {本稿では,BERT を利用した教師無しデータへの適用について論ずる.近年ディープラーニングの技術が確立し始めており,特に画像認識分野において,既存の技術では困難だった特徴の自動抽出を実現したことにより,非常に高い精度を上げるようになってきている.自然言語処理においてもディープラーニングの研究は広く行われているが,近年 Google により発表された BERT の功績は大きく,教師あり学習のタスクに対して,既存の成果を大きく上回る成果を上げている.本稿では,教師あり学習の精度を大きく高めた BERT を教師無しデータに適用することで,既存手法の性能向上につながる可能性があるという仮説を主張する.本稿では,特許文書を対象に,教師あり学習を行わずに特許の類似性を図る実験を行った.実験の結果,人手で付与した特許分類フラグに対し 61.9 %の正解率となり,BERT を活用することで教師データを与えずとも,特許の類似度を表現できることを示した., In this paper, we discuss application of documents to unsupervised data using BERT. In recent years, the technology of deep learning had begun to be established, and in the field of image recognition, the automatic extraction of features that was difficult with existing technologies has led to very high accuracy. Deep learning has been widely studied in natural language processing, but in recent years, BERT, by Google, has achieved a great deal of success and has far exceeded the existing achievements for supervised learning tasks. In this paper, we assert that applying BERT, which greatly improves the accuracy of supervised learning, to unsupervised data may lead to better performance than existing methods. We did an experiment on similarity of patents for patent documents without supervised learning. As a result, the accuracy rate was 61.9% for the manually assigned patent classification flag, and it was shown that the similarity of patents could be expressed without using training data by using BERT.}, title = {BERTの教師無しデータへの適用}, year = {2020} }