Transformerによる日本語とPythonコード間の機械翻訳

秋信, 有花; 小原, 百々雅; 縫嶋, 慧深; 倉光, 君郎; Yuka, Akinobu; Momoka, Obara; Emi, Nuijima; Kimio, Kuramitsu

WEKO3

lat lon distance

[[sub_check.contents]]

[[sub_radio.contents]]

Field does not validate

[[sub_attr.contents]]　

インデックスツリー

アイテム

Transformerによる日本語とPythonコード間の機械翻訳

https://ipsj.ixsq.nii.ac.jp/records/213901

名前 / ファイル	ライセンス	アクション
IPSJ-TPRO1405008.pdf (105.6 kB)	Copyright (c) 2021 by the Information Processing Society of Japan
オープンアクセス

Item type

Trans(1)

公開日

2021-11-25

タイトル

Transformerによる日本語とPythonコード間の機械翻訳

タイトル

言語

タイトル

Machine Translation between Japanese and Python Code using Transformer

言語

jpn

キーワード

主題Scheme

Other

主題

[発表概要, Unrefereed Presentatin Abstract]

資源タイプ

資源タイプ識別子

http://purl.org/coar/resource_type/c_6501

資源タイプ

journal article

著者所属

日本女子大学大学院理学研究科数理・物性構造科学専攻

著者所属

日本女子大学理学部数物科学科

著者所属

ソフトバンク株式会社

著者所属

日本女子大学理学部数物科学科

著者所属(英)

Graduate School of Science Division of Mathematical and Physical Sciences, Japan Women's University

著者所属(英)

Department of Mathematical and Physical Sciences, Japan Women's University

著者所属(英)

Department of Mathematical and Physical Sciences, Japan Women's University

著者名

秋信, 有花
小原, 百々雅
縫嶋, 慧深
倉光, 君郎

著者名(英)

Yuka, Akinobu
Momoka, Obara
Emi, Nuijima
Kimio, Kuramitsu

論文抄録

内容記述タイプ

Other

内容記述

Transformerは，自然言語処理向けの深層学習モデルである．自然言語処理の機械学習タスク全般において，従来モデルのRNNやLTSMに比べ明らかな好成績を示すのが特徴である．我々は，自然言語を用いたプログラミング支援の実現を目指し，Transformerを用いて日本語とPythonコード間のニューラル機械翻訳モデルを構築した．特徴は，SentencePieceによる日本語のトークン化と特殊トークンを用いたコードのベクトル化である．これらの工夫により，代表的な機械翻訳の評価尺度であるBLEUは，先行研究と比較して高いスコアが得られた．本発表では，我々が構築をした日本語とPythonコード間の翻訳モデルと翻訳精度について報告する．実験では，教師データとなる前処理の有無や対訳コーパスの量を変化させ，さまざまなバリエーションのモデルを構築し，評価を行った．これらの知見に基づいて，ソースコードの深層学習技術の適用への展望をまとめる．

論文抄録(英)

内容記述タイプ

Other

内容記述

Transformer is a deep learning model for natural language processing. It is characterized by clearly better performance than conventional models such as RNN and LTSM in all machine learning tasks of natural language processing. We built a neural machine translation model between Japanese and Python code using Transformer, aiming to realize programming support using natural language. The features are the tokenization of Japanese by SentencePiece and the vectorization of the code using special tokens. With these innovations, BLEU, a typical evaluation measure for machine translation, performed better than the previous work. In this presentation, we report on our machine translation model and its accuracy. In the experiments, we built and evaluated various variations of the model. Based on these findings, we summarize the prospects for the applying deep learning techniques in source code.

書誌レコードID

収録物識別子タイプ

NCID

収録物識別子

AA11464814

書誌情報

情報処理学会論文誌プログラミング（PRO）

巻 14, 号 5, p. 50-50, 発行日 2021-11-25

ISSN

収録物識別子タイプ

ISSN

収録物識別子

1882-7802

出版者

言語

出版者

情報処理学会

戻る

views

See details

	Views

Versions

Ver.1

2025-01-19 16:58:35.798108

Show All versions

Cite as

エクスポート

OAI-PMH

JPCOAR
DublinCore
DDI

Other Formats

JSON
BIBTEX

インデックスリンク

インデックスツリー

アイテム

Transformerによる日本語とPythonコード間の機械翻訳

× 秋信, 有花

× 小原, 百々雅

× 縫嶋, 慧深

× 倉光, 君郎

× Yuka, Akinobu

× Momoka, Obara

× Emi, Nuijima

× Kimio, Kuramitsu

Versions

Share

Cite as

エクスポート