日本語のプライバシポリシに対する完全性を考慮したリスク要約手法の評価

中村, 徹; ウェルデルファエルB., テスファイ; バネッサ, ブラカモンテ; 清本, 晋作; 鈴木, 信雄; Toru, Nakamura; Welderufael, B. Tesfay; Vanessa, Bracamonte; Shinsaku, Kiyomoto; Nobuo, Suzuki

WEKO3

lat lon distance

[[sub_check.contents]]

[[sub_radio.contents]]

Field does not validate

[[sub_attr.contents]]　

インデックスツリー

アイテム

日本語のプライバシポリシに対する完全性を考慮したリスク要約手法の評価

https://doi.org/10.20729/00208902

名前 / ファイル	ライセンス	アクション
IPSJ-JNL6201041.pdf (1.1 MB)	Copyright (c) 2021 by the Information Processing Society of Japan
オープンアクセス

Item type

Journal(1)

公開日

2021-01-15

タイトル

日本語のプライバシポリシに対する完全性を考慮したリスク要約手法の評価

タイトル

言語

タイトル

Evaluation of Risk Summarization for Privacy Policies in Japanese with Considering Completeness

言語

jpn

キーワード

主題Scheme

Other

主題

[一般論文] プライバシ保護，プライバシポリシ，機械学習，自然言語処理

資源タイプ

資源タイプ識別子

http://purl.org/coar/resource_type/c_6501

資源タイプ

journal article

ID登録

10.20729/00208902

ID登録タイプ

JaLC

著者所属

株式会社国際電気通信基礎技術研究所（ATR）

著者所属

ヨハンヴォルフガングゲーテ大学

著者所属

株式会社KDDI総合研究所

著者所属

株式会社KDDI総合研究所

著者所属

近畿大学／株式会社国際電気通信基礎技術研究所（ATR）

著者所属(英)

Advanced Telecommunications Research Institute International (ATR)

著者所属(英)

Goethe University Frankfurt

著者所属(英)

KDDI Research

著者所属(英)

KDDI Research

著者所属(英)

Kindai University / Advanced Telecommunications Research Institute International (ATR)

著者名

中村, 徹
ウェルデルファエルB., テスファイ

バネッサ, ブラカモンテ
清本, 晋作
鈴木, 信雄

著者名(英)

Toru, Nakamura
Welderufael, B. Tesfay
Vanessa, Bracamonte
Shinsaku, Kiyomoto
Nobuo, Suzuki

論文抄録

内容記述タイプ

Other

内容記述

本研究では，自然言語処理と機械学習を用いて，プライバシポリシの要約を行うことにより，より理解しやすいプライバシポリシの実現を目指す．本論文ではまず，日本語のプライバシポリシを収集し，これにラベル付与を行い，評価するコーパスを作成する．ラベル付与にあたり，プライバシリスクに関する項目だけでなく，プライバシポリシの完全性に関する項目を設定する．プライバシポリシの完全性とは，プライバシポリシがトピックを網羅していることを示す性質である．このコーパスに対し，プライバシポリシのテキスト部の特徴抽出および機械学習を行い，学習モデルの精度評価を行う．本論文では，特徴抽出アルゴリズムとして，Bag-of-Words（以下，BoW），TF-IDF，Doc2Vec，学習アルゴリズムとして，サポートベクタマシンとランダムフォレストを採用した場合のラベル推測精度について評価を行う．評価の結果，本コーパスに対しては，BoWを用いた場合とTF-IDFを用いた場合については，両者間に有意な差は確認できなかったが，BoWまたはTF-IDFとランダムフォレストを組み合わせた場合が最も高精度にラベル推測を行うことができることが明らかになった．

論文抄録(英)

内容記述タイプ

Other

内容記述

The purpose of this study is to realize more understandable privacy policies by privacy policy summarization. We first made the corpus for evaluation by collecting Japanese privacy policies and labeling to the policies. As labels of corpus, we set not only items related to privacy risk, but also those related to completeness of privacy policy, that is the property to cover necessary topics for privacy policies. Next, we made prediction models by feature extraction algorithms for text and machine learning algorithm with this corpus. We also evaluated the accuracy if we used these prediction models. In this paper, Bag-of-Words (BoW), TF-IDF, and Doc2Vec were used as feature extraction algorithms for text and support vector machine and random forest were used as machine learning algorithms. As the result of evaluation, we obtained the fact that the combination of BoW or TF-IDF and random forest achieved the best F1 value for predicting labels of privacy policies. There was no significant difference between the case with BoW and that with TF-IDF in this evaluation.

書誌レコードID

収録物識別子タイプ

NCID

収録物識別子

AN00116647

書誌情報

情報処理学会論文誌

巻 62, 号 1, p. 332-345, 発行日 2021-01-15

ISSN

収録物識別子タイプ

ISSN

収録物識別子

1882-7764

戻る

views

See details

	Views

Versions

Ver.1

2025-01-19 15:04:09.682036

Show All versions

Cite as

エクスポート

OAI-PMH

JPCOAR
DublinCore
DDI

Other Formats

JSON
BIBTEX

インデックスリンク

インデックスツリー

アイテム

日本語のプライバシポリシに対する完全性を考慮したリスク要約手法の評価

× 中村, 徹

× ウェルデルファエルB., テスファイ

× バネッサ, ブラカモンテ

× 清本, 晋作

× 鈴木, 信雄

× Toru, Nakamura

× Welderufael, B. Tesfay

× Vanessa, Bracamonte

× Shinsaku, Kiyomoto

× Nobuo, Suzuki

Versions

Share

Cite as

エクスポート