{"updated":"2025-01-19T15:20:24.347849+00:00","metadata":{"_oai":{"id":"oai:ipsj.ixsq.nii.ac.jp:00217901","sets":["1164:4088:10830:10908"]},"path":["10908"],"owner":"44499","recid":"217901","title":["アラートスコアリングによるITシステムの障害箇所推定方式の提案"],"pubdate":{"attribute_name":"公開日","attribute_value":"2022-05-12"},"_buckets":{"deposit":"e6c75286-4000-40c0-8f68-664e1f2a47ab"},"_deposit":{"id":"217901","pid":{"type":"depid","value":"217901","revision_id":0},"owners":[44499],"status":"published","created_by":44499},"item_title":"アラートスコアリングによるITシステムの障害箇所推定方式の提案","author_link":["565436","565438","565437","565439","565441","565440"],"item_titles":{"attribute_name":"タイトル","attribute_value_mlt":[{"subitem_title":"アラートスコアリングによるITシステムの障害箇所推定方式の提案"},{"subitem_title":"Proposal for a Method of Estimating IT System Failure Locations Using Alert Scoring","subitem_title_language":"en"}]},"item_keyword":{"attribute_name":"キーワード","attribute_value_mlt":[{"subitem_subject":"ICM","subitem_subject_scheme":"Other"}]},"item_type_id":"4","publish_date":"2022-05-12","item_4_text_3":{"attribute_name":"著者所属","attribute_value_mlt":[{"subitem_text_value":"富士通株式会社"},{"subitem_text_value":"富士通株式会社"},{"subitem_text_value":"富士通株式会社"}]},"item_4_text_4":{"attribute_name":"著者所属(英)","attribute_value_mlt":[{"subitem_text_value":"FUJITSU LIMITED","subitem_text_language":"en"},{"subitem_text_value":"FUJITSU LIMITED","subitem_text_language":"en"},{"subitem_text_value":"FUJITSU LIMITED","subitem_text_language":"en"}]},"item_language":{"attribute_name":"言語","attribute_value_mlt":[{"subitem_language":"jpn"}]},"item_publisher":{"attribute_name":"出版者","attribute_value_mlt":[{"subitem_publisher":"情報処理学会","subitem_publisher_language":"ja"}]},"publish_status":"0","weko_shared_id":-1,"item_file_price":{"attribute_name":"Billing file","attribute_type":"file","attribute_value_mlt":[{"url":{"url":"https://ipsj.ixsq.nii.ac.jp/record/217901/files/IPSJ-IOT22057019.pdf","label":"IPSJ-IOT22057019.pdf"},"format":"application/pdf","billing":["billing_file"],"filename":"IPSJ-IOT22057019.pdf","filesize":[{"value":"1.7 MB"}],"mimetype":"application/pdf","priceinfo":[{"tax":["include_tax"],"price":"0","billingrole":"43"},{"tax":["include_tax"],"price":"0","billingrole":"44"}],"accessrole":"open_login","version_id":"c7b2843d-0101-48da-af46-c41b73c1d2e8","displaytype":"detail","licensetype":"license_note","license_note":"Copyright (c) 2022 by the Institute of Electronics, Information and Communication Engineers This SIG report is only available to those in membership of the SIG."}]},"item_4_creator_5":{"attribute_name":"著者名","attribute_type":"creator","attribute_value_mlt":[{"creatorNames":[{"creatorName":"近藤, 玲子"}],"nameIdentifiers":[{}]},{"creatorNames":[{"creatorName":"荻原, 一隆"}],"nameIdentifiers":[{}]},{"creatorNames":[{"creatorName":"白石, 崇"}],"nameIdentifiers":[{}]}]},"item_4_creator_6":{"attribute_name":"著者名(英)","attribute_type":"creator","attribute_value_mlt":[{"creatorNames":[{"creatorName":"Reiko, Kondo","creatorNameLang":"en"}],"nameIdentifiers":[{}]},{"creatorNames":[{"creatorName":"Kazutaka, Ogihara","creatorNameLang":"en"}],"nameIdentifiers":[{}]},{"creatorNames":[{"creatorName":"Takashi, Shiraishi","creatorNameLang":"en"}],"nameIdentifiers":[{}]}]},"item_4_source_id_9":{"attribute_name":"書誌レコードID","attribute_value_mlt":[{"subitem_source_identifier":"AA12326962","subitem_source_identifier_type":"NCID"}]},"item_4_textarea_12":{"attribute_name":"Notice","attribute_value_mlt":[{"subitem_textarea_value":"SIG Technical Reports are nonrefereed and hence may later appear in any journals, conferences, symposia, etc."}]},"item_resource_type":{"attribute_name":"資源タイプ","attribute_value_mlt":[{"resourceuri":"http://purl.org/coar/resource_type/c_18gh","resourcetype":"technical report"}]},"item_4_source_id_11":{"attribute_name":"ISSN","attribute_value_mlt":[{"subitem_source_identifier":"2188-8787","subitem_source_identifier_type":"ISSN"}]},"item_4_description_7":{"attribute_name":"論文抄録","attribute_value_mlt":[{"subitem_description":"多数のアプリの組み合わせからなるサービス(マイクロサービス)が複数の仮想マシンや物理マシン上に配置されているシステムでは,各機器が連携して動作するため障害が発生すると広範な機器からアラートが発生し,障害箇所の特定が難しい.さらに障害箇所特定のためアラート発出の閾値設定を行うが,その設定が不適切だった場合に,その機器を障害箇所として特定できない問題もある.また多数の機器があるため複数の異なる障害が短時間の内に発生することも考えられ,異なる障害箇所から発生した複数のアラートを同じ障害として原因調査してしまうこともあり得る.そこで我々は,各機器のアラートから障害箇所である可能性をスコアリングする障害箇所推定方式を提案する.障害箇所推定方式は次の 3 つの特徴を持つ.アプリとインフラの依存関係を統合して分析することで,アプリとインフラを跨いだ広範な機器のアラートから障害箇所を推定できる.また機器間のアラートの伝搬をスコアに反映することで,閾値設定ミス等でアラートが上がらない機器も障害箇所として推定できる.さらに構成情報と機器間の依存関係から関連するアラートをグループ化することで,複数障害の分類への応用も期待できる.本提案技術で運用者はスコアの高い機器から調査を行うことができ,障害復旧時間の短縮が見込める.","subitem_description_type":"Other"}]},"item_4_description_8":{"attribute_name":"論文抄録(英)","attribute_value_mlt":[{"subitem_description":"In a system where services (microservices) consisting of a combination of many applications are deployed on multiple virtual and physical machines, each device works together, so when a failure occurs, alerts are generated from a wide range of devices, making it difficult to identify the location of the failure. In addition, the alert thresholds are set to identify the failure location, but if the settings are inappropriate, the equipment cannot be identified as the failure location. Also, because of the large number of devices, multiple different failures can occur in a short period of time, and it is possible that multiple alerts generated from different failure locations may be investigated as the same failure. Therefore, we propose a failure location estimation method that evaluates possibility of a failure location from the alerts of each device. The failure location estimation method has the following three features. By integrating and analyzing the dependencies between applications and infrastructure, the failure location can be estimated from alerts for a wide range of devices from applications to infrastructures. In addition, by reflecting the propagation of alerts between devices in the score, even devices that do not raise alerts due to a threshold setting error, etc., can be estimated as fault locations.  Furthermore, by grouping related alerts based on configuration information and dependencies between devices, it is expected to be applied to the classification of multiple failures. By using the proposed technology, operators can investigate the equipment with the highest score first, which is expected to shorten the time required for fault recovery.","subitem_description_type":"Other"}]},"item_4_biblio_info_10":{"attribute_name":"書誌情報","attribute_value_mlt":[{"bibliographicPageEnd":"6","bibliographic_titles":[{"bibliographic_title":"研究報告インターネットと運用技術(IOT)"}],"bibliographicPageStart":"1","bibliographicIssueDates":{"bibliographicIssueDate":"2022-05-12","bibliographicIssueDateType":"Issued"},"bibliographicIssueNumber":"19","bibliographicVolumeNumber":"2022-IOT-57"}]},"relation_version_is_last":true,"weko_creator_id":"44499"},"created":"2025-01-19T01:18:19.622498+00:00","id":217901,"links":{}}