Item type |
Symposium(1) |
公開日 |
2024-11-30 |
タイトル |
|
|
タイトル |
Building a Semantic Search Platform for Exploring Historical Chinese Corpora |
タイトル |
|
|
言語 |
en |
|
タイトル |
Building a Semantic Search Platform for Exploring Historical Chinese Corpora |
言語 |
|
|
言語 |
eng |
キーワード |
|
|
主題Scheme |
Other |
|
主題 |
Historical corpus, Large Language Models, Semantic change |
資源タイプ |
|
|
資源タイプ識別子 |
http://purl.org/coar/resource_type/c_5794 |
|
資源タイプ |
conference paper |
著者所属 |
|
|
|
Graduate Institute of Linguistics, National Taiwan University, Taipei, Taiwan |
著者所属 |
|
|
|
Graduate Institute of Linguistics, National Taiwan University, Taipei, Taiwan |
著者所属 |
|
|
|
Graduate Institute of Linguistics, National Taiwan University, Taipei, Taiwan |
著者所属(英) |
|
|
|
en |
|
|
Graduate Institute of Linguistics, National Taiwan University, Taipei, Taiwan |
著者所属(英) |
|
|
|
en |
|
|
Graduate Institute of Linguistics, National Taiwan University, Taipei, Taiwan |
著者所属(英) |
|
|
|
en |
|
|
Graduate Institute of Linguistics, National Taiwan University, Taipei, Taiwan |
著者名 |
Kitsunai, Micah
Watty, Deborah
Hsieh Shu-Kai
|
著者名(英) |
Micah, Kitsunai
Deborah Watty
Shu-Kai Hsieh
|
論文抄録 |
|
|
内容記述タイプ |
Other |
|
内容記述 |
This work introduces a historical corpus of the Chinese language spanning approximately 3,000 years and proposes a new corpus search system utilizing word embedding techniques and large language models (LLMs). The system adopts a hybrid search method that combines traditional keyword search with vector-based search based on semantic relationships. This approach enables searches for semantically similar words and visualizations of semantic change, which were challenging with conventional corpus search methods. Additionally, based on the collected corpus data, we implemented a feature to visualize changes in word meanings across specific periods and media types. This interface allows for a multifaceted analysis of language evolution, demonstrating a more effective analytical approach than traditional methods. |
論文抄録(英) |
|
|
内容記述タイプ |
Other |
|
内容記述 |
This work introduces a historical corpus of the Chinese language spanning approximately 3,000 years and proposes a new corpus search system utilizing word embedding techniques and large language models (LLMs). The system adopts a hybrid search method that combines traditional keyword search with vector-based search based on semantic relationships. This approach enables searches for semantically similar words and visualizations of semantic change, which were challenging with conventional corpus search methods. Additionally, based on the collected corpus data, we implemented a feature to visualize changes in word meanings across specific periods and media types. This interface allows for a multifaceted analysis of language evolution, demonstrating a more effective analytical approach than traditional methods. |
書誌情報 |
じんもんこん2024論文集
巻 2024,
p. 241-246,
発行日 2024-11-30
|
出版者 |
|
|
言語 |
ja |
|
出版者 |
情報処理学会 |