WEKO3
アイテム
{"_buckets": {"deposit": "05d76f22-e5e4-44cd-83c0-36548451b1eb"}, "_deposit": {"created_by": 2, "id": "12513", "owners": [2], "pid": {"revision_id": 0, "type": "depid", "value": "12513"}, "status": "published"}, "_oai": {"id": "oai:nagasaki-u.repo.nii.ac.jp:00012513", "sets": ["65"]}, "author_link": ["46456", "46454", "46455", "46457"], "item_9_biblio_info_6": {"attribute_name": "書誌情報", "attribute_value_mlt": [{"bibliographicIssueDates": {"bibliographicIssueDate": "2011", "bibliographicIssueDateType": "Issued"}, "bibliographicPageEnd": "69", "bibliographicPageStart": "60", "bibliographicVolumeNumber": "7008", "bibliographic_titles": [{"bibliographic_title": "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)"}]}]}, "item_9_description_4": {"attribute_name": "抄録", "attribute_value_mlt": [{"subitem_description": "This paper proposes a semi-supervised bibliographic element segmentation. Our input data is a large scale set of bibliographic references each given as an unsegmented sequence of word tokens. Our problem is to segment each reference into bibliographic elements, e.g. authors, title, journal, pages, etc. We solve this problem with an LDA-like topic model by assigning each word token to a topic so that the word tokens assigned to the same topic refer to the same bibliographic element. Topic assignments should satisfy contiguity constraint, i.e., the constraint that the word tokens assigned to the same topic should be contiguous. Therefore, we proposed a topic model in our preceding work [8] based on the topic model devised by Chen et al. [3]. Our model extends LDA and realizes unsupervised topic assignments satisfying contiguity constraint. The main contribution of this paper is the proposal of a semi-supervised learning for our proposed model. We assume that at most one third of word tokens are already labeled. In addition, we assume that a few percent of the labels may be incorrect. The experiment showed that our semi-supervised learning improved the unsupervised learning by a large margin and achieved an over 90% segmentation accuracy.", "subitem_description_type": "Abstract"}]}, "item_9_description_5": {"attribute_name": "内容記述", "attribute_value_mlt": [{"subitem_description": "13th International Conference on Asia-Pacific Digital Libraries, ICADL 2011; Beijing; 24 October 2011 through 27 October 2011", "subitem_description_type": "Other"}]}, "item_9_description_63": {"attribute_name": "引用", "attribute_value_mlt": [{"subitem_description": "Lecture Notes in Computer Science, 7008, pp.60-69; 2011", "subitem_description_type": "Other"}]}, "item_9_publisher_33": {"attribute_name": "出版者", "attribute_value_mlt": [{"subitem_publisher": "Springer Verlag"}]}, "item_9_rights_13": {"attribute_name": "権利", "attribute_value_mlt": [{"subitem_rights": "© 2011 Springer-Verlag."}, {"subitem_rights": "The original publication is available at www.springerlink.com"}]}, "item_9_source_id_10": {"attribute_name": "書誌レコードID", "attribute_value_mlt": [{"subitem_source_identifier": "AA0071599X", "subitem_source_identifier_type": "NCID"}]}, "item_9_source_id_7": {"attribute_name": "ISSN", "attribute_value_mlt": [{"subitem_source_identifier": "03029743", "subitem_source_identifier_type": "ISSN"}]}, "item_9_version_type_16": {"attribute_name": "著者版フラグ", "attribute_value_mlt": [{"subitem_version_resource": "http://purl.org/coar/version/c_ab4af688f83e57aa", "subitem_version_type": "AM"}]}, "item_creator": {"attribute_name": "著者", "attribute_type": "creator", "attribute_value_mlt": [{"creatorNames": [{"creatorName": "Masada, Tomonari"}], "nameIdentifiers": [{"nameIdentifier": "46454", "nameIdentifierScheme": "WEKO"}]}, {"creatorNames": [{"creatorName": "Takasu, Atsuhiro"}], "nameIdentifiers": [{"nameIdentifier": "46455", "nameIdentifierScheme": "WEKO"}]}, {"creatorNames": [{"creatorName": "Shibata, Yuichiro"}], "nameIdentifiers": [{"nameIdentifier": "46456", "nameIdentifierScheme": "WEKO"}]}, {"creatorNames": [{"creatorName": "Oguri, Kiyoshi"}], "nameIdentifiers": [{"nameIdentifier": "46457", "nameIdentifierScheme": "WEKO"}]}]}, "item_files": {"attribute_name": "ファイル情報", "attribute_type": "file", "attribute_value_mlt": [{"accessrole": "open_date", "date": [{"dateType": "Available", "dateValue": "2020-12-22"}], "displaytype": "detail", "download_preview_message": "", "file_order": 0, "filename": "LNCS7008_60.pdf", "filesize": [{"value": "491.6 kB"}], "format": "application/pdf", "future_date_message": "", "is_thumbnail": false, "licensetype": "license_free", "mimetype": "application/pdf", "size": 491600.0, "url": {"label": "LNCS7008_60.pdf", "url": "https://nagasaki-u.repo.nii.ac.jp/record/12513/files/LNCS7008_60.pdf"}, "version_id": "f1b213cf-b95f-4d00-b125-a00eca8032be"}]}, "item_language": {"attribute_name": "言語", "attribute_value_mlt": [{"subitem_language": "eng"}]}, "item_resource_type": {"attribute_name": "資源タイプ", "attribute_value_mlt": [{"resourcetype": "conference paper", "resourceuri": "http://purl.org/coar/resource_type/c_5794"}]}, "item_title": "Semi-supervised bibliographic element segmentation with latent permutations", "item_titles": {"attribute_name": "タイトル", "attribute_value_mlt": [{"subitem_title": "Semi-supervised bibliographic element segmentation with latent permutations"}]}, "item_type_id": "9", "owner": "2", "path": ["65"], "permalink_uri": "http://hdl.handle.net/10069/26677", "pubdate": {"attribute_name": "公開日", "attribute_value": "2011-12-06"}, "publish_date": "2011-12-06", "publish_status": "0", "recid": "12513", "relation": {}, "relation_version_is_last": true, "title": ["Semi-supervised bibliographic element segmentation with latent permutations"], "weko_shared_id": 2}
Semi-supervised bibliographic element segmentation with latent permutations
http://hdl.handle.net/10069/26677
http://hdl.handle.net/10069/26677f0adb0f4-2a48-4a39-a19e-33186d76719e
名前 / ファイル | ライセンス | アクション |
---|---|---|
LNCS7008_60.pdf (491.6 kB)
|
|
Item type | 会議発表論文 / Conference Paper(1) | |||||
---|---|---|---|---|---|---|
公開日 | 2011-12-06 | |||||
タイトル | ||||||
タイトル | Semi-supervised bibliographic element segmentation with latent permutations | |||||
言語 | ||||||
言語 | eng | |||||
資源タイプ | ||||||
資源タイプ識別子 | http://purl.org/coar/resource_type/c_5794 | |||||
資源タイプ | conference paper | |||||
著者 |
Masada, Tomonari
× Masada, Tomonari× Takasu, Atsuhiro× Shibata, Yuichiro× Oguri, Kiyoshi |
|||||
抄録 | ||||||
内容記述タイプ | Abstract | |||||
内容記述 | This paper proposes a semi-supervised bibliographic element segmentation. Our input data is a large scale set of bibliographic references each given as an unsegmented sequence of word tokens. Our problem is to segment each reference into bibliographic elements, e.g. authors, title, journal, pages, etc. We solve this problem with an LDA-like topic model by assigning each word token to a topic so that the word tokens assigned to the same topic refer to the same bibliographic element. Topic assignments should satisfy contiguity constraint, i.e., the constraint that the word tokens assigned to the same topic should be contiguous. Therefore, we proposed a topic model in our preceding work [8] based on the topic model devised by Chen et al. [3]. Our model extends LDA and realizes unsupervised topic assignments satisfying contiguity constraint. The main contribution of this paper is the proposal of a semi-supervised learning for our proposed model. We assume that at most one third of word tokens are already labeled. In addition, we assume that a few percent of the labels may be incorrect. The experiment showed that our semi-supervised learning improved the unsupervised learning by a large margin and achieved an over 90% segmentation accuracy. | |||||
内容記述 | ||||||
内容記述タイプ | Other | |||||
内容記述 | 13th International Conference on Asia-Pacific Digital Libraries, ICADL 2011; Beijing; 24 October 2011 through 27 October 2011 | |||||
書誌情報 |
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 巻 7008, p. 60-69, 発行日 2011 |
|||||
ISSN | ||||||
収録物識別子タイプ | ISSN | |||||
収録物識別子 | 03029743 | |||||
書誌レコードID | ||||||
収録物識別子タイプ | NCID | |||||
収録物識別子 | AA0071599X | |||||
権利 | ||||||
権利情報 | © 2011 Springer-Verlag. | |||||
権利 | ||||||
権利情報 | The original publication is available at www.springerlink.com | |||||
著者版フラグ | ||||||
出版タイプ | AM | |||||
出版タイプResource | http://purl.org/coar/version/c_ab4af688f83e57aa | |||||
出版者 | ||||||
出版者 | Springer Verlag | |||||
引用 | ||||||
内容記述タイプ | Other | |||||
内容記述 | Lecture Notes in Computer Science, 7008, pp.60-69; 2011 |