Neural Network-Based Model for Japanese Predicate Argument Structure Analysis Tomohide Shibata, Daisuke Kawahara and Sadao Kurohashi (Kyoto University, Japan) Background & Overview Predicate-Argument Structure (PAS) analysis is a task of identifying “who does what to whom” in a sentence Japanese PAS analysis is considered as one of the most difficult basic tasks due to the following two phenomena: 1. Case disappearance: when a topic marker “は” is used, case markings disappear 2. Argument omission: arguments are often omitted dependency parsing ジョン は パン を 買って 食べた。 John-TOP bread-ACC bought-and ate φ-NOM NOM case analysis SOTA: joint identification of all the arguments [Ouchi+15] - Scores for edges are calculated using the dot product of a sparse highdimensional feature vector with a model parameter → A hand-crafted feature template is needed Our proposed model adopts Ouchi’s model as a base model, and is achieved by an NN-based two-stage method 1. Learn selectional preferences in an unsupervised manner using a large raw corpus 2. For an input sentence, we score a likelihood that a predicate takes an element as an argument using an NN framework zero anaphora resolu on が (ga) → nomina ve (NOM) を (wo) → accusa ve (ACC) に (ni) → da ve (DAT) Base Model [Ouchi+15] ジョン John パン bread local score a1 NOM global score a2 買う buy a3 ACC 食べる eat a4 DAT p1 買う buy p2 食べる eat ACC NOM DAT a5 NULL φ-ACC local score global score Proposed Model 1. Argument Prediction Model 2. NN-based Score Calculation No external knowledge is used in the base model Selectional preferences are the most important clue PASs are first extracted from an automatically-parsed raw Web corpus Learn selectional preferences using the extracted PASs based on an NN e.g., p(ACC = bread|predicate = eat ) Calculate local and global scores using an NN framework Predicate/argument embeddings can capture the similar behavior of near-synonyms All the combinations of features in the input layer can be considered local score global score scor el (x, e) 2 Wl 1 Wg 1 Wl pred embed arg embed case vp va other arg. pred features score scor eg (x, ei , ej ) 2 Wg pred embed arg embed v pj v pi v aj v ai casei other casej features vf g vf l Experimental Results Evaluation set: Kyoto University Web Document Leads Corpus (5,000 Web documents) Gold morphologies, dependencies and named entities were used To consider “author” and “reader” as a referent, the two special nodes, 89.3 86.0 [author] and [reader], were added in 76.5 the graph of the base model 10M Web sentences were used for training the argument prediction model ACC 86.0 これまで 生産してきた 商品 の 中から、 89.3 so far 76.5 produce goods-GEN among 画像 で いくつか 紹介していきます。 image-INS 49.7 53.4 42.1 some introduce [author] NOM goods ACC [reader] DAT case analysis zero anaphora resolution 49.7 42.1 53.4 Future Work Inter-sentential zero anaphora resolution Incorporate coreference resolution into our model
© Copyright 2026 Paperzz