You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
From the ACE generation (ERG) log files in a translation pipeline:
25841 EP 'ja:_koto_n_nom' is not covered
25008 EP 'ja:neg_x' is not covered
24090 EP 'ja:coord_c' is not covered
21305 EP 'ja:_te_p_adjunct' is not covered
14923 EP 'ja:unspec_adj' is not covered
14923 EP 'ja:degree' is not covered
12529 EP 'ja:_you_n' is not covered
11217 EP 'ja:adversative' is not covered
7539 EP 'ja:_ni_p' is not covered
7482 EP 'ja:udef_q' is not covered
7340 EP 'ja:vv' is not covered
7122 EP 'ja:_suru_v_soc' is not covered
6587 EP 'ja:_kudasaru_v_aux' is not covered
5926 EP 'ja:_no_p' is not covered
4323 EP 'ja:_comma_d' is not covered
4054 EP 'ja:unknown_v' is not covered
3286 EP 'ja:_tokoro_n_2' is not covered
3164 EP 'ja:_ga_d' is not covered
3121 EP 'ja:_hou_n_7' is not covered
3115 EP 'ja:_はやる_v_unk' is not covered
3084 EP 'ja:_sha_a_4' is not covered
3076 EP 'ja:discourse_x' is not covered
3075 EP 'ja:_mo_d' is not covered
2934 EP 'ja:_chuu_n' is not covered
2779 EP 'ja:plus' is not covered
2309 EP 'ja:_made_p' is not covered
2267 EP 'ja:_mato_n' is not covered
2199 EP 'ja:_tame_n_5' is not covered
2190 EP 'ja:dofw' is not covered
This is a partial list. On the left are the occurrence counts. It's not surprising that Jacy predicates are not covered by the ERG, but when they are very frequent it means that JaEn should perhaps have a hand-built rule to catch the cases when the automatically extracted rules fail to transfer something. In some cases, there is such a rule, but it has become outdated. For instance, neg_x is not covered because JaEn's rule still targets neg_v. Similarly, JaEn targets coord instead of coord_c.
And here's some of those that aren't covered on the ERG side:
30754 EP 'def_q' is not covered
16051 EP 'implicit_q' is not covered
5386 EP '_good_a_at-for' is not covered
4053 EP 'of_rel_noun_mark' is not covered
3168 EP '_house_n_1' is not covered
2879 EP '_so_c' is not covered
2266 EP 'time_n' is not covered
1540 EP 'place_n' is not covered
1269 EP 'abstr_deg' is not covered
889 EP 'def_implicit_q' is not covered
848 EP '_soon_p' is not covered
794 EP '_home_p' is not covered
654 EP '_late_p' is not covered
555 EP '_here_a_1' is not covered
537 EP 'manner' is not covered
517 EP '_yesterday_a_1' is not covered
502 EP '_tomorrow_a_1' is not covered
435 EP '_bear_v_2' is not covered
383 EP '_there_a_1' is not covered
354 EP 'thing' is not covered
300 EP '_as_p_comp' is not covered
297 EP '_grandmother_n_1' is not covered
264 EP '_of_x_subord' is not covered
259 EP '_i_n_num' is not covered
240 EP 'numbered_hour' is not covered
188 EP 'pron' is not covered
There some other reasons for these, but generally it's also because the hand-built JaEn rules are out of date. The def_q and implicit_q ones are because the modified SEM-I for the ERG missed.
The text was updated successfully, but these errors were encountered:
_koto_n_nom can perhaps just be dropped, or be added to the auto-include set for my extractor (gets included in an extracted transfer rule even if it didn't exist in the predicate alignment, as long as it is incorporated into the rest of the MRS fragment).
unspec_adj and degree have the same count because they always co-occur. There should be a general rule or two written for these. Maybe:
From the ACE generation (ERG) log files in a translation pipeline:
This is a partial list. On the left are the occurrence counts. It's not surprising that Jacy predicates are not covered by the ERG, but when they are very frequent it means that JaEn should perhaps have a hand-built rule to catch the cases when the automatically extracted rules fail to transfer something. In some cases, there is such a rule, but it has become outdated. For instance,
neg_x
is not covered because JaEn's rule still targetsneg_v
. Similarly, JaEn targetscoord
instead ofcoord_c
.And here's some of those that aren't covered on the ERG side:
There some other reasons for these, but generally it's also because the hand-built JaEn rules are out of date. The
def_q
andimplicit_q
ones are because the modified SEM-I for the ERG missed.The text was updated successfully, but these errors were encountered: