#4 Enhancing Structural Mapping with LLM-derived Abstractions for Analogical Reasoning in Narratives

Detailed Summary (EN)

Read-like-fullpaper digest

This paper tackles However, once we extract events and abstract them according to their meaning in the corresponding story, it becomes apparent that ST 1 is a disanalogy (as shown by the mismatch between Rejection and Reward) and ST 2 is an analogy to SB (as all abstractions map logically between these two narratives). As such, it plays a central role in complex domains and tasks: legal argumentation, business decision-making, causal explanation, creative problem-solving [3], and even non-traditional forms of reasoning, such as interpreting poetic metaphors [3]. and analogy is used to map the event’s functional and causal organization from one story onto another (i.e., structure similarity), rather than by matching lexical overlap or topical similarity (i.e., surface similarity) [5].

The core proposal is As such, it plays a central role in complex domains and tasks: legal argumentation, business decision-making, causal explanation, creative problem-solving [3], and even non-traditional forms of reasoning, such as interpreting poetic metaphors [3]. and analogy is used to map the event’s functional and causal organization from one story onto another (i.e., structure similarity), rather than by matching lexical overlap or topical similarity (i.e., surface similarity) [5]. Closer error analysis reveals the remaining challenges in abstraction at the right level, in incorporating implicit causality, and an emerging categorization of analogical patterns in narratives. The narratives are decomposed into units, which are then converted into abstractions to capture their roles and general meaning, thus facilitating a structural mapping.

The empirical case is built around Our experiments reveal that abstractions consistently improve model performance, resulting in competitive or better performance than end-to-end LLM baselines. Then, YARN considers all possible pairs of abstracted units across the two stories as candidate local mappings and score each pair using a similarity function.

The central reported finding is Then, YARN considers all possible pairs of abstracted units across the two stories as candidate local mappings and score each pair using a similarity function.

The paper also makes it clear that Recently, Large Language Models (LLMs) have been reported to exhibit emergent analogy-solving behavior [16], although follow-up analyses highlighted substantial limitations, suggesting that these capabilities probably stem from data contamination or surface pattern matching rather than from analogical reasoning abilities reminiscent of those of humans [17, 18]. Intersecting these three challenges is the observation that the relationship between LLMs and structural mapping frameworks, both in terms of complementarity and integration, ha However, cognitive engines for structural mapping still assume pre-extracted entities and are thus not directly applicable to unstructured data such as narratives. Overall, the paper is most convincing where its proposed method is directly supported by the reported comparisons, but the scope of the claim should still be read in light of the evaluation setup and stated limitations.

Final takeaway

Main takeaway: Then, YARN considers all possible pairs of abstracted units across the two stories as candidate local mappings and score each pair using a similarity function.
Important caution: Recently, Large Language Models (LLMs) have been reported to exhibit emergent analogy-solving behavior [16], although follow-up analyses highlighted substantial limitations, suggesting that these capabilities probably stem from data contamination or surface pattern matching rather than from analogical reasoning abilities reminiscent of those of humans [17, 18].

Problem definition

However, once we extract events and abstract them according to their meaning in the corresponding story, it becomes apparent that ST 1 is a disanalogy (as shown by the mismatch between Rejection and Reward) and ST 2 is an analogy to SB (as all abstractions map logically between these two narratives).
As such, it plays a central role in complex domains and tasks: legal argumentation, business decision-making, causal explanation, creative problem-solving [3], and even non-traditional forms of reasoning, such as interpreting poetic metaphors [3].
and analogy is used to map the event’s functional and causal organization from one story onto another (i.e., structure similarity), rather than by matching lexical overlap or topical similarity (i.e., surface similarity) [5].
This limitation of SME and related engines to operate on unstructured data has been recognized, leading to methods that complement SME to map entities using automatically extracted relations [12] or with emotion

Core idea & method

As such, it plays a central role in complex domains and tasks: legal argumentation, business decision-making, causal explanation, creative problem-solving [3], and even non-traditional forms of reasoning, such as interpreting poetic metaphors [3].
and analogy is used to map the event’s functional and causal organization from one story onto another (i.e., structure similarity), rather than by matching lexical overlap or topical similarity (i.e., surface similarity) [5].
Closer error analysis reveals the remaining challenges in abstraction at the right level, in incorporating implicit causality, and an emerging categorization of analogical patterns in narratives.
The narratives are decomposed into units, which are then converted into abstractions to capture their roles and general meaning, thus facilitating a structural mapping.
We define and operationalize four levels of abstraction that capture both the general meaning of units and their roles in the story, grounded in prior work on framing.
Our experiments reveal that abstractions consistently improve model performance, resulting in competitive or better performance than end-to-end LLM baselines.

Actual findings

Then, YARN considers all possible pairs of abstracted units across the two stories as candidate local mappings and score each pair using a similarity function.

How the conclusion was reached

Step 1 — Proposed approach: As such, it plays a central role in complex domains and tasks: legal argumentation, business decision-making, causal explanation, creative problem-solving [3], and even non-traditional forms of reasoning, such as interpreting poetic metaphors [3].
Step 2 — Evaluation setup or comparison basis: Our experiments reveal that abstractions consistently improve model performance, resulting in competitive or better performance than end-to-end LLM baselines.
Step 3 — Main reported evidence: Then, YARN considers all possible pairs of abstracted units across the two stories as candidate local mappings and score each pair using a similarity function.
Step 5 — Claim boundary / limitation: Recently, Large Language Models (LLMs) have been reported to exhibit emergent analogy-solving behavior [16], although follow-up analyses highlighted substantial limitations, suggesting that these capabilities probably stem from data contamination or surface pattern matching rather than from analogical reasoning abilities reminiscent of those of humans [17, 18].

Experimental setup & results

Then, YARN considers all possible pairs of abstracted units across the two stories as candidate local mappings and score each pair using a similarity function.

Limitations & risks

Recently, Large Language Models (LLMs) have been reported to exhibit emergent analogy-solving behavior [16], although follow-up analyses highlighted substantial limitations, suggesting that these capabilities probably stem from data contamination or surface pattern matching rather than from analogical reasoning abilities reminiscent of those of humans [17, 18].
Intersecting these three challenges is the observation that the relationship between LLMs and structural mapping frameworks, both in terms of complementarity and integration, ha
However, cognitive engines for structural mapping still assume pre-extracted entities and are thus not directly applicable to unstructured data such as narratives.
We identify a gap between cognitive frameworks such as SMT and LLMs for analogical reasoning in narratives, which we summarize as three challenges.

상세 요약 (KO)

전체 논문 읽은 느낌 요약

그러나 이 논문에서는 이벤트를 추출하고 해당 스토리의 의미에 따라 추상화하면 ST 1은 비유비(거부와 보상의 불일치로 표시됨)이고 ST 2는 SB(모든 추상화가 이 두 내러티브 사이에 논리적으로 매핑됨)와 유사하다는 것이 분명해집니다. 이처럼 법적 논증, 비즈니스 의사결정, 인과관계 설명, 창의적인 문제 해결[3]은 물론 시적 은유 해석과 같은 비전통적 형태의 추론까지 복잡한 영역과 작업에서 중심 역할을 합니다[3]. 비유는 어휘적 중복이나 주제적 유사성(즉, 표면 유사성)을 일치시키는 대신 사건의 기능적, 인과적 구성을 한 이야기에서 다른 이야기로 매핑하는 데 사용됩니다(즉, 구조 유사성). 핵심 제안은 이처럼 법적 논증, 비즈니스 의사결정, 인과관계 설명, 창의적인 문제 해결[3], 심지어 시적 은유 해석과 같은 비전통적 형태의 추론까지 복잡한 영역과 작업에서 중심적인 역할을 합니다[3]. 비유는 어휘적 중복이나 주제적 유사성(즉, 표면 유사성)을 일치시키는 대신 사건의 기능적, 인과적 구성을 한 이야기에서 다른 이야기로 매핑하는 데 사용됩니다(즉, 구조 유사성). 보다 면밀한 오류 분석을 통해 올바른 수준의 추상화, 암시적 인과관계 통합, 내러티브에서 유추 패턴의 새로운 분류에 남아 있는 과제가 드러납니다. 내러티브는 단위로 분해된 다음 추상화로 변환되어 역할과 일반적인 의미를 포착함으로써 구조적 매핑을 용이하게 합니다. 경험적 사례는 추상화가 모델 성능을 지속적으로 향상시켜 엔드투엔드 LLM 기준보다 경쟁력이 있거나 더 나은 성능을 제공한다는 실험을 통해 구축되었습니다. 그런 다음 YARN은 두 스토리에 걸쳐 가능한 모든 추상화 단위 쌍을 후보 로컬 매핑으로 간주하고 유사성 함수를 사용하여 각 쌍의 점수를 매깁니다. 보고된 중심 결과는 다음과 같습니다. 그런 다음 YARN은 두 스토리에 걸쳐 가능한 모든 추상화 단위 쌍을 후보 로컬 매핑으로 간주하고 유사성 함수를 사용하여 각 쌍의 점수를 매깁니다. 이 논문은 또한 최근 LLM(Large Language Models)이 창발적 유추 해결 동작을 보이는 것으로 보고되었음을 분명히 밝혔습니다[16]. 후속 분석에서는 상당한 한계가 강조되었지만 이러한 기능은 아마도 인간의 유추 추론 능력보다는 데이터 오염이나 표면 패턴 일치에서 비롯된 것으로 나타났습니다[17, 18]. 이 세 가지 문제가 교차하는 것은 LLM과 구조 매핑 프레임워크 간의 관계가 상보성과 통합 측면에서 모두 있다는 관찰입니다. 그러나 구조 매핑을 위한 인지 엔진은 여전히 사전 추출된 엔터티를 가정하므로 내러티브와 같은 구조화되지 않은 데이터에 직접 적용할 수 없습니다. 전반적으로, 이 논문은 제안된 방법이 보고된 비교에 의해 직접적으로 뒷받침된다는 점에서 가장 설득력이 있지만, 청구 범위는 평가 설정 및 명시된 제한 사항을 고려하여 읽어야 합니다.

핵심 결론

주요 내용: 그런 다음 YARN은 두 스토리에 걸쳐 가능한 모든 추상화 단위 쌍을 후보 로컬 매핑으로 간주하고 유사성 함수를 사용하여 각 쌍의 점수를 매깁니다.
중요한 주의 사항: 최근 LLM(Large Language Models)이 창발적 유추 해결 동작을 보이는 것으로 보고되었습니다[16]. 후속 분석에서는 상당한 한계가 강조되었지만 이러한 기능은 아마도 인간의 유추 추론 능력보다는 데이터 오염이나 표면 패턴 일치에서 비롯된 것으로 나타났습니다[17, 18].

문제 정의

그러나 이벤트를 추출하고 해당 스토리의 의미에 따라 추상화하면 ST 1은 비유비(거절과 보상의 불일치로 표시됨)이고 ST 2는 SB(모든 추상화가 이 두 이야기 사이에 논리적으로 매핑됨)와 유사하다는 것이 분명해집니다.
이처럼 법적 논증, 비즈니스 의사결정, 인과관계 설명, 창의적인 문제 해결[3]은 물론 시적 은유 해석과 같은 비전통적 형태의 추론까지 복잡한 영역과 작업에서 중심 역할을 합니다[3].
비유는 어휘적 중복이나 주제적 유사성(즉, 표면 유사성)을 일치시키는 대신 사건의 기능적, 인과적 구성을 한 이야기에서 다른 이야기로 매핑하는 데 사용됩니다(즉, 구조 유사성).
구조화되지 않은 데이터에서 작동하는 SME 및 관련 엔진의 이러한 한계가 인식되어 자동으로 추출된 관계[12]를 사용하거나 감정을 사용하여 엔터티를 매핑하기 위해 SME를 보완하는 방법이 탄생했습니다.

핵심 아이디어/방법

이처럼 법적 논증, 비즈니스 의사결정, 인과관계 설명, 창의적인 문제 해결[3]은 물론 시적 은유 해석과 같은 비전통적 형태의 추론까지 복잡한 영역과 작업에서 중심 역할을 합니다[3].
비유는 어휘적 중복이나 주제적 유사성(즉, 표면 유사성)을 일치시키는 대신 사건의 기능적, 인과적 구성을 한 이야기에서 다른 이야기로 매핑하는 데 사용됩니다(즉, 구조 유사성).
보다 면밀한 오류 분석을 통해 올바른 수준의 추상화, 암시적 인과관계 통합, 내러티브에서 유추 패턴의 새로운 분류에 남아 있는 과제가 드러납니다.
내러티브는 단위로 분해된 다음 추상화로 변환되어 역할과 일반적인 의미를 포착함으로써 구조적 매핑을 용이하게 합니다.
우리는 프레이밍에 대한 이전 작업을 기반으로 스토리에서 유닛의 일반적인 의미와 역할을 모두 포착하는 네 가지 수준의 추상화를 정의하고 운영화합니다.
우리의 실험에 따르면 추상화는 모델 성능을 지속적으로 향상시켜 엔드투엔드 LLM 기준보다 경쟁력이 있거나 더 나은 성능을 제공하는 것으로 나타났습니다.

실제 결과

그런 다음 YARN은 두 스토리에 걸쳐 가능한 모든 추상화 단위 쌍을 후보 로컬 매핑으로 간주하고 유사성 함수를 사용하여 각 쌍의 점수를 매깁니다.

결론이 나온 과정

1단계 — 제안된 접근 방식: 법적 논증, 비즈니스 의사결정, 인과관계 설명, 창의적인 문제 해결[3], 심지어 시적 은유 해석과 같은 비전통적 형태의 추론까지 복잡한 영역과 작업에서 중심 역할을 합니다[3].
2단계 — 평가 설정 또는 비교 기준: 실험에 따르면 추상화는 모델 성능을 지속적으로 향상시켜 엔드투엔드 LLM 기준보다 경쟁력이 있거나 더 나은 성능을 제공하는 것으로 나타났습니다.
3단계 — 보고된 주요 증거: 그런 다음 YARN은 두 스토리에 걸쳐 가능한 모든 추상화 단위 쌍을 후보 로컬 매핑으로 간주하고 유사성 함수를 사용하여 각 쌍의 점수를 매깁니다.
5단계 — 주장 경계/제한: 최근 LLM(Large Language Models)이 창발적 유추 해결 동작을 보이는 것으로 보고되었습니다[16]. 후속 분석에서는 상당한 한계가 강조되었지만 이러한 기능은 아마도 인간의 유추 추론 능력보다는 데이터 오염이나 표면 패턴 일치에서 비롯된 것으로 나타났습니다[17, 18].

실험 설정/결과

그런 다음 YARN은 두 스토리에 걸쳐 가능한 모든 추상화 단위 쌍을 후보 로컬 매핑으로 간주하고 유사성 함수를 사용하여 각 쌍의 점수를 매깁니다.

한계/리스크

최근 LLM(Large Language Models)이 창발적 유추 해결 동작을 보이는 것으로 보고되었지만[16], 후속 분석에서는 상당한 한계가 강조되었지만, 이러한 기능은 아마도 인간의 유추 추론 능력보다는 데이터 오염이나 표면 패턴 일치에서 비롯된 것으로 나타났습니다[17, 18].
이 세 가지 문제가 교차하는 것은 보완성과 통합 측면에서 LLM과 구조 매핑 프레임워크 간의 관계가
그러나 구조 매핑을 위한 인지 엔진은 여전히 사전 추출된 엔터티를 가정하므로 내러티브와 같은 비정형 데이터에 직접 적용할 수 없습니다.
우리는 내러티브의 유추 추론을 위한 SMT와 LLM과 같은 인지 프레임워크 간의 격차를 식별하며 이를 세 가지 과제로 요약합니다.