POSIM — A Multi-Agent Simulation Framework for Social Media Public Opinion Evolution And Governance

Platform	Explicit Cognitive Modeling	Validation (M/P/S)	Real-Case Intervention	LLM Multi-Type Agents	Temporal Precision	Modular Design
S3	✗	✗/✓/✓	✗	✗	★★★	★★★
HiSim	✗	✗/✗/✓	✗	✗	★★	★★
GA-S3	✗	✗/✗/✓	✗	✓	★★★	★★
SPARK	✗	✗/✓/✗	✗	✓	★★	★★
FDE-LLM	✗	✗/✗/✓	✗	✗	★★	★★
TrendSim	✗	✓/✗/✗	✗	✓	★★★★	★★★
OASIS	✗	✗/✓/✓	✗	✗	★★★★	★★★★
LMAgent	✗	✗/✗/✓	✗	✓	★★	★★
POSIM (Ours)	✓	✓/✓/✓	✓	✓	★★★★★	★★★★★

Contributions

Key Contributions

Social-BDI Agent Architecture

Embeds LLMs within a layered cognitive framework (Perception → Belief → Desire → Intention → Action), incorporating emotional arousal and cognitive biases. Three cognitive subsystems are each powered by independent LLM calls, communicating through structured intermediate states. The entire behavioral generation process is fully traceable.

Hybrid Time-Event Driven Environment

Hawkes self-exciting point processes jointly model exogenous event shocks and endogenous user interactions, combined with circadian rhythm modulation, reproducing non-stationary activity patterns at minute-level temporal resolution.

Three-Tier Progressive Validation

Drawing on classical V&V methodology: micro-level behavioral mechanism calibration → macro-level emergent phenomenon verification → statistical result consistency alignment, building simulation credibility layer by layer.

Highly Decoupled Modular Architecture

Agents, simulation environment, and strategy evaluation communicate through standard interfaces — swap the cognitive architecture, change the time engine, or plug in new evaluation metrics without touching other modules.

Architecture

Framework Overview

POSIM consists of three core components: (1) Social-BDI Agents, (2) Hawkes process-driven simulation environment, and (3) Strategy evaluation module for counterfactual reasoning.

Figure 1. POSIM framework architecture. Left: Social-BDI agent cognitive pipeline. Upper-center: Hawkes process-driven simulation environment and virtual social media platform. Lower-right: Strategy evaluation module (Intervenor-Simulator-Evaluator).

Hawkes Self-Exciting Point Process Time Engine

The conditional intensity function models collective activity as the superposition of a background rate, exogenous event shocks (high intensity, slow decay), and endogenous user interactions (low intensity, fast decay):

$$\lambda(t) = \underbrace{\mu}_{\text{background}} + \underbrace{\sum \alpha_{ext} e^{-\beta_{ext}(t - t_i)}}_{\text{exogenous}} + \underbrace{\sum \alpha_{int} e^{-\beta_{int}(t - t_j)}}_{\text{endogenous}}$$

Cognitive Architecture

Social-BDI Agent Architecture

Extending the classical BDI architecture with an emotional dimension, building agents with explicit cognitive states and auditable multi-stage decision chains.

Perception

→

Belief

→

Desire

→

Intention

→

Action

Four-Layer Belief System

👤

B^id — Role Identity

Gender, location, occupation, followers, verification type

Fixed (Personality Anchor)

🧠

B^psy — Psychological

Conformity, paranoia, catharsis, curiosity-seeking patterns

Highly Stable

💬

B^evt — Event Opinion

Stance & reasoning on event entities, dynamically evolving

Dynamically Evolving

🔥

B^emo — Emotional Arousal

6D emotion vector: happy, sad, angry, fear, surprise, disgust

Real-time Fluctuation

Three-Level Chain-of-Thought Intention System

L1 — What to do & to whom: Select action type (like / repost / comment / original post) and target

L2 — How to express: Plan across 4 orthogonal dimensions: Emotion × Stance × Style × Narrative

L3 — What to say: Generate role-consistent social media text under L1+L2 constraints

Four Heterogeneous Agent Types

🧑

Citizen

Primary opinion participants

Colloquial, fragmented, emotion-driven. Impulsive expression under high arousal.

🌟

KOL

Key intermediary in two-step flow

Independent views, agenda-setting. Significant influence on downstream belief updates.

📰

Media

Information collection & dissemination

Formal, restrained, timely. Information confirmation at critical junctures.

🏛

Government

Official stance & public governance

Low frequency, high authority. Post-fermentation statements with turning-point impact.

Datasets

Experimental Datasets

Three representative public opinion events from Sina Weibo spanning social controversy, campus incidents, and food safety. Simulation precision: 10 min/step.

Social Controversy

Luxury Earring (LE)

An actress's earrings identified as ¥2.3M luxury goods, sparking intense public debate on celebrity extravagance.

1,530Users

34,218Posts

~46hDuration

276Sim. Steps

Campus Incident

WHU Library (WL)

Harassment allegation dispute at Wuhan University; court ruling reignited large-scale debate on justice and campus safety.

1,843Users

51,647Posts

~190hDuration

1,140Sim. Steps

Food Safety

Xibei Prepared Food (XF)

Internet celebrity publicly accused a restaurant chain of extensive prepared food use, sparking food safety concerns.

1,987Users

14,892Posts

~71hDuration

426Sim. Steps

Experiments

Results & Analysis

Overall Performance Improvement

POSIM's behavioral, content, and topological metrics outperform the best baseline across three real-world Weibo datasets:

+5.0%

Behavior Layer

+13.0%

Content Layer

+8.5%

Topology Layer

Three-Tier Validation Framework

Micro-Level Mechanism Calibration

Cognitive-behavior chain consistency (0–5), personality stability (0–1), decision robustness (0–1)

Macro-Level Emergence Verification

Opinion lifecycle, multi-agent heterogeneity, emotional polarization, scale-free topology & cascade power-law

Statistical Consistency Alignment

9 quantitative metrics across behavior (3), content (3), and topology (3) layers

Tier 1 · Micro-Level Behavioral Mechanism Validation

500 randomly sampled users, 12 simulation rounds, four methods under identical conditions.

Method	Cognitive-Behavior Chain (0–5) ↑	Personality Stability (0–1) ↑	Decision Robustness (0–1) ↑
Direct-Nothink	1.47 ± 0.50	0.478 ± 0.263	0.629 ± 0.240
Direct-Think	1.75 ± 0.43	0.448 ± 0.269	0.603 ± 0.299
CoT	3.09 ± 0.29	0.516 ± 0.272	0.541 ± 0.356
Social-BDI (Ours)	4.64 ± 0.48	0.661 ± 0.215	0.695 ± 0.213

Key Finding: CoT's decision robustness is actually the lowest (0.541) — without stable state anchoring, input perturbations ripple through the entire reasoning chain. Social-BDI's explicit belief states provide a cognitive anchoring effect, maintaining decision stability.

Tier 2 · Macro-Level Emergent Phenomena

All macro phenomena emerged spontaneously from agent interactions — none were pre-programmed.

**Figure 2. Opinion Lifecycle** Multi-stage lifecycle from outbreak → plateau → resurgence → decline. E₁–E₇ mark exogenous event injection points. Cumulative S-curve closely matches diffusion theory.

Agent Heterogeneity — **Figure 3. Multi-Agent Behavioral Heterogeneity** (a) Emotional intensity over time; (b) Content length distributions; (c) Multi-dimensional behavioral radar charts.

**Figure 4. Emotional Polarization** PI rises from 0.41 to 0.67 (63% increase, p < 0.001), consistent with echo chamber theory. Escalation/de-escalation ratio: 4.78 (ratchet effect).

Power-law Distribution — **Figure 5. Scale-Free Topology & Cascade Power-Law** Degree distribution: γ = 1.87 (R² = 0.880); cascade CCDF: α = 3.70 — reproducing the "most go unnoticed, few go viral" long-tail phenomenon.

Tier 3 · Statistical Consistency Calibration

Statistical Calibration Results

Data	Method	JSD ↓	Act.ρ ↑	RMSE ↓	Beh.Avg ↑	Confr. ↑	\|ΔTTR\| ↓	\|ΔS̄\| ↓	Cont.Avg ↑	Net. ↑	Casc. ↑	PL ↑	Topo.Avg ↑
LE	Rule ABM	0.427	0.808	0.158	0.741	—	—	—	—	0.479	0.633	0.543	0.552
	w/ LLM	0.289	0.799	0.162	0.783	0.544	0.185	0.319	0.680	0.565	0.735	0.918	0.739
	w/ CoT	0.394	0.806	0.151	0.754	0.584	0.141	0.123	0.774	0.755	0.777	0.756	0.763
	POSIM	0.193	0.809	0.154	0.821	0.790	0.030	0.029	0.910	0.895	0.830	0.961	0.896
WL	Rule ABM	0.318	0.681	0.126	0.746	—	—	—	—	0.869	0.696	0.642	0.736
	w/ LLM	0.237	0.722	0.119	0.789	0.453	0.172	0.360	0.640	0.528	0.667	0.580	0.592
	w/ CoT	0.229	0.744	0.116	0.800	0.474	0.091	0.365	0.673	0.941	0.740	0.673	0.784
	POSIM	0.073	0.750	0.118	0.853	0.841	0.010	0.203	0.876	0.850	0.758	0.965	0.858
XF	Rule ABM	0.312	0.664	0.187	0.721	—	—	—	—	0.528	0.614	0.279	0.474
	w/ LLM	0.244	0.671	0.190	0.746	0.765	0.117	0.076	0.858	0.774	0.695	0.482	0.650
	w/ CoT	0.293	0.699	0.181	0.742	0.774	0.134	0.014	0.875	0.767	0.696	0.460	0.641
	POSIM	0.148	0.727	0.168	0.804	0.843	0.019	0.046	0.926	0.885	0.696	0.513	0.698

Ablation Study

Ablation study on the LE dataset to verify the necessity of each module.

Configuration	JSD ↓	ρ ↑	RMSE ↓	Confr. ↑	\|ΔTTR\| ↓	\|ΔS̄\| ↓	Net. ↑	Casc. ↑
Full POSIM	0.193	0.809	0.154	0.790	0.030	0.029	0.895	0.830
w/o Belief	0.258	0.762	0.172	0.706	0.058	0.067	0.861	0.773
w/o Desire	0.267	0.779	0.169	0.682	0.071	0.083	0.853	0.788
w/o Intention	0.237	0.802	0.159	0.728	0.064	0.055	0.858	0.814
w/o Hawkes	0.177	0.235	0.362	0.787	0.207	0.028	0.822	0.754

Ablation Insights

Each component has a clear functional division:

w/o Hawkes: Act.ρ plummets from 0.809 → 0.235; uniform activation destroys temporal dynamics
w/o Belief: Confrontation similarity drops 0.790 → 0.706; agents lose deep understanding
w/o Desire: Content layer degrades most severely (Confr. → 0.682, lowest); motivation is the core driver
w/o Intention: Lexical diversity worsens most (0.064); three-level CoT critical for diversity and topology

Application

Case Studies

Demonstrating POSIM as a computational experiment platform for cognitive priming and counterfactual strategy evaluation.

Case 1 · Cognitive Priming Experiment

200 agents, 30 simulation steps. Two cognitive priming strategies applied at varying coverage rates (20%–100%).

Rational Cognition (RC): Negative emotion drops from 0.844 → 0.571 (−32.3%). Effect monotonically increases with coverage.

Empathy Priming (EP): Counterintuitive empathy paradox — negative emotion increases (0.878 vs 0.844). Deep understanding of others' suffering amplifies rather than mitigates negative sentiment.

Coverage crossing 60% shows a clear threshold effect — below this point, priming barely propagates through social networks.

Empathy Paradox Threshold Effect Non-linear Diffusion

Case 2 · Counterfactual Strategy Evaluation

Five PR strategies compared under identical external events on the Luxury-Earring dataset. The Intervenor-Simulator-Evaluator pipeline enables “what-if” analysis without real-world deployment.

Strategy	Neg. Emotion ↓	Anger ↓	Intensity ↓
Actual Response	0.792	0.791	0.685
Swift Apology	0.749	0.749	0.612
Proactive Transparency	0.773	0.773	0.645
Consumer Dialogue	0.744	0.743	0.598
Strategic Silence	0.831	0.831	0.702

Consumer Dialogue achieves the best results across all metrics (−6.1% neg. emotion vs actual response). Strategic Silence is the worst — inaction amplifies anger (+4.9%).

All strategies exhibit an immediate cooling & gradual rebound pattern: intervention triggers initial sentiment relief, but public opinion naturally restores toward baseline as new events unfold.

Immediate Cooling Gradual Rebound What-If Analysis

Negative Emotion by Strategy

Actual Response

0.792

Swift Apology

0.749

Proactive Transparency

0.773

Consumer Dialogue

0.744

Strategic Silence

0.831

Consumer Dialogue: Best Strategy

Lower negative emotion indicates better PR effectiveness

Architecture

Project Structure

posim/
├── posim/                          # Core framework
│   ├── agents/                     # Agent module
│   │   ├── base_agent.py           # Base class (cognitive pipeline)
│   │   ├── citizen_agent.py        # Citizen agent
│   │   ├── kol_agent.py            # KOL agent
│   │   ├── media_agent.py          # Media agent
│   │   │   └── government_agent.py     # Government agent
│   │   └── ebdi/                   # Social-BDI cognitive architecture
│   │       ├── belief/             # Belief subsystem
│   │       ├── desire/             # Desire subsystem
│   │       ├── intention/          # Intention subsystem (3-level CoT)
│   │       └── memory/             # Streaming memory store
│   ├── engine/                     # Simulation engine
│   │   ├── simulator.py            # Main loop (async concurrent)
│   │   ├── hawkes_process.py       # Hawkes self-exciting process
│   │   └── time_engine.py          # Time engine (circadian)
│   ├── environment/                # Virtual social media platform
│   │   ├── recommendation.py       # Content recommendation
│   │   ├── social_network.py       # Three-layer social network
│   │   ├── hot_search.py           # Trending topics
│   │   └── event_queue.py          # External event queue
│   ├── evaluation/                 # Evaluation framework
│   ├── llm/                        # LLM resource management
│   ├── prompts/                    # Prompt templates
│   └── config/                     # Configuration
├── scripts/                        # Simulation & evaluation scripts
├── data/                           # Datasets
└── requirements.txt

POSIM: A Multi-Agent Simulation Framework for Social Media Public Opinion Evolution And Governance

Why POSIM?

Key Contributions

Social-BDI Agent Architecture

Hybrid Time-Event Driven Environment

Three-Tier Progressive Validation

Highly Decoupled Modular Architecture

Framework Overview

Hawkes Self-Exciting Point Process Time Engine

Social-BDI Agent Architecture

Four-Layer Belief System

Bid — Role Identity

Bpsy — Psychological

Bevt — Event Opinion

Bemo — Emotional Arousal

Three-Level Chain-of-Thought Intention System

Four Heterogeneous Agent Types

Citizen

KOL

Media

Government

Experimental Datasets

Luxury Earring (LE)

WHU Library (WL)

Xibei Prepared Food (XF)

Results & Analysis

Overall Performance Improvement

Three-Tier Validation Framework

Micro-Level Mechanism Calibration

Macro-Level Emergence Verification

Statistical Consistency Alignment

Tier 1 · Micro-Level Behavioral Mechanism Validation

Tier 2 · Macro-Level Emergent Phenomena

Tier 3 · Statistical Consistency Calibration

Statistical Calibration Results

Ablation Study

Ablation Insights

Case Studies

Case 1 · Cognitive Priming Experiment

Case 2 · Counterfactual Strategy Evaluation

Project Structure

Citation

Under Development

B^id — Role Identity

B^psy — Psychological

B^evt — Event Opinion

B^emo — Emotional Arousal