The CnOpenData A-Share Listed Company Stock Forum Text Database originates from East Money Net's Guba (股吧), China's most influential stock investment communication community. It systematically compiles investor discussions from 1992 to the latest updates. This database encompasses multidimensional data including posts, replies, user information, and social relationships, comprehensively documenting the sentiment fluctuations, information dissemination, and interactive behaviors of Chinese retail investors. It serves as an authoritative foundational data resource for observing A-share market sentiment and conducting text analysis and behavioral finance research.
Data Uniqueness: Multi-Version Strategy for Comprehensive Research Scenarios
- Historical Snapshot Editions (2022 Edition, 2023 Edition) – Preserving Precious Early Data Legacy: Permanently preserves complete Guba data from 1992 to the collection period under early website structures. For instance, the 2022 Edition contains significantly more posts (e.g., 15.3 million in 2009 and 49.77 million in 2015) than subsequent versions can retrospectively capture, making this data an irreplaceable resource for studying early market ecosystems.
- Annual Rolling Update Edition – Providing the Latest and Most Complete Current Data: While limited in retrospective coverage of early historical data, this edition undergoes annual rolling updates, offering continuous and complete data streams from the starting year to the present. It is the preferred choice for recent event analysis and in-depth high-frequency research.
- Historical Full Merge Edition – The Most Complete and Longest-Spanning Ultimate Data Solution: Vertically integrates historical snapshot editions and rolling updates through primary key deduplication, forming a single dataset with "the longest time span and most comprehensive total data volume." The merged dataset includes 778 million posts and 1.52 billion replies, achieving seamless coverage from 1992 to the present.
- API Real-Time Update Edition – Capturing Market Pulse for Instantaneous Decision-Making and Research: Provides daily incremental real-time data streams, enabling researchers to capture minute-by-minute and hour-by-hour discussions and sentiment reactions to breaking news, financial reports, and policies. This edition is ideal for high-frequency trading strategy validation and real-time public opinion monitoring in time-sensitive fields.
Data Completeness: Extended Time Series and Massive Scale for Robust Research
- Extremely Long Time Span: All versions contain records starting from 1992, continuously covering nearly all key cycles and major events in China's stock market development up to the latest date. This enables multi-decadal trend comparisons, cycle analyses, and policy effect evaluations.
- Massive Data Volume: Cumulatively integrates approximately 780 million posts and over 1.52 billion replies from more than 26 million users, forming one of the largest Chinese-language financial social media text databases globally. This scale supports complex machine learning model training and large-sample statistical analysis.
- Continuous Annual Sequence: From 2006 onwards, each year maintains at least hundreds of thousands to billions of posts and replies without gaps, ensuring continuity for time-series analysis.
Potential Application Scenarios:
- Investor Sentiment and Market Volatility Research: Post and reply content directly reflects retail investor sentiment fluctuations, enabling the construction of investor sentiment indices to analyze dynamic relationships with stock prices and trading volumes, providing empirical evidence for behavioral finance.
- Information Diffusion and Market Efficiency Analysis: As a "fermentation pool" for market rumors and policy interpretations, Guba enables research on information diffusion efficiency, public opinion influence, and impact on pricing processes through text propagation path and network structure analysis.
- Corporate Governance and Event Studies: For significant corporate events such as financial report releases, M&A, and executive changes, Guba discussions' popularity, sentiment tendencies, and market reactions can be analyzed, offering references for corporate information disclosure and investor relations management.
- User Behavior and Social Network Research: Combining user profiles, follower relationships, and interaction data enables in-depth exploration of investor community structures, opinion leader influence, and group decision-making behaviors, extending sociology and communication studies to financial contexts.
The CnOpenData A-Share Guba Text Database redefines the depth and breadth of financial text data resources with its epic time span, massive data volume, and meticulously designed multi-version system. Whether for historical retrospection, current insights, or future predictions, this database provides researchers with the most comprehensive, suitable, and reliable data support, making it an indispensable core resource for exploring behavioral logic in China's capital markets.
Time Period
Up to 2025 (real-time updates)
Data Scale


Field Display
Sample Data
A-Share Listed Company Guba Post Details Table
A-Share Listed Company Guba Reply Details Table
A-Share Listed Company Guba User Details Table
A-Share Listed Company Guba User-Followers Table
参考文献
- 姜富伟、李梦如、孟令超,2024:《金融稳定沟通与银行系统性风险》,《世界经济》,2024第10期。
- 郑建东、吕晓亮、吕斌、郭峰,2022:《社交媒体平台信息交互与资本市场定价效率——基于股吧论坛亿级大数据的证据》,《数量经济技术经济研究》第11期。
- 尹必超、孔东民、季绵绵,2022:《散户积极主义提高上市公司审计质量吗》,《会计研究》第10期。
- 范小云、王业东、王道平、郭文璇、胡煊翊,2022:《不同来源金融文本信息含量的异质性分析——基于混合式文本情绪测度方法》,《管理世界》第10期。
- 朱孟楠、梁裕珩、吴增明,2020:《互联网信息交互网络与股价崩盘风险:舆论监督还是非理性传染》,《中国工业经济》第10期。
- 孙鲲鹏、王丹、肖星,2020:《互联网信息环境整治与社交媒体的公司治理作用》,《管理世界》第7期。
- 王丹、孙鲲鹏、高皓,2020:《社交媒体上“用嘴投票”对管理层自愿性业绩预告的影响》,《金融研究》第11期。
- 部慧、解峥、李佳鸿、吴俊杰,2018:《基于股评的投资者情绪对股票市场的影响》,《管理科学学报》第4期。
- Yuqin Huang, Feng Li, Tong Li, Tse-Chun Lin, 2024, “Local Information Advantage and StockReturns: Evidence from Social Media“. ContemporaryAccounting Research.
- Chang, Yen-Cheng and Hong, Harrison G. and Tiedens, Larissa and Wang, Na and Zhao, Bin, 2015, “Does Diversity Lead to Diverse Opinions? Evidence from Languages and Stock Markets ”, Rock Center for Corporate Governance at Stanford University Working Paper No. 168, Stanford University Graduate School of Business Research Paper No. 13-16.
- Sheridan Titman, Chishen Wei, Bin Zhao, 2021, “Corporate actions and the manipulation of retail investors in China: An analysis of stock splits”, Journal of Financial Economics.
Data Update Frequency
Annual Updates