News Public Opinion Data

This database systematically collects and organizes public news text data from 46 mainstream Chinese financial news websites, covering key fields such as site Chinese name, publication time, section name, primary title, title, subtitle, author, images, and main text. The data is updated in real-time, with a cumulative volume exceeding 130 million records by the end of 2025. It comprehensively and timely reflects the dynamic dissemination and content evolution of online financial information, providing large-scale, structured, and high-timeliness foundational data resources for empirical research and applied analysis based on financial texts.

Key Features:

  • Coverage of mainstream financial platforms, representing market focus: Data sources include core financial websites with extensive influence among investors and markets, such as East Money (东方财富网), Sina Finance (新浪财经), Hexun (和讯网), and CLS (财联社), effectively capturing the core context of Chinese financial public opinion and market information.
  • Strong real-time capability, supporting immediate response analysis to market dynamics: Compared to electronic newspapers, financial website information is released more rapidly. This database maintains synchronized updates with source sites, enabling research on the immediate dissemination paths of market news, emergencies, and policy releases, as well as their short-term impacts on financial markets.
  • Large-scale, theme-focused data suitable for in-depth mining and modeling: With over 130 million records concentrated in the financial vertical domain, it provides high-quality, large-scale training corpora for developing domain-specific text analysis models (e.g., sentiment analysis, event extraction, topic classification).

Potential Application Scenarios:

  • Financial market microstructure research: Leveraging high-frequency news release data to precisely analyze correlations between news popularity, sentiment tendencies, and price fluctuations/trading volume changes of assets (e.g., stocks, bonds, futures), particularly applicable for event studies and high-frequency data analysis.
  • Financial public opinion monitoring and dissemination analysis: Tracking the same event across different financial websites (e.g., East Money, The Paper, Jiemian) to analyze variations in headline phrasing, publication timing, and content emphasis, thereby examining financial information dissemination networks, public opinion formation processes, and media agenda-setting.
  • Quantitative investment and information factor construction: The extensive text corpus facilitates the development of quantitative factors based on news sentiment, topic popularity, or analyst perspectives, providing a data foundation for algorithmic trading and investment strategy development.
  • Financial text processing technology development and validation: This large-scale, domain-specific, and structurally clear dataset serves as an ideal experimental resource for developing and evaluating natural language processing (NLP) tasks in finance (e.g., financial entity recognition, automatic summarization, relation extraction).

The CnOpenData Chinese Financial News Text Database is continuously compiled from publicly available online sources. With its massive scale, real-time updates, vertical domain coverage, and comprehensive structured information, it provides a robust financial text data infrastructure for academic research, industry analysis, policy evaluation, and technological innovation.


Time Range


Field Display


Sample Data


相关文献

  • 姜富伟、刘雨旻、孟令超,2024:《大语言模型、文本情绪与金融市场》,《管理世界》第8期。
  • 范小云、王业东、王道平等,2022:《不同来源金融文本信息含量的异质性分析——基于混合式文本情绪测度方法》,《管理世界》第10期。
  • 许雪晨、田侃,2021:《一种基于金融文本情感分析的股票指数预测新方法》,《数量经济技术经济研究》第12期。
  • 张宗新、吴钊颖,2021:《媒体情绪传染与分析师乐观偏差——基于机器学习文本分析方法的经验证据》,《管理世界》第1期。

数据更新频率

实时更新