This database system collects public news text data from electronic editions of 35 major domestic financial and comprehensive newspapers, covering key fields such as site Chinese name, publication time, section name, primary headline, headline, sub-headline, author, images, and main text, providing comprehensively structured news content. The data is continuously updated in real-time, with over 14.51 million records accumulated by the end of 2025, offering large-scale, sustainable textual resources for observing Chinese financial public opinion dynamics, market information dissemination, and media trends.
Data Features:
- Extensive and Representative Sources: Data covers influential financial and comprehensive newspapers such as China Securities Journal, Shanghai Securities News, Securities Times, People's Daily, and Securities Daily, reflecting core voices in mainstream financial discourse.
- Long Time Span Supporting Longitudinal Research: Includes over a decade of continuous observational data from 20 newspapers, suitable for longitudinal studies on macroeconomic trends, market cycles, and policy evolution.
- Real-time Updates Capturing Dynamics: Data synchronization with newspaper releases enables immediate tracking and analysis of market hotspots, public opinion events, and policy announcements. Supports modern text analysis methods including natural language processing, sentiment analysis, and topic modeling.
Potential Applications:
- Financial Market and Public Opinion Analysis: Researchers may analyze market hotspot evolution and investor sentiment fluctuations through headlines and content, while investigating immediate and lagged effects of news on stock prices and trading volumes using publication timestamps.
- Policy Impact and Media Communication Studies: Long-term data facilitates content analysis of media framing and public opinion shifts following economic policy releases, as well as examination of reporting stances and dissemination characteristics across newspapers during major financial events.
- Text Mining and Computational Method Validation: The large-scale, domain-focused database serves as an ideal corpus for training and testing NLP models in financial text classification, entity recognition, and summarization, while supporting empirical validation of computational social science methodologies.
The CnOpenData Chinese Financial Press News Text Database systematically aggregates mainstream financial news content from public sources in a continuous, comprehensive, and structured manner. Combining macro-level temporal coverage with micro-level textual granularity, it provides a robust data foundation for academic research, industry analysis, and decision-making support.
Time Range
Field Specifications
Sample Data
相关文献
- 姜富伟、刘雨旻、孟令超,2024:《大语言模型、文本情绪与金融市场》,《管理世界》第8期。
- 范小云、王业东、王道平等,2022:《不同来源金融文本信息含量的异质性分析——基于混合式文本情绪测度方法》,《管理世界》第10期。
- 许雪晨、田侃,2021:《一种基于金融文本情感分析的股票指数预测新方法》,《数量经济技术经济研究》第12期。
- 张宗新、吴钊颖,2021:《媒体情绪传染与分析师乐观偏差——基于机器学习文本分析方法的经验证据》,《管理世界》第1期。
数据更新频率
实时更新