基于神经网络和迁移学习的网络事件热度预测方法研究 - Details

Author：

韩勖越 (韩勖越.)

Indexed by：

学位论文库

Abstract：

随着互联网的普及和信息技术不断发展，网络事件在许多领域产生重要影响，引起新闻媒体、网民、乃至整个社会的关注。研究网络事件热度的演化过程，预测网络事件的热度，不仅有助于网民从全局把握网络事件的发展，而且能够为事件的管理决策提供必要的数据支持。目前，有关网络事件热度预测的研究工作有限，缺乏对网络媒体包含的与网络事件相关的新闻报道和用户评论数据的分析与研究，同时，现有的预测方法存在时效性弱、数据集不足的问题。本文以百度新闻以及新浪新闻的网络事件新闻报道和用户评论为研究对象，分析网络事件发展趋势演化过程，利用网络事件之间的相似性，分别从网络事件的新闻报道量、网络事件的新闻报道用户评论量以及网络事件的用户评论量三个方面对网络事件的热度进行预测。具体工作如下：
首先，分析网络事件新闻报道及用户评论的特征。设计并开发了百度新闻和新浪新闻的爬虫程序，在此基础上对网络事件新闻报道和网络事件用户评论的特征进行分析。实验结果表明，不同的网络事件发展生命周期不同、演化曲线有较大区别，大部分网络事件的新闻报道和用户评论都具有爆发特性，网络事件新闻报道发布的时间差与用户评论数不相关。
然后，基于网络事件的新闻报道量预测网络事件热度。引入迁移学习策略，利用网络事件之间的相似性，借助事件热度的预测误差，并考虑事件之间热度峰值的归一化，提出TEELM和TPELM两种方法对网络事件新闻热度进行预测。实验结果表明，在极限学习机的基础上引入迁移学习策略，预测结果的准确度有明显提升。该模型的参数设置少、学习速度快且对数据的适应能力强，能够对网络事件的新闻报道热度进行即时预测，并解决训练集数据不足的问题。
接着，基于事件新闻报道的用户评论量预测网络事件热度。统计每个网络事件的每篇新闻报道的评论数据，按照不同的等待时间建立线性归回预测模型和MLP神经网络预测模型。实验结果表明，不同时间段发布的新闻报道其预测模型的性能有一定的区别，分时段建立预测模型有一定的应用价值，等待时间较短时，使用MLP神经网络相较线性回归模型预测结果更好，当等待时间较长时，MLP神经网络与线性回归模型的预测结果相差不大。
最后，基于网络事件的用户评论量预测网络事件热度。对每个网络事件发生后每小时的评论量，分别选择RNN、LSTM、GRU网络建立预测模型，对比训练集占数据集比例不同对网络事件的评论量时序数据建立预测模型的性能影响。提出了一种结合深度学习和迁移学习策略的网络事件热度预测模型。实验结果表明，在深度学习的基础上引入迁移学习策略，能够准确预测对网络事件用户评论量的趋势和评论量的峰值位置，当辅助训练集较小时，对预测结果有很大改善，模型的性能有较大提升，在实际应用中有一定的价值。

Keyword：

极限学习机迁移学习热度预测深度学习网络事件

Author Community：

[ 1 ] 西安交通大学电子与信息工程学院

Reprint Author's Address：

Show more details

Translated Title

Translated Abstract

With　the　popularization　of　the　Internet　and　the　continuous　development　of　information　technology,　network　events　have　had　an　important　impact　in　many　areas.　They　have　attracted　the　attention　of　the　news　media,　netizen,　and　even　the　entire　society.　Studying　the　evolution　and　the　prediction　methods　of　network　event　popularity　not　only　help　the　netizen　grasp　the　development　of　network　events　from　the　overall　perspective,　but　also　provide　necessary　data　support　for　event　management　decisions.　At　present,　there　is　limited　research　work　on　prediction　of　network　event　popularity.　There　is　still　a　lack　of　analysis　and　research　on　news　articles　and　news　comments　related　to　network　events　contained　in　online　media,　and　existing　prediction　methods　have　problems　of　weak　timeliness　and　insufficient　data　sets.　This　thesis　takes　news　articles　and　news　comments　related　to　network　events　from　Baidu　News　and　Sina　News　as　research　subjects,　analyzes　the　development　evolution　of　network　event.　With　the　similarities　between　network　events,　we　predict　the　network　event　popularity　based　on　news　articles,　comments　on　news　articles　and　comments　on　network　events　respectively.　The　specific　work　is　as　follows:

First,　analyzing　the　development　evolution　of　the　network　event.　We　design　and　develop　the　Web-Crawler　of　Baidu　News　and　Sina　News.　Then,　we　analyze　the　feature　of　news　articles　and　news　comments　related　to　network　events.　The　experimental　results　show　that　different　network　events　have　different　development　life　cycles　and　the　evolution　curves　are　quite　different.　However,　the　news　articles　of　most　network　events　show　the　character　of　the　burstiness,　the　majority　of　release　time　intervals　of　network　event　news　articles　are　not　related　to　the　number　of　its　news　comments,　most　of　the　news　comments　on　network　event　show　the　character　of　the　burstiness.
Then,　based　on　the　news　articles　of　network　events,　we　establish　the　prediction　model　of　network　event　popularity.　We　propose　two　methods　using　the　prediction　error　and　considering　the　normalization　of　the　peak　values　between　network　events　which　are　named　TEELM　and　TPELM　with　the　similarities　between　network　events.　Experimental　results　show　that　the　prediction　accuracy　has　improved　significantly,　the　two　algorithms　provide　an　effective　means　to　predict　the　network　event　popularity　immediately　and　solve　the　problem　of　the　insufficient　of　data　sets　with　its　advantages　of　fewer　parameter　settings,　high　learning　speed　and　good　generalization　performance.

Then,　based　on　the　amount　of　user　comments　on　news　articles　related　on　network　events,　we　establish　the　prediction　model　of　network　event　popularity.　We　gather　statistics　for　the　amount　of　user　comments　on　each　news　articles　related　on　a　network　event,　and　establish　linear　regression　and　MLP　neural　network　prediction　models　according　to　different　observation　time.　The　experimental　results　show　that　there　are　certain　differences　in　the　performance　of　prediction　models　with　different　observation　time,　and　there　are　certain　application　values　for　establishing　prediction　models　in　different　time　periods.　The　prediction　results　of　MLP　neural　network　are　obviously　better　compared　with　linear　regression　models　when　the　prediction　observation　period　is　short　and　the　prediction　results　of　the　linear　regression　and　MLP　neural　network　are　similar　when　the　prediction　observation　period　is　longer.
Finally,　based　on　the　amount　of　user　comments　on　network　events,　we　establish　the　prediction　model　of　network　event　popularity.　We　establish　the　deep　learning　network　model　with　RNN,　LSTM,　and　GRU　respectively　for　the　amount　of　user　comments　hourly　on　the　network　events　and　compare　the　performance　of　the　predictive　model　in　different　ratio　of　training　sets.　Based　on　the　original　prediction　model,　a　transfer　learning　strategy　was　introduced.　The　experimental　results　show　that　the　new　method　can　accurately　predict　the　trend　of　the　amount　of　user　comments　hourly　on　the　network　events.　When　the　auxiliary　training　set　is　small,　the　prediction　results　are　greatly　improved,　the　performance　of　the　model　has　been　greatly　improved.

Translated Keyword

[Deep learning, Extreme learning machine, Network event, Popularity prediction, Transfer learning]

Research Interests

Classification

Corresponding authors email

Basic Info ：

Degree：硕士

Mentor：彭勤科

Student No.：

Year： 2018

Language： Other

Cited Count：

WoS CC Cited Count： 0

30 Days PV： 17

Affiliated Colleges：

电子与信息工程学部（原电子与信息工程学院）本学院/部未明确归属的数据

Location

Library Discovery Baidu Scholar Search

Type
Departments

All Years Choose Year From to