Deyu Zhou, PhD

522 Computer Science Buidling

Jiulonghu Compus

Southeast University

Office Numbers: 025 52090861

Email Adress: d.zhou AT seu.edu.cn

Publications

2022 | 2021 | 2020 | 2019 | 2018 | 2017 | 2016 | 2015 | 2014 | 2011 | 2010 | 2008

2022

Deyu Zhou, Meng Zhang, Linhai Zhang, Yulan He. Pre-training and Fine-tuning Neural Topic Model: A Simple yet Effective Approach to Incorporating External Knowledge, In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (ACL 2022), 2022.
Recent years have witnessed growing interests in incorporating external knowledge such as pre-trained word embeddings (PWEs) or pre-trained language models (PLMs) into neural topic modeling. However, we found that employing PWEs and PLMs for topic modeling only achieved limited performance improvements but with huge computational overhead. In this paper, we propose a novel strategy to incorporate external knowledge into neural topic modeling where the neural topic model is pre-trained on a large corpus and then fine-tuned on the target dataset. Experiments have been conducted on three datasets and results show that the proposed approach significantly outperforms both current state-of-the-art neural topic models and some topic modeling approaches enhanced with PWEs or PLMs. Moreover, further study shows that the proposed approach greatly reduces the need for the huge size of training data.
@inproceedings{zhang-etal-2022-pre, title = ""Pre-training and Fine-tuning Neural Topic Model: A Simple yet Effective Approach to Incorporating External Knowledge"", author = ""Zhang, Linhai and Hu, Xuemeng and Wang, Boyu and Zhou, Deyu and Zhang, Qian-Wen and Cao, Yunbo"", booktitle = ""Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)"", month = may, year = ""2022"", address = ""Dublin, Ireland"", publisher = ""Association for Computational Linguistics"", url = ""https://aclanthology.org/2022.acl-long.413"", doi = ""10.18653/v1/2022.acl-long.413"", pages = ""5980--5989""}
Tao Wang, Linhai Zhang, Chenchen Ye, Junxi Liu, Deyu Zhou. A Novel Framework Based on Medical Concept Driven Attention for Explainable Medical Code Prediction via External Knowledge, In: Findings of the 60th Annual Meeting of the Association for Computational Linguistics (ACL 2022), 2022.
Medical code prediction from clinical notes aims at automatically associating medical codes with the clinical notes. Rare code problem, the medical codes with low occurrences, is prominent in medical code prediction. Recent studies employ deep neural networks and the external knowledge to tackle it. However, such approaches lack interpretability which is a vital issue in medical application. Moreover, due to the lengthy and noisy clinical notes, such approaches fail to achieve satisfactory results. Therefore, in this paper, we propose a novel framework based on medical concept driven attention to incorporate external knowledge for explainable medical code prediction. In specific, both the clinical notes and Wikipedia documents are aligned into topic space to extract medical concepts using topic modeling. Then, the medical concept-driven attention mechanism is applied to uncover the medical code related concepts which provide explanations for medical code prediction. Experimental results on the benchmark dataset show the superiority of the proposed framework over several state-of-the-art baselines.
@inproceedings{wang-etal-2022-novel, title = ""A Novel Framework Based on Medical Concept Driven Attention for Explainable Medical Code Prediction via External Knowledge"", author = ""Wang, Tao and Zhang, Linhai and Ye, Chenchen and Liu, Junxi and Zhou, Deyu"", booktitle = ""Findings of the Association for Computational Linguistics: ACL 2022"", month = may, year = ""2022"", address = ""Dublin, Ireland"", publisher = ""Association for Computational Linguistics"", url = ""https://aclanthology.org/2022.findings-acl.110"", doi = ""10.18653/v1/2022.findings-acl.110"", pages = ""1407--1416""}
Jiasheng Si, Liu Sun, Deyu Zhou, Jie Ren, Lin Li. Biomedical Argument Mining Based on Sequential Multi-Task Learning, IEEE/ACM Transactions on Computational Biology and Bioinformatics, 2022.
Biomedical argument mining aims to automatically identify and extract the argumentative structure in biomedical text. It helps to determine not only what positions people adopt, but also why they hold such opinions, which provides valuable insights into medical decision making. Generally, biomedical argument mining consists of three subtasks: argument component identification, argument component classification and relation identification. Current approaches employ conventional multi-task learning framework for jointly addressing the latter two subtasks, and achieve some successes. However, explicit sequential dependency between these two subtasks is ignored, which is crucial for accurate biomedical argument mining. Moreover, relation identification is conducted solely based on the argument component pair without considering its potentially valuable context. Therefore, in this paper, a novel sequential multi-task learning approach is proposed for biomedical argument mining. Specifically, to model explicit sequential dependency between argument component classification and relation identification, an information transfer strategy is employed to capture the information of argument component types that is transferred to relation identification. Furthermore, graph convolutional network is employed to model dependency relation among the related argument component pairs. The proposed method has been evaluated on a benchmark dataset and the experimental results show that the proposed method outperforms the state-of-the-art methods.
@article{si2022biomedical, title={Biomedical Argument Mining Based on Sequential Multi-Task Learning}, author={Si, Jiasheng and Sun, Liu and Zhou, Deyu and Ren, Jie and Li, Lin}, journal={IEEE/ACM Transactions on Computational Biology and Bioinformatics}, year={2022}, publisher={IEEE}}

2021

Deyu Zhou, Meng Zhang, Linhai Zhang, Yulan He. A Neural Group-wise Sentiment Analysis Model with Data Sparsity Awareness, In: Proceedings of the AAAI Conference on Artificial Intelligence (AAAI 2021), 2021.
Sentiment analysis on user-generated content has achieved notable progress by introducing user information to consider each individual’s preference and language usage. However, most existing approaches ignore the data sparsity problem, where the content of some users is limited and the model fails to capture discriminative features of users. To address this issue, we hypothesize that users could be grouped together based on their rating biases as well as degree of rating consistency and the knowledge learned from groups could be employed to analyze the users with limited data. Therefore, in this paper, a neural group-wise sentiment analysis model with data sparsity awareness is proposed. The user-centred document representations are generated by incorporating a group-based user encoder. Furthermore, a multi-task learning framework is employed to jointly modelusers’ rating biases and their degree of rating consistency. One task is vanilla populationlevel sentiment analysis and the other is groupwise sentiment analysis. Experimental results on three real-world datasets show that the proposed approach outperforms some state-of the-art methods. Moreover, model analysis and case study demonstrate its effectiveness of modeling user rating biases and variances.
@inproceedings{zhou2021neural, title={A Neural Group-wise Sentiment Analysis Model with Data Sparsity Awareness}, author={Zhou, Deyu and Zhang, Meng and Zhang, Linhai and He, Yulan}, booktitle={Proceedings of the AAAI Conference on Artificial Intelligence}, volume={35}, number={16}, pages={14594--14601}, year={2021} }
Linhai Zhang, Deyu Zhou, Yulan He, Zeng Yang. MERL: Multimodal Event Representation Learning in Heterogeneous Embedding Spaces, In: Proceedings of the AAAI Conference on Artificial Intelligence (AAAI 2021), 2021.
Previous work has shown the effectiveness of using event representations for tasks such as script event prediction and stock market prediction. It is however still challenging to learn the subtle semantic differences between events based solely on textual descriptions of events often represented as (subject, predicate, object) triples. As an alternative, images offer a more intuitive way of understanding event semantics. We observe that event described in text and in images show different abstraction levels and therefore should be projected onto heterogeneous embedding spaces, as opposed to what have been done in previous approaches which project signals from different modalities onto a homogeneous space. In this paper, we propose a Multimodal Event Representation Learning framework (MERL) to learn event representations based on both text and image modalities simultaneously. Event textual triples are projected as Gaussian density embeddings by a dual-path Gaussian triple encoder, while event images are projected as point embeddings by a visual event component-aware image encoder. Moreover, a novel score function motivated by statistical hypothesis testing is introduced to coordinate two embedding spaces. Experiments are conducted on various multimodal event-related tasks and results show that MERL outperforms a number of unimodal and multimodal baselines, demonstrating the effectiveness of the proposed framework.
@inproceedings{zhang2021merl, title={MERL: Multimodal event representation learning in heterogeneous embedding spaces}, author={Zhang, Linhai and Zhou, Deyu and He, Yulan and Yang, Zeng}, booktitle={Proceedings of the AAAI Conference on Artificial Intelligence}, volume={35}, number={16}, pages={14420--14427}, year={2021} }
Jiasheng Si, Linsen Guo, Deyu Zhou. Unsupervised latent event representation learning and storyline extraction from news articles based on neural networks, Intelligent Data Analysis, 25(3), 589-603, 2021.
Storyline extraction aims to generate concise summaries of related events unfolding over time from a collection of temporally-ordered news articles. Some existing approaches to storyline extraction are typically built on probabilistic graphical models that jointly model the extraction of events and the storylines from news published in different periods. However, their parameter inference procedures are often complex and require a long time to converge, which hinders their use in practical applications. More recently, a neural network-based approach has been proposed to tackle such limitations. However, event representations of documents, which are important for the quality of the generated storylines, are not learned. In this paper, we propose a novel unsupervised neural network-based approach to extract latent events and link patterns of storylines jointly from documents over time. Specifically, event representations are learned by a stacked autoencoder and clustered for event extraction, then a fusion component is incorporated to link the related events across consecutive periods for storyline extraction. The proposed model has been evaluated on three news corpora and the experimental results show that it outperforms state-of-the-art approaches with significant improvements.
@article{si2021unsupervised, title={Unsupervised latent event representation learning and storyline extraction from news articles based on neural networks}, author={Si, Jiasheng and Guo, Linsen and Zhou, Deyu}, journal={Intelligent Data Analysis}, volume={25}, number={3}, pages={589--603}, year={2021}, publisher={IOS Press} }
Linhai Zhang, Chao Lin, Deyu Zhou, Yulan He, Meng Zhang. A Bayesian end-to-end model with estimated uncertainties for simple question answering over knowledge bases, Computer Speech & Language, 66, 101167, 2021.
Existing methods for question answering over knowledge bases (KBQA) ignore the consideration of the model prediction uncertainties. We argue that estimating such uncertainties is crucial for the reliability and interpretability of KBQA systems. Therefore, we propose a novel end-to-end KBQA model based on Bayesian Neural Network (BNN) to estimate uncertainties arose from both model and data. To our best knowledge, we are the first to consider the uncertainty estimation problem for the KBQA task using BNN. The proposed end-to-end model integrates entity detection and relation prediction into a unified framework, and employs BNN to model entity and relation under the given question semantics, transforming network weights into distributions. Therefore, predictive distributions can be estimated by sampling weights and forward inputs through the network multiple times. Uncertainties can be further quantified by calculating the variances of predictive distributions. The experimental results demonstrate the effectiveness of uncertainties in both the misclassification detection task and cause of error detection task. Furthermore, the proposed model also achieves comparable performance compared to the existing state-of-the-art approaches on SimpleQuestions dataset.
@article{zhang2021bayesian, title={A bayesian end-to-end model with estimated uncertainties for simple question answering over knowledge bases}, author={Zhang, Linhai and Lin, Chao and Zhou, Deyu and He, Yulan and Zhang, Meng}, journal={Computer Speech \& Language}, volume={66}, pages={101167}, year={2021}, publisher={Elsevier} }
Keqin Peng, Chuantao Yin, Wenge Rong, Chenghua Lin, Deyu Zhou, Zhang Xiong. Named entity aware transfer learning for biomedical factoid question answering, IEEE/ACM Transactions on Computational Biology and Bioinformatics, 2021.
Biomedical factoid question answering is an important task in biomedical question answering application. It has attracted much attention because of its reliability of the answer. In question answering system, better representation of word is of much importance and a proper word embedding usually can improve the performance of system significantly. With the success of pre-trained models in general natural language process tasks, pretrained model has been widely used in biomedical area as well and a lot of pretrained model based approaches have been proven effective in biomedical question answering task. Besides the proper word embedding, name entity is also important information for biomedical question answering. Inspired by the concept of transfer learning, in this research we developed a mechanism to finetune BioBERT with name entity dataset to improve the question answering performance.
@article{peng2021named, title={Named entity aware transfer learning for biomedical factoid question answering}, author={Peng, Keqin and Yin, Chuantao and Rong, Wenge and Lin, Chenghua and Zhou, Deyu and Xiong, Zhang}, journal={IEEE/ACM Transactions on Computational Biology and Bioinformatics}, year={2021}, publisher={IEEE} }
Rui Wang, Deyu Zhou, Yuxuan Xiong, Haiping Huang. Variational Gaussian Topic Model with Invertible Neural Projections, arXiv preprint, arXiv:2105.10095, 2021.
Neural topic models have triggered a surge of interest in extracting topics from text automatically since they avoid the sophisticated derivations in conventional topic models. However, scarce neural topic models incorporate the word relatedness information captured in word embedding into the modeling process. To address this issue, we propose a novel topic modeling approach, called Variational Gaussian Topic Model (VaGTM). Based on the variational auto-encoder, the proposed VaGTM models each topic with a multivariate Gaussian in decoder to incorporate word relatedness. Furthermore, to address the limitation that pre-trained word embeddings of topic-associated words do not follow a multivariate Gaussian, Variational Gaussian Topic Model with Invertible neural Projections (VaGTM-IP) is extended from VaGTM. Three benchmark text corpora are used in experiments to verify the effectiveness of VaGTM and VaGTM-IP. The experimental results show that VaGTM and VaGTM-IP outperform several competitive baselines and obtain more coherent topics.
@article{wang2021variational, title={Variational Gaussian Topic Model with Invertible Neural Projections}, author={Wang, Rui and Zhou, Deyu and Xiong, Yuxuan and Huang, Haiping}, journal={arXiv preprint arXiv:2105.10095}, year={2021} }
Lixing Zhu, Gabriele Pergola, Lin Gui, Deyu Zhou, Yulan He. Topic-driven and knowledge-aware transformer for dialogue emotion detection, In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing {ACL/IJCNLP} 2021.
Emotion detection in dialogues is challenging as it often requires the identification of thematic topics underlying a conversation, the relevant commonsense knowledge, and the intricate transition patterns between the affective states. In this paper, we propose a Topic-Driven Knowledge-Aware Transformer to handle the challenges above. We firstly design a topic-augmented language model (LM) with an additional layer specialized for topic detection. The topic-augmented LM is then combined with commonsense statements derived from a knowledge base based on the dialogue contextual information. Finally, a transformer-based encoder-decoder architecture fuses the topical and commonsense information, and performs the emotion label sequence prediction. The model has been experimented on four datasets in dialogue emotion detection, demonstrating its superiority empirically over the existing state-of-the-art approaches. Quantitative and qualitative results show that the model can discover topics which help in distinguishing emotion categories.
@inproceedings{DBLP:conf/acl/ZhuP0ZH20, author = {Lixing Zhu and Gabriele Pergola and Lin Gui and Deyu Zhou and Yulan He}, editor = {Chengqing Zong and Fei Xia and Wenjie Li and Roberto Navigli}, title = {Topic-Driven and Knowledge-Aware Transformer for Dialogue Emotion Detection}, booktitle = {Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, {ACL/IJCNLP} 2021, (Volume 1: Long Papers), Virtual Event, August 1-6, 2021}, pages = {1571--1582}, publisher = {Association for Computational Linguistics}, year = {2021}, url = {https://doi.org/10.18653/v1/2021.acl-long.125}, doi = {10.18653/v1/2021.acl-long.125}, timestamp = {Sat, 09 Apr 2022 12:33:46 +0200}, biburl = {https://dblp.org/rec/conf/acl/ZhuP0ZH20.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} }
Deyu Zhou, Meng Zhang, Yang Yang, Yulan He. Hierarchical state recurrent neural network for social emotion ranking, Computer Speech & Language, 68, 101177, 2021.
Text generation with auxiliary attributes, such as topics or sentiments, has made remarkable progress. However, high-quality labeled data is difficult to obtain for the large-scale corpus. Therefore, this paper focuses on social emotion ranking aiming to identify social emotions with different intensities evoked by online documents, which could be potentially beneficial for further controlled text generation. Existing studies often consider each document as an entirety that fail to capture the inner relationship between sentences in a document. In this paper, we propose a novel hierarchical state recurrent neural network for social emotion ranking. A hierarchy mechanism is employed to capture the key hierarchical semantic structure in a document. Moreover, instead of incrementally reading a sequence of words or sentences as in traditional recurrent neural networks, the proposed approach encodes the hidden states of all words or sentences simultaneously at each recurrent step to capture long-range dependencies precisely. Experimental results show that the proposed approach performs remarkably better than the state-of-the-art social emotion ranking approaches and is useful for controlled text generation.
@article{zhou2021hierarchical, title={Hierarchical state recurrent neural network for social emotion ranking}, author={Zhou, Deyu and Zhang, Meng and Yang, Yang and He, Yulan}, journal={Computer Speech \& Language}, volume={68}, pages={101177}, year={2021}, publisher={Elsevier} }
Yachen Shi, Linhai Zhang, Zan Wang, Xiang Lu, Tao Wang, Deyu Zhou, Zhijun Zhang. Multivariate machine learning analyses in identification of major depressive disorder using resting-state functional connectivity: A multicentral study, ACS Chemical Neuroscience, 12(15), 2878-2886, 2021.
Diagnosis of major depressive disorder (MDD) using resting-state functional connectivity (rs-FC) data faces many challenges, such as the high dimensionality, small samples, and individual difference. To assess the clinical value of rs-FC in MDD and identify the potential rs-FC machine learning (ML) model for the individualized diagnosis of MDD, based on the rs-FC data, a progressive three-step ML analysis was performed, including six different ML algorithms and two dimension reduction methods, to investigate the classification performance of ML model in a multicentral, large sample dataset [1021 MDD patients and 1100 normal controls (NCs)]. Furthermore, the linear least-squares fitted regression model was used to assess the relationships between rs-FC features and the severity of clinical symptoms in MDD patients. Among used ML methods, the rs-FC model constructed by the eXtreme Gradient Boosting (XGBoost) method showed the optimal classification performance for distinguishing MDD patients from NCs at the individual level (accuracy = 0.728, sensitivity = 0.720, specificity = 0.739, area under the curve = 0.831). Meanwhile, identified rs-FCs by the XGBoost model were primarily distributed within and between the default mode network, limbic network, and visual network. More importantly, the 17 item individual Hamilton Depression Scale scores of MDD patients can be accurately predicted using rs-FC features identified by the XGBoost model (adjusted R2 = 0.180, root mean squared error = 0.946). The XGBoost model using rs-FCs showed the optimal classification performance between MDD patients and HCs, with the good generalization and neuroscientifical interpretability.
@article{2021Multivariate, title={Multivariate Machine Learning Analyses in Identification of Major Depressive Disorder Using Resting-State Functional Connectivity: A Multicentral Study}, author={ Shi, Y. and Zhang, L. and Wang, Z. and Lu, X. and Zhang, Z. }, journal={ACS Chemical Neuroscience}, volume={12}, number={11}, year={2021}, }
Deyu Zhou, Jiale Yuan, Jiasheng Si. Health issue identification in social media based on multi-task hierarchical neural networks with topic attention, Artificial Intelligence in Medicine, 118, 102119, 2021.
"Objective Health issue identification in social media is to predict whether the writers have a disease based on their posts. Numerous posts and comments are shared on social media by users. Certain posts may reflect writers' health condition, which can be employed for health issue identification. Usually, the health issue identification problem is formulated as a classification task. Methods and material In this paper, we propose novel multi-task hierarchical neural networks with topic attention for identifying health issue based on posts collected from the social media platforms. Specifically, the model incorporates the hierarchical relationship among the document, sentences, and words via bidirectional gated recurrent units (BiGRUs). The global topic information shared across posts is incorporated with the hidden states of BiGRUs to obtain the topic-enhanced attention weights for words. In addition, tasks of predicting whether the writers suffer from a disease (health issue identification) and predicting the specific domain of the posts (domain category classification) are learned jointly in multi-task mechanism. Results The proposed method is evaluated on two datasets: dementia issue dataset and depression issue dataset. The proposed approach achieves 98.03% and 88.28% F-1 score on two datasets, outperforming the state-of-the-art approach by 0.73% and 0.4% respectively. Further experimental analysis shows the effectiveness of incorporating both the multi-task learning framework and topic attention mechanism."
@article{zhou2021health, title={Health issue identification in social media based on multi-task hierarchical neural networks with topic attention}, author={Zhou, Deyu and Yuan, Jiale and Si, Jiasheng}, journal={Artificial Intelligence in Medicine}, volume={118}, pages={102119}, year={2021}, publisher={Elsevier} }
Deyu Zhou, Kai Sun, Mingqi Hu, Yulan He. Image generation from text with entity information fusion, Knowledge-Based Systems, 227, 107200, 2021.
Image generation from text is the task of generating new images from a textual unit such as word, phase, clause and sentence. It has attracted great attention in both the community of natural language processing and computer vision. Current approaches usually employ an end-to-end framework to tackle the problem. However, we find that the entity information, including categories and attributes of the images, are ignored by most approaches. Such information is crucial for guaranteeing semantic alignment and generating image accurately. For two pictures of the same category, the emphasis of the corresponding text description may be different, but the images generated by these two sentences should have some similarities and the generation process can learn from each other. Therefore, we propose two novel end-to-end frameworks to incorporate entity information in the process of image generation. For the first framework, an image representation is generated from entity labels using the variational inference mechanism and then fused with the representation generated from the corresponding sentence. Instead of fusing the images in high-dimensional space, images are inferred and fused in the latent space (the low-dimensional space) in the second framework, where computationally intensive upsampling modules are shared. Moreover, a novel metric (Entity Matching Score) is proposed to measure the degree of consistency of the generated image with its corresponding text description and the effectiveness of the metric has been proved by the generated samples in our experiments. Experimental results show that both the proposed frameworks outperform some state-of-the-art approaches significantly on two benchmark datasets.
@article{zhou2021image, title={Image generation from text with entity information fusion}, author={Zhou, Deyu and Sun, Kai and Hu, Mingqi and He, Yulan}, journal={Knowledge-Based Systems}, volume={227}, pages={107200}, year={2021}, publisher={Elsevier} }
Deyu Zhou, Yanzheng Xiang, Linhai Zhang, Chenchen Ye, Qian-Wen Zhang, Yunbo Cao. A Divide-And-Conquer Approach for Multi-label Multi-hop Relation Detection in Knowledge Base Question Answering, In: Findings of the Association for Computational Linguistics: EMNLP 2021, 2021.
Relation detection in knowledge base question answering, aims to identify the path (s) of relations starting from the topic entity node that is linked to the answer node in knowledge graph. Such path might consist of multiple relations, which we call multi-hop. Moreover, for a single question, there may exist multiple relation paths to the correct answer, which we call multi-label. However, most of existing approaches only detect one single path to obtain the answer without considering other correct paths, which might affect the final performance. Therefore, in this paper, we propose a novel divide-and-conquer approach for multi-label multi-hop relation detection (DC-MLMH) by decomposing it into head relation detection and conditional relation path generation. In specific, a novel path sampling mechanism is proposed to generate diverse relation paths for the inference stage. A majority-vote policy is employed to detect final KB answer. Comprehensive experiments were conducted on the FreebaseQA benchmark dataset. Experimental results show that the proposed approach not only outperforms other competitive multi-label baselines, but also has superiority over some state-of-art KBQA methods.
@inproceedings{zhou-etal-2021-divide-conquer, title = ""A Divide-And-Conquer Approach for Multi-label Multi-hop Relation Detection in Knowledge Base Question Answering"", author = ""Zhou, Deyu and Xiang, Yanzheng and Zhang, Linhai and Ye, Chenchen and Zhang, Qian-Wen and Cao, Yunbo"", booktitle = ""Findings of the Association for Computational Linguistics: EMNLP 2021"", month = nov, year = ""2021"", address = ""Punta Cana, Dominican Republic"", publisher = ""Association for Computational Linguistics"", url = ""https://aclanthology.org/2021.findings-emnlp.412"", doi = ""10.18653/v1/2021.findings-emnlp.412"", pages = ""4798--4808"" }
Linhai Zhang, Deyu Zhou, Chao Lin, Yulan He. A Multi-label Multi-hop Relation Detection Model based on Relation-aware Sequence Generation, In: Findings of the Association for Computational Linguistics: EMNLP 2021, 2021.
Multi-hop relation detection in Knowledge Base Question Answering (KBQA) aims at retrieving the relation path starting from the topic entity to the answer node based on a given question, where the relation path may comprise multiple relations. Most of the existing methods treat it as a single-label learning problem while ignoring the fact that for some complex questions, there exist multiple correct relation paths in knowledge bases. Therefore, in this paper, multi-hop relation detection is considered as a multi-label learning problem. However, performing multi-label multi-hop relation detection is challenging since the numbers of both the labels and the hops are unknown. To tackle this challenge, multi-label multi-hop relation detection is formulated as a sequence generation task. A relation-aware sequence relation generation model is proposed to solve the problem in an end-to-end manner. Experimental results show the effectiveness of the proposed method for relation detection and KBQA.
@inproceedings{zhang-etal-2021-multi-label-multi, title = ""A Multi-label Multi-hop Relation Detection Model based on Relation-aware Sequence Generation"", author = ""Zhang, Linhai and Zhou, Deyu and Lin, Chao and He, Yulan"", booktitle = ""Findings of the Association for Computational Linguistics: EMNLP 2021"", month = nov, year = ""2021"", address = ""Punta Cana, Dominican Republic"", publisher = ""Association for Computational Linguistics"", url = ""https://aclanthology.org/2021.findings-emnlp.404"", doi = ""10.18653/v1/2021.findings-emnlp.404"", pages = ""4713--4719""}
Chenchen Ye, Linhai Zhang, Yulan He, Deyu Zhou, Jie Wu. Beyond Text: Incorporating Metadata and Label Structure for Multi-Label Document Classification using Heterogeneous Graphs, In: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP 2021), 2021.
Multi-label document classification, associating one document instance with a set of relevant labels, is attracting more and more research attention. Existing methods explore the incorporation of information beyond text, such as document metadata or label structure. These approaches however either simply utilize the semantic information of metadata or employ the predefined parent-child label hierarchy, ignoring the heterogeneous graphical structures of metadata and labels, which we believe are crucial for accurate multi-label document classification. Therefore, in this paper, we propose a novel neural network based approach for multi-label document classification, in which two heterogeneous graphs are constructed and learned using heterogeneous graph transformers. One is metadata heterogeneous graph, which models various types of metadata and their topological relations. The other is label heterogeneous graph, which is constructed based on both the labels’ hierarchy and their statistical dependencies. Experimental results on two benchmark datasets show the proposed approach outperforms several state-of-the-art baselines.
@inproceedings{ye-etal-2021-beyond, title = ""Beyond Text: Incorporating Metadata and Label Structure for Multi-Label Document Classification using Heterogeneous Graphs"", author = ""Ye, Chenchen and Zhang, Linhai and He, Yulan and Zhou, Deyu and Wu, Jie"", booktitle = ""Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing"", month = nov, year = ""2021"", address = ""Online and Punta Cana, Dominican Republic"", publisher = ""Association for Computational Linguistics"", url = ""https://aclanthology.org/2021.emnlp-main.253"", doi = ""10.18653/v1/2021.emnlp-main.253"", pages = ""3162--3171"" }
Deyu Zhou, Jianan Wang, Linhai Zhang, Yulan He. Implicit Sentiment Analysis with Event-centered Text Representation, In: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP 2021), 2021.
Implicit sentiment analysis, aiming at detecting the sentiment of a sentence without sentiment words, has become an attractive research topic in recent years. In this paper, we focus on event-centric implicit sentiment analysis that utilizes the sentiment-aware event contained in a sentence to infer its sentiment polarity. Most existing methods in implicit sentiment analysis simply view noun phrases or entities in text as events or indirectly model events with sophisticated models. Since events often trigger sentiments in sentences, we argue that this task would benefit from explicit modeling of events and event representation learning. To this end, we represent an event as the combination of its event type and the event triplet< subject, predicate, object>. Based on such event representation, we further propose a novel model with hierarchical tensor-based composition mechanism to detect sentiment in text. In addition, we present a dataset for event-centric implicit sentiment analysis where each sentence is labeled with the event representation described above. Experimental results on our constructed dataset and an existing benchmark dataset show the effectiveness of the proposed approach.
@inproceedings{zhou-etal-2021-implicit, title = ""Implicit Sentiment Analysis with Event-centered Text Representation"", author = ""Zhou, Deyu and Wang, Jianan and Zhang, Linhai and He, Yulan"", booktitle = ""Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing"", month = nov, year = ""2021"", address = ""Online and Punta Cana, Dominican Republic"", publisher = ""Association for Computational Linguistics"", url = ""https://aclanthology.org/2021.emnlp-main.551"", doi = ""10.18653/v1/2021.emnlp-main.551"", pages = ""6884--6893""}
Yachen Shi, Linhai Zhang, Cancan He, Yingying Yin, Ruize Song, Suzhen Chen, Dandan Fan, Deyu Zhou, Yonggui Yuan, Chunming Xie, Zhijun Zhang. Sleep disturbance-related neuroimaging features as potential biomarkers for the diagnosis of major depressive disorder: A multicenter study based on machine learning, Journal of Affective Disorders, 295, 148-155, 2021.
"Background Objective biomarkers are crucial for overcoming the clinical dilemma in major depressive disorder (MDD), and the individualized diagnosis is essential to facilitate the precise medicine for MDD. Methods Sleep disturbance-related magnetic resonance imaging (MRI) features was identified in the internal dataset (92 MDD patients) using the relevance vector regression algorithm, which was further verified in 460 MDD patients of an independent, multicenter dataset. Subsequently, using these MRI features, the eXtreme Gradient Boosting classification model was constructed in the current multicenter dataset (460 MDD patients and 470 normal controls). Meanwhile, the association between classification outputs and the severity of depressive symptoms was also investigated. Results In MDD patients, the combination of gray matter density and fractional amplitude of low-frequency fluctuation can accurately predict individual sleep disturbance score that was calculated by the sum of item 4 score, item 5 score, and item 6 score of the 17-Item Hamilton Rating Scale for Depression (HAMD-17) (R2 = 0.158 in the internal dataset; R2 = 0.110 in multicenter dataset). Furthermore, the classification model based on these MRI features distinguished MDD patients from normal controls with 86.3% accuracy (area under the curve = 0.937). Importantly, the classification outputs significantly correlated with HAMD-17 scores in MDD patients. Limitation Lacking some specialized tools to assess the personal sleep quality, e.g. Pittsburgh Sleep Quality Index. Conclusion Neuroimaging features can reflect accurately individual sleep disturbance manifestation and serve as potential diagnostic biomarkers of MDD."
@article{shi2021sleep, title={Sleep disturbance-related neuroimaging features as potential biomarkers for the diagnosis of major depressive disorder: A multicenter study based on machine learning}, author={Shi, Yachen and Zhang, Linhai and He, Cancan and Yin, Yingying and Song, Ruize and Chen, Suzhen and Fan, Dandan and Zhou, Deyu and Yuan, Yonggui and Xie, Chunming and others}, journal={Journal of Affective Disorders}, volume={295}, pages={148--155}, year={2021}, publisher={Elsevier} }

2020

Rui Wang, Deyu Zhou, Yulan He. Optimising Topic Coherence with Weighted Polya Urn scheme, In: Neurocomputing, 2020.
Topic models have been widely used to mine hidden topics from documents. However, one limitation of such topic models is that they are prone to generate incoherent topics. To address this limitation, many approaches have been proposed to incorporate the prior knowledge of word semantic relatedness into the topic inference process. One example is the Generalized Polya Urn (GPU) scheme. However, GPU- based topic models often require sophisticated algorithms to acquire domain-specific knowledge from data. Moreover, prior knowledge is incorporated into the topic inference process without considering its impact on the intermediate topic sampling results. In this paper, we propose a novel Weighted Polya Urn scheme and incorporate it into Latent Dirichlet Allocation framework to build the self-enhancement topic model and generate coherent topics. In specific, semantic prior knowledge based on word embedding is employed to measure the semantic coherence of a word to different topics, which is incorporated into the Weighted Polya Urn scheme. Moreover, semantic coherence is updated dynamically based on the semantic similarity between a word and the representative words in different topics. Experiments have been conducted on seven public corpora from different domains to evaluate the effectiveness of the proposed approach. Experimental results show that compared to the state-of-the-art baselines, the proposed approach can generate more coherent topics.
@article{wang2019optimising, title={Optimising Topic Coherence with Weighted Po{\'{}} lya Urn scheme}, author={Wang, Rui and Zhou, Deyu and He, Yulan}, journal={Neurocomputing}, year={2019}, publisher={Elsevier} }
Chuan Wang, Taomin Zhang, Xuan Liu, Lei Miao, Deyu Zhou, Peng Wang, Yibo Zhang, Qing Jiang, Yezi Hu, Han Yin, Jianfei Sun. Bone Metabolic Biomarkers-Based Diagnosis of Osteoporosis Caused by Diabetes Mellitus using Support Vector Machine, In: Research Square, 2020.
Background: Diabetes has significant effects on bone metabolism. Both type 1 and type 2 diabetes can cause osteoporotic fracture. However, it remains challenging to diagnose osteoporosis in type 2 diabetes by bone mineral density which lacks regular changes. Seen another way, osteoporosis can be ascribed to the imbalance of bone metabolism, which is closely related to diabetes as well. Method: Here, to assist clinicians in diagnosing osteoporosis in type 2 diabetes, an efficient and simple SVM model was established based on different combinations of biochemical indices, including bone turnover makers, calcium and phosphorus, etc. The classification performance was measured using several evaluations. Results: The predicting accuracy rate of final model is above 88%, with feature combination of Sex, Age, BMI, TP1NP and OSTEOC. Conclusion: Experimental results show that the model has come to an anticipant result for early detection and daily monitoring on type 2 diabetic osteoporosis.
@article{sun2020bone, title={Bone Metabolic Biomarkers-Based Diagnosis of Osteoporosis Caused by Diabetes Mellitus using Support Vector Machine}, author={Sun, J and Wang, C and Zhang, T and Liu, X and Miao, L and Zhou, D and Wang, P and Zhang, Y and Jiang, Q and Hu, Y and others}, year={2020} }
Lixing Zhu, Yulan He, Deyu Zhou. Neural opinion dynamics model for the prediction of user-level stance dynamics, Information Processing & Management, 2020.
Social media platforms allow users to express their opinions towards various topics online. Oftentimes, users' opinions are not static, but might be changed over time due to the influences from their neighbors in social networks or updated based on arguments encountered that undermine their beliefs. In this paper, we propose to use a Recurrent Neural Network (RNN) to model each user's posting behaviors on Twitter and incorporate their neighbors' topic-associated context as attention signals using an attention mechanism for user-level stance prediction. Moreover, our proposed model operates in an online setting in that its parameters are continuously updated with the Twitter stream data and can be used to predict user's topic-dependent stance. Detailed evaluation on two Twitter datasets, related to Brexit and US General Election, justifies the superior performance of our neural opinion dynamics model over both static and dynamic alternatives for user-level stance prediction.
@article{zhu2019neural, title={Neural opinion dynamics model for the prediction of user-level stance dynamics}, author={Zhu, Lixing and He, Yulan and Zhou, Deyu}, journal={Information Processing \& Management}, year={2019}, publisher={Elsevier} }

Rui Wang, Xuemeng Hu, Deyu Zhou, Yulan He, Yuxuan Xiong, Chenchen Ye, Haiyang Xu. Neural Topic Modeling with Bidirectional Adversarial Training, In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL2020), 2020.
Recent years have witnessed a surge of interests of using neural topic models for automatic topic extraction from text, since they avoid the complicated mathematical derivations for model inference as in traditional topic models such as Latent Dirichlet Allocation (LDA). However, these models either typically assume improper prior (e.g. Gaussian or Logistic Normal) over latent topic space or could not infer topic distribution for a given document. To address these limitations, we propose a neural topic modeling approach, called Bidirectional Adversarial Topic (BAT) model, which represents the first attempt of applying bidirectional adversarial training for neural topic modeling. The proposed BAT builds a twoway projection between the document-topic distribution and the document-word distribution. It uses a generator to capture the semantic patterns from texts and an encoder for topic inference. Furthermore, to incorporate word relatedness information, the Bidirectional Adversarial Topic model with Gaussian (Gaussian-BAT) is extended from BAT. To verify the effectiveness of BAT and GaussianBAT, three benchmark corpora are used in our experiments. The experimental results show that BAT and Gaussian-BAT obtain more coherent topics, outperforming several competitive baselines. Moreover, when performing text clustering based on the extracted topics, our models outperform all the baselines, with more significant improvements achieved by Gaussian-BAT where an increase of near 6% is observed in accuracy.
@article{wang2020neural, title={Neural Topic Modeling with Bidirectional Adversarial Training}, author={Wang, Rui and Hu, Xuemeng and Zhou, Deyu and He, Yulan and Xiong, Yuxuan and Ye, Chenchen and Xu, Haiyang}, journal={arXiv preprint arXiv:2004.12331}, year={2020} }
Lixing Zhu, Yulan He, Deyu Zhou. Neural Temporal Opinion Modelling for Opinion Prediction on Twitter, In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL2020), 2020.
Opinion prediction on Twitter is challenging due to the transient nature of tweet content and neighbourhood context. In this paper, we model users' tweet posting behaviour as a temporal point process to jointly predict the posting time and the stance label of the next tweet given a user's historical tweet sequence and tweets posted by their neighbours. We design a topic-driven attention mechanism to capture the dynamic topic shifts in the neighbourhood context. Experimental results show that the proposed model predicts both the posting time and the stance labels of future tweets more accurately compared to a number of competitive baselines.
@article{zhu2020neural, title={Neural Temporal Opinion Modelling for Opinion Prediction on Twitter}, author={Zhu, Lixing and He, Yulan and Zhou, Deyu}, journal={arXiv preprint arXiv:2005.13486}, year={2020} }
Linsen Guo, Deyu Zhou, Yulan He, Haiyang Xu. Storyline extraction from news articles with dynamic dependency, Intelligent Data Analysis, 24(1), 183-197, 2020.
Storyline generation aims to produce a concise summary of related events unfolding over time from a collection of news articles. It can be cast into an evolutionary clustering problem by separating news articles into different epochs. Existing unsupervised approaches to storyline generation are typically based on probabilistic graphical models. They assume that the storyline distribution at the current epoch depends on the weighted combination of storyline distributions in the latest previous M epochs. The evolutionary parameters of such long-term dependency are typically set by a fixed exponential decay function to capture the intuition that events in more recent epochs have stronger influence to the storyline generation in the current epoch. However, we argue that the amount of relevant historical contextual information should vary for different storylines. Therefore, in this paper, we propose a new Dynamic Dependency Storyline Extraction Model (D2SEM) in which the dependencies among events in different epochs but belonging to the same storyline are dynamically updated to track the time-varying distributions of storylines over time. The proposed model has been evaluated on three news corpora and the experimental results show that it outperforms the state-of-the-art approaches and is able to capture the dependency on historical contextual information dynamically.
@article{guo2020storyline, title={Storyline extraction from news articles with dynamic dependency}, author={Guo, Linsen and Zhou, Deyu and He, Yulan and Xu, Haiyang}, journal={Intelligent Data Analysis}, volume={24}, number={1}, pages={183--197}, year={2020}, publisher={IOS Press} }
Xuemeng Hu, Rui Wang, Deyu Zhou, Yuxuan Xiong. Neural Topic Modeling with Cycle-Consistent Adversarial Training, In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP 2020), 2020.
Advances on deep generative models have attracted significant research interest in neural topic modeling. The recently proposed Adversarial-neural Topic Model models topics with an adversarially trained generator network and employs Dirichlet prior to capture the semantic patterns in latent topics. It is effective in discovering coherent topics but unable to infer topic distributions for given documents or utilize available document labels. To overcome such limitations, we propose Topic Modeling with Cycle-consistent Adversarial Training (ToMCAT) and its supervised version sToMCAT. ToMCAT employs a generator network to interpret topics and an encoder network to infer document topics. Adversarial training and cycle-consistent constraints are used to encourage the generator and the encoder to produce realistic samples that coordinate with each other. sToMCAT extends ToMCAT by incorporating document labels into the topic modeling process to help discover more coherent topics. The effectiveness of the proposed models is evaluated on unsupervised/supervised topic modeling and text classification. The experimental results show that our models can produce both coherent and informative topics, outperforming a number of competitive baselines.
@article{hu2020neural, title={Neural topic modeling with cycle-consistent adversarial training}, author={Hu, Xuemeng and Wang, Rui and Zhou, Deyu and Xiong, Yuxuan}, journal={arXiv preprint arXiv:2009.13971}, year={2020} }
Deyu Zhou, Xuemeng Hu, Rui Wang. Neural Topic Modeling by Incorporating Document Relationship Graph, Intelligent Data Analysis, 24(1), 183-197, 2020.
Graph Neural Networks (GNNs) that capture the relationships between graph nodes via message passing have been a hot research direction in the natural language processing community. In this paper, we propose Graph Topic Model (GTM), a GNN based neural topic model that represents a corpus as a document relationship graph. Documents and words in the corpus become nodes in the graph and are connected based on document-word co-occurrences. By introducing the graph structure, the relationships between documents are established through their shared words and thus the topical representation of a document is enriched by aggregating information from its neighboring nodes using graph convolution. Extensive experiments on three datasets were conducted and the results demonstrate the effectiveness of the proposed approach.
@article{zhou2020neural, title={Neural topic modeling by incorporating document relationship graph}, author={Zhou, Deyu and Hu, Xuemeng and Wang, Rui}, journal={arXiv preprint arXiv:2009.13972}, year={2020} }

2019

Rui Wang, Deyu Zhou, Yulan He. ATM:Adversarial-neural Topic Model, Information Processing & Management, 56(6).
Topic models are widely used for thematic structure discovery in text. But traditional topic models often require dedicated inference procedures for specific tasks at hand. Also, they are not designed to generate word-level semantic representations. To address these limitations, we propose a topic modeling approach based on Generative Adversarial Nets (GANs), called Adversarial-neural Topic Model (ATM). The proposed ATM models topics with Dirichlet prior and employs a generator network to capture the semantic patterns among latent topics. Meanwhile, the generator could also produce word-level semantic representations. To illustrate the feasibility of porting ATM to tasks other than topic modeling, we apply ATM for open domain event extraction. Our experimental results on the two public corpora show that ATM generates more coherence topics, outperforming a number of competitive baselines. Moreover, ATM is able to extract meaningful events from news articles.
@article{wang2019atm, title={Atm: Adversarial-neural topic model}, author={Wang, Rui and Zhou, Deyu and He, Yulan}, journal={Information Processing \& Management}, volume={56}, number={6}, pages={102098}, year={2019}, publisher={Elsevier} }

Xinyi Yu, Wenge Rong, Jingshuang Liu, Deyu Zhou, Yuanxin Ouyang, Zhang Xiong. LSTM-Based End-to-End Framework for Biomedical Event Extraction, IEEE/ACM transactions on computational biology and bioinformatics
Biomedical event extraction plays an important role in the extraction of biological information from large-scale scientific publications. However, most state-of-the-art systems separate this task into several steps, which leads to cascading errors. In addition, it is complicated to generate features from syntactic and dependency analysis separately. Therefore, in this paper, we propose an end-to-end model based on long short-term memory (LSTM) to optimize biomedical event extraction. Experimental results demonstrate that our approach improves the performance of biomedical event extraction. We achieve average F1-scores of 59.68%, 58.23% and 57.39% on the BioNLP09, BioNLP11 and BioNLP13's Genia event datasets, respectively. The experimental study has shown our proposed model's potential in biomedical event extraction.
@article{yu2019lstm, title={LSTM-Based End-to-End Framework for Biomedical Event Extraction}, author={Yu, Xinyi and Rong, Wenge and Liu, Jingshuang and Zhou, Deyu and Ouyang, Yuanxin and Xiong, Zhang}, journal={IEEE/ACM transactions on computational biology and bioinformatics}, year={2019}, publisher={IEEE} }

Yachen Shi, Xiang Lu, Linhai Zhang, Hao Shu, Lihua Gu, Zan Wang, Lijuan Gao, Jianli Zhu, Haisan Zhang, Deyu Zhou, Zhijun Zhang. Potential value of plasma amyloid-β, total tau, and neurofilament light for identification of early Alzheimer’s disease, ACS Chemical Neuroscience, 10(8), 3479-3485, 2019.
The objective of the study was to explore the potential value of plasma indicators for identifying amnesic mild cognitive impairment (aMCI) and determine whether levels of plasma indicators are related to the performance of cognitive function and brain tissue volumes. In total, 155 participants (68 aMCI patients and 87 health controls) were recruited in the present cross-sectional study. The levels of plasma amyloid-β (Aβ) 40, Aβ42, total tau (t-tau), and neurofilament light (NFL) were measured using an ultrasensitive quantitative method. Machine learning algorithms were performed for establishing an optimal model of identifying aMCI. Compared with healthy controls, Aβ40 and Aβ42 levels were lower and NFL levels were higher in plasma of aMCI patients with an exception of t-tau levels. In aMCI patients, the higher plasma Aβ40 levels were correlated with the impaired episodic memory and negative correlations were observed between plasma t-tau levels and global cognitive function and gray matter (GM) volume. In addition, the higher plasma NFL levels were correlated with reduced hippocampus volume and total GM volume of the left inferior and middle temporal gyrus. An integrated model included clinical features, hippocampus volume, and plasma Aβ42 and NFL and had the highest accuracy for detecting aMCI patients (accuracy, 74.2%). We demonstrated that plasma Aβ40, Aβ42, t-tau, and NFL may be useful to identify aMCI and correlate with cognitive decline and brain atrophy. Among these plasma indicators, Aβ42 and NFL are more valuable as key members of a peripheral biomarker panel to detect aMCI.
@article{shi2019potential, title={Potential value of plasma amyloid-$\beta$, total tau, and neurofilament light for identification of early Alzheimer’s disease}, author={Shi, Yachen and Lu, Xiang and Zhang, Linhai and Shu, Hao and Gu, Lihua and Wang, Zan and Gao, Lijuan and Zhu, Jianli and Zhang, Haisan and Zhou, Deyu and others}, journal={ACS Chemical Neuroscience}, volume={10}, number={8}, pages={3479--3485}, year={2019}, publisher={ACS Publications} }

Rui Wang, Deyu Zhou, Mingmin Jiang, Jiasheng Si, Yang Yang. A Survey on Opinion Mining: From Stance to Product Aspect, IEEE Access, 41101-41124.
With the prevalence of social media and online forum, opinion mining, aiming at analyzing and discovering the latent opinion in user-generated reviews on the Internet, has become a hot research topic. This survey focuses on two important subtasks in this field, stance detection and product aspect mining, both of which can be formalized as the problem of the triple htarget, aspect, opinioni extraction. In this paper, we first introduce the general framework of opinion mining and describe the evaluation metrics. Then, the methodologies for stance detection on different sources, such as online forum and social media are discussed. After that, approaches for product aspect mining are categorized into three main groups which are corpus level aspect extraction, corpus level aspect, and opinion mining, and document level aspect and opinion mining based on the processing units and tasks. And then we discuss the challenges and possible solutions. Finally, we summarize the evolving trend of the reviewed methodologies and conclude the survey.
@article{wang2019survey, title={A Survey on Opinion Mining: From Stance to Product Aspect}, author={Wang, Rui and Zhou, Deyu and Jiang, Mingmin and Si, Jiasheng and Yang, Yang}, journal={IEEE Access}, volume={7}, pages={41101--41124}, year={2019}, publisher={IEEE} }

Lixing Zhu, Yulan He, Deyu Zhou. Hierarchical viewpoint discovery from tweets using Bayesian modelling, Expert Systems with Applications, 116, 430-438.
When users express their stances towards a topic in social media, they might elaborate their viewpoints or reasoning. Oftentimes, viewpoints expressed by different users exhibit a hierarchical structure. Therefore, detecting this kind of hierarchical viewpoints offers a better insight to understand the public opinion. In this paper, we propose a novel Bayesian model for hierarchical viewpoint discovery from tweets. Driven by the motivation that a viewpoint expressed in a tweet can be regarded as a path from the root to a leaf of a hierarchical viewpoint tree, the assignment of the relevant viewpoint topics is assumed to follow two nested Chinese restaurant processes. Moreover, opinions in text are often expressed in un-semantically decomposable multi-terms or phrases, such as 'economic recession'. Hence, a hierarchical Pitman-Yor process is employed as a prior for modelling the generation of phrases with arbitrary length. Experimental results on two Twitter corpora demonstrate the effectiveness of the proposed Bayesian model for hierarchical viewpoint discovery.
@article{zhu2019hierarchical, title={Hierarchical viewpoint discovery from tweets using Bayesian modelling}, author={Zhu, Lixing and He, Yulan and Zhou, Deyu}, journal={Expert Systems with Applications}, volume={116}, pages={430--438}, year={2019}, publisher={Elsevier} }
Donglei Tang, Zhikai Zhang, Yulan He, Chao Lin, Deyu Zhou.Hidden topic-emotion transition model for multi-level social emotion detection, Knowledge-Based Systems,
164, 426-435.
With the fast development of online social platforms, social emotion detection, focusing on predicting readers' emotions evoked by news articles, has been intensively investigated. Considering emotions as latent variables, various probabilistic graphical models have been proposed for emotion detection. However, the bag-of-words assumption prohibits those models from capturing the interrelations between sentences in a document. Moreover, existing models can only detect emotions at either the document-level or the sentence-level. In this paper, we propose an effective Bayesian model, called hidden Topic-Emotion Transition model, by assuming that words in the same sentence share the same emotion and topic and modelling the emotions and topics in successive sentences as a Markov chain. By doing so, not only the document-level emotion but also the sentence-level emotion can be detected simultaneously. Experimental results on the two public corpora show that the proposed model outperforms state-of-theart approaches on both document-level and sentence-level emotion detection.
@article{tang2019hidden, title={Hidden topic--emotion transition model for multi-level social emotion detection}, author={Tang, Donglei and Zhang, Zhikai and He, Yulan and Lin, Chao and Zhou, Deyu}, journal={Knowledge-Based Systems}, volume={164}, pages={426--435}, year={2019}, publisher={Elsevier} }

Rui Wang, Deyu Zhou, Yulan He. Open Event Extraction from Online Text using a Generative Adversarial Network, In: Conference on Empirical Methods in Natural Language Processing & International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), Hong Kong, China, November 3-7, 2019
To extract the structured representations of open-domain events, Bayesian graphical models have made some progress. However, these approaches typically assume that all words in a document are generated from a single event. While this may be true for short text such as tweets, such an assumption does not generally hold for long text such as news articles. Moreover, Bayesian graphical models often rely on Gibbs sampling for parameter inference which may take long time to converge. To address these limitations, we propose an event extraction model based on Generative Adversarial Nets, called Adversarial-neural Event Model (AEM). AEM models an event with a Dirichlet prior and uses a generator network to capture the patterns underlying latent events. A discriminator is used to distinguish documents reconstructed from the latent events and the original documents. A byproduct of the discriminator is that the features generated by the learned discriminator network allow the visualization of the extracted events. Our model has been evaluated on two Twitter datasets and a news article dataset. Experimental results show that our model outperforms the baseline approaches on all the datasets, with more significant improvements observed on the news article dataset where an increase of 15% is observed in F-measure.
@article{wang2019open, title={Open Event Extraction from Online Text using a Generative Adversarial Network}, author={Wang, Rui and Zhou, Deyu and He, Yulan}, journal={arXiv preprint arXiv:1908.09246}, year={2019} }
Yang Yang, Deyu Zhou, Yulan He, Meng Zhang. Interpretable Relevant Emotion Ranking with Event-Driven Attention, In: Conference on Empirical Methods in Natural Language Processing & International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), Hong Kong, China, November 3-7, 2019
Multiple emotions with different intensities are often evoked by events described in documents. Oftentimes, such event information is hidden and needs to be discovered from texts. Unveiling the hidden event information can help to understand how the emotions are evoked and provide explainable results. However, existing studies often ignore the latent event information. In this paper, we proposed a novel interpretable relevant emotion ranking model with the event information incorporated into a deep learning architecture using the event-driven attentions. Moreover, corpuslevel event embeddings and document-level event distributions are introduced respectively to consider the global events in corpus and the document-specific events simultaneously. Experimental results on three real-world corpora show that the proposed approach performs remarkably better than the state-of-the-art emotion detection approaches and multi-label approaches. Moreover, interpretable results can be obtained to shed light on the events which trigger certain emotions.
Mingqi Hu, Deyu Zhou, Yulan He. Variational Conditional GAN for Fine-grained Controllable Image Generation, In: Asian Conference on Machine Learning (ACML), Nagoya, Japan, 2019
In this paper, we propose a novel variational generator framework for conditional GANs to catch semantic details for improving the generation quality and diversity. Traditional generators in conditional GANs simply concatenate the conditional vector with the noise as the input representation, which is directly employed for upsampling operations. However, the hidden condition information is not fully exploited, especially when the input is a class label. Therefore, we introduce a variational inference into the generator to infer the posterior of latent variable only from the conditional input, which helps achieve a variable augmented representation for image generation. Qualitative and quantitative experimental results show that the proposed method outperforms the state-of-the-art approaches and achieves the realistic controllable images.
@article{hu2019variational, title={Variational Conditional GAN for Fine-grained Controllable Image Generation}, author={Hu, Mingqi and Zhou, Deyu and He, Yulan}, journal={arXiv preprint arXiv:1909.09979}, year={2019} }

2018

Deyu Zhou, Zhikai Zhang, Minling Zhang, Yulan He. Weakly Supervised POS Tagging without Disambiguation, ACM Transactions on Asian and Low-Resource Language Information Processing, 17.4:35, 2018.
Weakly supervised part-of-speech (POS) tagging is to learn to predict the POS tag for a given word in context by making use of partial annotated data instead of the fully tagged corpora. Weakly supervised POS tagging would benefit various natural language processing applications in such languages where tagged corpora are mostly unavailable.

In this article, we propose a novel framework for weakly supervised POS tagging based on a dictionary of words with their possible POS tags. In the constrained error-correcting output codes (ECOC)-based approach, a unique L-bit vector is assigned to each POS tag. The set of bitvectors is referred to as a coding matrix with value {1, -1}. Each column of the coding matrix specifies a dichotomy over the tag space to learn a binary classifier. For each binary classifier, its training data is generated in the following way: each pair of words and its possible POS tags are considered as a positive training example only if the whole set of its possible tags falls into the positive dichotomy specified by the column coding and similarly for negative training examples. Given a word in context, its POS tag is predicted by concatenating the predictive outputs of the L binary classifiers and choosing the tag with the closest distance according to some measure. By incorporating the ECOC strategy, the set of all possible tags for each word is treated as an entirety without the need of performing disambiguation. Moreover, instead of manual feature engineering employed in most previous POS tagging approaches, features for training and testing in the proposed framework are automatically generated using neural language modeling. The proposed framework has been evaluated on three corpora for English, Italian, and Malagasy POS tagging, achieving accuracies of 93.21%, 90.9%, and 84.5% individually, which shows a significant improvement compared to the state-of-the-art approaches.
@article{zhou2018weakly, title={Weakly Supervised POS Tagging without Disambiguation}, author={Zhou, Deyu and Zhang, Zhikai and Zhang, Min-Ling and He, Yulan}, journal={ACM Transactions on Asian and Low-Resource Language Information Processing (TALLIP)}, volume={17}, number={4}, pages={35}, year={2018}, publisher={ACM} }

Deyu Zhou, Lei Miao, Yulan He. Position-aware deep multi-task learning for drug-drug interaction extraction, Artificial Intelligence In Medicine, 87, 1-8, 2018.
Objective: A drug-drug interaction (DDI) is a situation in which a drug affects the activity of another drug synergistically or antagonistically when being administered together. The information of DDIs is crucial for healthcare professionals to prevent adverse drug events. Although some known DDIs can be found in purposely-built databases such as DrugBank, most information is still buried in scientific publications. Therefore, automatically extracting DDIs from biomedical texts is sorely needed.

Methods and material: In this paper, we propose a novel position-aware deep multi-task learning approach for extracting DDIs from biomedical texts. In particular, sentences are represented as a sequence of word embeddings and position embeddings. An attention-based bidirectional long short-term memory (BiLSTM) network is used to encode each sentence. The relative position information of words with the target drugs in textis combined with the hidden states of BiLSTM to generate the position-aware attention weights. Moreover, the tasks of predicting whether or not two drugs interact with each other and further distinguishing the types of interactions are learned jointly in multi-task learning framework.

Results: The proposed approach has been evaluated on the DDIExtraction challenge 2013 corpus and the results show that with the position-aware attention only, our proposed approach outperforms the state-of-the-art method by 0.99% for binary DDI classification, and with both position-aware attention and multi-task learning, our approach achieves a micro F-score of 72.99% on interaction type identification, outperforming the state-of-the-art approach by 1.51%, which demonstrates the effectiveness of the proposed approach.
@article{zhou2018position, title={Position-aware deep multi-task learning for drug--drug interaction extraction}, author={Zhou, Deyu and Miao, Lei and He, Yulan}, journal={Artificial intelligence in medicine}, volume={87}, pages={1--8}, year={2018}, publisher={Elsevier} }

Yang Yang, Deyu Zhou and Yulan He. An Interpretable Neural Network with Topical Information for Relevant Emotion Ranking, In: Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, October 31-November 4, 2018.
Text might express or evoke multiple emotions with varying intensities. As such, it is crucial to predict and rank multiple relevant emotions by their intensities. Moreover, as emotions might be evoked by hidden topics, it is important to unveil and incorporate such topical information to understand how the emotions are evoked. We proposed a novel interpretable neural network approach for relevant emotion ranking. Specifically, motivated by transfer learning, the neural network is initialized to make the hidden layer approximate the behavior of topic models. Moreover, a novel error function is defined to optimize the whole neural network for relevant emotion ranking. Experimental results on three real-world corpora show that the proposed approach performs remarkably better than the state-of-theart emotion detection approaches and multilabel learning methods. Moreover, the extracted emotion-associated topic words indeed represent emotion-evoking events and are in line with our common-sense knowledge.
@inproceedings{yang2018interpretable, title={An interpretable neural network with topical information for relevant emotion ranking}, author={Yang, Yang and Deyu, ZHOU and He, Yulan}, booktitle={Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing}, pages={3423--3432}, year={2018} }

Deyu Zhou, Yang Yang and Yulan He. Relevant Emotion Ranking from Text Constrained with Emotion Relationships, In: The North American Chapter of the Association for Computational Linguistics (NAACL 2018), New Orleans, Louisiana, June 1-6, Vol. 1,
pp. 561-571, 2018.
Text might contain or invoke multiple emotions with varying intensities. As such, emotion detection, to predict multiple emotions associated with a given text, can be cast into a multi-label classification problem. We would like to go one step further so that a ranked list of relevant emotions are generated where top ranked emotions are more intensely associated with text compared to lower ranked emotions, whereas the rankings of irrelevant emotions are not important. A novel framework of relevant emotion ranking is proposed to tackle the problem. In the framework, the objective loss function is designed elaborately so that both emotion prediction and rankings of only relevant emotions can be achieved. Moreover, we observe that some emotions co-occur more often while other emotions rarely coexist. Such information is incorporated into the framework as constraints to improve the accuracy of emotion detection. Experimental results on two real-world corpora show that the proposed framework can effectively deal with emotion detection and performs remarkably better than the state-of-the-art emotion detection approaches and multi-label learning methods.
@inproceedings{zhou2018relevant, title={Relevant emotion ranking from text constrained with emotion relationships}, author={Zhou, Deyu and Yang, Yang and He, Yulan}, booktitle={Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers)}, pages={561--571}, year={2018} }

Deyu Zhou, Linsen Guo, Yulan He. Neural Storyline Extraction Model for Storyline Generation from News Articles, In: The North American Chapter of the Association for Computational Linguistics (NAACL 2018), New Orleans, Louisiana, June 1-6, Vol. 1,
pp. 1727-1736, 2018.
Storyline generation aims to extract events described on news articles under a certain topic and reveal how those events evolve over time. Most existing approaches first train supervised models to extract events from news articles published in different time periods and then link relevant events into coherent stories. They are domain dependent and cannot deal with unseen event types. To tackle this problem, approaches based on probabilistic graphic models jointly model the generations of events and storylines without annotated data. However, the parameter inference procedure is too complex and models often require long time to converge. In this paper, we propose a novel neural network based approach to extract structured representations and evolution patterns of storylines without using annotated data. In this model, title and main body of a news article are assumed to share the similar storyline distribution. Moreover, similar documents described in neighboring time periods are assumed to share similar storyline distributions. Based on these assumptions, structured representations and evolution patterns of storylines can be extracted. The proposed model has been evaluated on three news corpora and the experimental results show that it outperforms state-of-the-art approaches accuracy and efficiency.
@inproceedings{zhou2018neural, title={Neural Storyline Extraction Model for Storyline Generation from News Articles}, author={Zhou, Deyu and Guo, Linshen and He, Yulan}, year={2018}, organization={Association for Computational Linguistics} }

2017

Deyu Zhou, Xuan Zhang, Yulan He.Event extraction from Twitter using Non-Parametric Bayesian Mixture Model with Word Embeddings, In: Proceedings of the 2017 Conference on The European Chapter of the Association for Computational Linguistics (EACL 2017), Valencia,
April 3-7,2017.
In recent years, there have been increasing interests in using unsupervised models to extract structured representations of newsworthy events from Twitter. These models typically assume that tweets involving the same named entities and expressed using similar words are likely to belong to the same event. Hence, they group tweets into clusters based on the cooccurrence patterns of named entities and topical keywords. However, there are two main limitations. First, they require the number of events to be known beforehand, which is not realistic in practical applications. Second, they don't recognise that the same named entity might be referred to by multiple mentions, for example, "Putin" and "The President of Russia" refer to the same person. As a results, tweets using different mentions would be wrongly assigned to different events. To overcome these limitations, we propose a non-parametric Bayesian mixture model with word embeddings for event extraction, in which the number of events can be inferred automatically and the issue of lexical variations for the same named entity can be dealt with properly. Our model has been evaluated on three datasets with sizes ranging between 2,499 and over 60 million tweets. Experimental results show that our model outperforms the baseline approach on all datasets by 5-8% in F-measure.
@inproceedings{zhou2017event, title={Event extraction from Twitter using non-parametric Bayesian mixture model with word embeddings}, author={Zhou, Deyu and Zhang, Xuan and He, Yulan}, booktitle={Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers}, pages={808--817}, year={2017} }

2016

Deyu Zhou, Xuan Zhang, Yin Zhou, Quan Zhao, Xin Geng. Emotion Distribution Learning from Texts, In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing (EMNLP 2016), Austin, Texas, USA, November 1-5, 2016.
The advent of social media and its prosperity enable users to share their opinions and views. Understanding users' emotional states might provide the potential to create new business opportunities. Automatically identifying users' emotional states from their texts and classifying emotions into finite categories such as joy, anger, disgust, etc., can be considered as a text classification problem. However, it introduces a challenging learning scenario where multiple emotions with different intensities are often found in a single sentence. Moreover, some emotions co-occur more often while other emotions rarely coexist. In this paper, we propose a novel approach based on emotion distribution learning in order to address the aforementioned issues. The key idea is to learn a mapping function from sentences to their emotion distributions describing multiple emotions and their respective intensities. Moreover, the relations of emotions are captured based on the Plutchik's wheel of emotions and are subsequently incorporated into the learning algorithm in order to improve the accuracy of emotion detection. Experimental results show that the proposed approach can effectively deal with the emotion distribution detection problem and perform remarkably better than both the state-of-theart emotion detection method and multi-label learning methods.
@inproceedings{deyu2016emotion, title={Emotion distribution learning from texts}, author={Deyu, ZHOU and Zhang, Xuan and Zhou, Yin and Zhao, Quan and Geng, Xin}, booktitle={Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing}, pages={638--647}, year={2016} }

Deyu Zhou, Tianmeng Gao, Yulan He. Jointly Event Extraction and Visualization on Twitter via Probabilistic Modelling, In: Proceedings of the 54th Annual Meeting of the Association for Computation al Linguistics (ACL 2016), Berlin, Germany, August 7-12, 2016.
Event extraction from texts aims to detect structured information such as what has happened, to whom, where and when. Event extraction and visualization are typically considered as two different tasks. In this paper, we propose a novel approach based on probabilistic modelling to jointly extract and visualize events from tweets where both tasks benefit from each other. We model each event as a joint distribution over named entities, a date, a location and event-related keywords. Moreover, both tweets and event instances are associated with coordinates in the visualization space. The manifold assumption that the intrinsic geometry of tweets is a low-rank, non-linear manifold within the high-dimensional space is incorporated into the learning framework using a regularization. Experimental results show that the proposed approach can effectively deal with both event extraction and visualization and performs remarkably better than both the state-of-the-art event extraction method and a pipeline approach for event extraction and visualization.
@inproceedings{zhou2016jointly, title={Jointly event extraction and visualization on twitter via probabilistic modelling}, author={Zhou, Deyu and Gao, Tianmeng and He, Yulan}, booktitle={Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)}, pages={269--278}, year={2016} }

Deyu Zhou, Haiyang Xu, Xinyu Dai, Yulan He. Unsupervised Storyline Extraction from News Articles, In: Proceedings of the 25th International Joint Conference on Artificial Intelligence (IJCAI 2016), New York, USA, July 9-15, 2016.
Storyline extraction from news streams aims to extract events under a certain news topic and reveal how those events evolve over time. It requires algorithms capable of accurately extracting events from news articles published in different time periods and linking these extracted events into coherent stories. The two tasks are often solved separately, which might suffer from the problem of error propagation. Existing unified approaches often consider events as topics, ignoring their structured representations. In this paper, we propose a non-parametric generative model to extract structured representations and evolution patterns of storylines simultaneously. In the model, each storyline is modelled as a joint distribution over some locations, organizations, persons, keywords and a set of topics. We further combine this model with the Chinese restaurant process so that the number of storylines can be determined automatically without human intervention. Moreover, per-token Metropolis-Hastings sampler based on light latent Dirichlet allocation is employed to reduce sampling complexity. The proposed model has been evaluated on three news corpora and the experimental results show that it outperforms several baseline approaches.
@inproceedings{zhou2016unsupervised, title={Unsupervised Storyline Extraction from News Articles.}, author={Zhou, Deyu and Xu, Haiyang and Dai, Xin-Yu and He, Yulan}, booktitle={IJCAI}, pages={3014--3021}, year={2016} }

2015

Deyu Zhou, Dayou Zhong. A semi-supervised learning framework for biomedical event extraction based on hidden topics, Artificial Intelligence In Medicine, Volume 64, No.1,
pp. 51-58, 2015.
Objective: Scientists have devoted decades of efforts to understanding the in teraction between proteins or RNA production. The information might empower the current knowledge on drug reactions or the development of certain diseases. Nevertheless, due to the lack of explicit structure, literature in life science, one of the most important sources of these information, prevents computer-based systems from accessing. Therefore, biomedical event extraction, automatical13 ly acquiring knowledge of molecular events in research articles, has attract14 ed community-wide efforts recently. Most approaches are based on statistical models, requiring large-scale annotated corpora to precisely estimate models' parameters. However, it is usually difficult to obtain in practice. Therefore, employing un-annotated data based on semi-supervised learning for biomedical event extraction is a feasible solution and attracts more interests.

Methods and Material: In this paper, a semi-supervised learning frame20 work based on hidden topics for biomedical event extraction is presented. In this framework, sentences in the un-annotated corpus are elaborately and au22 tomatically assigned with event annotations based on their distances to these sentences in the annotated corpus. More specifically, not only the structures of the sentences, but also the hidden topics embedded in the sentences are used for describing the distance. The sentences and newly assigned event annotations, together with the annotated corpus, are employed for training. Results: Experiments were conducted on the multi-level event extraction corpus, a golden standard corpus. Experimental results show that more than 2.2% improvement on F-score on biomedical event extraction is achieved by the proposed framework when compared to the state-of-the-art approach.

Conclusion: The results suggest that by incorporating un-annotated data, the proposed framework indeed improves the performance of the state-of-the-art event extraction system and the similarity between sentences might be precisely described by hidden topics and structures of the sentences.
@article{zhou2015semi, title={A semi-supervised learning framework for biomedical event extraction based on hidden topics}, author={Zhou, Deyu and Zhong, Dayou}, journal={Artificial intelligence in medicine}, volume={64}, number={1}, pages={51--58}, year={2015}, publisher={Elsevier} }

Deyu Zhou, Haiyang Xu, Yulan He. An Unsupervised Bayesian Modelling Approach for Storyline Detection on News Articles, In: Proceedings of the 2015 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP 2015), Lisboa, Portugal, September 17-21, 2015.
Storyline detection from news articles aims at summarizing events described under a certain news topic and revealing how those events evolve over time. It is a difficult task because it requires first the detection of events from news articles published in different time periods and then the construction of storylines by linking events into coherent news stories. Moreover, each storyline has different hierarchical structures which are dependent across epochs. Existing approaches often ignore the dependency of hierarchical structures in storyline generation. In this paper, we propose an unsupervised Bayesian model, called dynamic storyline detection model, to extract structured representations and evolution patterns of storylines. The proposed model is evaluated on a large scale news corpus. Experimental results show that our proposed model outperforms several baseline approaches.
@inproceedings{zhou2015unsupervised, title={An unsupervised Bayesian modelling approach for storyline detection on news articles}, author={Zhou, Deyu and Xu, Haiyang and He, Yulan}, booktitle={Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing}, pages={1943--1948}, year={2015} }

Deyu Zhou, Liangyu Chen, Yulan He. An Unsupervised Framework of Exploring Events on Twitter: Filtering, Extraction and Categorization, In: Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence (AAAI 2015), Austin, Texas, USA, January 25-30, 2015.
Twitter, as a popular microblogging service, has become a new information channel for users to receive and exchange the most up-to-date information on current events. However, since there is no control on how users can publish messages on Twitter, finding newsworthy events from Twitter becomes a difficult task like "finding a needle in a haystack". In this paper we propose a general unsupervised framework to explore events from tweets, which consists of a pipeline process of filtering, extraction and categorization. To filter out noisy tweets, the filtering step exploits a lexicon-based approach to separate tweets that are event-related from those that are not. Then, based on these event-related tweets, the structured representations of events are extracted and categorized automatically using an unsupervised Bayesian model without the use of any labelled data. Moreover, the categorized events are assigned with the event type labels without human intervention. The proposed framework has been evaluated on over 60 millions tweets which were collected for one month in December 2010. A precision of 70.49% is achieved in event extraction, outperforming a competitive baseline by nearly 6%. Events are also clustered into coherence groups with the automatically assigned event type label.
@inproceedings{zhou2015unsupervised, title={An unsupervised framework of exploring events on twitter: Filtering, extraction and categorization}, author={Zhou, Deyu and Chen, Liangyu and He, Yulan}, booktitle={Twenty-Ninth AAAI Conference on Artificial Intelligence}, year={2015} }

2014

Deyu Zhou, Dayou Zhong, Yulan He. Event Trigger Identification for Biomedical Events Extraction using Domain Knowledge, Bioinformatics, Volume 30, No.11, pp.1587-1594.
Motivation: In molecular biology, molecular events describe observable alterations of biomolecules, such as binding of proteins or RNA production. These events might be responsible for drug reactions or development of certain diseases. As such, biomedical event extraction, the process of automatically detecting description of molecular interactions in research articles, attracted substantial research interest recently. Event trigger identification, detecting the words describing the event types, is a crucial and prerequisite step in the pipeline process of biomedical event extraction. Taking the event types as classes, event trigger identification can be viewed as a classification task. For each word in a sentence, a trained classifier predicts whether the word corresponds to an event type and which event type based on the context features. Therefore, a well-designed feature set with a good level of discrimination and generalization is crucial for the performance of event trigger identification.

Results: In this article, we propose a novel framework for event trigger identification. In particular, we learn biomedical domain knowledge from a large text corpus built from Medline and embed it into word features using neural language modeling. The embedded features are then combined with the syntactic and semantic context features using the multiple kernel learning method. The combined feature set is used for training the event trigger classifier. Experimental results on the golden standard corpus show that 42.5% improvement on F-score is achieved by the proposed framework when compared with the state-of-the-art approach, demonstrating the effectiveness of the proposed framework.

Availability and implementation: The source code for the proposed framework is freely available and can be downloaded at http://palm.seu.edu.cn/zhoudeyu/ETI_Sourcecode.zip .
@article{zhou2014event, title={Event trigger identification for biomedical events extraction using domain knowledge}, author={Zhou, Deyu and Zhong, Dayou and He, Yulan}, journal={Bioinformatics}, volume={30}, number={11}, pages={1587--1594}, year={2014}, publisher={Oxford University Press} }

Deyu Zhou, Dayou Zhong, Yulan He. Biomedical Relation Extraction: From Binary to Complex, Computational and Mathematical Methods in Medicine, Volume 2014, 2014.
Biomedical relation extraction aims to uncover high-quality relations from life science literature with high accuracy and efficiency. Early biomedical relation extraction tasks focused on capturing binary relations, such as protein-protein interactions, which are crucial for virtually every process in a living cell. Information about these interactions provides the foundations for new therapeutic approaches. In recent years, more interests have been shifted to the extraction of complex relations such as biomolecular events. While complex relations go beyond binary relations and involve more than two arguments, they might also take another relation as an argument. In the paper, we conduct a thorough survey on the research in biomedical relation extraction. We first present a general framework for biomedical relation extraction and then discuss the approaches proposed for binary and complex relation extraction with focus on the latter since it is a much more difficult task compared to binary relation extraction. Finally, we discuss challenges that we are facing with complex relation extraction and outline possible solutions and future directions.
@article{zhou2014biomedical, title={Biomedical relation extraction: from binary to complex}, author={Zhou, Deyu and Zhong, Dayou and He, Yulan}, journal={Computational and mathematical methods in medicine}, volume={2014}, year={2014}, publisher={Hindawi} }

Deyu Zhou, Liangyu Chen, Yulan He. A Simple Bayesian Modelling Approach to Event Extraction from Twitter, In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (ACL 2014), Baltimore, Maryland, USA, June 23- 25 2014.
With the proliferation of social media sites, social streams have proven to contain the most up-to-date information on current events. Therefore, it is crucial to extract events from the social streams such as tweets. However, it is not straightforward to adapt the existing event extraction systems since texts in social media are fragmented and noisy. In this paper we propose a simple and yet effective Bayesian model, called Latent Event Model (LEM), to extract structured representation of events from social media. LEM is fully unsupervised and does not require annotated data for training. We evaluate LEM on a Twitter corpus. Experimental results show that the proposed model achieves 83% in F-measure, and outperforms the state-of-the-art baseline by over 7%.
@inproceedings{zhou2014simple, title={A simple bayesian modelling approach to event extraction from twitter}, author={Zhou, Deyu and Chen, Liangyu and He, Yulan}, booktitle={Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)}, pages={700--705}, year={2014} }

2011

Deyu Zhou, Yulan He. Biomedical Events Extraction using the Hidden Vector State Model, Artificial Intelligence In Medicine, Volume 53, No. 3, pp. 205-13, 2011.
Objective: Biomedical events extraction concerns about extracting events describing changes on the state of bio-molecules from literature. Comparing to the protein-protein interactions (PPIs) extraction task which often only involves the extraction of binary relations between two proteins, biomedical events extraction is much harder since it needs to deal with complex events consisting of embedded or hierarchical relations among proteins, events, and their textual triggers. In this paper, we propose an information extraction system based on the hidden vector state (HVS) model, called HVS-BioEvent, for biomedical events extraction, and investigate its capability in extracting complex events.

Methods and Material: HVS has been previously employed for the extractions of PPIs. In HVS-BioEvent, we propose an automated way to generate abstract annotations for HVS training and further propose novel machine learning approaches for event trigger word identification, and for biomedical events extraction from the HVS parse results.

Results: Our proposed system achieves an F-score of 49.57% on the corpus used in the BioNLP'09 shared task, which is only 2.38% lower than the best performing system by UTurku in the BioNLP'09 share task. Nevertheless, HVS-BioEvent outperforms UTurku's system on complex events extraction with 36.57% vs 30.52% being achieved for extracting regulation events, and 40.61% vs 38.99% for negative regulation events.

Conclusions: The results suggest that the HVS model with the hierarchical hidden state structure is indeed more suitable for complex event extraction since it could naturally model embedded structural context in sentences. Keywords: Hidden vector state model, biomedical events extraction, abstract annotations, semantic parsing.
@article{zhou2011biomedical, title={Biomedical events extraction using the hidden vector state model}, author={Zhou, Deyu and He, Yulan}, journal={Artificial Intelligence in Medicine}, volume={53}, number={3}, pages={205--213}, year={2011}, publisher={Elsevier} }

Yulan He, Deyu Zhou. Self-training from labeled features for sentiment analysis, Information Processing and Management, Volume 47, Issue 4, pp. 606-616, 2011.
Sentiment analysis concerns about automatically identifying sentiment or opinion expressed in a given piece of text. Most prior work either use prior lexical knowledge defined as sentiment polarity of words or view the task as a text classification problem and rely on labeled corpora to train a sentiment classifier. While lexicon-based approaches do not adapt well to different domains, corpus-based approaches require expensive manual annotation effort.

In this paper, we propose a novel framework where an initial classifier is learned by incorporating prior information extracted from an existing sentiment lexicon with preferences on expectations of sentiment labels of those lexicon words being expressed using generalized expectation criteria. Documents classified with high confidence are then used as pseudo-labeled examples for automatical domain-specific feature acquisition. The word-class distributions of such self-learned features are estimated from the pseudolabeled examples and are used to train another classifier by constraining the model's predictions on unlabeled instances. Experiments on both the movie-review data and the multi-domain sentiment dataset show that our approach attains comparable or better performance than existing weakly-supervised sentiment classification methods despite using no labeled documents.
@article{he2011self, title={Self-training from labeled features for sentiment analysis}, author={He, Yulan and Zhou, Deyu}, journal={Information Processing \& Management}, volume={47}, number={4}, pages={606--616}, year={2011}, publisher={Elsevier} }

Deyu Zhou, Yulan He. A Novel Framework of Training Hidden Markov Support Vector Machines from Lightly-Annotated Data, In: Proceedings of the 20th ACM Conference on Information and Knowledge Management (CIKM 2011), Glasgow.
Natural language understanding (NLU) aims to map sentences to their semantic mean representations. Statistical approaches to NLU normally require fully-annotated training data where each sentence is paired with its word-level semantic annotations. In this paper, we propose a novel learning framework which trains the Hidden Markov Support Vector Machines without the use of expensive fullyannotated data. In particular, our learning approach takes as input a training set of sentences labeled with abstract semantic annotations encoding underlying embedded structural relations and automatically induces derivation rules that map sentences to their semantic meaning representations. The proposed approach has been tested on the DARPA Communicator Data and achieved 93.18% in F-measure, which outperforms the previously proposed approaches of training the hidden vector state model or conditional random fields from unaligned data, with a relative error reduction rate of 43.3% and 10.6% being achieved.
@inproceedings{zhou2011novel, title={A novel framework of training hidden markov support vector machines from lightly-annotated data}, author={Zhou, Deyu and He, Yulan}, booktitle={Proceedings of the 20th ACM international conference on Information and knowledge management}, pages={2025--2028}, year={2011}, organization={ACM} }

Deyu Zhou, Yulan He. Learning Conditional Random Fields from Unaligned Data for Natural Language Understanding, In: Proceedings of the 33rd European Conference on Information Retrieval (ECIR 2011), Dublin, Ireland, 283-288.
In this paper, we propose a learning approach to train conditional random fields from unaligned data for natural language understanding where input to model learning are sentences paired with predicate formulae (or abstract semantic annotations) without word-level annotations. The learning approach resembles the expectation maximization algorithm. It has two advantages, one is that only abstract annotations are needed instead of fully word-level annotations, and the other is that the proposed learning framework can be easily extended for training other discriminative models, such as support vector machines, from abstract annotations. The proposed approach has been tested on the DARPA Communicator Data. Experimental results show that it outperforms the hidden vector state (HVS) model, a modified hidden Markov model also trained on abstract annotations. Furthermore, the proposed method has been compared with two other approaches, one is the hybrid framework (HF) combining the HVS model and the support vector hidden Markov model, and the other is discriminative training of the HVS model (DT). The proposed approach gives a relative error reduction rate of 18.7% and 8.3% in F-measure when compared with HF and DT respectively.
@inproceedings{zhou2011learning, title={Learning conditional random fields from unaligned data for natural language understanding}, author={Zhou, Deyu and He, Yulan}, booktitle={European Conference on Information Retrieval}, pages={283--288}, year={2011}, organization={Springer} }

Deyu Zhou, Yulan He. Semantic Parsing for Biomedical Event Extraction, In: Proceedings of the 9th International Conference on Computational Semantics (IWCS 2011), Oxford, UK.
We propose a biomedical event extraction system, HVS-BioEvent, which employs the hidden vector state (HVS) model for semantic parsing. Biomedical events extraction needs to deal with complex events consisting of embedded or hierarchical relations among proteins, events, and their textual triggers. In HVS-BioEvent, we further propose novel machine learning approaches for event trigger word identification, and for biomedical events extraction from the HVS parse results. Our proposed system achieves an F-score of 49.57% on the corpus used in the BioNLP'09 shared task, which is only two points lower than the best performing system by UTurku. Nevertheless, HVSBioEvent outperforms UTurku on the extraction of complex event types. The results suggest that the HVS model with the hierarchical hidden state structure is indeed more suitable for complex event extraction since it can naturally model embedded structural context in sentences.
@inproceedings{zhou2011semantic, title={Semantic parsing for biomedical event extraction}, author={Zhou, Deyu and He, Yulan}, booktitle={Proceedings of the Ninth International Conference on Computational Semantics}, pages={395--399}, year={2011}, organization={Association for Computational Linguistics} }

2010

Yulan He, Harith. Alani, Deyu Zhou. Exploring English Lexicon Knowledge for Chinese Sentiment Analysis, In: Proceedings of CIPS-SIGHAN Joint Conference on Chinese Language Processing (CLP 2010), Beijing, China, 2010.
This paper presents a weakly-supervised method for Chinese sentiment analysis by incorporating lexical prior knowledge obtained from English sentiment lexicons through machine translation. A mechanism is introduced to incorporate the prior information about polaritybearing words obtained from existing sentiment lexicons into latent Dirichlet allocation (LDA) where sentiment labels are considered as topics. Experiments on Chinese product reviews on mobile phones, digital cameras, MP3 players, and monitors demonstrate the feasibility and effectiveness of the proposed approach and show that the weakly supervised LDA model performs as well as supervised classifiers such as Naive Bayes and Support vector Machines with an average of 83% accuracy achieved over a total of 5484 review documents. Moreover, the LDA model is able to extract highly domain-salient polarity words from text.
@inproceedings{he2010exploring, title={Exploring english lexicon knowledge for chinese sentiment analysis}, author={He, Yulan and Harith, Alani and Zhou, Deyu}, booktitle={CIPS-SIGHAN joint conference on Chinese language processing}, year={2010} }

2008

Deyu Zhou, Yulan He. Discriminative Training of the Hidden Vector State Model for Semantic Parsing, IEEE Transaction on Knowledge and Data Engineering, Volume 21,
Issue 1, pp. 66-77.
In this paper, we discuss how discriminative training can be applied to the hidden vector state (HVS) model in different task domains. The HVS model is a discrete hidden Markov model (HMM) in which each HMM state represents the state of a push-down automaton with a finite stack size. In previous applications, maximum-likelihood estimation (MLE) is used to derive the parameters of the HVS model. However, MLE makes a number of assumptions and unfortunately some of these assumptions do not hold. Discriminative training, without making such assumptions, can improve the performance of the HVS model by discriminating the correct hypothesis from the competing hypotheses. Experiments have been conducted in two domains: the travel domain for the semantic parsing task using the DARPA Communicator data and the Air Travel Information Services (ATIS) data and the bioinformatics domain for the information extraction task using the GENIA corpus. The results demonstrate modest improvements of the performance of the HVS model using discriminative training. In the travel domain, discriminative training of the HVS model gives a relative error reduction rate of 31 percent in F-measure when compared with MLE on the DARPA Communicator data and 9 percent on the ATIS data. In the bioinformatics domain, a relative error reduction rate of 4 percent in F-measure is achieved on the GENIA corpus.
@article{zhou2008discriminative, title={Discriminative training of the hidden vector state model for semantic parsing}, author={Zhou, Deyu and He, Yulan}, journal={IEEE Transactions on Knowledge and Data Engineering}, volume={21}, number={1}, pages={66--77}, year={2008}, publisher={IEEE} }
Deyu Zhou, Yulan He. Extracting Interactions between Proteins from the Literature, Journal of Biomedical Informatics, Volume 41, No 2, pp. 393-407, 2008.
During the last decade, biomedicine has witnessed a tremendous development. Large amounts of experimental and computational biomedical data have been generated along with new discoveries, which are accompanied by an exponential increase in the number of biomedical publications describing these discoveries. In the meantime, there has been a great interest with scientific communities in text mining tools to find knowledge such as protein-protein interactions, which is most relevant and useful for specific analysis tasks. This paper provides a outline of the various information extraction methods in biomedical domain, especially for discovery of protein- protein interactions. It surveys methodologies involved in plain texts analyzing and processing, categorizes current work in biomedical information extraction, and provides examples of these methods. Challenges in the field are also presented and possible solutions are discussed.
@article{zhou2008extracting, title={Extracting interactions between proteins from the literature}, author={Zhou, Deyu and He, Yulan}, journal={Journal of biomedical informatics}, volume={41}, number={2}, pages={393--407}, year={2008}, publisher={Elsevier} }

Deyu Zhou, Yulan He. A Hybrid Generative Framework to Train a Semantic, In: Proceedings of the 22nd International Conference on Computational Linguistics (COLING 2008), Manchester, August, 2008.
We propose a hybrid generative/discriminative framework for semantic parsing which combines the hidden vector state (HVS) model and the hidden Markov support vector machines (HMSVMs). The HVS model is an extension of the basic discrete Markov model in which context is encoded as a stack-oriented state vector. The HM-SVMs combine the advantages of the hidden Markov models and the support vector machines. By employing a modified K-means clustering method, a small set of most representative sentences can be automatically selected from an un-annotated corpus. These sentences together with their abstract annotations are used to train an HVS model which could be subsequently applied on the whole corpus to generate semantic parsing results. The most confident semantic parsing results are selected to generate a fully-annotated corpus which is used to train the HM-SVMs. The proposed framework has been tested on the DARPA Communicator Data. Experimental results show that an improvement over the baseline HVS parser has been observed using the hybrid framework. When compared with the HM-SVMs trained from the fullyannotated corpus, the hybrid framework gave a comparable performance with only a small set of lightly annotated sentences.
@inproceedings{zhou2008hybrid, title={A hybrid generative/discriminative framework to train a semantic parser from an un-annotated corpus}, author={Zhou, Deyu and He, Yulan}, booktitle={Proceedings of the 22nd International Conference on Computational Linguistics-Volume 1}, pages={1113--1120}, year={2008}, organization={Association for Computational Linguistics} }

Yulan He, Keiichi Nakata, Deyu Zhou. Ontology-Based Protein-Protein Interactions Extraction from Literature using the Hidden Vector State Model, In: The workshop 'SADM'08' of the 8th IEEE International Conference on Data Mining (ICDM'08), Pisa, Italy, December 2008.
This paper proposes a novel framework of incorporating protein-protein interactions (PPI) ontology knowledge into PPI extraction from biomedical literature in order to address the emerging challenges of deep natural language understanding. It is built upon the existing work on relation extraction using the Hidden Vector State (HVS) model. The HVS model belongs to the category of statistical learning methods. It can be trained directly from un-annotated data in a constrained way whilst at the same time being able to capture the underlying named entity relationships. However, it is difficult to incorporate background knowledge or non-local information into the HVS model. This paper proposes to represent the HVS model as a conditionally trained undirected graphical model in which non-local features derived from PPI ontology through inference would be easily incorporated. The seamless fusion of ontology inference with statistical learning produces a new paradigm to information extraction.
@inproceedings{he2008ontology, title={Ontology-based protein-protein interactions extraction from literature using the hidden vector state model}, author={He, Yulan and Nakata, Keiichi and Zhou, Deyu}, booktitle={2008 IEEE International Conference on Data Mining Workshops}, pages={736--743}, year={2008}, organization={IEEE} }

Deyu Zhou, Yulan He. Extracting Protein-Protein Interaction based on Discriminative Training of the Hidden Vector State Model, In: Proceedings of the ACL 2008 Work-shop on Biomedical Natural Language Processing (Bio'NLP) 2008, Columbus, OH, USA.
The knowledge about gene clusters and protein interactions is important for biological researchers to unveil the mechanism of life. However, large quantity of the knowledge often hides in the literature, such as journal articles, reports, books and so on. Many approaches focusing on extracting information from unstructured text, such as pattern matching, shallow and deep parsing, have been proposed especially for extracting protein-protein interactions (Zhou and He, 2008).

A semantic parser based on the Hidden Vector State (HVS) model for extracting protein-protein interactions is presented in (Zhou et al., 2008). The HVS model is an extension of the basic discrete Markov model in which context is encoded as a stack-oriented state vector. Maximum Likelihood estimation (MLE) is used to derive the parameters of the HVS model. In this paper, we propose a discriminative approach based on parse error measure to train the HVS model. To adjust the HVS model to achieve minimum parse error rate, the generalized probabilistic descent (GPD) algorithm (Kuo et al., 2002) is used. Experiments have been conducted on the GENIA corpus. The results demonstrate modest improvements when the discriminatively trained HVS model outperforms its MLE trained counterpart by 2.5% in F-measure on the GENIA corpus.
@inproceedings{zhou2008extracting, title={Extracting protein-protein interaction based on discriminative training of the hidden vector state model}, author={Zhou, Deyu and He, Yulan}, booktitle={Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing}, pages={98--99}, year={2008} }