Important Dates

  • Paper Submission Due: July 23, 2021
  • Notification of acceptance: August 27, 2021
  • Camera-ready due: September 10, 2021
  • Early Registration ends: September 15, 2021
  • Late Registration ends: October 5, 2021
  • On-Site Registration: October 15-16, 2021

All deadlines are 11.59 pm UTC-12h (anywhere on earth)

Welcome to ROCLING 2021!

ROCLING 2021 is the 33rd annual Conference on Computational Linguistics and Speech Processing in Taiwan sponsored by the Association for Computational Linguistics and Chinese Language Processing (ACLCLP).The conference will be held in the Teaching and Research Building of National Central University (NCU) in Taoyuan, Taiwan during October 15-16, 2021.

ROCLING 2021 will provide an international forum for researchers and industry practitioners to share their new ideas, original research results and practical development experiences from all language and speech research areas, including computational linguistics, information understanding, and signal processing. ROCLING 2021 will feature oral papers, posters, tutorials, special sessions and shared tasks.

The conference on Computational Linguistics and Speech Processing (ROCLING) was initiated in 1988 by the Association for Computational Linguistics and Chinese Language Processing (ACLCLP) with the major goal to provide a platform for researchers and professionals from around the world to share their experiences related to natural language processing and speech processing. Following are a list of past ROCLING conferences.

Call for Papers

ROCLING 2021 invites paper submissions reporting original research results and system development experiences as well as real-world applications. Each submission will be reviewed based on originality, significance, technical soundness, and relevance to the conference. Accepted papers will be presented orally or as poster presentations. Both oral and poster presentations will be published in the ROCLING 2021 conference proceedings and included in the ACL Anthology. A number of papers will be selected and invited for extension into journal versions and publication in a special issue of the International Journal of Computational Linguistics and Chinese Language Processing (IJCLCLP).

Papers can be written and presented in either Chinese or English. Papers should be made in PDF format and submitted online through the paper submission system. Submitted papers may consist of 4-8 pages of content, plus unlimited references. Upon acceptance, final versions will be given additional pages of content (up to 9 pages) so that reviewers’ comments can be taken into account. ROCLING 2021 mainly targets two scientific tracks: natural language processing (NLP) and speech processing (Speech). Relevant topics for the conference include, but are not limited to, the following areas (in alphabetical order):

Natural Language Processing Speech Processing
  • Cognitive/Psychological Linguistics
  • Discourse and Pragmatics
  • Dialogue System
  • Information Extraction
  • Information Retrieval
  • Language Generation
  • Machine Translation
  • NLP Applications
  • Phonology, Morphology and Word Segmentation
  • Question Answering
  • Resources and Evaluation
  • Semantics: Lexical, Sentence-Level, Textual Inference
  • Sentiment Analysis
  • Summarization
  • Syntax: Tagging, Chunking and Parsing
  • Others
  • Speech Perception, Production and Acquisition
  • Phonetics, Phonology and Prosody
  • Analysis of Paralinguistics in Speech and Language
  • Speaker and Language Identification
  • Analysis of Speech and Audio Signals
  • Speech Coding and Enhancement
  • Speech Synthesis and Spoken Language Generation
  • Speech Recognition
  • Spoken Dialog Systems and Analysis of Conversation
  • Spoken Language Processing:
    Retrieval, Translation, Summarization, Resources and Evaluation
  • Others

Paper submissions must use the official ROCLING 2021 style templates (Latex and Word). Submission is electronic, using the EasyChair conference management system. The submission site is available at url (under construction).

As the reviewing will be double-blind, papers must not include authors' names and affiliations. Furthermore, self-references that reveal the author's identity must be avoided. Papers that do not conform to these requirements will be rejected without review. Papers may be accompanied by a resource (software and/or data) described in the paper, but these resources should be anonymized as well.

Keynote Speakers

Vincent Ng

The University of Texas at Dallas

Vincent Ng (Ph.D., Cornell)

Vincent Ng (Ph.D., Cornell) is a Professor in the Computer Science Department at the University of Texas at Dallas. He is also the director of the Machine Learning and Language Processing Laboratory in the Human Language Technology Research Institute at UT Dallas. He is currently an associate editor of the journal of Artificial Intelligence, an action editor of the Transactions of the Association for Computational Linguistics, and an associate editor of the ACM Transactions on Asian and Low-Resource Language Information Processing (TALLIP). His recent projects have focused primarily on developing unsupervised and semi-supervised machine learning techniques for natural language processing, with the goal of reducing the amount of annotated data needed to build NLP applications and process resource-scarce languages.

Jinyu Li

Microsoft Corporation, Redmond, USA

Jinyu Li

Jinyu Li received the Ph.D. degree from Georgia Institute of Technology, Atlanta, in 2008. From 2000 to 2003, he was a Researcher in the Intel China Research Center and Research Manager in iFlytek, China. Currently, he is a Partner Applied Scientist and Technical Lead in Microsoft Corporation, Redmond, USA. He leads a team to design and improve speech modeling algorithms and technologies that ensure industry state-of-the-art speech recognition accuracy for Microsoft.His major research interests cover several topics in speech recognition, including end-to-end modeling, deep learning, noise robustness, etc. He is the leading author of the book "Robust Automatic Speech Recognition -- A Bridge to Practical Applications", Academic Press, Oct, 2015. He is the member of IEEE Speech and Language Processing Technical Committee since 2017. He also served as the associate editor of IEEE/ACM Transactions on Audio, Speech and Language Processing from 2015 to 2020.

Special Session

大腦與語言
Special Session: Brain and Language

徐峻賢 Chun-hsien Hsu
國立中央大學認知神經科學研究所
Institute of Cognitive Neuroscience
National Central University, Taiwan
neurolang@g.ncu.edu.tw

李佳穎 Chia-ying Lee
中央研究院語言學研究所
Institute of Linguistics
Academia Sinica, Taiwan
chiaying@gate.sinica.edu.tw

李佳霖 Chia-lin Lee
國立台灣大學語言學研究所
Graduate Institute of Linguistics
National Taiwan University, Taiwan
chialinlee@ntu.edu.tw

摘要

作為一門跨領域的科學研究項目,神經語言學擅長結合各種領域知識,主要包含腦科學、語言學理論、計算科學、認知心理學,以探討大腦如何處理人類語言。本次座談會希望將相關的研究成果回饋給科學研究社群,以鼓勵更多資訊科學、語言學領域的學者和研究生參與神經語言學的研究。本次座談會提到的語言參數來自各種不同的資源,包含中華民國計算語言學會出版的平衡語料庫、口語資料庫,以及研究者依照研究目的而建置的語料。這些資訊有助於實驗研究,以探索語言發展,以及探索一般正常母語使用者的大腦解讀語言結構的方式。徐峻賢會介紹基本的認知神經科學研究方法,以及使用腦磁圖技術研究構詞理論、口語理解的研究成果,並且分享以深度學習模型探討大腦活動特徵的研究方法。李佳穎將介紹結合行為測量、事件關聯腦電位進行詞彙知識、語言發展的研究成果,以實證研究澄清一般人對於中文的字型、字音結構常有的迷思,並且從語料庫進行語意多樣性的分析說明當前心理詞彙理論的轉變。李佳霖將分享人們處理意義的認知功能和其大腦機制,包含從語言使用的情境提取適當的語意訊息 (比如 “鋼琴” 要指涉音色、形狀、還是操作方式),以及老化造成的改變對於處理意義的認知功能之影響。

Abstract

Neurolinguistic is an interdisciplinary study that incorporates elements of neuroscience, linguistics, computational science, and cognitive psychology to aiming to explore how the brain processes human language. By presenting our research results to the science community, we hope to encourage future studies from researchers and graduate students in the areas of computational science and linguistics. The speech parameters mentioned in this symposium were gathered from distinct sources including Sinica Corpus and COSPRO & Toolkit published by the Association for Computational Linguistics and Chinese Language Processing, and customized databases that were built by researchers regarding their purposes of research. These data are beneficial to research studies aiming to explore topics in language development and language comprehension in native speakers. Dr. Chun-Hsien Hsu is going to introduce some basic research methods in cognitive neuroscience, and the research results of using Magnetoencephalography (MEG) to study morphosyntactic theories and speech comprehension. Dr. Hsu would also shares some approaches to employ deep learning models for studying the features of brain responses to language. Dr. Chia-yin Lee is going to talk about the use of combining behavioral testings and Event-Related Potential (ERP) in vocabulary knowledge and language development studies. Dr. Lee intends to clarify common misconceptions and myths about Chinese orthography and phonetic structures, and explain the current changes/shift in theories of mental lexicon through corpus analyses. Dr. Chia-Lin Lee is going to share with us about the cognitive functions and brain mechanisms involved in the processing of meaningful information, which includes the fetching of appropriate semantic meaning depending on the context of usage ( i.e. the concept of “piano” may comprise the sound of a piano, the shape of the instrument, and different ways to play it etc.), and the effect of aging on general cognitive function and semantic processes.

ROCLING 2021 Shared Task:


Dimensional Sentiment Analysis for Chinese Students’ Self-Evaluated Comments

Organizers

I. Background

Sentiment analysis has emerged as a leading technique to automatically identify affective information within texts. In sentiment analysis, affective states are generally represented using either categorical or dimensional approaches (Calvo and Kim, 2013). The categorical approach represents affective states as several discrete classes (e.g., positive, negative, neutral), while the dimensional approach represents affective states as continuous numerical values on multiple dimensions, such as valence-arousal (VA) space (Russell, 1980), as shown in Fig. 1. The valence represents the degree of pleasant and unpleasant (or positive and negative) feelings, and the arousal represents the degree of excitement and calm. Based on this two-dimensional representation, any affective state can be represented as a point in the VA coordinate plane by determining the degrees of valence and arousal of given words (Wei et al., 2011; Malandrakis et al., 2013; Wang et al., 2016; Du and Zhang, 2016; Wu et la., 2017; Yu et al., 2020) or texts (Kim et al., 2010; Paltoglou et al, 2013; Goel et la., 2017; Zhu et al., 2019; Wang et al., 2019; 2020).

In 2016, we hosted a first dimensional sentiment analysis task for Chinese words (Yu et al., 2016b) at the 20th International Conference on Asian Language Processing (IALP 2016). In 2017, we extended this task to include both word- and phrase-level dimensional sentiment analysis (Yu et al., 2017). This year, we explore the sentence-level dimensional sentiment analysis task on students’ self-evaluated comments.

II. Task Description

Structured data such as attendance, homework completion and in-class participation have been extensively studied to predict students’ learning performance. Unstructured data, such as self- evaluation comments written by students, is also a useful data resource because it contains rich emotional information that can help illuminate the emotional states of students (Yu et al., 2018). Dimensional sentiment analysis is an effective technique to recognize the valence-arousal ratings from texts, indicating the degree from most negative to most positive for valence, and from most neutral low Valence IV Low-Arousal, Positive-Valence Tired calm to most excited for arousal.

In this task, participants are asked to provide a real-valued score from 1 to 9 for both valence and arousal dimensions for each self-evaluation comment. The input format is “sentence_id, sentence”, and the output format is “sentence_id, vallence_rating, arousal_rating”. Below are the input/output formats of the example sentences.

Example 1:

     Input: 1, 今天教了許多以前沒有學過的東西,所以上起課來很新鮮

     Output: 1, 6.8, 5.2

Example 2:

     Input: 2, 覺得課程進度有點快,內容難以消化

     Output: 2, 3.0, 4.0

III. Data

Training set

  • CVAW 4.0: including 5,512 single words annotated with valence-arousal ratings (Yu et al., 2016a).
  • CVAP 2.0: including 2,998 multi-word phrases annotated with valence-arousal ratings (Yu et al., 2017).
  • CVAT 2.0: including 2,969 sentences annotated with valence-arousal ratings (Yu et al., 2016a).


The policy of this shared task is an open test. Participating systems are allowed to use other publicly available data for this shared task, but the use of other data should be specified in the final technical report.

IV. Evaluation

The performance is evaluated by examining the difference between machine-predicted ratings and human-annotated ratings (valence and arousal are treated independently). The evaluation metrics include:

Mean absolute error:

Pearson correlation coefficient:

where Ai denotes the human-annotated ratings, Pi denotes the machine-predicted ratings, n is the number of test samples, A and P respectively denote the arithmetic mean of A and P, and σ is the standard deviation.

V. Important Dates

  • Release of training data: May 1, 2021
  • Release of test data: July 21, 2021
  • Testing results submission due: July 23, 2021
  • Release of evaluation results: July 26, 2021
  • System description paper due: August 9, 2021
  • Notification of Acceptance: August 27, 2021
  • Camera-ready deadline: September 10, 2021

References

  • Rafael A. Calvo, and Sunghwan Mac Kim. 2013. Emotions in text: dimensional and categorical models. Computational Intelligence, 29(3):527-543.
  • Munmun De Choudhury, Scott Counts, and Michael Gamon. 2012. Not all moods are created equal! Exploring human emotional states in social media. In Proc. of ICWSM-12, pages 66-73.
  • Steven Du and Xi Zhang. 2016. Aicyber’s system for IALP 2016 shared task: Character-enhanced word vectors and Boosted Neural Networks, in Proc. of IALP-16, pages 161–163.
  • Pranav Goel, Devang Kulshreshtha, Prayas Jain and Kaushal Kumar Shukla. 2017. Prayas at EmoInt 2017: An Ensemble of Deep Neural Architectures for Emotion Intensity Prediction in Tweets, in Proc. of WASSA-17, pages 58–65.
  • Sunghwan Mac Kim, Alessandro Valitutti, and Rafael A. Calvo. 2010. Evaluation of unsupervised emotion models to textual affect recognition. In Proc. of the NAACL HLT 2010 Workshop on Computational Approaches to Analysis and Generation of Emotion in Text, pages 62-70.
  • N. Malandrakis, A. Potamianos, E. Iosif, and S. Narayanan. 2013. Distributional semantic models for affective text analysis. IEEE Transactions on Audio, Speech, and Language Processing, 21(11): 2379-2392.
  • Myriam Munezero, Tuomo Kakkonen, and Calkin S. Montero. 2011. Towards automatic detection of antisocial behavior from texts. In Proc. of the Workshop on Sentiment Analysis where AI meets Psychology (SAAIP) at IJCNLP-11, pages 20-27.
  • Georgios Paltoglou, Mathias Theunis, Arvid Kappas, and Mike Thelwall. 2013. Predicting emotional responses to long informal text. IEEE Trans. Affective Computing, 4(1):106-115.
  • Jie Ren and Jeffrey V. Nickerson. 2014. Online review systems: How emotional language drives sales. In Proc. of AMCIS-14.
  • James A. Russell. 1980. A circumplex model of affect. Journal of Personality and Social Psychology, 39(6):1161.
  • Wen-Li Wei, Chung-Hsien Wu, and Jen-Chun Lin. 2011. A regression approach to affective rating of Chinese words from ANEW. In Proc. of ACII-11, pages 121-131.
  • Liang-Chih Yu, Cheng-Wei Lee, Huan-Yi Pan, Chih-Yueh Chou, Po-Yao Chao, Zhi-Hong Chen, Shu-Fen Tseng, Chien-Lung Chan and K. Robert Lai. 2018. Improving early prediction of academic failure using sentiment analysis on self-evaluated comments, Journal of Computer Assisted Learning, 34(4):358-365.
  • Liang-Chih Yu, Lung-Hao Lee, Shuai Hao, Jin Wang, Yunchao He, Jun Hu, K. Robert Lai, and Xuejie Zhang. 2016a. Building Chinese affective resources in valence-arousal dimensions. In Proc. of NAACL/HLT-16, pages 540-545.
  • Liang-Chih Yu, Lung-Hao Lee, Jin Wang and Kam-Fai Wong. 2017. IJCNLP-2017 Task 2: Dimensional sentiment analysis for Chinese phrases, in Proc. of IJCNLP-17, pages 9-16.
  • Liang-Chih Yu, Lung-Hao Lee and Kam-Fai Wong. 2016b. Overview of the IALP 2016 shared task on dimensional sentiment analysis for Chinese words, in Proc. of IALP-16, pages 156-160.
  • Liang-Chih Yu, Jin Wang, K. Robert Lai and Xuejie Zhang. 2020. Pipelined neural networks for phrase-level sentiment intensity prediction, IEEE Transactions on Affective Computing, 11(3), 447-458.
  • Jin Wang, Liang-Chih Yu, K. Robert Lai and Xuejie Zhang. 2016. Community-based weighted graph model for valence-arousal prediction of affective words, IEEE/ACM Trans. Audio, Speech and Language Processing, 24(11):1957-1968.
  • Jin Wang, Liang-Chih Yu, K. Robert Lai and Xuejie Zhang. 2020. Tree-structured regional CNN- LSTM model for dimensional sentiment analysis, IEEE/ACM Transactions on Audio Speech and Language Processing, 28, 581–591.
  • Chuhan Wu, Fangzhao Wu, Yongfeng Huang, Sixing Wu and Zhigang Yuan. 2017. THU NGN at IJCNLP-2017 Task 2: Dimensional sentiment analysis for Chinese phrases with deep LSTM, in Proc. of IJCNLP-17, pages 42-52.
  • Suyang Zhu, Shoushan Li and Guodong Zhou. 2019. Adversarial attention modeling for multi- dimensional emotion regression, in Proc. of ACL-19, pages 471–480.

Organization

Honorary Chair

Jing-Yang Jou

National Central University

Conference Chairs

Lung-Hao Lee

National Central University

Chia-Hui Chang

National Central University

Kuan-Yu Chen

National Taiwan University of Science and Technology

Program Chairs

Yung-Chun Chang

Taipei Medical University

Yi-Chin Huang

National Pingtung University

Tutorial Chair

Hung-Yi Lee

National Taiwan University

Publication Chair

Jheng-Long Wu

Soochow University

Organized by

National Center University

National Taiwan University of Science and Technology

The Association for Computational Linguistics and Chinese Language Processing

Supported by