Zhang Yongwei (CASS IL, Peking)

From Conceptual Model to Data Model in Multimedia & Multimodal Corpus Integration


Id§pont: 2013. december 10, 11:00


The presentation offers an introduction of my research group's studies of multimedia & mulitmodal corpus integration, of which a conceptual model has been formulated by Professor Gu (2006, 2009). The present procedure is to build a data model according to the conceptual model and integrate different data types. What makes this research more demanding and challenging lies in the fact that we established the Spoken Chinese Corpus of Situated Discourse (SCCSD) involving three basic types of data, orthographic transcripts, audio sound and video image streams recorded in everyday scenarios (and not in sound-proof studios). In other words, the integration has to face the totally different data, sometimes considered "impure" or "noisy".  With these regards, my own work plan falls into three parts: (1) to demonstrate the conceptual models and data model according to different multimedia types. (2) to identify where the problem is; (3) and to introduce the procedure of constructing a multimedia & multimodal corpus. (4) to show certain applications based on the multimedia & multimodal corpus.