ZJU Uncovers Secrets Behind Tang and Song Poems With Big Data

Editor: Yu Liu     Time: 2019-08-28      Number of visits :43

What kind of spark will there be when data visualization meets classical poetry in Tang and Song Dynasty?

Recently, a set of data visualization projects " Immersing in Song Ci, Singing Tenderly and Seeking Forever" (hereinafter referred to as Song Poems) and “Illustration of Female Tang Poets" (hereinafter referred to as Tang Poems) became a hot topic on Wechat Moments. The project was made by CAD&CG State Key Laboratory of Zhejiang University and the Xinhuanet Data and Information Department and took half a year to complete.

The team analyzed 55,000 Tang poems and 21,000 Song poems. Through interpretation with big data technology, a lot of hidden information was uncovered.

Big data shows the place the famous poet Su Dongpo visited the most was Hangzhou.

The reporter opened Song Poems in the browser and took a look at this project.

There are a lot of white space and ink wash captions, creating a simple and elegant color scheme. The overall style of the page resembles a traditional Chinese landscape painting.

Song Poems took sample poems from “The Complete Collection of Song Poems”. The team analyzed 21,000 poems, 1,330 poets, and 1,300 words to complete the interpretation. On the other hand, “Tang Poems” performed data analysis on 55,000 Tang poems.

The reporter observed that the web version of Song Poems is composed of migration maps and life maps of the poets. It also included a word cloud, sentiment graph and rhythm graph of Song poems.

The reporter selected another famous poet Su Shi’s life map. According to the illustration, it showed a line that was flat at the start, steep in the middle and flat at the end, which represents the fluctuations in the life of Su Shi.

On the migration map linked with the life map, brown spots of different sizes were put on the map connected by lines. The size of each point is determined by the number of times that Su Shi went that place. Judging from the chart, Su Shi’s travelled almost the entire land of the Song Dynasty. Among them, the biggest point is Hangzhou, which shows that Hangzhou was the place he visited the most.

The word east wind appeared 1264 times in Song Poems

The next section was word cloud for Song Poems. The word cloud tracks the frequency of words used in the Song poems. More frequent words are given larger font size, deeper font color and more centered position in the word cloud. The reporter saw that the word in the middle was 东风 (east wind), which was used 1264 times. It was followed by 何处 (where) , which was used a total of 1157 times. Ranked third was “人间” (the world), which appeared a total of 1061 times in the Song poems.

We have studied Song poems before, but it was more of a separate analysis for individual poem. This research allowed us to search for information hidden behind all the poems from the perspective of big data. CAD&CG State Key Laboratory Design Director Zhang Wei told the reporter.

“Song Poems” and “Tang Poems”, produced by the Zhejiang University team and the Xinhuanet Data and Information Department, were presented on the web after half a year of research. The information is quite rich. Among them, the most informative and complicated production is Song Poems.

“In the media industry, visual data projects with such mature traditional cultural themes are still very rare.” Zhang Wei said that this is the first attempt by the Zhejiang University Visualization Team in this particular field.

The word “wine” in poetry implies nostalgia and happiness

The team not only analyzed the information on the surface of “The Complete Collection of Song Poems”, but also deeply explored the meaning of its imagery and integrated it into a sentiment graph.

The sentiment graph selected 30 common words such as “month” and “wine” and represented 24 prolific poets such as Su Shi and Li Qingzhao. Through big data analysis, the sentiments expressed by these imagery vocabularies are obtained and divided into five categories: joy, anger, sorrow, happiness and memory. The proportion of different emotions represented by various words was then displayed on a pie chart.

For example, when a poet writes wine, he is expressing nostalgia and contemplation nearly half the time. Lu You’s “红酥手,黄滕酒” and “一曲新词酒一杯,去年天气旧亭台” are both reminiscent of past people and life. Another 30% of passages that mention wine are lighthearted, such as “日日深杯酒满,朝朝小圃花开” by Zhu Guoru.

So, how does big data technology understand the underlying sentiments of these poets?

First, the team needs to sort out the typical topos that represent a particular emotion. Zhang Wei said the team invited Dr. Hu Qiuyan from Zhejiang University School of Literature to verify their analysis in order to be more precise.   

Pan Rusheng, who was responsible for data analysis and front-end development, told reporters that they used use big data to analyze the context and calculate the probability that the word belongs to a certain sentiment based on typical topos, and derived the sentiment that the poet most likely wanted to express.

To put it simply with an example, the poet Zhang Zai’s 题兴龙寺老柏院 wrote: 南邻北舍牡丹开,年少寻芳日几回。惟有君家老柏树,春风来似不曾来. “柏树” (cypress) expresses a kind of reminiscence emotion. Linking this with its context, we can conclude that “牡丹” (peony) and “春风” (spring breeze) convey the meaning of nostalgia.

Visualized data representation makes Tang and Song poems easy to understand

When asked about the difficulty of this research, Zhang Wei first mentioned the choice of chart. In order to find the most appropriate way to present data, the team experimented with multiple charts.

The right chart needs to be visually appealing, capable of covering information in an intuitive way and smooth to interact with. This was a tough job that took the research team a lot of effort to figure out. According to Zhang Wei, the team tried to use the mountain peak view to express the rhythm of the words but considering that the overlap of images affects how the chart looks and the fact that it’s inconvenient to place images, the team had to explore another alternative.

People are visual, so visual forms of popular science can make difficult ancient poetry easy to understand. It takes the dull stereotype out of lecturing and does a great job promoting traditional culture” said Chen Wei, Deputy Dean of Zhejiang University School of Computing Science.

This research was meant to promote science to the general public, so the poems that were analyzed were based on entry level Tang and Song poetry. Zhang Wei said: This product wasn’t made to draw a certain conclusion, but to provide people with an interesting tool for exploring Tang and Song poetry. Therefore, more interesting conclusions have yet to be discovered by the readers.


(Chief reporter: Wang Zhan Correspondent: Wang Weijia Liu Sumeng)

Back to Top

Copyright © 2018 College of Computer Science and Technology, Zhejiang University