AIA CES Credits
AV Equipment Request Form
AV Office
Abstract Publication
Academic Calendar, Columbia University
Academic Calendar, GSAPP
Admissions Office
Advanced Standing Waiver Form
Alumni Board
Alumni Office
Architecture Studio Lottery
Avery Library
Avery Review

Data-Mining China: Urban Village in Shenzhen

Jul 15, 2017 – Aug 12, 2017
Shenzhen, China
Research Question

The 2017 edition of the Rural China Lab shifted focus from villages in the Chinese countryside to villages within China’s cities. In the southern metropolis of Shenzhen, where the monthlong workshop took place, more than 240 “cheng zhong cun” (城中村), or “urban villages,” dot the city fabric.

The theme of the upcoming 2017-18 Bi-City Biennale of Architecture and Urbanism (Shenzhen/Hong Kong) centers on these former fishing and farming villages, which were engulfed in the rapid urbanization of the Pearl River Delta over the past 35 years. Studio X was invited to be a core contributor to the Biennale’s main exhibition, and this student workshop is the first component of that participation.


As part of their curatorial approach to the 2017 Biennale, Meng Yan and Liu Xiaodu of URBANUS envisioned the creation of a comprehensive archive of urban village related research, theory, and documentation. In light of that goal, our workshop asks whether it is possible to use contemporary data-mining tools to collect everything yet written on urban village, and then, if machine-learning tools can be used to parse that data and draw useful conclusions from it. Colloquially speaking, can we, as Meng Yan asked, “close the book” on a generation of urban village writings?

The research tasks, and the students, were divided into three groups of focus: Academic Works, Professional Journalism, and Social Media. Using python-scripting, each group tailored custom approaches to the respective data sources and endeavored to collect all available texts (in Chinese and English) which were potentially relevant. The students then had to develop ways of sorting and organizing the information such that it would be both usable to others and readily accessible to computer analysis. In parallel with this effort, they began prototyping machine-learning algorithms to create proofs of concept for ways in which the collected data might be utilized in the post-workshop stage of development.

Output and Findings
As the workshop evolved and reformulated around toolkits and methodologies, the three data groups effectively became two. Although conceptually different, the first two data groups (Academic and Journalism) relied on very similar workflows and sources. With numerous potential portals, collecting an exhaustively large volume of text proved easier than expected. The critical issue that emerged was in determinizing how important any given paper was, so students in these two groups moved quickly on to experimentations with relevancy ranking and content analysis, giving a substantial foundation for the next stage work to come.
The social media group, conversely, faced an ambiguously defined scope and a myriad of disaggregated sources. Each corner of the Chinese social media universe had to be treated as its own individual problem to be solved and each produced its own distinctly flavored type of information. Unlike the newspaper articles and white papers, which are inherently text, the multimedia sources here had to be intelligently harvested for their usable text. As the primary means of sorting and analyzing, the group then explored applications of natural language processing; but in trying to draw conclusions about larger patterns, the students were stymied by the shifting usages and impermanent natures of social media portals, the macro-level movements of which drown out finer trend lines. As the research moves forward, such issues may be addressed by compositing multiple social media portals together and/or normalizing overall usage trends so that specific insights can be gleaned.

Other 2017 workshops

Burning man summer workshop 2017
Burning Man
Black Rock City, Nevada
Aug 7, 2017 – Sep 2, 2017
Beirut Summer Workshop 2017
Building Yacoubian, A Social Biopsy of Modernist Architecture
Beirut, Lebanon
Aug 5, 2017 – Aug 19, 2017
Hudson river workshop 3 copy
Justice in Place: Downtown Regeneration in the Shadow of Urban Renewal in Hudson River Valley, NY
Poughkeepsie, New York
Aug 1, 2017 – Aug 18, 2017
Tokyo Summer Workshop 2017
Aging Tokyo in Japan
Tokyo, Japan
Jul 24, 2017 – Aug 4, 2017
Harare Summer Workshop 2017
Afro-Imaginaries in Harare, Zimbabwe
Harare, Zimbabwe
Jun 26, 2017 – Jul 13, 2017
Madrid summer workshop image
The Environmentalist Dilemma: Reducing the economic and social costs of a low carbon city in Madrid
Madrid, Spain
Jun 10, 2017 – Jul 9, 2017
Jordan trail 4
Heritage Sites of the Jordan Trail: Documenting and Interpreting 7,000 Years of Urban Living in Jordan
Jordan Trail, Jordan
Jun 13, 2017 – Jun 26, 2017
This website uses cookies as well as similar tools and technologies to understand visitors' experiences. By continuing to use this website, you consent to Columbia University's usage of cookies and similar technologies, in accordance with the Columbia University Website Cookie Notice.