Summer Schools: CDHSummer2026 (4th Corpus and Digital Humanities Summer School 2026) (China)
Host Institution: Nanjing Normal University, Nanjing, China Coordinating Institution: Faculty of Arts and Humanities, University of Macau; School of Humanities and Social Science, The Hong Kong University of Science and Technology; Beijing Normal-Hong Kong Baptist University; College of Information Management, Nanjing Agricultural University
Dates: 25-Jul-2026 - 04-Aug-2026 Location: Nanjing, China
Minimum Education Level: Linguistic/History Students/teachers at University
Special Qualifications: Linguistic/History Students/teachers at University with little knowledge of computer science. APPLICATIONS AND SUBMISSION GUIDELINES SUBMISSION WINDOW: May 5 - May 12, 2026 (Beijing Time)
The offical language will be Chinese.
Applicants are required to complete the official application form and submit their Curriculum Vitae (CV), research background, learning objectives, and academic recommendation letters. The organizing committee will conduct a merit-based selection process. Admission results will be announced by June 1, 2026, via email. Admitted participants are required to sign a letter of commitment. Once successfully enrolled, withdrawing from the program or switching tracks is not permitted without valid, exceptional reasons.
PARTICIPANT CAPACITY AND GRADUATION CRITERIA - CAPACITY: Each track is strictly limited to 40 on-site and 40 online participants. The total program capacity is 240 participants. - GRADUATION ASSESSMENT: Each participant must complete an independent research project by the end of the program, such as a humanities database, a statistical analysis report, or a prototype LLM tool. - CERTIFICATION: Participants who successfully pass the final assessment will receive an official Certificate of Completion. Outstanding projects will be awarded an Excellence Certificate.
Focus: Driven by the rapid expansion of large-scale data ecosystems and Large Language Models (LLMs), research across the humanities and social sciences is undergoing a significant transformation. Traditional disciplines, including linguistics, literature, history, and philology, are increasingly adopting computational technologies to develop innovative, data-driven methodologies.
Central to this methodological shift is the development of reliable data infrastructure built upon well-annotated corpora. Furthermore, adapting and deploying artificial intelligence — particularly fine-tuned LLMs — for humanistic inquiry has become an essential skill for the next generation of scholars. To advance interdisciplinary research, empower scholars with both humanistic depth and computational expertise, and foster global academic exchange, Nanjing Normal University is proud to launch this intensive summer program in partnership with our esteemed co-hosts.
This program is dedicated mainly (but not exclusively) to undergraduates, postgraduates, and young researchers specializing in Digital Humanities, Computational Linguistics, Chinese Language and Literature, History, Philology, and related disciplines. Scholars focusing on the intersection of the humanities with large-scale datasets and LLMs are particularly welcome.
Description:
SUMMER SCHOOL ACTIVITIES AND CURRICULUM The curriculum is designed to provide a deep dive into four core modules: Digital Humanities Theory, Cutting-edge Technologies, Corpora Construction and Standards, and Quantitative Statistical Methods.
1. PARALLEL TRAINING WORKSHOPS Applicants must choose exactly ONE of the three parallel tracks. Each track consists of eight systematic lectures and hands-on coding practices, supported by 4 dedicated teaching assistants.
- TRACK A: DATABASE PROGRAMMING WORKSHOP - Instructor: Prof. Bin Li (Nanjing Normal University, China). Oriented toward beginners with no prior programming experience. Utilizing MySQL and PHP as the core development platform, the workshop uses classical texts, such as the "Complete Tang Poems", to teach data structuring, SQL querying, and interactive website development.
- TRACK B: LINGUISTIC STATISTICAL METHODS WORKSHOP - Instructor: Prof. Wei Shen (Central China Normal University, China). Focusing on quantitative analysis for linguistic and textual corpora, this track covers foundational statistics using SPSS software. Topics include parametric and non-parametric tests, clustering, correlation, chi-square tests, and regression models.
- TRACK C: PYTHON LARGE LANGUAGE MODEL PROGRAMMING WORKSHOP - Instructors: Prof. Dongbo Wang and Prof. Liu Liu (Nanjing Agricultural University, China). Designed for participants with foundational Python knowledge, this advanced track connects ancient texts with artificial intelligence. Using the ancient text LLM “Xunzi” as a case study, it covers prompt engineering, instruction fine-tuning, intelligent agent development, and LLM deployment in humanities contexts.
2. SUPPORTING ACADEMIC AND PRACTICAL EVENTS - EXPERT LECTURE SERIES: 20 top-tier international and domestic scholars will deliver 20 premium academic lectures on cutting-edge algorithms and humanities research methodologies. - THEMATIC ROUNDTABLE FORUMS: Specialized panels will host in-depth dialogues on "Opportunities and Challenges for Humanities in the LLM Era" and "The Future Trajectory of Linguistics and Digital Humanities". - CULTURAL EXCURSION AND SEMINARS: Digital humanities field trips in the historical city of Nanjing, complemented by academic networking seminar sessions.
Linguistic Field(s): Applied Linguistics Computational Linguistics Text/Corpus Linguistics
Registration: 05-May-2026 to 12-May-2026
Apply by Email: dhbase2026@126.com Apply on the web: https://v.wjx.cn/vm/rwUUUJ7.aspx
Registration Instructions: The lectures and courses will be given in Chinese. The courses are free of charge. All travel, accommodation and catering expenses shall be self-funded.