Introduction to Data Science in Python, 21/22 May (online)

Date: Thursday 21st May 9:30am-12:30pm &
Friday 22nd May 9:30am – 12:30pm (this session will now run over two mornings.)

Venue: Please note: This event will now take place online.

This is a training and capacity building event organised by the Consumer Data Research Centre (CDRC) in conjunction with the National Centre for Research Methods (NCRM), ESRC funded research projects.

This online course will introduce you to the nascent field of Data Science using the industry standard, the Python programming language. We will cover key steps involved in solving practical problems with data, from manipulation and processing, to visualisation and modelling.

This session will be held on Zoom and run over the course of two mornings – Thursday 21st May at 9:30am – 12:30pm & Friday 22nd May at 9:30am – 12:30pm.

All participants will be emailed with further details and guidance surrounding this online session.

These topics will be explored from a “hands-on” perspective using a modern Python stack (e.g. pandas, seaborn, scikit-learn), and examples using real-world spatial data. We will start with an overview of the main ways to access and manipulate data in Python. Then we will move on to visualisation, learning to create figures that allow you to better understand your data. The course will then move into unsupervised learning, with a K-Means example; and then on to supervised learning, covering linear regression and random forests, which will allow us to illustrate the challenge of overfitting and thus motivate cross-validation. We will finish the course with some time for questions and to work on your own data.

Course Tutor

Dr Dani Arribas-Bel is interested in computers, cities, and data. He is a senior lecturer in Geographic Data Science at the Department of Geography and Planning of the University of Liverpool. Prior to his appointment in 2015, Dani held positions at the University of Birmingham, the VU University in Amsterdam, Arizona State University, and Universidad de Zaragoza. He holds honorary positions at the University of Chicago’s Center for Spatial Data Science, the Center for Geospatial Sciences of the University of California Riverside, and the Smart Cities Chair of Universitat the Barcelona. Dani’s research interests combine urban studies, computational methods and new forms of data. His research has been published in journals such as PLOS ONE, Demography, Geographical Analysis, or Environment and Planning (A/B/C), and he is also member of the development team of PySAL, the Python library for spatial analysis. Dani currently serves as co-editor of the journal “Environment and Planning B – Urban Analytics & City Science” and the “Journal of the Royal Statistical Society Series A – Statistics in Society” and chairs the Quantitative Methods Research Group of the Royal Geographical Society.

Target Audience

The course is introductory and, as such, it will provide a panoramic overview of several concepts and techniques. Basic statistical notions as well as some experience with programming are not strictly required but will be helpful


£70- UK registered students
£120 – staff at UK academic institutions and research centres, UK-registered charity and voluntary organisations, staff in public sector and government
£300 – all other participants including staff from commercial organisations

Don’t forget, 40% off for SLA members.

Further information

‘Please Note: This session will not be affected by the UCU strike’

All fees include event materials, lunch, morning and afternoon tea. They do not include travel and accommodation costs.

Numbers on the course are limited and allocated on a first come, first served basis.  We reserve the right to cancel the course if the minimum number is not met.  Full refunds will be given if the course is cancelled.

For further information about the course, please email ke.coordinator@sbs.ox.ac.uk