Macquarie University - Data Journalism
In this course, students are introduced to Python as a powerful tool for data-driven storytelling. Python's versatility allows media practitioners to conduct initial analyses on a wide range of socio-cultural phenomena, transforming raw data into compelling narratives. Throughout this course, students learn the fundamental concepts and techniques of using Python to analyze diverse datasets common in modern journalism. They gain hands-on experience in data collection, cleaning, and basic statistical analysis using Python libraries such as pandas. By the end of the course, students will have the foundational skills to use Python for exploring datasets, identifying trends, and uncovering stories that might otherwise remain hidden in the numbers.
You can run all the notebooks for this course using Colab (link above). If you prefer to work locally, I recommend using Jupyter notebooks which are easily accessible using the Anaconda interface.
Week | Topic | Materials |
---|---|---|
0 | Preparation | notebook |
1 | Introduction to Data Journalism | slides |
2 | Press Freedom, FOI, Open Data, and Ethics + Foundation of Statistics | slides |
3 | Foundation of Statistics + Code Foundations | slides + notebook |
4 | Data Gathering and Cleaning | slides + notebook |
5 | Data Analysis: Exploring your data | slides + notebook |
6 | Combining Data | slides + notebook |
7 | Visualizing Data | slides |
8 | OSINT, Geojournalism, and Maps | slides |
9 | Crowdsourcing, Social Media Data, and Platform Journalism | slides |
10 | Due Diligence and Follow the Money | slides |
11 | AI and Journalism | slides |
Extra | Project Management - Scrum | slides |
The datasets used in this course can be found here.
Students can also use these datasets(https://github.com/mromanello/ADA-DHOxSS/tree/master/data) to test their knowledge. They are from the Applied Data Analysis course.
See the assignments folder.
-
Gray, J., Chambers, L., & Bounegru, L. (2017). The data journalism handbook (Open Textbook Library (Corporate Author), Ed.). O’Reilly. https://doi.org/10.5281/zenodo.1281347
-
Klein, L. F., & Catherine D’Ignazio. (2020). Data feminism (Lauren F. Klein & ProQuest (Firm), Eds.). MIT Press.
-
Boyd, D., & Crawford, K. (2012). Critical questions for big data : provocations for a cultural, technological, and scholarly phenomenon. Information, Communication & Society : A Decade in Internet Time : The Dynamics of the Internet and Society, 15(5), 662–679. https://doi.org/10.1080/1369118X.2012.678878
-
Kitchin, R. (2014). The data revolution : big data, open data, data infrastructures and their consequences (1st ed.). SAGE Publications. https://doi.org/10.4135/9781473909472
-
Catherine D’Ignazio. (4201). Putting Data Back Into Context. Datajournalism.Com. https://datajournalism.com/read/longreads/putting-data-back-into-context Australian Bureau of Statistics. (n.d.). https://www.abs.gov.au Understanding statistics : helping you make sense of statistics. (2020, Autumn).
-
Qiu, J. L. (2015). Reflections on big data : ‘just because it is accessible does not make it ethical.’ Media, Culture & Society, 37(7), 1089–1094. https://doi.org/10.1177/0163443715594104 Butt, C. (1120, Autumn). Everything you always wanted to know about statistics (but were afraid to ask). Sydney Morning Herald. https://www.smh.com.au/national/everything-you-always-wanted-to-know-about-statistics-but-were-afraid-to-ask-20191105-p537ir.html
-
Jonathan Gray, G., & Liliana Bounegru, B. (2021). The Data Journalism Handbook: Towards a Critical Data Practice (J. Gray & L. Bounegru, Eds.). Amsterdam University Press. https://doi.org/10.5117/9789462989511
-
Kennedy, H. (2021). Developing Visualisation Literacy. University of Sheffield. http://seeingdata.org/developing-visualisation-literacy/ Forrest, J. (2019). None of us are free if some of us are not: Catherine D’Ignazio on Data Feminism. https://medium.com/nightingale/catherine-dignazio-on-data-feminism-ce3b3c65f04a
-
Heather Krause. (2120). An Introduction to Data Biography,. We All Count. https://weallcount.com/2019/01/21/an-introduction-to-the-data-biography/ McCandless,. (2012). Information is beautiful. William Collins.
-
McCandless, D. (2014). Knowledge is beautiful /. Harper Design, an imprint of HarperCollinsPublishers.
-
Rogers, S. (1620, Summer). The five Ws of data journalism. https://simonrogers.net/2014/10/16/the-five-ws-of-data-journalism/
-
Bradshaw, P. (7201, January). The inverted pyramid of data journalism. https://onlinejournalismblog.com/2011/07/07/the-inverted-pyramid-of-data-journalism/
-
Bradshaw, P. (1320, November). 6 ways of communicating data journalism (The inverted pyramid of data journalism part 2). https://onlinejournalismblog.com/2011/07/13/the-inverted-pyramid-of-data-journalism-part-2-6-ways-of-communicating-data-journalism/
-
Bradshaw, P. (4201, April). Updating the Inverted Pyramid of Data Journalism. https://onlinejournalismblog.com/2024/01/04/ive-updated-the-inverted-pyramid-of-data-journalism-and-brought-together-resources-for-every-stage/
-
Coddington, M. (2015). Clarifying Journalism’s Quantitative Turn: A typology for evaluating data journalism, computational journalism, and computer-assisted reporting. Digital Journalism, 3(3), 331–348. https://doi.org/10.1080/21670811.2014.976400
-
Coddington, M. (2019). Defining and Mapping Data Journalism and Computational Journalism: A review of typologies and themes. In B. Franklin, S. A. Eldridge, B. Franklin, & S. Eldridge (Eds.), The Routledge Handbook of Developments in Digital Journalism Studies (1st ed., pp. 225–236). Routledge. https://doi.org/10.4324/9781315270449-18
-
Wright, S., & Doyle, K. (2019). The Evolution of Data Journalism: A Case Study of Australia. Journalism Studies (London, England), 20(13), 1811–1827. https://doi.org/10.1080/1461670X.2018.1539343
-
Heravi, B. R., & Lorenz, M. (2020). Data Journalism Practices Globally: Skills, Education, Opportunities, and Values. Journalism and Media, 1(1), 26–40. https://doi.org/10.3390/journalmedia1010003
-
Ojo, A., & Heravi, B. (2018). Patterns in Award Winning Data Storytelling: Story Types, Enabling Tools and Competences. Digital Journalism, 6(6), 693–718. https://doi.org/10.1080/21670811.2017.1403291
-
Borges-Rey, E. (2016). Unravelling Data Journalism: A study of data journalism practice in British newsrooms. Journalism Practice, 10(7), 833–843. https://doi.org/10.1080/17512786.2016.1159921
-
Fink, K., & Anderson, C. W. (2015). Data Journalism in the United States: Beyond the “usual suspects.” Journalism Studies (London, England), 16(4), 467–481. https://doi.org/10.1080/1461670X.2014.939852
-
Baack, S. (2015). Datafication and empowerment: How the open data movement re-articulates notions of democracy, participation, and journalism. Big Data & Society, 2(2), 205395171559463-. https://doi.org/10.1177/2053951715594634
-
Coatney, C. (2023). Data investigations: A humanitarian turn. In C. Coatney & C. Coatney (Eds.), Investigative Journalism in Changing Times (1st ed., Vol. 1, pp. 135–149). Routledge. https://doi.org/10.4324/9781003279808-9
-
Shepherd, E. (2015). Freedom of Information, Right to Access Information, Open Data: Who is at the Table? Round Table (London), 104(6), 715–726. https://doi.org/10.1080/00358533.2015.1112101
-
Ureta, A. L., & Rodríguez, E. M. F. (2021). The potential of investigative data journalism to reshape professional culture and values. A study of bellwether transnational projects. Communication & Society, 34(1), 41–56. https://doi.org/10.15581/003.34.1.41-56
-
RSF. (1620, Summer). 2024 World Press Freedom Index – journalism under political pressure. https://rsf.org/en/2024-world-press-freedom-index-journalism-under-political-pressure
-
Gerli, M., Mazzoni, M., & Mincigrucci, R. (2018). Constraints and limitations of investigative journalism in Hungary, Italy, Latvia and Romania. European Journal of Communication (London), 33(1), 22–36. https://doi.org/10.1177/0267323117750672
-
Stubbs, R. (2008). Freedom of Information and Democracy in Australia and Beyond. Australian Journal of Political Science, 43(4), 667–684. https://doi.org/10.1080/10361140802429270
-
Ienca, M., & Vayena, E. (2021). Ethical requirements for responsible research with hacked data. Nature Machine Intelligence, 3(9), 744–748. https://doi.org/10.1038/s42256-021-00389-w
-
Liebes, T., & Blum-Kulka, S. (2004). It Takes Two to Blow the Whistle: Do Journalists Control the Outbreak of Scandal? The American Behavioral Scientist (Beverly Hills), 47(9), 1153–1170. https://doi.org/10.1177/0002764203262341
-
Canning, J. (4201). Statistics for the Humanities. http://statisticsforhumanities.net/book/
-
Karsdorp, F., Kestemont, M., & Riddell, A. (4201). Humanities Data Analysis: Case Studies with Python. https://www.humanitiesdataanalysis.org/index.html
-
Walsh, M. (4201). Introduction to Cultural Analytics & Python. https://melaniewalsh.github.io/Intro-Cultural-Analytics/welcome.html
-
Tong, J., & Zuo, L. (2021). The Inapplicability of Objectivity: Understanding the Work of Data Journalism. Journalism Practice, 15(2), 153–169. https://doi.org/10.1080/17512786.2019.1698974
-
D’Ignazio, C. (920, Autumn). Putting data back into context. https://datajournalism.com/read/longreads/putting-data-back-into-context
-
Krause, H. (2120, Autumn). An Introduction to Data Biography. https://weallcount.com/2019/01/21/an-introduction-to-the-data-biography/
-
Krause, H. (2720, Spring). Data Biographies: Getting to Know Your Data. https://gijn.org/stories/data-biographies-getting-to-know-your-data/
-
Krause, H. (n.d.). Understanding Data through Data Biographies [Video]. https://youtu.be/yCuRQc4xuhA Bradshaw, P. (n.d.).
-
Python for journalism: scraping, selecting and storing data [Video]. https://www.youtube.com/watch?v=ZESEopsqjeY
-
D’Ignazio, C. (2020). Data feminism (ProQuest (Firm) & L. F. Klein, Eds.). The MIT Press.
-
Newman, D. A. (2014). Missing Data: Five Practical Guidelines. Organizational Research Methods, 17(4), 372–411. https://doi.org/10.1177/1094428114548590
-
Chao, Z. (2024). Insight into Reality: Data Collection and Analysis. In The Power of Data (1st ed., pp. 59–86). Routledge. https://doi.org/10.4324/9781003426141-4
-
Jeppesen, S. (2023). Radical Data Journalism. In E. Siapera, S. Farrell, G. Souvlis, G. Souvlis, E. Siapera, & S. Farrell (Eds.), Radical Journalism (1st ed., Vol. 1, pp. 115–134). Routledge. https://doi.org/10.4324/9781003221784-8
-
de-Lima-Santos, M.-F., Schapals, A. K., & Bruns, A. (2021). Out-of-the-box versus in-house tools: How are they affecting data journalism in Australia? Media International Australia Incorporating Culture & Policy, 181(1), 152–166. https://doi.org/10.1177/1329878X20961569
-
de-Lima-Santos, M.-F., & Mesquita, L. (2023). Data Journalism in favela: Made by, for, and about Forgotten and Marginalized Communities. Journalism Practice, 17(1), 108–126. https://doi.org/10.1080/17512786.2021.1922301
-
Kusleika, D. (2021). Data Visualization with Excel Dashboards and Reports. John Wiley & Sons, Incorporated. Lindy Ryan. (2018). Visual Data Storytelling with Tableau, First edition. Addison-Wesley Professional. Ryan, L. (2018). Visual data storytelling with Tableau (1st edition). Addison-Wesley : Pearson Education. Tableau Tutorials. (n.d.).
-
Elega, A. A., de-Lima-Santos, M.-F., & Mesquita, L. (2024). Geojournalism, data journalism and crowdsourcing: The case of Eco-Nai+ in Nigeria. Journalism (London, England), 25(7), 1538–1558. https://doi.org/10.1177/14648849231225324
-
Open-Source Journalism: Innovations and Ethical Challenges. (2024). In Handbook of Digital Journalism. Springer. https://ebookcentral.proquest.com/lib/mqu/reader.action?docID=31318930&ppg=425
-
Salovaara, I. (2016). Participatory Maps: Digital cartographies and the new ecology of journalism. Digital Journalism, 4(7), 827–837. https://doi.org/10.1080/21670811.2016.1173519
-
Garreton, M., Morini, F., Paz Moyano, D., Grün, G. ‐C., Parra, D., & Dörk, M. (2023). Data Stories of Water: Studying the Communicative Role of Data Visualizations within Long‐form Journalism. Computer Graphics Forum, 42(3), 99–110. https://doi.org/10.1111/cgf.14815
-
Napoli, P. M. (2021). The platform beat: Algorithmic watchdogs in the disinformation age. European Journal of Communication (London), 36(4), 376–390. https://doi.org/10.1177/02673231211028359
-
Palomo, B., Teruel, L., & Blanco-Castilla, E. (2019). Data Journalism Projects Based on User-Generated Content. How La Nacion Data Transforms Active Audience into Staff. Digital Journalism, 7(9), 1270–1288. https://doi.org/10.1080/21670811.2019.1626257 Zubiaga, A. (2019). Mining social media for newsgathering: A review. Online Social Networks and Media, 13, 100049-. https://doi.org/10.1016/j.osnem.2019.100049 Zubiaga, A., Heravi, B., An, J., & Kwak, H. (2019). Social media mining for journalism. Online Information Review, 43(1), 2–6. https://doi.org/10.1108/OIR-02-2019-395
-
UNESCO. (n.d.). Guidelines for understanding and implementing journalistic due diligence. https://www.medijskisavjet.me/images/sampledata/dokumenti/Guidelines%20for%20understanding%20and%20implementing%20due%20journalistic%20diligence.pdf
-
Following the Money Trail: Investigative Data Journalism. (2019). In Digital Investigative Journalism. Springer International Publishing AG. https://ebookcentral.proquest.com/lib/mqu/reader.action?docID=5626867&ppg=83
-
Showkat, D., & Baumer, E. P. S. (2021). Where Do Stories Come From? Examining the Exploration Process in Investigative Data Journalism. Proceedings of the ACM on Human-Computer Interaction, 5(CSCW2), 1–31. https://doi.org/10.1145/3479534
-
de-Lima-Santos, M. F., & Salaverría, R. (2021). From Data Journalism to Artificial Intelligence: Challenges Faced by La Nación in Implementing Computer Vision in News Reporting. Palabra-Clave, 24(3), 1–40. https://doi.org/10.5294/pacla.2021.24.3.7
-
Broussard, M., Diakopoulos, N., Guzman, A. L., Abebe, R., Dupagne, M., & Chuan, C.-H. (2019). Artificial Intelligence and Journalism. Journalism & Mass Communication Quarterly, 96(3), 673–695. https://doi.org/10.1177/1077699019859901
-
de-Lima-Santos, M. F., & Ceron, W. (2022). Artificial Intelligence in News Media: Current Perceptions and Future Outlook. Journalism and Media, 3(1), 13–26. https://doi.org/10.3390/journalmedia3010002
-
Stray, J. (2019). Making Artificial Intelligence Work for Investigative Journalism. Digital Journalism, 7(8), 1076–1097. https://doi.org/10.1080/21670811.2019.1630289
-
Will Douglas Heaven. (1020, Winter). What is AI? https://www.technologyreview.com/2024/07/10/1094475/what-is-artificial-intelligence-ai-definitive-guide/
-
McKinsey Insights. (3202, April). What is AI (artificial intelligence)? https://www.mckinsey.com/featured-insights/mckinsey-explainers/what-is-ai#/
-
McKinsey Insights. (2202, April). What is generative AI? https://www.mckinsey.com/featured-insights/mckinsey-explainers/what-is-generative-ai
A good companion for this course is John Canning, Statistics for the Humanities, 2014. Also recommended are Melanie Walsh, Introduction to Cultural Analytics & Python, 2021 and Karsdorp, Kestemont, Riddell, Humanities Data Analysis: Case Studies with Python, 2021.
Mathias-Felipe de-Lima-Santos (Ph.D.) is a Lecturer (aka Assistant Professor) at Macquarie University. He is also a research associate in the Digital Media and Society Observatory (DMSO) at the Federal University of São Paulo (Unifesp), Brazil. Previously, he was a postdoctoral researcher in the Human(e) AI and AI4Media projects at the University of Amsterdam, Netherlands and a researcher at the University of Navarra, Spain, under the JOLT project, a Marie Skłodowska-Curie European Training Network funded by the European Commission’s Horizon 2020. He was also a Visiting Researcher at the Queensland University of Technology (QUT) in Brisbane, Australia. Mathias-Felipe is co-editor of the book “Journalism, Data, and Technology in Latin America” and the upcoming two-volume book “Fact-Checking in the Global South” both by Palgrave Macmillan. Mathias-Felipe is currently part of the editorial board of Digital Journalism. His research interests include the changing nature of communications driven by technological innovations, particularly in journalism, media, and online social networks.
Everything in this repository which is not already attributed to someone else is released under CC BY 4.0.