Machine Learning

This module provides an in-depth, practice-oriented introduction to machine learning, with a strong emphasis on real-world data analytics applications. It is designed to develop both conceptual understanding and practical proficiency, beginning with the mathematical foundations essential to the field and progressing through a structured exploration of core learning models and algorithmic paradigms. The curriculum covers a broad range of supervised and unsupervised learning techniques, integrating theoretical insights with hands-on implementation using contemporary software tools and datasets. Practical case studies are used to bridge the gap between theory and application, enabling the construction and evaluation of full machine learning pipelines. Advanced topics such as model interpretability, performance evaluation, and deployment strategies are introduced in the context of current industry and research practices. The module also considers ethical, legal, and societal implications, highlighting the importance of responsible innovation in the development and application of machine learning systems.

Learning Outcomes

• Learn about the key paradigms and algorithms in machine learning.
• Get an understanding of data analytics based on machine learning and using modern programming tools, such as Python or R.
• Experience how machine learning and data analytics can be used in real-world applications.
• Acquire the ability to gather and synthesise information from multiple sources to aid in the systematic analysis of complex problems using machine learning tools and algorithms.

Units & Activities

Collaborative Discussion

Legal and Ethical Views on ANN Applications

As mentioned in my initial post, Large Language Models (LLMs) have rapidly transformed how we approach various aspects of life, including communication, writing, research, and decision-making. Recent progress has been powered by transformer architectures, rapidly emerging as leaders over previous deep learning approaches like Convolutional Neural Networks (CNNs) (Sajun, Sankalpa and Zualkernan, 2024). Examples include LLMs like BERT and GPT-4, models that pushed the boundaries of Natural Language Processing (NLP).

GPT-4 is a powerful tool, passing both the Turing test (Jones and Bergen, 2025) and the BAR exam (Daniel Martin Katz et al., 2024) with great performance. However, uncertainty remains a major concern. Despite their strengths, LLMs often generate false or nonsensical information, a process called hallucination (Liu et al., 2024). To address this, Zhang et al. (2025) applied a retrieval-augmented generation (RAG) method, feeding the model with real examples to fill in the gaps and prevent it from falsifying code or generating confusion.

Another issue raised in the initial post, and by my peers, is bias. LLMs were found to reflect societal biases, propagating issues like gender and racial stereotypes (Zack et al. 2024). This aspect, fueled by widespread use, highlights the urgent need for more accountability and regulation in model training and retroactive error correction.

Regardless of NLP's success, limitations persist. Trying to replicate traits such as empathy and common sense remains difficult, although improvements are underway (Rasool et al., 2025). Ethical challenges also remain: overreliance in writing may undermine learning and academic integrity (Perkins et al. 2023), while applications in business raise questions about data privacy (Li et al. 2025).

Moving forward, the AI implementation process must be responsible, including tracing a solid legal framework, but the most important aspect to ensure smooth progression is combining human oversight with machine support. It is essential to mitigate risks to harness AI's full potential across not only text, but only vision and other applications.

References

Jones, C.R. and Bergen, B.K. (2025) ‘Large Language Models Pass the Turing Test’, arXiv (Cornell University). Available at: https://arxiv.org/abs/2503.23674 (Accessed: 7 July 2025).
Katz, D.M., Bommarito, M.J., Gao, S. and Arredondo, P. (2024) ‘GPT-4 passes the bar exam’, Philosophical Transactions of the Royal Society A 382(2270). Available at: https://doi.org/10.1098/rsta.2023.0254 (Accessed: 8 July 2025).
Li, W., Lai, Y., Soni, S. and Saha, K. (2025) ‘Emails by LLMs: A Comparison of Language in AI-Generated and Human-Written Emails’, Proceedings of the 17th ACM Web Science Conference, pp. 391–403. Available at: https://dl.acm.org/doi/10.1145/3717867.3717872 (Accessed: 8 July 2025).
Liu, F., Liu, Y., Shi, L., Huang, H., Wang, R., Yang, Z., Zhang, L., Li, Z. and Ma, Y. (2024) ‘Exploring and Evaluating Hallucinations in LLM-Powered Code Generation’, arXiv (Cornell University). Available at: https://doi.org/10.48550/arXiv.2404.00971 (Accessed: 8 July 2025).
Perkins, M., Roe, J., Postma, D. and McGaughran, J. (2023) ‘Detection of GPT-4 Generated Text in Higher Education: Combining Academic Judgement and Software to Identify Generative AI Tool Misuse’, Journal of Academic Ethics 22(22). Available at: https://doi.org/10.1007/s10805-023-09492-6 (Accessed: 8 July 2025).
Rasool, A., Shahzad, M.I., Aslam, H., Chan, V. and Arshad, M.A. (2025). ‘Emotion-Aware Embedding Fusion in Large Language Models (Flan-T5, Llama 2, DeepSeek-R1, and ChatGPT 4) for Intelligent Response Generation’, AI 6(3), p.56. Available at: https://doi.org/10.3390/ai6030056 (Accessed: 8 July 2025).
Sajun, A.R., Sankalpa, D. and Zualkernan, I. (2024) ‘A Historical Survey of Advances in Transformer Architectures’, Applied Sciences 14(10), pp.4316–4316. Available at: https://doi.org/10.3390/app14104316 (Accessed: 8 July 2025).
Zack, T., Lehman, E., Suzgun, M., Rodriguez, J., A., Celi, L., A., Gichoya, J., Jurafsky, D., Szolovits, P., Bates, D., W., Abdulnour, R., Butte, A., J. and Alsentzer, E. (2024) ‘Assessing GPT-4's potential to perpetuate biases in healthcare: a model evaluation study’, The Lancet Digital Health 6(1), pp. e12–e22. Available at: https://www.thelancet.com/journals/landig/article/PIIS2589-7500(23)00225-X/fulltext (Accessed: 8 July 2025).
Zhang, Z., Wang, C., Wang, Y., Shi, E., Ma, Y., Zhong, W., Chen, J., Mao, M. and Zheng, Z. (2025) ‘LLM Hallucinations in Practical Code Generation: Phenomena, Mechanism, and Mitigation’, Proceedings of the ACM on software engineering 2(ISSTA), pp.481–503. Available at: https://doi.org/10.1145/3728894 (Accessed: 8 July 2025).

Assignments

Development Team Project: Airbnb Business Analysis Using a Data Science Approach

Development Team Project Report (Click to Download)

Summative Assessment: Neural Network Models for Object Recognition Using Multi-Track ML Approaches

Individual Presentation Transcript (Click to Download)
Individual Presentation CIFAR-10 (Click to Download)
CIFAR-10 Code Demonstration (Click to Open)

Module Reflection

Reflective Piece (Click to Download)