Data Science, Artificial Intelligence, Data Analysis, Data Visualization, Business Intelligence
-Projects -Competencies -Education -Work Experience -Research Publications
Data Science Projects (Python, Tensorflow, Keras, Scikit-Learn)
Text Classification for Hate Speech Detection
Hate Speech Detection Using Emotional Embedding Features
Semi-Supervised Hate Target Detection
Data Visualization Projects (SQL, PowerBI, Excel)
Income & Expenditure Dashboard
Myntra Product Catalog Project
Top UK YouTubers in 2024 Project
Data Cleaning & Exploration Projects (SQL)
Covid Vaccination Data Exploration
- Data Analysis: SQL, NumPy, Pandas, Scikit Learn, Excel, Power Automate
- Data Visualization: PowerBI, Tableau, Looker, Matplotlib, Seaborn
- Data Science/Machine Learning/AI: TensorFlow/Keras, Pytorch, Azure Open AI Service
- Data Engineering: Snowflake, Databricks, MS Azure, Google Cloud Platform
- Programming: Python, DAX
- Version Control: GitHub/Gitlab
- OS: Linux, Windows
-
Ph.D. Artificial Intelligence/Computer Science: School of Computer Science and Engineering, Victoria University of Wellington, Wellington, New Zealand
-
M.Eng. Digital Electronic Engineering: Department of Electronic Engineering, University of Nigeria, Nsukka, Enugu, Nigeria
-
B.Eng. Electronic Engineering: Department of Electronic Engineering, University of Nigeria, Nsukka, Enugu, Nigeria
Bank of Montreal | Toronto, Ontario, Canada (Aug 2024 – Present)
- Built a transformer-based abstractive text summarization model using Python to provide short descriptions of over 5000 third-party supplier services, shortening from an average of 800 characters to less than 250 characters per description, and reducing the time spent from over 3 months to 5 minutes.
- Developed ML classification models (Random Forest, Logistic Regression) in Python to categorize suppliers into risk levels based on historical data, financial health, and past performance, enhanced supplier risk mitigation strategies, and reduced supplier-related operational disruptions by 25%.
- Led the data validation, quality checks and reconciliation of 5k+ third-party supplier records, consolidating diverse data sources into a single, centralized repository using SQL and Excel, resulting in a 65% reduction in data discrepancies and improved data accuracy.
- Designed visual reports in Power BI on the 3rd party supplier data to determine systemic concentration risk across supplier, service and global geographical dimensions for contracts supporting critical operations enabling the procurement team to prioritize risk mitigation efforts and improve supplier diversification strategies.
- Automated many multi-system repetitive processes for issue and risk management saving over 300 labour-intensive manual entry hours for the procurement team using Power Automate.
Ernst & Young LLP | Wellington, New Zealand (Jun 2023 – Mar 2024)
- Built a Gen AI-based automatic newsletter generation application on data from emails and websites using gpt-4 from Azure Open AI Service reducing the manual writing effort by 70%, and allowing the marketing team to focus on strategy and design.
- Developed a customer water usage patterns predictive model at 85% accuracy to analyze past customer consumption data and predict future usage behaviours for resource planning leading to targeted conservation efforts and improved customer satisfaction.
- Designed and implemented key performance metrics for asset and work management operations for a water utility service, resulting in a 60% reduction in contractor scheduling conflicts and decreasing critical equipment stock-out from an average of 20 occurrences to 3 per quarter.
- Built multiple Power BI dashboards that drove a 70% increase in self-service data analysis adoption, enabling teams to make timely and informed decisions across supply chain, customer & billing, and asset management.
Ernst & Young LLP | Wellington, New Zealand (Jun 2022 – Jun 2023)
- Built a visual dashboard in Power BI to track issue tickets, incidents, SLAs and other tasks for an Identity and Access Management solution leading to a reduction in response time and improved BAU support.
- Designed features for a reinforcement learning model for optimizing hospital resource allocation using data on ICU beds, ventilators and medical staff that resulted in a 20% reduction in resource wastage and a 15% improvement in patient care efficiency, as measured by bed occupancy rates and staff utilization metrics.
- Developed and executed SQL scripts to extract and analyze data from MySQL and Snowflake databases to create insightful reports, informing stakeholder decision-making.
- Validated an SQL-based hedge fund model to ensure the generated output corresponded with claims made by the financial institution according to rules designated by the governing regulatory body.
- Reduced rework in the product development process by 75% through agile delivery methodologies, rapidly prototyping the specified requirements and fostering efficient collaboration with product owners.
Victoria University of Wellington | Wellington, New Zealand (Apr 2019 - May 2022)
- Constructed novel domain-specific features for a supervised text classification model using a fine-tuned skip-gram word2vec representation and Numpy across SVM, CNN and Bi-LSTM algorithms with Scikit-learn, TensorFlow and Keras. The features captured the dependency between words in long sentences for distinguishing between hate and offensive language with an accuracy of 86%.
- Fine-tuned an LLM (BERT) classification model on domain-specific data for sentiment analysis for tweets discussing elections and politicians with an accuracy of 90%.
- Designed ensemble learning-based semi-supervised models (K-means clustering) to identify the target of hateful speech with a cluster purity score of 88.1% and an Adjusted Rand Index of 0.82, reduced dependence on scarce and biased labelled data and introduced model explainability using Python.
- Analyzed research data using Excel and Python (pandas) and published the statistically validated results in peer-reviewed conference proceedings and journal articles.
- Created presentations to communicate research results to technical and non-technical audiences at 5 international conferences and 2 competitions using Matplotlib and Seaborn for results visualization.
Department of Electronic Engineering, University of Nigeria, Nsukka | Enugu, Nigeria (May 2016 – Jan 2019)
- Conducted analysis and comparison of research findings against benchmark data to calculate statistically significant changes, contributing to 4 electronic engineering research papers for publications and conferences.
- Automated calculating course grades and CGPA using Excel for 5 courses and over 400 students across 4 levels, reducing labour hours and improving data quality.
- Digitized 10 years of student grade records and established validated data collection processes for new records in Excel, making data retrieval easy for further analysis.
- Designed exam timetables for 4 levels of departmental courses, ensuring no conflicts with faculty or inter-level courses, enabling students to retake exams from different levels without scheduling conflicts.
- MADUKWE, K. J., GAO, X., & XUE, B. “Token Replacement-based Data Augmentation Methods for Hate Speech Detection” In World Wide Web: Internet and Web Information Systems Journal (Special Issue on Web Intelligence: Artificial Intelligence in the Connected World) 2022, pp 1129–1150.
- MADUKWE, K. J., GAO, X., & XUE, B. “What Emotion Is Hate? Incorporating Emotion Information into the Hate Speech Detection Task”. In 18th Pacific Rim International Conference on Artificial Intelligence, PRICAI 2021, Hanoi, Vietnam, pp 273–286.
- MADUKWE, K. J., GAO, X., & XUE, B. “Dependency-Based Embedding for Distinguishing Between Hate Speech and Offensive Language,” 2020 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology (WI-IAT), 2020, pp. 860-868.
- MADUKWE, K. J., GAO, X., & XUE, B. “In Data We Trust: A Critical Analysis of Hate Speech Detection Datasets”. In Proceedings of the Fourth Workshop on Online Abuse and Harms(Online, 2020), Association of Computational Linguistics (ACL), pp. 150–161
- MADUKWE, K. J., GAO, X., & XUE, B. “A GA-based Approach to Fine-tuning BERT for Hate Speech Detection”. In 2020 IEEE Symposium Series on Computational Intelligence (SSCI)(2020), pp. 2821–2828.
- MADUKWE, K. J., & GAO, X. “The Thin Line Between Hate and Profanity”. In AI 2019: Advances in Artificial Intelligence (Cham, 2019), Springer International Publishing, pp. 344–356