Skip to content

Commit

Permalink
[GH Actions] automatic-add-publications-by-author (#442)
Browse files Browse the repository at this point in the history
Co-authored-by: xhluca <[email protected]>
  • Loading branch information
github-actions[bot] and xhluca authored Dec 12, 2024
1 parent b729673 commit 47fa35e
Show file tree
Hide file tree
Showing 73 changed files with 1,928 additions and 0 deletions.
22 changes: 22 additions & 0 deletions _posts/papers/2011-04-01-10.1109-ITNG.2011.67.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
---
title: A Secure e-Voting Architecture
venue: '2011 Eighth International Conference on Information Technology: New Generations'
names: A. Sodiya, S. Onashoga, David Ifeoluwa Adelani
tags:
- '2011 Eighth International Conference on Information Technology: New Generations'
link: https://doi.org/10.1109/ITNG.2011.67
author: David Adelani
categories: Publications
layout: paper

---

*{{ page.names }}*

**{{ page.venue }}**

{% include display-publication-links.html pub=page %}

## Abstract

The constant development in computer technology now gives rise to an efficient way of using computer or electronic medium of voting. However, it is being faced with the problem of non-anonymity, coercion and bribery. In this paper, elliptic curve is combined with ElGamal cryptosystem to enhance the security of e-voting architecture. Several points from (x, y) coordinates from elliptic curve are used instead of using a large integer along with ElGamal encryption that is based on probabilistic encryption (produces several cipher texts) which is used to ensure anonymity, non-coercion and receipt-freeness. A voter can also revote to find an appropriate answer to coercion at another location. With the proposed architecture, e-voting system should be fair.
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
---
title: A Devs-Based Ann Training and Prediction Platform
venue: ''
names: David Ifeoluwa Adelani
tags:
- ''
link: https://www.semanticscholar.org/paper/524aa9a70db6cdd57aeed0b7b0e2575231c590cb
author: David Adelani
categories: Publications
layout: paper

---

*{{ page.names }}*

**{{ page.venue }}**

{% include display-publication-links.html pub=page %}

## Abstract

None
23 changes: 23 additions & 0 deletions _posts/papers/2016-06-02-10.1142-S1793962316500057.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
---
title: Enhancing the reusability and interoperability of artificial neural networks
with DEVS modeling and simulation
venue: Advances in Complex Systems
names: David Ifeoluwa Adelani, M. Traoré
tags:
- Advances in Complex Systems
link: https://doi.org/10.1142/S1793962316500057
author: David Adelani
categories: Publications
layout: paper

---

*{{ page.names }}*

**{{ page.venue }}**

{% include display-publication-links.html pub=page %}

## Abstract

Artificial neural networks (ANNs), a branch of artificial intelligence, has become a very interesting domain since the eighties when back-propagation (BP) learning algorithm for multilayer feed-forward architecture was introduced to solve nonlinear problems. It is used extensively to solve complex nonalgorithmic problems such as prediction, pattern recognition and clustering. However, in the context of a holistic study, there may be a need to integrate ANN with other models developed in various paradigms to solve a problem. In this paper, we suggest discrete event system specification (DEVS) be used as a model of computation (MoC) to make ANN models interoperable with other models (since all discrete event models can be expressed in DEVS, and continuous models can be approximated by DEVS). By combining ANN and DEVS, we can model the complex configuration of ANNs and express its internal workings. Therefore, we are extending the DEVS-based ANN proposed by Toma et al. [A new DEVS-based generic artficial neural network modeling approach, The 23rd European Modeling and Simulation Symp. (Simulation in Industry), Rome, Italy, 2011] for comparing multiple configuration parameters and learning algorithms and also to do prediction. The DEVS models are described using the high level language for system specification (HiLLS), [Maiga et al., A new approach to modeling dynamic structure systems, The 29th European Modeling and Simulation Symp. (Simulation in Industry), Leicester, United Kingdom, 2015] a graphical modeling language for clarity. The developed platform is a tool to transform ANN models into DEVS computational models, making them more reusable and more interoperable in the context of larger multi-perspective modeling and simulation (MAS).
24 changes: 24 additions & 0 deletions _posts/papers/2019-05-13-1905.05961.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
---
title: Demographic Inference and Representative Population Estimates from Multilingual
Social Media Data
venue: The Web Conference
names: Zijian Wang, Scott A. Hale, David Ifeoluwa Adelani, Przemyslaw A. Grabowicz,
Timo Hartmann, Fabian Flöck, David Jurgens
tags:
- The Web Conference
link: https://arxiv.org/abs/1905.05961
author: David Adelani
categories: Publications
layout: paper

---

*{{ page.names }}*

**{{ page.venue }}**

{% include display-publication-links.html pub=page %}

## Abstract

Social media provide access to behavioural data at an unprecedented scale and granularity. However, using these data to understand phenomena in a broader population is difficult due to their non-representativeness and the bias of statistical inference tools towards dominant languages and groups. While demographic attribute inference could be used to mitigate such bias, current techniques are almost entirely monolingual and fail to work in a global environment. We address these challenges by combining multilingual demographic inference with post-stratification to create a more representative population sample. To learn demographic attributes, we create a new multimodal deep neural architecture for joint classification of age, gender, and organization-status of social media users that operates in 32 languages. This method substantially outperforms current state of the art while also reducing algorithmic bias. To correct for sampling biases, we propose fully interpretable multilevel regression methods that estimate inclusion probabilities from inferred joint population counts and ground-truth population counts. In a large experiment over multilingual heterogeneous European regions, we show that our demographic inference and bias correction together allow for more accurate estimates of populations and make a significant step towards representative social sensing in downstream applications with multilingual social media.
23 changes: 23 additions & 0 deletions _posts/papers/2019-07-22-1907.09177.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
---
title: Generating Sentiment-Preserving Fake Online Reviews Using Neural Language Models
and Their Human- and Machine-based Detection
venue: International Conference on Advanced Information Networking and Applications
names: David Ifeoluwa Adelani, H. Mai, Fuming Fang, H. Nguyen, J. Yamagishi, I. Echizen
tags:
- International Conference on Advanced Information Networking and Applications
link: https://arxiv.org/abs/1907.09177
author: David Adelani
categories: Publications
layout: paper

---

*{{ page.names }}*

**{{ page.venue }}**

{% include display-publication-links.html pub=page %}

## Abstract

None
24 changes: 24 additions & 0 deletions _posts/papers/2019-12-05-1912.02481.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
---
title: 'Massive vs. Curated Embeddings for Low-Resourced Languages: the Case of Yorùbá
and Twi'
venue: International Conference on Language Resources and Evaluation
names: Jesujoba Oluwadara Alabi, Kwabena Amponsah-Kaakyire, David Ifeoluwa Adelani,
C. España-Bonet
tags:
- International Conference on Language Resources and Evaluation
link: https://arxiv.org/abs/1912.02481
author: David Adelani
categories: Publications
layout: paper

---

*{{ page.names }}*

**{{ page.venue }}**

{% include display-publication-links.html pub=page %}

## Abstract

The success of several architectures to learn semantic representations from unannotated text and the availability of these kind of texts in online multilingual resources such as Wikipedia has facilitated the massive and automatic creation of resources for multiple languages. The evaluation of such resources is usually done for the high-resourced languages, where one has a smorgasbord of tasks and test sets to evaluate on. For low-resourced languages, the evaluation is more difficult and normally ignored, with the hope that the impressive capability of deep learning architectures to learn (multilingual) representations in the high-resourced setting holds in the low-resourced setting too. In this paper we focus on two African languages, Yorùbá and Twi, and compare the word embeddings obtained in this way, with word embeddings obtained from curated corpora and a language-dependent processing. We analyse the noise in the publicly available corpora, collect high quality and noisy data for the two languages and quantify the improvements that depend not only on the amount of data but on the quality too. We also use different architectures that learn word representations both from surface forms and characters to further exploit all the available information which showed to be important for these languages. For the evaluation, we manually translate the wordsim-353 word pairs dataset from English into Yorùbá and Twi. We extend the analysis to contextual word embeddings and evaluate multilingual BERT on a named entity recognition task. For this, we annotate with named entities the Global Voices corpus for Yorùbá. As output of the work, we provide corpora, embeddings and the test suits for both languages.
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
---
title: Improving Yorùbá Diacritic Restoration
venue: arXiv.org
names: Iroro Orife, David Ifeoluwa Adelani, Timi E. Fasubaa, Victor Williamson, W.
Oyewusi, Olamilekan Wahab, Kólá Túbosún
tags:
- arXiv.org
link: https://www.semanticscholar.org/paper/6a39fcaae3eee1277df9f1191b8bab1b11732c25
author: David Adelani
categories: Publications
layout: paper

---

*{{ page.names }}*

**{{ page.venue }}**

{% include display-publication-links.html pub=page %}

## Abstract

None
22 changes: 22 additions & 0 deletions _posts/papers/2020-03-18-2003.08272.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
---
title: Unsupervised Pidgin Text Generation By Pivoting English Data and Self-Training
venue: arXiv.org
names: Ernie Chang, David Ifeoluwa Adelani, Xiaoyu Shen, Vera Demberg
tags:
- arXiv.org
link: https://arxiv.org/abs/2003.08272
author: David Adelani
categories: Publications
layout: paper

---

*{{ page.names }}*

**{{ page.venue }}**

{% include display-publication-links.html pub=page %}

## Abstract

West African Pidgin English is a language that is significantly spoken in West Africa, consisting of at least 75 million speakers. Nevertheless, proper machine translation systems and relevant NLP datasets for pidgin English are virtually absent. In this work, we develop techniques targeted at bridging the gap between Pidgin English and English in the context of natural language generation. %As a proof of concept, we explore the proposed techniques in the area of data-to-text generation. By building upon the previously released monolingual Pidgin English text and parallel English data-to-text corpus, we hope to build a system that can automatically generate Pidgin English descriptions from structured data. We first train a data-to-English text generation system, before employing techniques in unsupervised neural machine translation and self-training to establish the Pidgin-to-English cross-lingual alignment. The human evaluation performed on the generated Pidgin texts shows that, though still far from being practically usable, the pivoting + self-training technique improves both Pidgin text fluency and relevance.
24 changes: 24 additions & 0 deletions _posts/papers/2020-03-18-2003.08370.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
---
title: 'Distant Supervision and Noisy Label Learning for Low Resource Named Entity
Recognition: A Study on Hausa and Yorùbá'
venue: AfricaNLP
names: David Ifeoluwa Adelani, Michael A. Hedderich, D. Zhu, Esther van den Berg,
D. Klakow
tags:
- AfricaNLP
link: https://arxiv.org/abs/2003.08370
author: David Adelani
categories: Publications
layout: paper

---

*{{ page.names }}*

**{{ page.venue }}**

{% include display-publication-links.html pub=page %}

## Abstract

The lack of labeled training data has limited the development of natural language processing tools, such as named entity recognition, for many languages spoken in developing countries. Techniques such as distant and weak supervision can be used to create labeled data in a (semi-) automatic way. Additionally, to alleviate some of the negative effects of the errors in automatic annotation, noise-handling methods can be integrated. Pretrained word embeddings are another key component of most neural named entity classifiers. With the advent of more complex contextual word embeddings, an interesting trade-off between model size and performance arises. While these techniques have been shown to work well in high-resource settings, we want to study how they perform in low-resource scenarios. In this work, we perform named entity recognition for Hausa and Yor\`ub\'a, two languages that are widely spoken in several developing countries. We evaluate different embedding approaches and show that distant supervision can be successfully leveraged in a realistic low-resource scenario where it can more than double a classifier's performance.
23 changes: 23 additions & 0 deletions _posts/papers/2020-03-23-2003.10564.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
---
title: Improving Yor\`ub\'a Diacritic Restoration
venue: ''
names: Iroro Orife, David Ifeoluwa Adelani, Timi E. Fasubaa, Victor Williamson, W.
Oyewusi, Olamilekan Wahab, Kólá Túbosún
tags:
- ''
link: https://arxiv.org/abs/2003.10564
author: David Adelani
categories: Publications
layout: paper

---

*{{ page.names }}*

**{{ page.venue }}**

{% include display-publication-links.html pub=page %}

## Abstract

Yoruba is a widely spoken West African language with a writing system rich in orthographic and tonal diacritics. They provide morphological information, are crucial for lexical disambiguation, pronunciation and are vital for any computational Speech or Natural Language Processing tasks. However diacritic marks are commonly excluded from electronic texts due to limited device and application support as well as general education on proper usage. We report on recent efforts at dataset cultivation. By aggregating and improving disparate texts from the web and various personal libraries, we were able to significantly grow our clean Yoruba dataset from a majority Bibilical text corpora with three sources to millions of tokens from over a dozen sources. We evaluate updated diacritic restoration models on a new, general purpose, public-domain Yoruba evaluation dataset of modern journalistic news text, selected to be multi-purpose and reflecting contemporary usage. All pre-trained models, datasets and source-code have been released as an open-source project to advance efforts on Yoruba language technology.
23 changes: 23 additions & 0 deletions _posts/papers/2020-06-19-2006.10919.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
---
title: On the effect of normalization layers on Differentially Private training of
deep Neural networks
venue: ''
names: A. Davody, David Ifeoluwa Adelani, Thomas Kleinbauer, D. Klakow
tags:
- ''
link: https://arxiv.org/abs/2006.10919
author: David Adelani
categories: Publications
layout: paper

---

*{{ page.names }}*

**{{ page.venue }}**

{% include display-publication-links.html pub=page %}

## Abstract

Differentially private stochastic gradient descent (DPSGD) is a variation of stochastic gradient descent based on the Differential Privacy (DP) paradigm, which can mitigate privacy threats that arise from the presence of sensitive information in training data. However, one major drawback of training deep neural networks with DPSGD is a reduction in the models accuracy. In this paper, we study the effect of normalization layers on the performance of DPSGD. We demonstrate that normalization layers significantly impact the utility of deep neural networks with noisy parameters and should be considered essential ingredients of training with DPSGD. In particular, we propose a novel method for integrating batch normalization with DPSGD without incurring an additional privacy loss. With our approach, we are able to train deeper networks and achieve a better utility-privacy trade-off.
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
---
title: Robust Differentially Private Training of Deep Neural Networks
venue: arXiv.org
names: A. Davody, David Ifeoluwa Adelani, Thomas Kleinbauer, D. Klakow
tags:
- arXiv.org
link: https://www.semanticscholar.org/paper/d6802054111e37b6d6b517fbe1cddf394cef76c7
author: David Adelani
categories: Publications
layout: paper

---

*{{ page.names }}*

**{{ page.venue }}**

{% include display-publication-links.html pub=page %}

## Abstract

Differentially private stochastic gradient descent (DPSGD) is a variation of stochastic gradient descent based on the Differential Privacy (DP) paradigm which can mitigate privacy threats arising from the presence of sensitive information in training data. One major drawback of training deep neural networks with DPSGD is a reduction in the model's accuracy. In this paper, we propose an alternative method for preserving data privacy based on introducing noise through learnable probability distributions, which leads to a significant improvement in the utility of the resulting private models. We also demonstrate that normalization layers have a large beneficial impact on the performance of deep neural networks with noisy parameters. In particular, we show that contrary to general belief, a large amount of random noise can be added to the weights of neural networks without harming the performance, once the networks are augmented with normalization layers. We hypothesize that this robustness is a consequence of the scale invariance property of normalization operators. Building on these observations, we propose a new algorithmic technique for training deep neural networks under very low privacy budgets by sampling weights from Gaussian distributions and utilizing batch or layer normalization techniques to prevent performance degradation. Our method outperforms previous approaches, including DPSGD, by a substantial margin on a comprehensive set of experiments on Computer Vision and Natural Language Processing tasks. In particular, we obtain a 20 percent accuracy improvement over DPSGD on the MNIST and CIFAR10 datasets with DP-privacy budgets of $\varepsilon = 0.05$ and $\varepsilon = 2.0$, respectively. Our code is available online: this https URL.
22 changes: 22 additions & 0 deletions _posts/papers/2020-08-07-2008.03101.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
---
title: Privacy Guarantees for De-identifying Text Transformations
venue: Interspeech
names: David Ifeoluwa Adelani, A. Davody, Thomas Kleinbauer, D. Klakow
tags:
- Interspeech
link: https://arxiv.org/abs/2008.03101
author: David Adelani
categories: Publications
layout: paper

---

*{{ page.names }}*

**{{ page.venue }}**

{% include display-publication-links.html pub=page %}

## Abstract

Machine Learning approaches to Natural Language Processing tasks benefit from a comprehensive collection of real-life user data. At the same time, there is a clear need for protecting the privacy of the users whose data is collected and processed. For text collections, such as, e.g., transcripts of voice interactions or patient records, replacing sensitive parts with benign alternatives can provide de-identification. However, how much privacy is actually guaranteed by such text transformations, and are the resulting texts still useful for machine learning? In this paper, we derive formal privacy guarantees for general text transformation-based de-identification methods on the basis of Differential Privacy. We also measure the effect that different ways of masking private information in dialog transcripts have on a subsequent machine learning task. To this end, we formulate different masking strategies and compare their privacy-utility trade-offs. In particular, we compare a simple redact approach with more sophisticated word-byword replacement using deep learning models on multiple natural language understanding tasks like named entity recognition, intent detection, and dialog act classification. We find that only word-byword replacement is robust against performance drops in various tasks.
Loading

0 comments on commit 47fa35e

Please sign in to comment.