Mahmoud Al Shboul (aka Mahmoud Al-Ayyoub) received his Ph.D. in Computer Science (CS) from Stony Brook University in 2010. He is a professor at Ajman University, UAE. His research interest is Artificial Intelligence and its applications in various fields. He published hundreds of manuscripts (book chapters, journal/conference/workshop papers and patents), some of which are well-received by the community as they won or were nominated for Best Paper Awards and they are widely-read and well-cited. This has led to Prof Al-Ayyoub being consistently featured among the top CS researchers within his home country of Jordan (e.g., according to UoJ Ranking of Jordanian Researchers) and globally (e.g., according to Research.com's Ranking of Top Scientists in CS and Rising Stars of Science Award).
The ownership of user actions in computer and mobile applications is an important concern, especially when using shared devices. User identification using physical biometric authentication methods permits the actual user to access the device. However, in cases where different users may access a shared device during the same active session, the person who owns the active session will be accountable for any actions performed by the other users. Thus, user identification using behavioral characteristics has come into the picture. Human activity recognition from mobile sensor data is gaining more interest with the advent of mobile devices and the emergence of the Internet of Things, where different applications such as elderly health monitoring, athletic evaluation, and context-aware behavior are being developed. In this paper, we show how human activity data can be utilized to identify the actual device user. We build deep learning models that are capable of identifying the users of mobile and wearable devices based on their body movements and daily activities. We use the Long Short-Term Memory classifier for building the user identification model based on time-series data from mobile motion sensors. The model targets the users that were involved in the training process. We tested our approach on two publicly available human activity datasets that contain daily activities and fall states data from accelerometer and gyroscope mobile sensors. The results show that the models are capable of identifying the actual mobile device users from their motion data with an accuracy of up to 90%. Further, the results show that the model from the accelerometer data outperforms the one from gyroscope data.
The ability to automatically understand and analyze human language attracted researchers and practitioners in the Natural Language Processing (NLP) field. Detecting humor is an NLP task needed in many areas, including marketing, politics, and news. However, such a task is challenging due to the context, emotion, culture, and rhythm. To address this problem, we have proposed a robust model called BFHumor, a BERT-Flair-based Humor detection model that detects humor through news headlines. It is an ensemble model of different state-of-the-art pre-trained models utilizing various NLP techniques. We used public humor datasets from the SemEval-2020 workshop to evaluate the proposed model. As a result, the model achieved outstanding performance with 0.51966 as Root Mean Squared Error (RMSE) and 0.62291 as accuracy. In addition, we extensively investigated the underlying reasons behind the high accuracy of the BFHumor model in humor detection tasks. To that end, we conducted two experiments on the BERT model: vocabulary level and linguistic capturing level. Our investigation shows that BERT can capture surface knowledge in the lower layers, syntactic in the middle, and semantic in the higher layers.
This work is an effort towards building Neural Speech Recognizers system for Quranic recitations that can be effectively used by anyone regardless of their gender and age. Despite having a lot of recitations available online, most of them are recorded by professional male adult reciters, which means that an ASR system trained on such datasets would not work for female/child reciters. We address this gap by adopting a benchmark dataset of audio records of Quranic recitations that consists of recitations by both genders from different ages. Using this dataset, we build several speaker-independent NSR systems based on the DeepSpeech model and use word error rate (WER) for evaluating them. The goal is to show how an NSR system trained and tuned on a dataset of a certain gender would perform on a test set from the other gender. Unfortunately, the number of female recitations in our dataset is rather small while the number of male recitations is much larger. In the first set of experiments, we avoid the imbalance issue between the two genders and down-sample the male part to match the female part. For this small subset of our dataset, the results are interesting with 0.968 WER when the system is trained on male recitations and tested on female recitations. The same system gives 0.406 WER when tested on male recitations. On the other hand, training the system on female recitations and testing it on male recitation gives 0.966 WER while testing it on female recitations gives 0.608 WER.
Emotion analysis is divided into emotion detection, where the system detects if there is an emotional state, and emotion recognition where the system identifies the label of the emotion. In this paper, we provide a multimodal system for emotion detection and recognition using Arabic dataset. We evaluated the performance of both audio and visual data as a unimodal system, then, we exposed the impact of integrating the information sources into one model. We examined the effect of gender identification on the performance. Our results show that identifying speaker’s gender beforehand increases the performance of emotion recognition especially for the models that rely on audio data. Comparing the audio-based system with the visual-based system demonstrates that each model performs better for a specific emotional label. 70% of the angry labels were predicted correctly using the audio model while this percentage was less using the visual model (63%). The accuracy obtained for the surprise class was (40.6%) using the audio model while it was (56.2%) using the visual model. The combination of both modalities improves accuracy. Our final result for the multimodal system was (75%) for the emotion detection task and (60.11%) for emotion recognition task and these results are among the top results achieved in this field and the first which focus on Arabic content. Specifically, the novelty of this work is expressed by exploiting deep learning and multimodal models in emotion analysis and applying it on a natural audio and video dataset for Arabic speaking persons.
Satisfaction Detection is one of the most common issues that impact the business world. So, this study aims to suggest an application that detects the Satisfaction tone that leads to customer happiness for Big Data that came out from online businesses on social media, in particular, Facebook and Twitter, by using two famous methods, machine learning and deep learning (DL) techniques.There is a lack of datasets that are involved with this topic. Therefore, we have collected the dataset from social media. We have simplified the concept of Big Data analytics for business on social media using three of the most famous Natural Language Processing tools, stemming, normalization, and stop word removal. To evaluate the performance of the classifiers, we calculated F1-measure, Recall, and Precision measures. The result showed superiority for the Random Forest classifier the highest value of F1-measure with (99.1%). The best result achieved without applying pre-processing techniques, through Support Vector Machine with F1-measure (93.4%). On the other hand, we apply DL techniques, and we apply the feature extraction method, which includes Word Embedding and Bag of Words on the dataset. The results showed superiority for the Deep Neural Networks DNN algorithm.
Having a system that can take an image of a natural scene and accurately classify the plants in it is of undeniable importance. However, the complexities of dealing with natural scene images and the vast diversity of plants in the wild make designing such a classifier a challenging task. Deep Learning (DL) lends itself as viable solution to tackle such complex problem. However, advanced in DL architectures and software (including DL frameworks) come with a high cost in terms of energy consumption especially when employing Graphics Processing Units (GPU). As data expands rapidly, the need to create energy-aware models increases in order to reduce energy consumption and move towards “Greener AI”. Since the problem of designing energy-aware architectures for plant classification has not been studied significantly in the literature, our work comes to start bridging this gap by focusing not only on the models’ performance, but also on their energy usage on both CPU and GPU platforms. We consider different state-of-the-art Convolutional Neural Networks (CNN) architectures and train them on two famous challenging plants datasets: iNaturalist and Herbarium. Our experiments are meant to highlight the trade-off between accuracy and energy consumption. For examples, the results show that while GPU-bound models can be about 40% faster in terms of training time than simple models running on CPU, the latter’s energy consumption is only two thirds of the former. We hope that such findings will encourage the community to reduce its reliance on accuracy measures to compare different architectures and start taking other factors into account such as power consumption, simplicity, etc.
Human activity recognition is a thriving field with many applications in several domains. It relies on well-trained artificial intelligence models to provide accurate real-time predictions of various human movements and activities. Human activity recognition utilizes various types of sensors such as video cameras, fixed motion sensors, and those found in personal smart edge devices such as accelerometers and gyroscopes. The latter sensors capture motion as time-series data, following a specific pattern for each movement. However, movements for some users may vary from these patterns, which limit the efficacy of using a generic model. This paper proposes a human activity recognition architecture that utilizes deep learning models using time-series data. It applies incremental learning for building personalized models derived from a well-trained model. The architecture uses edge devices for model prediction and the cloud for model training. Performing the prediction on edge devices reduces the network overhead as well as the load on the cloud. We tested our approach on a publicly available dataset containing samples for daily living activities and fall states. The results show that building a personalized model from a well-trained model significantly improves the prediction accuracy. Moreover, deploying a light version of the model on edge devices maintains prediction accuracy and provides comparable response times to the original model on the cloud.
Recently, the COVID-19 pandemic has triggered different behaviors in education, especially during the lockdown, to contain the virus outbreak in the world. As a result, educational institutions worldwide are currently using online learning platforms to maintain their education presence. This research paper introduces and examines a dataset, E-LearningDJUST, that represents a sample of the student’s study progress during the pandemic at Jordan University of Science and Technology (JUST). The dataset depicts a sample of the university’s students as it includes 9,246 students from 11 faculties taking four courses in spring 2020, summer 2020, and fall 2021 semesters. To the best of our knowledge, it is the first collected dataset that reflects the students’ study progress within a Jordanian institute using e-learning system records. One of this work’s key findings is observing a high correlation between e-learning events and the final grades out of 100. Thus, the E-LearningDJUST dataset has been experimented with two robust machine learning models (Random Forest and XGBoost) and one simple deep learning model (Feed Forward Neural Network) to predict students’ performances. Using RMSE as the primary evaluation criteria, the RMSE values range between 7 and 17. Among the other main findings, the application of feature selection with the random forest leads to better prediction results for all courses as the RMSE difference ranges between (0–0.20). Finally, a comparison study examined students’ grades before and after the Coronavirus pandemic to understand how it impacted their grades. A high success rate has been observed during the pandemic compared to what it was before, and this is expected because the exams were online. However, the proportion of students with high marks remained similar to that of pre-pandemic courses.
The release of millions of financial documents, which has been known as the ‘WikiLeaks’ of the financial world (a.k.a. ‘Panama Papers’), has dragged global attention in how highly structured means applied by some of the elite to conceal their financial assets. Consequently, significant financial corruption allegations were raised. We concentrate on a somewhat overlooked region, the Middle East and North Africa (MENA) region. This study aims to use social network analytics to study the information contained in these documents. We are checking the major players in the MENA’s trends and patterns to determine if it matches the known economic powers. The analysis reveals that while the constructed network enjoys some typical characteristics, many interesting observations and properties are worth discussing. Specifically, using the extracted network consisting of 62 987 nodes and 84 692 edges, our social network analysis finding shows that, perhaps surprisingly, the nodes or the social network are not necessarily directly correlated with perceived economic influence.
Human activity recognition is concerned with detecting different types of human movements and actions using data gathered from various types of sensors. Deep learning approaches, when applied on time series data, offer promising results over intensive handcrafted feature extraction techniques that are highly reliant on the quality of defined domain parameters. In this paper, we investigate the benefits of time series data augmentation in improving the accuracy of several deep learning models on human activity data gathered from mobile phone accelerometers. More specifically, we compare the performance of the Vanilla, Long-Short Term Memory, and Gated Recurrent Units neural network models on three open-source datasets. We use two time series data augmentation techniques and study their impact on the accuracy of the target models. The experiments show that using gated recurrent units achieves the best results in terms of accuracy and training time followed by the long-short term memory technique. Furthermore, the results show that using data augmentation significantly enhances recognition quality.
Bioinformatics is an interdisciplinary field that applies trending techniques in information technology, mathematics, and statistics in studying large biological data. Bioinformatics involves several computational techniques such as sequence and structural alignment, data mining, macromolecular geometry, prediction of protein structure and gene finding. Protein structure and sequence analysis are vital to the understanding of cellular processes. Understanding cellular processes contributes to the development of drugs for metabolic pathways. Protein sequence alignment is concerned with identifying the similarities and the relationships among different protein structures. In this paper, we target two well-known protein sequence alignment algorithms, the Needleman–Wunsch and the Smith–Waterman algorithms. These two algorithms are computationally expensive which hinders their applicability for large data sets. Thus, we propose a hybrid parallel approach that combines the capabilities of multi-core CPUs and the power of contemporary GPUs, and significantly speeds up the execution of the target algorithms. The validity of our approach is tested on real protein sequences. Moreover, the scalability of the approach is verified on randomly generated sequences with predefined similarity levels. The results showed that the proposed hybrid approach was up to 242 times faster than the sequential approach.
Computer-aided diagnosis (CAD) systems have been the focus of many research endeavors. We consider the problem of building a CAD system for diagnosing lumbar disk herniation from MRI axial scans. Like other typical image based CAD systems, the CAD system we consider consists of several stages: image acquisition, region of interest (ROI) extraction and enhancement, feature extraction, and classification. Experimentally, we found that the ROI extraction is the hardest stage and it greatly determines the accuracy of the CAD system. In this work, we enhance on the ROI extraction process by using SIFT features, which are well known for their use in matching objects. The experiments conducted to evaluate the SIFT based ROI extraction approach shows its superiority over existing heuristic approach.
The outbreak of coronavirus disease 2019 (COVID-19) drives most higher education systems in many countries to stop face-to-face learning. Accordingly, many universities, including Jordan University of Science and Technology (JUST), changed the teaching method from face-to-face education to electronic learning from a distance. This research paper investigated the impact of the e-learning experience on the students during the spring semester of 2020 at JUST. It also explored how to predict students’ academic performances using e-learning data. Consequently, we collected students’ datasets from two resources: the center for e-learning and open educational resources and the admission and registration unit at the university. Five courses in the spring semester of 2020 were targeted. In addition, four regression machine learning algorithms had been used in this study to generate the predictions: random forest (RF), Bayesian ridge (BR), adaptive boosting (AdaBoost), and extreme gradient boosting (XGBoost). The results showed that the ensemble model for RF and XGBoost yielded the best performance. Finally, it is worth mentioning that among all the e-learning components and events, quiz events had a significant impact on predicting the student’s academic performance. Moreover, the paper shows that the activities between weeks 9 and 12 influenced students’ performances during the semester.
This paper is the first step in an effort toward building automatic speech recognition (ASR) system for Quranic recitations that caters specifically to female reciters. To function properly, ASR systems require a huge amount of data for training. Surprisingly, the data readily available for Quranic recitations suffer from major limitations. Specifically, the currently available audio recordings of Quran recitations have massive volume, but they are mostly done by male reciters (who have dedicated most of their lives to perfecting their recitation skills) using professional and expensive equipment. Such proficiency in the training data (along with the fact that the reciters come from a specific demographic group; adult males) will most likely lead to some bias in the resulting model and limit their ability to process input from other groups, such as non-/semi-professionals, females or children. This work aims at empirically exploring this shortcoming. To do so, we create a first-of-its-kind (to the best of our knowledge) benchmark dataset called the Quran recitations by females and males (QRFAM) dataset. QRFAM is a relatively big dataset of audio recordings made by male and female reciters from different age groups and proficiency levels. After creating the dataset, we experiment on it by building ASR systems based on one of the most popular open-source ASR models, which is the celebrated DeepSpeech model from Mozilla. The speaker-independent end-to-end models, that we produce, are evaluated using word error rate (WER). Despite DeepSpeech’s known flexibility and prowess (which is shown when trained and tested on recitations from the same group), the models trained on the recitations of one group could not recognize most of the recitations done by the other groups in the testing phase. This shows that there is still a long way to go in order to produce an ASR system that can be used by anyone and the first step is to build and expand the resources needed for this such as QRFAM. Hopefully, our work will be the first step in this direction and it will inspire the community to take more interest in this problem.
Computer-aided diagnosis systems have been the focus of many research endeavours. In addition to being a great asset for any hospital, such systems represent invaluable platforms for educational and research purposes. In this work, we propose a system for the diagnosis and training on the diagnosis of lumbar disk herniation from magnetic resonance imaging (MRI) scans. The proposed system has three main novel contributions. First, it utilises the axial MRI spine view of the suspected region instead of using MRI sagittal spine view. Second, instead of simply classifying cases as normal or abnormal, the proposed system is capable of determining the type of lumbar disk herniation and pinpoints its location. The final contribution is the simulated training environment that can be used to train novice radiologists on the diagnosis of lumbar disk herniation. Our experiments show that it is quick and accurate besides being very useful for training purposes.
Android applications have recently witnessed a pronounced progress, making them among the fastest growing technological fields to thrive and advance. However, such level of growth does not evolve without some cost. This particularly involves increased security threats that the underlying applications and their users usually fall prey to. As malware becomes increasingly more capable of penetrating these applications and exploiting them in suspicious actions, the need for active research endeavors to counter these malicious programs becomes imminent. Some of the studies are based on dynamic analysis, and others are based on static analysis, while some are completely dependent on both. In this paper, we studied static, dynamic, and hybrid analyses to identify malicious applications. We leverage machine learning classifiers to detect malware activities as we explain the effectiveness of these classifiers in the classification process. Our results prove the efficiency of permissions and the action repetition feature set and their influential roles in detecting malware in Android applications. Our results show empirically very close accuracy results when using static, dynamic, and hybrid analyses. Thus, we use static analyses due to their lower cost compared to dynamic and hybrid analyses. In other words, we found the best results in terms of accuracy and cost (the trade-off) make us select static analysis over other techniques.