📄 CV 🔗LinkedIn 🐦 Twitter 💻 GitHub ✉️ Email

Md Toki Tahmid

Welcome! I am a recent graduate from the Department of Computer Science and Engineering at the Bangladesh University of Engineering and Technology (BUET). My passion lies in leveraging computational techniques to address challenges in biomedical informatics, clinical health, precision medicine, and computational biology.

profile photo

Research Interests

Journals
Conferences
Preprints

Journal Publications

project image

TransBind: Precise Detection of DNA-Binding Proteins and Residues Using Language Models and Deep Learning


Md Toki Tahmid, A.K.M. Mehedi Hasan, Md Shamsuzzoha Bayzid
Nature Communications Biology, 2024

Identifying DNA-binding proteins and their binding residues is critical for understanding diverse biological processes, but conventional experimental approaches are slow and costly. Existing machine learning methods, while faster, often lack accuracy and struggle with data imbalance, relying heavily on evolutionary profiles like PSSMs and HMMs derived from multiple sequence alignments (MSAs). These dependencies make them unsuitable for orphan proteins or those that evolve rapidly. To address these challenges, we introduce TransBind, an alignment-free deep learning framework that predicts DNA-binding proteins and residues directly from a single primary sequence, eliminating the need for MSAs. By leveraging features from pre-trained protein language models, TransBind effectively handles the issue of data imbalance and achieves superior performance. Extensive evaluations using diverse experimental datasets and case studies demonstrate that TransBind significantly outperforms state-of-the-art methods in terms of both accuracy and computational efficiency
project image

A Ubiquitous Method for Predicting Underground Petroleum Deposits Based on Satellite Data


Sarfaraz Newaz, Md Toki Tahmid, Nadia Al-Aboody, ABM Alim Al Islam
Nature Scientific Reports, 2023
doi /
The method of finding new petroleum deposits beneath the earth’s surface is always challenging for having low accuracy while simultaneously being highly expensive. As a remedy, this paper presents a novel way to predict the locations of petroleum deposits. Here, we focus on a region of the Middle East, Iraq to be specific, and conduct a detailed study on predicting locations of petroleum deposits there based on our proposed method. To do so, we develop a new method of predicting the location of a new petroleum deposit based on publicly available data sensed by an open satellite named Gravity Recovery and Climate Experiment (GRACE). Using GRACE data, we calculate the gravity gradient tensor of the earth over the region of Iraq and its surroundings. We use this calculated data to predict the locations of prospective petroleum deposits over the region of Iraq. In the process of our study for making the …
project image

MD-CardioNet: A Multi-Dimensional Deep Neural Network for Cardiovascular Disease Diagnosis from Electrocardiogram


Md Toki Tahmid, Muhammad Ehsanul Kader, Tanvir Mahmud, Shaikh Anowarul Fattah
IEEE Journal of Biomedical and Health Informatics, 2023
doi /
Automated classification of cardiovascular diseases from electrocardiogram (ECG) signals using deep learning has gained significant interest due to its wide range of applications. However, existing deep learning approaches often overlook inter-channel shared information or lose time-sequence dependent information when considering 1D and 2D ECG representations, respectively. Moreover, besides considering spatial dimension, it is necessary to understand the context of the signals from a global feature space. We propose MD-CardioNet, an efficient deep learning architecture that captures temporal, spatial, and volumetric features from multi-lead ECG signals using multidimensional (1D, 2D, and 3D) convolutions to address these challenges. Sequential feature extractors capture time-dependent information, while a 2D convolution is applied to form an image representation from the multi-channel ECG signal, extracting inter-channel features. Additionally, a volumetric feature extraction network is designed to incorporate intra-channel, inter-channel, and inter-filter global space information. To reduce computational complexity, we introduce a practical knowledge distillation framework that reduces the number of trainable parameters by up to eight times (from 4,304,910 parameters to 94,842 parameters) while maintaining satisfactory performance compatible with the other existing approaches. The proposed architecture is evaluated on a large publicly available dataset containing ECG signals from over 10,000 patients, achieving an accuracy of 97.3% in classifying six heartbeat rhythms. Our results surpass the performance of some state-of-the-art approaches. This paper presents a novel deep-learning approach for ECG classification that addresses the limitations of existing methods. The experimental results highlight the robustness and accuracy of MD-CardioNet in cardiovascular disease classification, offering valuable insights for future research in this field.
project image

Escalating Post-Disaster Rescue Missions Through Ad-Hoc Victim Localization Exploiting Wi-Fi Networks


Khondoker Nazmoon Nabi, Md Toki Tahmid, Abdur Rafi, Md. Ehsanul Kader, Md. Asif Haider
Heliyon, 2021
doi /
The number of disasters, accidents, and casualties in disasters is increasing, however, technological advancement has yet to ripe benefits to emergency rescue operations. This contrast is even more prominent in the Global South. The consequences are a huge loss of wealth and resources, but more importantly, the loss of lives. Locating victims of disasters as quickly as possible while speeding up rescue operations can lessen these losses. Traditional approaches for effective victim localization and rescue often requires the establishment of additional infrastructure during the construction period. Which in the context of countries of the global south such as - Bangladesh, is not followed for most of the industrial and household constructions. In this paper, we conduct a study to better understand the challenges of victim localization in emergency rescue operations and to overcome them using “whatever” resources available at hand without needing prior infrastructure facilities and pre-calibration. We design and develop a solution for this purpose and deployed it in several emulated disaster-like scenarios. We analyze and discuss the results obtained from our experiments. Finally, we point out the design implications of an infrastructure-independent and extensive emergency rescue system.
project image

Forecasting COVID-19 Cases: A Comparative Analysis Between Recurrent and Convolutional Neural Networks


Khondoker Nazmoon Nabi, Md Toki Tahmid, Abdur Rafi, Md. Ehsanul Kader, Md. Asif Haider
Results in Physics, 2021
doi /
Though many countries have already launched COVID-19 mass vaccination programs to control the disease outbreak quickly, numerous countries around worldwide are grappling with unprecedented surges of new COVID-19 cases due to a more contagious and deadly variant of coronavirus. As the number of new cases is skyrocketing, pandemic fatigue and public apathy towards different intervention strategies pose new challenges to government officials to combat the pandemic. Henceforth, it is indispensable for the government officials to understand the future dynamics of COVID-19 flawlessly to develop strategic preparedness and resilient response planning. In light of the above circumstances, probable future outbreak scenarios in Brazil, Russia, and the United kingdom have been sketched in this study with the help of four deep learning models: long short term memory (LSTM), gated recurrent unit (GRU), convolutional neural network (CNN) and multivariate convolutional neural network (MCNN). In our analysis, the CNN algorithm has outperformed other deep learning models in terms of validation accuracy and forecasting consistency. It is unearthed in our study that CNN can provide robust long-term forecasting results in time-series analysis due to its capability of essential features learning, distortion invariance, and temporal dependence learning. However, the prediction accuracy of the LSTM algorithm has been found to be poor as it tries to discover seasonality and periodic intervals from any time-series dataset, which were absent in our studied countries. Our study has highlighted the promising validation of using convolutional neural networks instead of recurrent neural networks when forecasting with very few features and less amount of historical data.

Conference Publications

project image

TomoPicker: Annotation-Efficient Particle Picking in Cryo-Electron Tomograms


Mostofa Rafid Uddin, Ajmain Yasar Ahmed, Md Toki Tahmid, Md. Zarif Ul Alam, Zachary Freyberg, Min Xu
NeurIPS 2024 MLSB Workshop, 2024
arxiv /
Particle picking in cryo-electron tomograms (cryo-ET) is crucial for in situ structure detection of macromolecules and protein complexes. The traditional template-matching-based approaches for particle picking suffer from template-specific biases and have low throughput. Given these problems, learning-based solutions are necessary for particle picking. However, the paucity of annotated data for training poses substantial challenges for such learning-based approaches. Moreover, preparing extensively annotated cryo-ET tomograms for particle picking is extremely time-consuming and burdensome. Addressing these challenges, we present TomoPicker, an annotation-efficient particle-picking approach that can effectively pick particles when only a minuscule portion (≈ 0.3−0.5%) of the total particles in a cellular cryo-ET dataset is provided for training. TomoPicker regards particle picking as a voxel classification problem and solves it with two different positive-unlabeled learning approaches. We evaluated our method on a benchmark cryo-ET dataset of eukaryotic cells, where we observed about 30% improvement by TomoPicker against the most recent state-of-the-art annotation efficient learning-based picking approaches.
project image

Analyzing Impacts on Physiological Aspects of Rickshaw Pullers due to Heat Exposure


Maoyejatun Hasana, Masfiqur Rahaman, Tauhidur Rahman, Razin Reaz Abedin, Md Toki Tahmid, Ishika Tarin, Sudipa Saha, Sutapa Dey Tithi, Zarin Tasnim Promi, Kazi Abdun Noor, Md Zahidul Islam Sanjid, Mahir Shahriar Dhrubo, Samira Akter, A. B. M. Alim Al Islam
11th International Conference on Networking, Systems, and Security, 2024
arxiv /
The changing climate is anticipated to result in more frequent and extended heat waves having catastrophic effects on urban areas. Prolonged exposure to extreme heat is believed to adversely affect both physiological and psychological well-being, particularly for outdoor workers. This study analyzed the effects of hot weather conditions on the brainwaves of rickshaw pullers by examining changes in EEG signals before and after driving rickshaws at dif- ferent air temperatures and relative humidity levels. We selected six rickshaw pullers as participants for six levels of air temperature (°C) and humidity condition (%) in various outdoor areas in Dhaka city, Bangladesh, while driving rickshaws for a duration of 20 to 30 minutes. The acquired EEG data assists in assessing the effects of heat exposure on cerebral activity. Our statistical analysis shows that there is a positive association between heat stress and the alpha band power of brainwaves.
project image

EmbedSimScore: Advancing Protein Similarity Analysis with Structural and Contextual Embeddings


Gourab Saha*, Md Toki Tahmid*, Md. Shamsuzzoha Bayzid (*Equal Contribution)
NeurIPS 2024 SSL Workshop (Self-Supervised Learning - Theory and Practice), 2024
arxiv /
Accurately computing protein similarity is challenging due to the intricate interplay between local substructures and the global structure within protein molecules. Traditional metrics like TM-score often focus on aligning the global structures of the proteins in a rather geometry-based algorithmic way, potentially overlooking critical local global relations and contextual comparisons. We introduce Embed-SimScore, a novel self-supervised method that generates structural and contextual embeddings by jointly considering both local substructures and global proteins's structures. Utilizing contrastive language-structure pre training (CLSP) and structural contrastive learning, EmbedSimScore captures comprehensive features across different scales of protein structure. These embeddings provide a more precise and holistic means of computing protein similarities, resulting in the identification of intrinsic relations among proteins that traditional approaches overlook.
project image

LOCAS: Multi-label mRNA Localization with Supervised Contrastive Learning


Abrar Rahman Abir*, Md Toki Tahmid*, M. Saifur Rahman (*Equal Contribution)
NeurIPS 2024 MLSB Workshop, 2024
arxiv /
Traditional methods for mRNA subcellular localization often fail to account for multiple compartmentalization. Recent multi-label models have improved performance, but still face challenges in capturing complex localization patterns. We introduce LOCAS (Localization with Supervised Contrastive Learning), which integrates an RNA language model to generate initial embeddings, employs supervised contrastive learning (SCL) to identify distinct RNA clusters, and uses a multi-label classification head (ML-Decoder) with cross-attention for accurate predictions. Through extensive ablation studies and multi-label overlapping threshold tuning, LOCAS achieves state-of-the-art performance across all metrics, providing a robust solution for RNA localization tasks.
project image

RNA-DCGen: Dual Constrained RNA Sequence Generation with LLM-Attack


Haz Sameen Shahgir*, Md. Rownok Zahan Ratul*, Md Toki Tahmid*, Khondker Salman Sayeed, Atif Rahman (*Equal Contribution)
NeurIPS 2024 MLSB Workshop, 2024
arxiv /
Designing RNA sequences with specific properties is critical for developing personalized medications and therapeutics. While recent diffusion and flow-matching-based generative models have made strides in conditional sequence design, they face two key limitations: specialization for fixed constraint types, such as tertiary structures, and lack of flexibility in imposing additional conditions beyond the primary property of interest. To address these challenges, we introduce RNA-DCGen, a generalized framework for RNA sequence generation that is adaptable to any structural or functional properties through straightforward finetuning with an RNA language model (RNA-LM). Additionally, RNA-DCGen can enforce conditions on the generated sequences by fixing specific conserved regions. On RNA generation conditioned on RNA distance maps, RNA-DCGen generates sequences with an average R2 score of 0.625 compared to random sequences that score only 0.118 over 250 generations as judged by a separate more capable RNA-LM. When conditioned on RNA secondary structures, RNA-DCGen achieves an average F1 score of 0.4 against a random baseline of 0.006.
project image

Structure Matters: Deciphering Neural Network’s Properties from its Structure


Shashata Sawmya*, Md Toki Tahmid*, Gourab Saha*, Arpita Saha, Nir N. Shavit, Lu Mi (*Equal Contribution)
NeurIPS 2024 Symmetry and Geometry in Neural Representations Workshop, 2024
arxiv /
Neural networks; both biological and artificial, are commonly represented as graphs with connections between neurons, yet there is little understanding of the relationship between their graph structure and computational properties. Neuroscientists are trying to answer this question in biological neural networks or connectomes; however, there is a big op- portunity to explore this in the vast domain of artificial neural networks. We present StructureReps, an architecture-agnostic framework for encoding neural networks as graphs using graph representation learning. By capturing key structural properties, StructureReps reveals strong correlations between network structure and task performance across various architectures. Additionally, this framework has potential applications beyond the decoding of neural network properties.
project image

BiRNA-BERT: Adaptive Tokenization for Efficient RNA Language Modeling


Md Toki Tahmid, Haz Sameen Shahgir, Sazan Mahbub, Yue Dong, Md Shamsuzzoha Bayzid
NeurIPS 2024 ENLSP Workshop, Oral/Spotlight, 2024
arxiv / code /
Recent advancements in Transformer-based models have spurred interest in their use for biological sequence analysis. However, adapting models like BERT is challenging due to sequence length, often requiring truncation for proteomics and genomics tasks. Additionally, advanced tokenization and relative positional encoding techniques for long contexts in NLP are often not directly transferable to DNA/RNA sequences, which require nucleotide or character-level encodings for tasks such as 3D torsion angle prediction. To tackle these challenges, we propose an adaptive dual tokenization scheme for bioinformatics that utilizes both nucleotide-level (NUC) and efficient BPE tokenizations. Building on the dual tokenization, we introduce BiRNA-BERT, a 117M parameter Transformer encoder pretrained with our proposed tokenization on 36 million coding and non-coding RNA sequences. BiRNA-BERT achieves state-of-the-art results in long-sequence downstream tasks and performs comparable to 6 times larger models in short-sequence tasks with 27 times less pre-training compute. In addition, our empirical experiments and ablation studies demonstrate that NUC is often preferable over BPE for bioinformatics tasks, given sufficient VRAM availability. This further highlights the advantage of BiRNA-BERT, which can dynamically adjust its tokenization strategy based on sequence length. It utilizes NUC for shorter sequences and switching to BPE for longer ones, eliminating the need for truncation.
project image

Long-Range Low-Cost Networking for Real-Time Monitoring of Rail Tracks in Developing Countries


Saiful Islam Salim, Uday Kamal, Adnan Quaium, Mainul Hossain, Masfiqur Rahaman, Nazmul Hasan Sakib, Md Toki Tahmid, ABM Alim Al Islam
Proceedings of the 2022 International Conference on Information and Communication Technologies and Development, 2022
doi /
Derailments present a frequent phenomenon in several developing countries, which result in massive loss of property along with death tolls. For preventing derailments, a real-time automated system is needed to detect uprooted or faulty rail blocks. One of the solutions in this context is to sense the vibration of the rail track having an incoming train and transmit the information to the train notifying it about the condition of the rail track ahead. However, existing studies in this regard are yet to present a pragmatic solution that enables much-demanded long-distance networking to transmit the sensed data. The demand for long-distance network communication between the sensor nodes and the incoming train is unavoidable, as stopping the train after sensing an uprooted or faulty rail block ahead needs a considerable response time and distance. Therefore, in this paper, we develop a low-cost, long-range, and highly reliable mobile multi-hop networking scheme to successfully transmit data sensed from rail tracks to an approaching train at a distance of around 2000m. By considering the effect of Fresnel’s Region in our study, we determine the suitable placement of the networking module on the rail track, which leads us to achieve a delivery ratio of more than 99%. We confirm this finding through rigorous experiments over a real testbed scenario enabling mobile multi-hop networking.
project image

Artificial Intelligence Based Cybersecurity: Two-Step Suitability Test


Sajjad Waheed Shah Md Istiaque, Md Toki Tahmid, Asif Iqbal Khan, Zaber Al Hassan
IEEE International Conference on Service Operation and Logistics, and Informatics, SOLI, 2021, 2021
doi /
Enormous network connectivity, big data, the Internet of Things (IoT), digitalization of the world, and the use of social websites and apps have brought enormous institutional and individual security challenges. The conventional security system often fails to provide cyber security to institutions and individuals. Artificial Intelligence (AI) is highly adaptive and smart to handle the volatile cyber security environment. AI plays a prudent role in access control, user authentication and behavior analysis, spam, malware, and botnet detection. Machine learning (ML) models are the building blocks of AI. In this research, a novel practical approach is followed to prove the effectiveness of AI in the field of cyber-security. Multiple machine learning algorithms are applied to prove that. A two-step suitability test is conducted in this study. In the 1st step, the KDD'99 data set is used to train and test the AI models. In the second step, train models are repeatedly tested on a fresh data set, NSL-KDD. Finally, the testing results are compared to prove the authenticity of AI in cyber-security. An absolute practical approach and reliable outputs of various AI models prove AI's suitability in cyber-security.

Preprints

project image

GraFusionNet: Integrating Node, Edge, and Semantic Features for Enhanced Graph Representations


Md Toki Tahmid, Tanjeem Azwad Zaman, M. Saifur Rahman
bioRxiv, 2024
arxiv /
Understanding complex graph-structured data is a cornerstone of modern research in fields like cheminformatics and bioinformatics, where molecules and biological systems are naturally represented as graphs. However, traditional graph neural networks (GNNs) often fall short by focusing mainly on node features while overlooking the rich information encoded in edges. To bridge this gap, we present GraFusionNet, a framework designed to integrate node, edge, and molecular-level semantic features for enhanced graph classification. By employing a dual-graph autoencoder, GraFusionNet transforms edges into nodes via a line graph conversion, enabling it to capture intricate relationships within the graph structure. Additionally, the incorporation of Chem-BERT embeddings introduces semantic molecular insights, creating a comprehensive feature representation that combines structural and contextual information. Our experiments on benchmark datasets, such as Tox21 and HIV, highlight GraFusionNet's superior performance in tasks like toxicity prediction, significantly surpassing traditional models. By providing a holistic approach to graph data analysis, GraFusionNet sets a new standard in leveraging multi-dimensional features for complex predictive tasks.
project image

Advancing Noninvasive Mechanical Ventilation: Simulating Techniques for Improved Respiratory Care


Md Toki Tahmid, Mrinmoy Nandi Bappa
bioRxiv, 2024
arxiv /
Respiratory failure is a critical condition that often requires mechanical ventilation to support or restore normal breathing. While invasive mechanical ventilation (IMV) is commonly used for severe cases, noninvasive mechanical ventilation (NIMV) offers a less intrusive alternative that reduces complications and can be applied in moderate cases. The COVID-19 pandemic highlighted the global shortage of ventilators, particularly in low- and middle-income countries (LMICs), where limited access to life-saving equipment exacerbated the crisis. In response to these challenges, this paper presents a simplified, compartmental-based simulation model for NIMV. This model provides a practical and accessible tool for simulating respiratory system behavior under various ventilation modes, using the analogy between electrical circuits and lung physiology. By simulating key parameters such as airway resistance and lung compliance, the model allows clinicians and researchers to evaluate ventilator performance and optimize treatment strategies. Furthermore, the simulation offers a blueprint for developing cost-effective, easy- to-use NIMV systems that can be deployed in resource-constrained environments. Our contribution seeks to address the ventilator shortage by enabling better design and understanding of noninvasive ventilation, ultimately improving respiratory care for patients with moderate respiratory failure.
project image

DeepRNA-Twist: Language Model Guided RNA Torsion Angle Prediction with Attention-Inception Network


Abrar Rahman Abir, Md Toki Tahmid, Rafiqul Islam Rayan, M. Saifur Rahman
bioRxiv, 2024
arxiv /
RNA torsion and pseudo-torsion angles are critical in determining the three-dimensional conformation of RNA molecules, which in turn governs their biological functions. However, current methods are limited by structural complexity and flexibility of RNA, as it can adopt multiple conformations, with experimental techniques being costly and computational approaches struggling to capture the intricate sequence dependencies needed for accurate predictions. To address these challenges, we introduce DeepRNA-Twist, a novel deep learning framework designed to predict RNA torsion and pseudo-torsion angles directly from sequence. DeepRNA-Twist utilizes RNA language model embeddings, which provides rich, context-aware feature representations of RNA sequences. Additionally, it introduces 2A3IDC module (Attention Augmented Inception Inside Inception with Dilated CNN), combining inception networks with dilated convolutions and multi head attention mechanism. The dilated convolutions capture long-range dependencies in the sequence without requiring a large number of parameters, while the multi-head attention mechanism enhances the ability of the model to focus on both local and global structural features simultaneously. DeepRNA-Twist was rigorously evaluated on benchmark datasets, including RNA-Puzzles, CASP-RNA, and SPOT-RNA 1D, and demonstrated significant improvements over existing methods, achieving state-of-the-art accuracy. Source code is available at https://github.com/abrarrahmanabir/DeepRNA-Twist
project image

BioLLMNet: Enhancing RNA-Interaction Prediction with a Specialized Cross-LLM Transformation Network


Md Toki Tahmid, Abrar Rahman Abir, Md. Shamsuzzoha Bayzid
bioRxiv, 2024
arxiv /
Existing computational methods for the prediction of RNA related interactions often rely heavily on manually crafted features. Language model features for bio-sequences has gain significant popularity in proteomics and genomics. However, during interaction prediction, how language model features from different modalities should be combined to extract the most representative features is yet to be explored. We introduce BioLLMNet, a novel framework that introduces an effective combination approach for multi-modal bio-sequences. BioLLMNet provides a way to transform feature space of different molecule’s language model features and uses learnable gating mechanism to effectively fuse features. Rigorous evaluations show that BioLLMNet achieves state-of-the-art performance in RNA-protein, RNA-small molecule, and RNA-RNA interactions, outperforming existing methods in RNA-associated interaction prediction.
project image

wQFM-TREE: Highly Accurate and Scalable Quartet-Based Species Tree Inference from Gene Trees


Abdur Rafi, Ahmed Mahir Sultan Rumi, Sheikh Azizul Hakim, Sohaib Sohaib, Md Toki Tahmid, Rabib Jahin Ibn Momin, Tanjeem Azwad Zaman, Rezwana Reaz, Md. Shamsuzzoha Bayzid
bioRxiv, 2024
arxiv /
Summary methods are becoming increasingly popular for species tree estimation from multi-locus data in the presence of gene tree discordance. ASTRAL, a leading method in this class, solves the Maximum Quartet Support Species Tree problem within a constrained solution space constructed from the input gene trees. In contrast, alternative heuristics such as wQFM and wQMC operate by taking a set of weighted quartets as input and employ a divide-and-conquer strategy to construct the species tree. Recent studies showed wQFM to be more accurate than ASTRAL and wQMC, though its scalability is hindered by the computational demands of explicitly generating and weighting Θ(n4) quartets. Here, we introduce wQFM-TREE, a novel summary method that enhances wQFM by circumventing the need for explicit quartet generation and weighting, thereby enabling its application to large datasets. Unlike wQFM, wQFM-TREE can also handle polytomies. Extensive simulations under diverse and challenging model conditions, with hundreds or thousands of taxa and genes, consistently demonstrate that wQFM-TREE matches or improves upon the accuracy of ASTRAL. Specifically, wQFM-TREE outperformed ASTRAL in 25 of 27 model conditions analyzed in this study involving 200-1000 taxa, with statistically significant differences in 20 of these conditions. Moreover, we applied wQFM-TREE to re-analyze the green plant dataset from the One Thousand Plant Transcriptomes Initiative. Its remarkable accuracy and scalability position wQFM-TREE as a highly competitive alternative to leading methods in the field. Additionally, the algorithmic and …
project image

A Paradigm Shift in Mouza Map Vectorization: A Human-Machine Collaboration Approach


Mahir Shahriar Dhrubo, Samira Akter, Anwarul Bashir Shuaib, Md Toki Tahmid, Zahid Hasan, ABM Islam
arXiv, 2024
arxiv /
Efficient vectorization of hand-drawn cadastral maps, such as Mouza maps in Bangladesh, poses a significant challenge due to their complex structures. Current manual digitization methods are time-consuming and labor-intensive. Our study proposes a semi-automated approach to streamline the digitization process, saving both time and human resources. Our methodology focuses on separating the plot boundaries and plot identifiers and applying our digitization methodology to convert both of them into vectorized format. To accomplish full vectorization, Convolutional Neural Network (CNN) models are utilized for pre-processing and plot number detection along with our smoothing algorithms based on the diversity of vector maps. The CNN models are trained with our own labeled dataset, generated from the maps, and smoothing algorithms are introduced from the various observations of the map's vector formats. Further human intervention remains essential for precision. We have evaluated our methods on several maps and provided both quantitative and qualitative results with user study. The result demonstrates that our methodology outperforms the existing map digitization processes significantly.
project image

Connecting the Dots: Leveraging Spatio-Temporal Graph Neural Networks for Accurate Bangla Sign Language Recognition


Haz Sameen Shahgir, Khondker Salman Sayeed, Md Toki Tahmid, Tanjeem Azwad Zaman, Md Zarif Ul Alam
arXiv, 2023
arxiv /
Recent advances in Deep Learning and Computer Vision have been successfully leveraged to serve marginalized communities in various contexts. One such area is Sign Language - a primary means of communication for the deaf community. However, so far, the bulk of research efforts and investments have gone into American Sign Language, and research activity into low-resource sign languages - especially Bangla Sign Language - has lagged significantly. In this research paper, we present a new word-level Bangla Sign Language dataset - BdSL40 - consisting of 611 videos over 40 words, along with two different approaches: one with a 3D Convolutional Neural Network model and another with a novel Graph Neural Network approach for the classification of BdSL40 dataset. This is the first study on word-level BdSL recognition, and the dataset was transcribed from Indian Sign Language (ISL) using the Bangla Sign Language Dictionary (1997). The proposed GNN model achieved an F1 score of 89%. The study highlights the significant lexical and semantic similarity between BdSL, West Bengal Sign Language, and ISL, and the lack of word-level datasets for BdSL in the literature. We release the dataset and source code to stimulate further research.