David Farr1, Iain J. Cruickshank2, Lynnette Hui Xian Ng2, Nico Manzonelli3, Nicholas Clark1, Kate Starbird1 , and Jevin West1, 1University of Washington, 2Carnegie Mellon University, 3Cyber Fusion and Innovation Cell
Assessing classification confidence is critical for leveraging Large Language Models (LLMs) in automated labeling tasks, especially in the sensitive domains presented by Computational Social Science (CSS) tasks. In this paper, we apply five different Uncertainty Quantification strategies for three CSS tasks: stance detection, ideology identification and frame detection. We use three different LLMs to perform the classification tasks. To improve the classification accuracy, we propose an ensemble-based UQ aggregation strategy. Our results demonstrate that our proposed UQ aggregation strategy improves upon existing methods and can be used to significantly improve human-in-the-loop data annotation processes.
uncertainty quantification, large language models, stance detection, ideology identification, frames detection, ensemble models.
Thibault Rousset1, Taisei Kakibuchi2, Yusuke Sasaki2, and Yoshihide Nomura2, 1School of Computer Science, McGill University, 2Fujitsu Ltd.
This paper investigates the integration of technical vocabulary in merged language models. We explore the knowledge transfer mechanisms involved when combining a general-purpose language-specific model with a domain-specific model, focusing on the resulting model’s comprehension of technical jargon. Our experiments analyze the impact of this merging process on the target model’s proficiency in handling specialized terminology. We present a quantitative evaluation of the performance of the merged model, comparing it with that of the individual constituent models. The findings offer insights into the effectiveness of different model merging methods for enhancing domain-specific knowledge and highlight potential challenges and future directions in leveraging these methods for crosslingual knowledge transfer in Natural Language Processing.
Large Language Models · Knowledge Transfer · Model Merging · Domain Adaptation · Natural Language Processing.
Gianni Jacucci, University of Trento, Department of Information Engineering and Computer Science, Italy
Current AI models, particularly large language models (LLMs), are predominantly grounded in positivist epistemology, treating knowledge as an external, objective entity derived from statistical patterns in data. However, this paradigm fails to capture "facts-in-the-conscience", the subjective, meaning-laden experiences central to human sciences. In contrast, phenomenology hermeneutics and constructivism, as fostered by socio-technical research (16), provide a more fitting foundation for AI development, recognizing knowledge as an intentional, co-constructed process shaped by human interaction and community consensus. Phenomenology highlights the lived experience and intentionality necessary for meaning-making, while constructivism emphasizes the social negotiation of knowledge within communities of practice. This paper argues for an AI paradigm shift integrating second-order cybernetics, enabling recursive interaction between AI and human cognition. Such a shift would make AI not merely a tool for knowledge retrieval but a co-participant in epistemic evolution, supporting more trustworthy, context-sensitive, and meaning-aware AI systems within socio-technical frameworks.
AI epistemology, Large Language Models(LLMs), Consensus Domain, Human-AI Interaction, Structural Coupling.
Ahmad Mahmood1, Zainab Ahmad1, Iqra Ameer2, and Grigori Sidorov1, 1Instituto Politécnico Nacional (IPN), Centro de Investigación en Computación(CIC), Mexico City, Mexico, 2Division of Science and Engineering, The Pennsylvania State University, Abington, PA, USA
The development of medical question-answering (QA) systems has predominantly focused on high-resource languages, leaving a significant gap for low-resource languages like Urdu. This study proposed a novel corpus designed to advance medical QA research in Urdu, created by translating the benchmark MedQuAD corpus into Urdu using the Generative AI-based translation technique. The proposed corpus is evaluated using three approaches: (i) Information Retrieval (IR), (ii) Cache-Augmented Generation (CAG), and (iii) Fine-Tuning (FT). We conducted two experiments, one on a 500-instance subset and another on the complete 3,152-question corpus, to assess retrieval effectiveness, response accuracy, and computational efficiency. Our results show that JinaAI embeddings outperformed other IR models, while OpenAI 4o mini, FT achieved the highest response accuracy (BERTScore: 70.6%) but is computationally expensive. CAG eliminates retrieval latency but requires high resources. Findings suggest that IR is optimal for real-time QA, Fine-Tuning ensures accuracy, and CAG balances both. This research advances Urdu medical AI, bridging healthcare accessibility gaps.
Information retrieval, retrieval-augmented generation, cache-augmented generation, fine-tuning, Urdu medical question-answering.
Nikitha Merilena Jonnada, University of the Cumberlands, USA
In this paper, the author discusses the importance of IoT, its security measures, and device protection. IoT devices have become a trend as they allow users to easily use and understand the devices. IoT has become a widely used technique within many industries like banking, agriculture, health care, and others. It made the users experience easy. IoT without AI has been a good investment for many users as its connectivity helps them use multiple devices from a single device and sometimes with a single click.
Artificial Intelligence (AI), Machine Learning (ML), Internet of Things (IoT), Security, Hacking, Risks.
Lakmali Karunarathne, York St John University, UK
Smart garbage bins which are automatically opened the bins doors when the person is standing in front of the smart bins are the perfectly innovated garbage bins by the IT industries and developers. The IR sensor is used to sense the waste and then its identified the which category the waste is by the support of sensors like metal proximity sensor, capacitive proximity sensor and the inductive proximity sensor. The expected services are aimed to provide by this entire system. The entire project is included things are, identifying the bins category, dispose the waste based on that category, send notifications and provide the reports for the purpose of getting awareness about the users garbage management. The IOT product is combined with the SMART GARBAGE MASTER (SGM) mobile application to interact with the entire IOT system via the cloud based to provide effective and efficient service to the users who use this system. The data is sent the Arduino for taking the decision that the garbage is either metal or non- metal.
Smart, garbage, segregation, plastic, paper, sensors, ultrasonic sensor, IR sensor, bin, level, percentage, IOT, Arduino, Cloud Databases.
Tanja Steigner and Mohammad Ikbal Hossain, Emporia State University, Kansas, USA
Bitcoin mining, often criticized for its substantial energy consumption, holds significant potential to drive energy innovation and sustainability. This paper reevaluates Bitcoin minings environmental impact, focusing on its ability to utilize surplus and renewable energy sources. Mining operations absorb excess energy, such as curtailed wind and solar power, that would otherwise go to waste, contributing to grid efficiency and renewable energy integration. The increasing shift toward renewables, now accounting for over 50% of mining’s energy mix, underscores the industrys progress toward sustainability. Through the analysis of industry data, this paper highlights Bitcoin minings dual role as both a flexible energy consumer and a catalyst for green energy investments. Despite challenges like e-waste and the industrys reliance on energy-intensive proof-of-work mechanisms, the findings demonstrate how targeted policies, and technological advancements can transform Bitcoin mining into a force for environmental and economic benefits. The study emphasizes the need for collaborative efforts among stakeholders to unlock Bitcoin minings full potential in supporting the global energy transition.
Bitcoin mining, renewable energy, grid stabilization, green energy investments, proof-of-work (PoW), carbon footprint reduction, e-waste management, decentralized energy systems
Aakaash Kurunth, Adithya S Gurikar, Tejas B, Sean Sougaijam and Kamatchi Priya L, PES University, Bengaluru, India
Spirulina platensis, a microalgae known for its high nutritional value and sustainability, is widely used in food, pharmaceuticals, and bioenergy. Its growth depends on factors like temperature, irradiance, pH, and nutrients, but optimizing these conditions is challenging due to their complex interactions. To address this, we integrate predictive analytics with an intelligent recommendation system to optimize cultivation. We evaluate multiple regression models, including Stacking, xGBoost, CatBoost, Gradient Boosting Machine (GBM), Support Vector Regressor (SVR), and Neural Networks, to determine the most accurate predictor of Spirulina optical density. The best-performing model powers a hybrid recommendation engine that combines content-based filtering and rule-based logic. This system identifies optimal growth conditions and provides precise recommendations for farmers and researchers, enhancing efficiency in Spirulina cultivation. By leveraging machine learning, this approach ensures data-driven insights for maximizing yield and sustainability.
Spirulina Growth Prediction, Environmental Factors, Machine Learning, Sustainable Cultivation, Regression Models
Vaani Bansal, R Navaneeth Krishnan, Punith Anand, Aditya Kumar Sinha, and Prof.Sheela Devi, Department of Computer Science Engineering, PES University, Bangalore, India
Counterfeit medicines absolutely show a high threat to public health and non-functional areas in the pharmaceutical industry that do not have proper regulatory mechanisms in place. Such counterfeit drugs might contain the wrong doses or even some hazardous materials. Hence, they break the trust formed between the healthcare system and patients and expose patients to severe health risks. This project presents a complete solution integrated with blockchain technology and machine learning features to ensure drug authenticity and to protect the pharmaceutical supply chain. Blockchain module built on Hyperledger Fabric gives the tamper-proof, decentralized ledger for medicine logistics tracking. Each medicine has a unique QR code that links together with its whole manufacturing and regulatory information. This provides the scope for customers and employees to check medicines authenticity at the same time just by scanning the code. In other words, this encourages transparency and makes traceability, thus preventing counterfeit drugs entering the supply chain. The protection of the blockchain infrastructure is ensured by employing an anomaly detection model using XGBoost-based machine learning. Trained on the NSL-KDD dataset, the model is capable of identifying and nullifying network malicious activities such as unauthorized access attempt, thereby ensuring reliability and security of the system. By combining these technologies, an all-in-one, scalable solution for minimizing counterfeiting medicines is available. The data within the framework can be kept by blockchain integrity and accessibility while providing machine learning security and thus forming a complete counterfeiting regime. The system not only protects public health but also improves the culture of trust and transparency in the pharmaceutical supply chain, making it a feasible approach for large-scale implementation in the industry.
Blockchain, Machine Learning, Pharmaceutical Supply Chain, Counterfeit Drugs, Hyperledger.
Samah Kansab1, Matthieu Hanania1, Francis Bordeleau1, and Ali Tizghadam2, 1Ecole de technologie sup´erieure ( ´ ETS), Montr´eal, Canada, 2TELUS, Toronto, Canada
Context: DevOps integrates collaboration, automation, and continuous improvement in software development, enhancing agility and ensuring consistent software releases. GitLab’s Merge Request (MR) mechanism plays a critical role in this process by streamlining code submission and review. While extensive research has focused on code review metrics like time to complete reviews, MR data can offer broader insights into collaboration, productivity, and process optimization. Objectives: This study aims to leverage MR data to analyze multiple facets of the DevOps process, focusing on the impact of environmental changes (e.g., COVID-19) and process adaptations (e.g., migration to OpenShift technology). We also seek to identify patterns in branch management and examine how different metrics impact code review efficiency. Methods: We analyze a dataset of 26.7k MRs from 116 projects across four teams within a networking software solution company, focusing on metrics related to MR effort, productivity, and collaboration. The study compares the impact of process and environmental changes, and branch management strategies. Additionally, we apply machine learning techniques to examine code review processes, highlighting the distinct roles of bots and human reviewers. Results: Our analysis reveals that the pandemic led to increased review effort, although productivity levels remained stable. Remote work habits persisted, with up to 70% of weekly activities occurring outside standard hours. The migration to OpenShift showed a successful adaptation, with stabilized performance metrics over time. Branch management on stable branches, especially for new releases, exhibited effective prioritization. Bots helped initiate reviews more quickly, but human reviewers were essential in reducing the overall time to complete reviews. Other factors like commit’s number and reviewer experience also impact code review efficiency. Conclusion: This research offers practical insights for practitioners, demonstrating the potential of MR data to analyze and improve different aspects such as productivity, effort, and overall efficiency in DevOps practices.
Software process, DevOps, Merge request, GitLab, Code review.