ETRAN / IcETRAN Dashboard
Registracije, check-in, akreditacije, sertifikati i program konferencije

VII 1

Детаљи сесије / Session details

VII 1

08.06.2026. 09:00–11:00
Сала / Room: Сала 3 / Hall 3Секција / Трацк / Section / Track: AI
Председавајући / ChairMiljan Vučetić, Nemanja Ilić
Институција / InstitutionVlatacom Institute of High Technology, Belgrade, Serbia
  1. VII1.1
    An Analysis of the Performance of Models for Direct Serbian Speech-to-English Translation
    Milana Vitković, Svetlana Krunić, Siniša Suzić and Nosek Tijana
    ID: 0217Секција / Track: AIRPIEEE Xplore
    Кључне речи / Keywords: automatic speech translation, automatic speech recognition, machine translation, whisper, seamlessm4t
    Апстракт / Abstract
    Automatic speech translation has become an important area
    of research with the development of deep learning and large
    multi-lingual models. In this study, we investigate
    Serbian-to-English speech translation using Whisper and
    SeamlessM4T models. We evaluate the performance in three
    different scenarios: zero-shot speech translation,
    cascaded approach where ASR models are fine-tuned on
    Serbian using the Južne Vesti and ParlaSpeech corpora, and
    translation is performed with both proprietary (GPT-4o) and
    open-source models (Salamandra-7B-Instruct and
    EuroLLM-1.7B) and finally, we explore end-to-end adaptation
    by fine-tuning models directly on a parallel
    Serbian–English cor-pus for speech-to-text translation.
    Evaluation on the Južne Vesti test set using BLEU, METEOR,
    and WER metrics shows that direct fine-tuning improves
    translation performance compared to zero-shot and cascaded
    approaches, while open-source trans-lation models could
    provide a viable alternative when proprie-tary systems are
    unavailable. These results highlight the im-portance of
    task-specific adaptation and parallel speech–translation
    data in improving translation quality for low-resource
    languages like Serbian.
  2. VII1.2
    To slide or to snip? LLM fine-tuning for sentiment analysis of long Serbian movie reviews
    Aleksandra Todorović and Vuk Batanović
    ID: 1414Секција / Track: AIRPIEEE Xplore
    Кључне речи / Keywords: Sentiment analysis, Serbian NLP, BERTić, Document-level classification, SerbMR
    Апстракт / Abstract
    We establish the first fine-tuned transformer results on
    the SerbMR dataset, evaluating strategies for long-form
    sentiment analysis in Serbian, a morphologically rich
    language. Using BERTić, a widely used large language model
    for Serbian and closely related languages, we compare two
    strategies for handling the limited context window:
    truncation (tail and head) and sliding-window-based
    aggregation (mean pooling, max pooling, majority voting,
    and a rule-based heuristic). Evaluation is performed across
    binary (2-class) and ternary (3-class) classification
    schemes. The fine-tuned BERTić achieves 93.76% accuracy on
    the binary task and 74.55% on the ternary task, an
    improvement of approximately 9 and 12.5 percentage points
    respectively over the best reported results of linear
    classifiers. The sliding window with mean pooling proves to
    be the strongest overall strategy, most notably on the
    harder ternary task, but its 45% runtime overhead offers
    only marginal gains over head truncation in the binary
    setting. The consistent advantage of head over tail
    truncation across all conditions suggests that evaluative
    language in Serbian movie reviews concentrates towards the
    end of the text.
  3. VII1.3
    Fuse-T Gated Residual Late Fusion of Text Semantics and Thread Topology for Unseen-Event Rumour Classification in Conversational Reply Graphs
    Aleksandar Stankovic
    ID: 0315Секција / Track: AIRPIEEE Xplore
    Кључне речи / Keywords: rumour detection, early classification, multi-modal fusion, graph neural networks, GraphSAGE, RoBERTa, PHEME
    Апстракт / Abstract
    Rumours on social media spread rapidly during
    breaking events, while early evidence is often sparse,
    noisy, and
    highly event-specific. Text-only classifiers can overfit to
    keywords
    and writing style tied to particular events, while
    structure-only
    propagation models struggle when discussion trees are
    small. This
    paper presents Fuse-T, a gated residual late-fusion
    architecture for
    binary rumour classification on conversational reply
    graphs. FuseT uses a pre-trained language model as a stable
    semantic backbone
    and injects graph propagation cues as a controlled additive
    residual. A learned element-wise gate modulates the injected
    topology signal and is initialized near zero to avoid
    covariate shift
    on the text classifier. We evaluate Fuse-T on
    leave-one-event-out
    (LOEO) generalization across seven events from the PHEME
    collection. Across events, Fuse-T improves average Macro-F1
    from
    62.13% (text-only RoBERTa) and 62.42% (text-attributed GNN
    baseline) to 65.68%. In an early-detection setting using
    only the
    first 10 minutes of replies, Fuse-T retains robust
    performance
    (about 63% Macro-F1), while structure-only models degrade
    substantially.
  4. VII1.4
    Evaluation of VGG16-Based Transfer Learning Strategies for Pollen Classification from Reconstructed Digital Holograms
    Dimitrije Stefanović, Nikša Jakovljević and Marko Panić
    ID: 4187Секција / Track: AIRPIEEE Xplore
    Кључне речи / Keywords: Machine learning, transfer learning, image classification, digital holography, pollen grains
    Апстракт / Abstract
    Reliable pollen classification is an essential step in
    estimating airborne pollen concentration, a key indicator
    for environmental monitoring and allergy forecasting. This
    study addresses the problem of classifying six pollen taxa
    from reconstructed digital holographic images, as a step
    toward potential improvement of automated concentration
    assessment systems. Four training strategies were
    evaluated: three VGG models were trained end-to-end, one
    trained with randomly initialized weights, another
    initialized with frozen pretrained ImageNet weights and
    third fine-tuned entirely by allowing all layers to update
    during training. The fourth fine-tuned VGG was employed as
    feature extractor that was then combined with a support
    vector machine (SVM) classifier. Despite strong
    morphological similarities among certain pollen taxa, all
    models showed promising results, with fine-tuned strategies
    achieving the best performances, demonstrating the
    capability of deep learning for accurate pollen
    classification applicable to real-time monitoring systems.
    The experiments confirmed a high degree of robustness and
    generalization, paving the way for the development of
    enhanced methodologies that can further improve the
    reliability of the classification and expand to a larger
    number of pollen taxa.
  5. VII1.5
    Automated Chemical Vulnerability Assessment of Canvas Paintings from XRF Spectral Imaging Using Deep Learning and Foundation Models
    Dimitrije Pesic, Janko Vukobratovic, Aleksandra Stojanovic, Giulia Ristori, Stefano Ridolfi, Maja Gajic-Kvascev and Goran Kvascev
    ID: 4492Секција / Track: AIRPIEEE Xplore
    Кључне речи / Keywords: XRF, Spectral Imaging, Deep Learning
    Апстракт / Abstract
    This paper presents the design of an automated
    pipeline for chemical vulnerability assessment of canvas
    paintings
    from XRF spectral imaging. The design is based on the
    recognition of the materials used (from XRF data) and on the
    physico-chemical prediction of time- and condition-dependent
    degradation of these materials. The pipeline that processes
    XRF
    spectra and, for each element, calculates the signal
    intensity by
    integrating within a window around the corresponding peak,
    with
    the linearly interpolated background below the peak
    subtracted,
    was developed. The two-dimensional maps of the elements
    (calcium, titanium, iron, copper, and lead-single emission
    line)
    were created using the resulting peak intensities. The
    Pearson
    correlation coefficients were calculated for two
    independent scans
    on detector 10264 to evaluate the reproducibility of the
    inputs.
    The coefficients for relevant elements (Ca, Ti, Fe, Cu, Pb)
    achieve r > 0.98, confirming excellent quantitative
    reproducibility
    and that the instrumental setup remains stable over time.
    The
    system chains self-supervised denoising, physics-based
    element
    extraction, NMF decomposition, a literature-grounded CVI,
    and
    SAM-based region segmentation—requiring no expert-labeled
    training data. Thirteen segments were automatically
    identified by
    SAM, and six per-region mean CVI values for the highest-risk
    segments were labeled. The whole procedure typically
    requires
    extensive expert analysis, while using this pipeline, the
    process
    takes under 70 seconds (on an Intel Core i5-12450H with 16GB
    RAM)
  6. VII1.6
    Robustness of Graph Neural Networks under Structural and Feature Corruptions
    Zorana Štaka and Marko Mišić
    ID: 7018Секција / Track: AIRPIEEE Xplore
    Кључне речи / Keywords: GNN, robustness, graph corruptions, inference-time evaluation, node classification, homophily
    Апстракт / Abstract
    GNNs are successfully used in various tasks and domains
    involving working with graph data. However, their
    robustness under realistic, non-adversarial corruptions is
    underexplored. In this paper, we provide a systematic
    evaluation of GNNs under structural and feature corruptions
    at different levels of severity. To provide a more
    comprehensive evaluation, four models are used: GCN, GAT,
    GraphSAGE, and GIN, and four datasets: Cora, CiteSeer,
    PubMed, and OGBN-ArXiv. To ensure our findings are
    statistically meaningful, a rigorous statistical evaluation
    is conducted with multiple seeds, including the Wilcoxon
    signed-rank tests for pairwise comparison,
    Benjamini-Hochberg false discovery rate, and effect sizes.
    Results indicate that feature corruptions are more damaging
    to the model performance than structural ones under
    inference-time corruptions. Robustness of GNNs is driven
    more by corruption type and severity than by the model
    architecture. Also, homophily and calibration have
    statistically significant but limited correlation with
    predictive performance. Therefore, to properly evaluate
    GNNs' reliability, clean-test accuracy is insufficient;
    rather, the evaluation of model performance under diverse
    corruption settings using a statistically rigorous
    procedure is necessary.
  7. VII1.7
    Flow Matching Policy for Behavioral Cloning
    Mihailo Radović and Filip Marčić
    ID: 6154Секција / Track: AIRPIEEE Xplore
    Кључне речи / Keywords: behavioral cloning, flow matching policy, Gaussian policy, diffusion policy
    Апстракт / Abstract
    Behavioral cloning (BC) is a foundational imitation
    learning paradigm, but many standard continuous-control BC
    baselines rely on unimodal Gaussian policies or other
    relatively low-expressivity action parameterizations.
    Consequently, they struggle to capture the complex,
    multi-modal strategies present in diverse offline datasets,
    such as those containing human, medium-quality, or mixed
    trajectories, leading to a significant performance gap. To
    address this limitation, we introduce the Flow Matching
    Policy (FMP), a highly expressive representation for
    continuous control BC. Our approach models the conditional
    action distribution as a continuous-time normalizing flow,
    learning an observation-conditioned velocity field to
    transport a simple base noise distribution into the
    empirical action distribution. Evaluations against strong
    Gaussian and diffusion policy baselines across standard
    continuous control benchmarks demonstrate that the FMP
    consistently achieves competitive or superior performance.
    These results suggest that continuous-time flow models are
    a promising alternative for capturing highly complex and
    varied behaviors from noisy data.
  8. VII1.8
    Metaheuristic Optimization of Boosting and Hybrid Machine Learning Models for IoT Intrusion Detection: A Review
    Ninoslava Jankovic, Aleksandar Petrovic and Petar Spalevic
    ID: 8038Секција / Track: AIRPIEEE Xplore
    Кључне речи / Keywords: IoT, cyber security, intrusion detection, AI, metaheuristics
    Апстракт / Abstract
    The rapid expansion of Internet of Things (IoT)
    infrastructures has significantly increased the attack
    surface of
    modern digital ecosystems. Due to constrained computational
    resources and heterogeneous device architectures,
    traditional
    intrusion detection mechanisms are often inadequate for IoT
    environments. Machine learning techniques have emerged as
    efficient solutions for intelligent intrusion detection
    systems, while
    metaheuristic optimization algorithms have demonstrated sub
    stantial improvements in feature selection and
    hyperparameter
    tuning processes.
    This paper provides a comprehensive review of metaheuristic
    optimized machine learning approaches for IoT intrusion de
    tection. Existing methods are systematically categorized
    based
    on optimization strategy, learning architecture, and
    application
    domain. Particular emphasis is placed on boosting-based
    models,
    deep learning frameworks, and hybrid multi-level
    optimization
    strategies. Furthermore, commonly used datasets, evaluation
    methodologies, and performance trends are analyzed to
    identify
    current research directions and limitations. The study
    highlights
    emerging trends, including domain-specific IoT applications
    and hybrid optimization frameworks, while identifying open
    challenges related to benchmarking, deployment efficiency,
    and
    generalization across heterogeneous IoT scenarios.
  9. VII1.9
    A comparative study of KAN and Neural ODE models for LR-DDoS attack detection in IoT networks
    Dušan Drinić, Marija Novičić and Goran Kvaščev
    ID: 5960Секција / Track: AIRPIEEE Xplore
    Кључне речи / Keywords: Low-rate DDoS, neural networks, KAN, Neural ODE, IoT, datasets
    Апстракт / Abstract
    In recent years, cybersecurity has become a critical aspect
    of modern network systems, as Internet of Things (IoT)
    networks continue to grow rapidly across various
    application domains. While these advancements bring
    significant benefits in terms of connectivity and
    functionality, they also introduce increased security and
    privacy risks. One of the most challenging threats is the
    Low-Rate DDoS (LR-DDoS) attack, which can cause significant
    damage to target systems using minimal resources and
    traffic, typically representing only a small portion of
    total network activity, making detection highly difficult.
    In this paper, we propose and evaluate two models, a
    modified Kolmogorov–Arnold Network (KAN) and a Neural
    Ordinary Differential Equation (Neural ODE), under the same
    experimental conditions. Experiments are conducted on the
    CICIoT2023 dataset using a 5-fold cross-validation strategy
    and standard evaluation metrics including Accuracy,
    Precision, Recall, and F1-score. The results show that both
    models achieve high and stable performance, with accuracy
    above 96% and strong generalization capabilities. Overall,
    the study demonstrates the effectiveness of both approaches
    while highlighting their different trade-offs in detection
    performance and computational efficiency.