About this project
it-programming / artificial-intelligence-1
Open
Overview: Seeking a skilled ai/ml freelancer with strong experience in encoder-decoder models, medical image captioning, and model optimization techniques. The goal is to fine-tune and extend an existing architecture to generate accurate and explainable captions for medical images. Key Requirements: Dataset: Use the ROCO dataset (10,000+ medical images) for training and evaluation. Model: Fine-tune the MedICap model integrating: SCST (Self-Critical Sequence Training) Cross Entropy Loss BERTScore Optimization: Implement Hybrid Harris Hawk Optimization (HHO) for feature selection to improve model efficiency. Explainability: Integrate Grad-CAM++ to generate visual explanations (heatmaps) and compare performance with and without explainability. Evaluation Metrics: Captioning: Bertscore, bleu-4, rouge-l, meteor, cider efficiency: accuracy, feature selection time, computational time implementation scope: end-to-end model training and fine-tuning image upload interface caption generation and grad-cam++ visualization integration medicap ->
https://github.com/aehrc/imageclefmedical_caption_23 Deliverables: Fully trained and fine-tuned model Optimized and explainable pipeline Clean, modular code with documentation ? Technical Skills Deep Learning & Computer Vision Proficiency with encoder-decoder architectures (e.g., cnn-rnn, transformer-based models) experience with image captioning tasks understanding of scst (self-critical sequence training) natural language processing familiarity with bertscore, bleu, rouge, meteor, cider handling of text generation tasks and nlp metrics optimization algorithms knowledge of metaheuristic optimization techniques, especially harris hawk optimization (hho) experience applying optimization to feature selection model explainability expertise with grad-cam++ or similar explainability tools for vision models frameworks & libraries pytorch / lightning transformers (hugging face) opencv, matplotlib (for visualizations) scikit-learn, numpy, pandas devops & integration python scripting and modular code structuring experience creating web or script-based interfaces (e.g., Streamlit, Flask, or basic UI) Model deployment & integration (e.g., upload image → output caption + Grad-CAM++)
Project overview
Hey! we need to do this:Dataset: ROCO (80k images). Model: Fine-tune MedICap using SCST + BERTScore + Cross Entropy Loss. Optimization: Apply Hybrid Hawk Optimization (HHO) for feature selection. Explainability: Use Grad-CAM++ to compare heatmaps vs. No explainability. Evaluation: Compute Bertscore, bleu-4, rouge-l, meteor, cider, and efficiency metrics (accuracy, feature selection time, computational time). Implementation: Train, fine-tune, and integrate into a frontend (image upload → caption generation + Grad-CAM++ visualization). Codebase: Using a sample repo from ImageCLEF 2023 for modifications.
Category IT & Programming
Subcategory Artificial Intelligence
Project size Small
Is this a project or a position? Project
Required availability As needed
Delivery term: April 13, 2025
Skills needed