Projects with this topic
-
-
These projects form an open-source suite of AI infrastructure tools built for modularity, security, and self-hosted deployment, allowing users to maintain full ownership and control of their systems and data. They deliver interoperable AI functions including automation, data retrieval, encryption, and identity management that can be applied across many different industries.
Updated -
Recommend.Games blog: https://blog.recommend.games/
Updated -
SIMNL is een Python-tool die federatief consistente synthetische datasets genereert voor de drie belangrijkste Nederlandse overheidsregisters: BRP (personen), BAG (adressen) en HR (ondernemingen). De gegenereerde data is AVG-compliant, demografisch accuraat volgens CBS-cijfers, en bevat correcte kruisverwijzingen tussen registers — ideaal voor het testen van applicaties die met overheidsdata werken zonder echte persoonsgegevens te gebruiken. Met ondersteuning voor reproduceerbare output via seed-parameters en 30+ gevalideerde business rules levert SIMNL realistische testdata voor ontwikkel- en integratieomgevingen.
Updated -
Bahn-Vorhersage - The best Train Delay Prediction System.
Updated -
Welcome to my portfolio repository. Here you can find a selection of projects developed during my Master's degree in Data Science, ranging from machine learning and statistical analysis to research-oriented applications.
Updated -
Python SDK for the PaveDB /v1 API, with HTTP client support and local embedded provider discovery.
Updated -
Fundamental theory and practice in Data Science (DS).
🧮 data analysis AI ML DL machine lear... deep learning data science data-enginee... artificial i... data-science data preproc... Python C C++ NumPy pandas mathematics Algorithm algorithms Data Enginee... big data scipy scikit-learn xgboost lightgbm catboost TensorFlow keras PyTorch matplotlib seaborn plotly nltk opencv dask linear-algebra calculus probability statistics Discrete Mat... RUpdated -
-
Charla para Pythonistas. Cómo usar Quarto + Python + GIT para crear reportes
Updated -
-
Machine learning-based fraud detection system using Logistic Regression, Random Forest, and a Voting Classifier, achieving ~99% accuracy on a synthetic financial dataset.
Updated -
Projet académique de Master 2 en Python portant sur l’analyse de la qualité de l’air en France et en Île-de-France. Le projet combine scraping de données, traitement de données environnementales, analyse statistique des concentrations de polluants atmosphériques et visualisation cartographique des indices de qualité de l’air. Les analyses portent notamment sur les polluants NO₂, O₃ et PM10, avec une étude temporelle des données Airparif de 2014 à 2017.
Mots-clés :
python data-science web-scraping data-analysis data-visualization air-quality environmental-data geospatial-analysis time-series pandas
Updated -
A modular Clinical NLP Pipeline built to process and analyze unstructured medical text using both traditional machine learning and transformer-based approaches.
The project combines multiple components including OCR, text preprocessing, feature engineering, classification, named entity recognition, and visualization into a single end-to-end pipeline. It supports extracting clinical insights from raw documents and predicting medical categories using both TF-IDF + SVM and BERT-based models.
The system was designed and implemented as a structured Python project, with each stage separated into independent modules for scalability and maintainability.
Key Highlights
Built an end-to-end NLP pipeline for clinical text processing. Implemented SVM (≈51% accuracy) and BERT (≈77% accuracy) models. Integrated OCR for extracting text from medical documents. Performed Named Entity Recognition (NER) on clinical data. Designed modular architecture (src/) for clean code organization. Exported outputs for visualization and dashboard integration.Updated -
-
-
A LARA python-django app for managing projects and experiments in lab automation systems and scientific laboratories.
Updated -
Documentos referentes al curso de Introducción a la bioestadística y programación:
Hojas de cálculo Bioestadística Programas bioinformáticos Bases de datos biosanitarias Lenguajes de programación Lenguaje de programación RUpdated -
Personal explorations in ML and statistics for quant trading
Updated