PGDF

OSIRIS Legal-Fiscal AI Workflows

AI delivery for PGDF legal-fiscal operations, spanning production APIs, supervised and semi-supervised models, active learning, and early LLM exploration for document-heavy institutional workflows.

Data Scientist · May 2023 - May 2024

Stack

  • Python
  • SQL
  • FastAPI
  • Pytest
  • scikit-learn
  • XGBoost
  • LightGBM
  • DVC
  • MLflow
  • spaCy
  • Hugging Face Transformers
  • LangChain
  • Pandas
  • NumPy
  • Docker

Primary impact

Brought governed ML workflows and production APIs into legal-fiscal operations, while designing active-learning paths for longer-term model adaptation.

Outcomes

  • Production APIs connected model outputs to PGDF internal systems
  • Active-learning loop designed to reduce model drift over time
  • LLM exploration opened paths for future fiscal-text workflows

Context

OSIRIS was a research and development initiative supporting PGDF on legal-fiscal execution workflows. The goal was to automate internal steps, improve efficiency, and explore where machine learning and LLMs could reduce repetitive work in document-heavy institutional processes.

My role

  • Worked as Data Scientist on the initiative and translated business requirements into technical scope.
  • Developed RESTful APIs in Python and FastAPI to connect model outputs to PGDF systems.
  • Built and evaluated supervised, unsupervised, and semi-supervised models for fiscal classification and process optimization.
  • Designed active-learning and experimentation workflows for longer-term model adaptation.

Problem

Legal-fiscal operations combine messy text, changing procedures, and institutional systems that cannot tolerate brittle automation. The team needed machine learning that could improve internal flow without creating a hard-to-maintain research island.

That required practical model delivery, not just experimentation: reproducibility, data versioning, integration, and a plan for model behavior as the domain evolved.

Architecture

The OSIRIS workflow was built around:

  • preprocessing and feature-engineering pipelines for legal-fiscal data
  • supervised, unsupervised, and semi-supervised model experiments
  • REST APIs for production integration
  • dataset and experiment versioning through DVC and MLflow
  • active-learning loops to keep the system current
  • exploratory LLM workflows for fiscal-text interpretation
  • continuous improvements to data pipelines and training frameworks

The system was designed to support both present delivery needs and future model evolution.

Challenges

  • Fiscal legal text changes over time, which makes static models decay quickly.
  • Production adoption depends on integration quality as much as model quality.
  • LLM exploration in institutional environments needs a clear boundary between useful experimentation and premature rollout.
  • Internal legal-fiscal workflows need automation that remains explainable and maintainable over time.

Solution

I treated the project as a workflow problem first. The solution combined governed ML delivery, API integration, and active-learning design so the models could improve without becoming operationally fragile.

In parallel, I evaluated how LLMs could support fiscal-text interpretation while keeping the work anchored in real deployment constraints. That created a better base for future expansion without overselling early experiments.

Impact

  • Deployed model-backed APIs into PGDF internal systems.
  • Designed an active-learning loop for continuous improvement with lower manual relabeling burden.
  • Opened practical LLM paths for legal-fiscal document analysis while keeping delivery grounded in operational reality.

Related writing

Writing that emerged from the same work.

Technical writing

LLM Evaluation in Production Starts With Explicit Failure Modes

Jul 2, 2025

Evaluation is most useful when it reflects the failures a system can actually produce in production: missing context, wrong retrieval, incorrect tool use, unstable outputs, and unhelpful responses.

  • LLM
  • Evaluation
  • Production AI
  • Quality
Read post

Technical writing

Scaling ML Pipelines Means Reducing Hidden Manual Work

May 19, 2025

ML pipelines usually fail to scale because they depend on undocumented manual steps around data preparation, retraining, packaging, and release coordination.

  • MLOps
  • Airflow
  • MLflow
  • CI/CD
Read post

Next step

Need the broader background behind this work?

The about page connects these case studies to the rest of my delivery history across courts, agencies, and AI platform work.