Skip to main content

Posts

Showing posts from March, 2025

MarkItDown with Ollama–Process images inside documents

Yesterday I showed how we can use MarkItDown to convert multiple document types to markdown to make them easier consumable in an LLM context. I used a simple CV in  PDF format as an example. But what if you have images inside your documents? No worries! MarkItDown allows to process images inside documents as well. Although I couldn’t find a way to directly use and call it through the command line, it certainly is possible by writing some Python code. First make sure that you have the MarkItDown module installed locally: pip install 'markitdown[all]' Now we can first try to recreate our example from yesterday through code: Remark: The latest version of MarkItDown requires Python 3.10. If that works as expected, we can further extend this code to include an LLM to extract image data. We’ll use Ollama in combination with LLaVA (Large Language and Vision Assistant), a model designed to combine language and vision capabilities, enabling it to process and understand...

Convert documents to Markdown to build a RAG solution

Context is key in building effective AI enabled solutions. The most popular way to extend the pretrained knowledge set of a Large Language Model is through RAG, or Retrieval-Augmented Generation. By augmenting LLMs with external data we ensure that outputs are not only coherent but also factually grounded and up-to-date. This makes it invaluable for applications like chatbots, personalized recommendations, content creation, and decision support systems. What is MarkItDown? For any RAG solution to function effectively, the quality and format of the input data are critical. This is where MarkItDown, a lightweight Python utility created by Microsoft, stands out. It specializes in converting various files into Markdown format, a token-efficient and LLM-friendly structure. From the documentation : MarkItDown is a lightweight Python utility for converting various files to Markdown for use with LLMs and related text analysis pipelines. To this end, it is most comparable to textract , ...

VSCode - Change Python version

After installing the latest Python version on my local machine, I noticed that VSCode was still referring to an old(er) version. In this post I'll show how to fix this. Let's dive in! I installed a new Python version using the official installer: Download Python | Python.org . However when I tried to run a Python program in VSCode, I noticed that an older version was still used when I looked at the output in the terminal: & C:/Users/bawu/AppData/Local/Microsoft/WindowsApps/python3.9.exe d:/Projects/Test/MarkItDownImages/example.py Traceback (most recent call last):   File "d:\Projects\Test\MarkItDownImages\example.py", line 1, in <module>     from markitdown import MarkItDown ImportError: cannot import name 'MarkItDown' from 'markitdown' (C:\Users\bawu\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.9_qbz5n2kfra8p0\LocalCache\local-packages\Python39\site-packages\markitdown\__init__.py) To fix it, open...

Using Bolt.new locally using Bolt.diy and Ollama

Maybe you’ve heared about Bolt.new , the AI solution from StackBlitz that allows you to prompt, edit, and deploy full-stack web and mobile applications in a breeze. It uses an in-browser AI web development agent that leverages StackBlitz’s WebContainers to allow for full stack application development. The application presents users with a simple, chat-based environment in which one prompts an agent to make code changes that are updated in real time in the WebContainers dev environment. I find it a great way to get a head start when building small(ler) web applications. But what if due to company policies or other reasons, you are not allowed to use bolt online? In that case I have some good news for you, as the team from StackBlitz also created bolt.diy, the open source version of Bolt.new which allows you to choose the LLM that you use for each prompt. Installing Bolt.diy The easiest way to install Bolt.diy is through Docker. Start by cloning the repository locally: g...

Tackling Technical Debt- Where to start?

Every software project accumulates technical debt. Like financial debt, it compounds over time if left unaddressed, making future changes increasingly difficult and expensive. But knowing where to begin tackling technical debt can be overwhelming.  As our time is limited, we have to choose wisely. I got inspired after watching Adam Tornhill  talk called Prioritizing Technical Debt as If Time & Money Matters. So before you continue reading this post, check out his great talk: Back? Ok, let’s first make sure that we agree on our definition of ‘technical debt’… Understanding Technical Debt Technical debt isn't just "bad code." It represents trade-offs made during development—shortcuts taken to meet deadlines, features implemented without complete understanding of requirements, or design decisions that made sense at the time but no longer fit current needs. Technical debt manifests in several ways: Code smells : Duplicated code, overly complex methods, an...

Discontinuous improvement

One of the mantra’s I always preached for my teams was the concept of 'Continuous Improvement'. The idea is simple and appealing: we constantly seek incremental enhancements to our processes, products, and services. This approach, popularized by Japanese manufacturing methodologies like Kaizen, promises steady progress through small, ongoing adjustments rather than dramatic overhauls. However while reading the ‘Leadership is language’ by L. David Marquet, I started to wonder; what if this widely accepted wisdom is fundamentally flawed? What if true improvement doesn't actually happen continuously at all? The stairway, not the ramp In his book, David explains that improvement doesn't occur as a smooth, uninterrupted climb upward. Rather, it happens in distinct, intentional batches - like climbing stairs instead of walking up a ramp. This is what he calls "discontinuous improvement," and understanding this concept can transform how your team operates. ...

VSCode - Expose a local API publicly using port forwarding

I’m currently working on building my own Copilot agent(more about this in another post). As part of the process, I needed to create an API and expose it publicly so it is accessible publicly through a GitHub app. During local development and debugging I don't want to have to publish my API, so let's look at how we can use the VS Code Port Forwarding feature to expose a local API publicly. Port forwarding Port forwarding is a networking technique that redirects communication requests from one address and port combination to another. In the context of web development and VS Code, here's what it means: When you run a web application or API locally, it's typically only accessible from your own machine at addresses like localhost:3000 or 127.0.0.1:8080 . Port forwarding creates a tunnel that takes requests coming to a publicly accessible address and forwards them to your local port. For example, if you have an API running locally on port 3000: Without port forw...

.NET Aspire Dashboard - The mystery of the hidden endpoint

One of the cool features of .NET Aspire is the Aspire Dashboard(also available standalone by the way). It allows you to closely track various aspects of your app and its resources, including logs, traces, and environment configurations, in real-time. After migrating an existing solution to .NET Aspire, I noticed that no URL was shown for some of the endpoints: I couldn’t find a direct reason why this was the case, but I found online that it could be related to the launchsettings.json files. Here is the launchsettings.json file for the ‘api’ project (where the endpoint URL is shown): And here is the launchsettings.json file the for the ‘webapp’ project (where no endpoint is shown): With some trial and error, I found that the order of the profiles in the launchsettings.json file matters and that Aspire will use the first profile found. I switched the profiles for the ‘webapp’ project: And indeed, after doing that, the endpoint URL became visible on the dashboard: ...

You don’t need cloud

A recent survey by Flexera found that businesses waste an estimated 30% of their cloud spend on unused or unnecessary services. For small businesses with tight margins, this isn't just inefficient—it's potentially devastating. While tech giants promote cloud solutions as essential for every modern business, the reality for most small operations is quite different. Cloud is flexible, not cheap. Although you can build cost-effective solutions on a cloud platform, you have to be purposeful about this and design your solution specifically for it. The cost predictability problem One of the most frustrating aspects of cloud services is their variable pricing models. While providers advertise low entry prices, costs can quickly spiral as your usage increases: Pay-per-use pricing makes monthly bills unpredictable Resource provisioning often requires overestimation to prevent outages Hidden costs for data transfer, API calls, and storage can shock you at month...

Future Tech 2025

Last week I had the opportunity to participate and speak at Future Tech 2025. And although it was the 7th edition, it was my first time there, so I didn't know what to expect. The Future Tech team was very kind and walked with me to the speaker room (yes, free 'stroopwafels'). When the opening was announced, I entered the main room together with 700+ other software developers still unaware of what would happen next. And then Anjuli Jhakry and Dennis Vroegop took the stage… What started as a small introduction of the agenda, evolved in a story about fear and ended with the key message that we all should confront and embrace our fears. This all brought in such a vulnerable and strong way (no I will not spoil how they did it) that it will stick with me for the rest of my life. What a great intro and exceptional example of storytelling. Kudos to both Anjuli and Dennis! After putting the bar so high, Roelant Dieben and his AI companion took the stage for his keynote to talk ...

Tweak your LLM models with Ollama–Using OpenWebUI

Yesterday I explained how we can create and upload our own language models in Ollama through the usage of a modelfile. I explained the modelfile format and the different building blocks that can be used to define and configure a model. Today I want to continue on my previous post by explaining how to use OpenWebUI instead of doing everything by hand. Start by opening OpenWebUI (checkout my previous post on how to get it up and running): Click on the Workspace section on the left:   Click on the + button in the Models section on the right: Start editing your modelfile:   Hit Save & Create at the bottom:   After saving the new model, you can immediately test it: More information Explore and test local modals using Ollama and OpenWebUI Models | Open WebUI

Tweak your LLM models with Ollama

If you want to create and share your own model through Ollama or tweak an existing model, you need to understand the Ollama Model file. The model file is the blueprint to create and share models with Ollama. Understanding the Ollama model file Let us first  have a look an existing model file to give you an example. Therefore you can use the following command: ollama show <modelname> --modelfile Let’s give it a try: ollama show phi4:latest --modelfile # Modelfile generated by "ollama show" # To build a new Modelfile based on this, replace FROM with: # FROM phi4:latest FROM C:\Users\bawu\.ollama\models\blobs\sha256-fd7b6731c33c57f61767612f56517460ec2d1e2e5a3f0163e0eb3d8d8cb5df20 TEMPLATE """{{- range $i, $_ := .Messages }} {{- $last := eq (len (slice $.Messages $i)) 1 -}} < |im_start|>{{ .Role }}<|im_sep|> {{ .Content }}{{ if not $last }}<|im_end|> {{ end }} {{- if and...

Running LLMs locally using LM Studio

As I like to experiment a lot with AI, I always have to be careful and keep my token usage under control. And although the token cost has decreased over time for most models, the expenses can go up quite fast. That is one of the reasons I like to use  (Large) Language Models locally.  There are multiple ways to run a model locally but my preferred way so far was Ollama (together with OpenWebUI ). I also experimented with Podman AI Lab but I always returned to Ollama in the end.  Recently a colleague introduced me to LM Studio , another tool to run and test LLM’s locally. With LM Studio, you can: Run LLMs offline on your local machine Download and run models from Hugging Face Integrate your own application with a local model using the LM Studio SDK or through the OpenAI endpoints Use the built-in RAG support to chat with your local documents More than enough reasons to give it a try… Getting started I downloaded the installer from the websi...