Yesterday I showed how we can use MarkItDown to convert multiple document types to markdown to make them easier consumable in an LLM context. I used a simple CV in PDF format as an example. But what if you have images inside your documents? No worries! MarkItDown allows to process images inside documents as well. Although I couldn’t find a way to directly use and call it through the command line, it certainly is possible by writing some Python code. First make sure that you have the MarkItDown module installed locally: pip install 'markitdown[all]' Now we can first try to recreate our example from yesterday through code: Remark: The latest version of MarkItDown requires Python 3.10. If that works as expected, we can further extend this code to include an LLM to extract image data. We’ll use Ollama in combination with LLaVA (Large Language and Vision Assistant), a model designed to combine language and vision capabilities, enabling it to process and understand...