The art of simplicity

Posts

Showing posts from November, 2023

PowerBI–Load a parquet file from an Azure Data Lake Storage

In our Azure Data Lake Storage we have data stored in parquet files. Reading this data in PowerBI is not that hard. In this post I'll walk you through the steps to get this done. Start by opening PowerBI and click on Get data from another source . Choose Azure Data Lake Storage Gen2 from the list of available sources and click on Connect . Enter the url of your Azure Data Lake Storage and click on OK. Now you get a list of available files found in the data lake. We don’t want to use these files directly but transform them, so click on Transform Data . This will open up the Power Query editor . Click here on Binary next to the parquet file we want to extract. This will add an extra step to our Power Query that parses the parquet file and extracts all the data. Click on Close & Apply to apply the changes to our query and start using the results. That’s it! More information Azure Data Lake Storage Gen2 - Power Query | Microsoft Learn Analyze data in Az...

NuGet 6.8–Package vulnerability notifications

Starting from 6.8 , NuGet will audit PackageReference packages and warn you if any have known vulnerabilities similar to what NPM does when using npm install . This works when using dotnet restore: And also when using Visual Studio: Nice! More information Auditing package dependencies for security vulnerabilities | Microsoft Learn

Git–Discard local changes

Git offers a wide range of possibilities to discard local changes. In this post I like to share some of the available options. Before I dive into the details, it is important to make the distinction between untracked and tracked files. From the documentation : Git has something called the "staging area" or "index". This is an intermediate area where commits can be formatted and reviewed before completing the commit. Untracked files live in the git working directory but are not managed by git until you stage them. Tracked files Here are some options to discard changes in tracked files: Discard Changes in a Specific File: If you want to discard changes in a specific file, you can use the following command: git checkout -- filename This will replace the changes in the specified file with the last committed version of that file. Discard Changes in All Modified Files: To discard changes in all modified files in the wo...

.NET 8–JSON Source Generator improvements

If you don’t know what the JSON source generator is, I suggest to first check this older post before you continue reading. Still there? OK! Today I want to focus on some improvements in the JSON source generator that were introduced with the release of .NET 8. Although the JSON source generator is a great (performance) improvement, it couldn’t handle some of the more recent language features like required members and init-only properties . If you tried to use these features in combination with source generators you get the following warning before .NET 8: Starting from .NET, full support for required and init members has been added and the code above will just work:

MassTransit–Quorum queues

Mirrored queues have been a feature in RabbitMQ for quite some time. When using mirrored queues messages are replicated across multiple nodes providing high availability in a RabbitMQ cluster. Each mirrored queue has a master and one or more mirrors, and messages are replicated from the master to its mirrors. Mirrored Queues operate on synchronous replication, meaning that the master node waits for at least one mirror to acknowledge the receipt of a message before considering it successfully delivered. This impacts performance and can result in throughput issues due to the synchronous nature of replication. Certain failure scenarios can result in mirrored queues confirming messages too early, potentially resulting in a data loss. Quorum queues Quorum Queues are a more recent addition to RabbitMQ, introduced to address some of the limitations posed by Mirrored Queues. They use a different replication model based on the Raft consensus algorithm. In this model, each queue is repli...

.NET 8 and C# 12–Overview

If you want to see how .NET and C# evolved over time, check out the updated overview created by nietras : Check out his post for more details and a PDF version of the image above.

Company vs company

In the English vocabulary, the word 'company' has 2 meanings: Company: an organization that sells goods or services in order to make money And Company: the fact of being with a person or people, or the person or people you are with I think it is no coincidence that the same word is used for both explanations. Origin The word "company" traces its origins back to the Latin term "com-" meaning "together with" and "panis" meaning "bread" . In its earliest usage, "company" referred to a group of people who shared meals together, highlighting the communal aspect of coming together around a common table. This initial meaning laid the foundation for the word's dual evolution, branching into both social and business contexts. Social Company: In its more informal sense, "company" refers to a gathering of individuals for social interaction or mutual enjoyment. The shared origin with ...

ADFS - MSIS5007: The caller authorization failed for caller identity

Our ASP.NET Core applications typically use WS-Federation with ADFS as our Identity Provider. After configuring a new application(Relying Party) in ADFS the first authentication attempt failed with the following error message: Encountered error during federation passive request. A look at the event viewer gave us more details: Protocol Name: wsfed Relying Party: https://localhost/example/ Exception details: Microsoft.IdentityServer.RequestFailedException: MSIS7012: An error occurred while processing the request. Contact your administrator for details. ---> Microsoft.IdentityServer.Service.IssuancePipeline.CallerAuthorizationException: MSIS5007: The caller authorization failed for caller identity <domain>\<ADUser> for relying party trust https://localhost/example/ . at Microsoft.IdentityModel.Threading.AsyncResult.End(IAsyncResult result) at Microsoft.IdentityModel.Threading.TypedAsyncResult`1.End(IAsyncResult result) ...

Find a subset from a set of values whose sum is closest to a specific value–C#

I got an interesting question from my girlfriend last week: Given I have a list of numbers, I want to select a subset of numbers that added up matches closest to a specific (positive) value. Let me give a simplified example to explain what she was asking for: If our list is [12, 79, 99, 91, 81, 47] and the expected value is 150, it should return [12, 91, 47] as 12+91+47 is 150. If our list is [15, 79, 99, 6, 69, 82, 32] and the expected value is 150 it should return [69, 82] as 69+82 is 151, and there is no subset whose sum is 150. This turns out to be known as the Subset sum problem and is a computational hard problem to solve. Luckily the list of numbers she needs to work with is quite small (about 50 numbers) and we can easily brute force this. Yesterday I explained how this problem can be solved in Excel, but what is the fun in that?! Let us have a look on how we can do this in C#. With some help of Github Copilot I came up with the following solution: Let us...

Find a subset from a set of values whose sum is closest to a specific value– Excel

I got an interesting question from my girlfriend last week: Given I have a list of numbers, I want to select a subset of numbers that added up matches closest to a specific (positive) value. Let me give a simplified example to explain what she was asking for: If our list is [12, 79, 99, 91, 81, 47] and the expected value is 150, it should return [12, 91, 47] as 12+91+47 is 150. If our list is [15, 79, 99, 6, 69, 82, 32] and the expected value is 150 it should return [69, 82] as 69+82 is 151, and there is no subset whose sum is 150. This turns out to be known as the Subset sum problem and is a computational hard problem to solve. Luckily the list of numbers she needs to work with is quite small (about 50 numbers) and we can easily brute force this. Today I want to show you how we can tackle this problem in Excel using the Solver add-in. Activate the Solver Add-In: Go to the "File" tab. Click on "Options." In the Excel Opti...

PowerBI–Limit the amount of imported data

For an Azure Fabric based data platform we are building I'm evaluating PowerBI as a possible reporting solution. My knowledge of PowerBI is almost non existant so there are a lot of things I learned along the way. Yesterday I talked about the different data fetching strategies. Today I want to focus on how to limit the amount of data. I explained yesterday that PowerBI can handle large amounts of data. This is good news but during development I want to work with a smaller dataset to limit the file size and gain some extra development speed. Limit the amount of data Let’s find out how we can get limit the amount of data in PowerBI. Start by opening the Query Editor view in PowerBI by selecting Transform data from the Home tab: This will load the Query Editor . Here we need to create a new parameter that can be used as a limit value. Click on Manage Parameters and choose New Parameter : We create a LimitRows parameter with a default value of 10: Now that we have...

PowerBI– How to fetch your data?

For an Azure Fabric based data platform we are building I'm evaluating PowerBI as a possible reporting solution. My knowledge of PowerBI is almost non existant so there are a lot of things I learned along the way. My first lesson learned: There are multiple ways to fetch data in PowerBI. I learned this lesson quite fast when I tried to connect to my first datasource. After selecting the ‘Import data from database’ option I immediately got to choose a Data Connectivity mode : The available options were: Import DirectQuery Let’s find out what each of these options mean and what exactly is the difference. Import mode This mode should be your first choice. As the name implies in this mode the data is imported in the report. When querying the data everything is loaded into memory which gives really fast query performance. This sounds a good idea for really small datasets but what if you have a lot of data? PowerBI makes it possible to work with big dataset...

C#–Declaring attributes on positional record types

In C# 9 record types were introduced. A record in C# is a class or struct that provides special syntax and behavior for working with data models. What makes them different from a ‘normal’ class or struct is that they offer the following functionality: Value equality: Two record types are equal if the types match and all property and fields match Immutability: You cannot change any property or field value after instantiation You have 2 ways to define a record type. One way is similar to what you already know when creating classes or structs: A second way is through positional parameters: Behind the scenes, the compiler does a lot of work for us as it creates: A public autoimplemented property for each positional parameter provided in the record declaration A primary constructor whose parameters match the positional parameters on the record declaration. A Deconstruct method with an out parameter for each positional parameter provided...

.NET 8–Http Logging

In .NET 7 and before the default request logging in ASP.NET Core is quite noisy, with multiple events emitted per request. That is one of the reasons why I use the Serilog.AspNetCore package . By adding the following line to my ASP.NET Core application I can reduce the number of events to 1 per request. The result looks like this in Seq : Starting from .NET 8 the HTTP logging middleware has several new capabilities and we no longer need Serilog to achieve the same result. By configuring the following 2 options, we could achieve this: HttpLoggingFields.Duration : When enabled, this emits a new log at the end of the request/response measuring the total time in milliseconds taken for processing. This has been added to the HttpLoggingFields.All set. HttpLoggingOptions.CombineLogs : When enabled, the middleware will consolidate all of its enabled logs for a request/response into one log at the end. This includes the request, request body, response, response body...

Use the index, Luke!

While investigating some database related performance issues, I discovered the following website: This site explains everything you as a developer need to know about SQL indexing. Use The Index, Luke is the free web-edition of SQL Performance Explained . It presents indexing in a vendor agnostic fashion but also share product specific guidelines for DB2, MySQL, Oracle, PostgreSQL and SQL Server. Do you think you already know everything you need to know about SQL indexing? Confirm your skills by taking this short 3 minute test: 3-Minute Test: What do you know about SQL performance? I could not say that I did so great:

ADFS–ID4216 error

After creating a new claims rule, our ADFS instance started to return the following error message: The new rule we created just pass an existing claim value: c:[Type == "urn:be:vlaanderen:acm:rrn"] => issue(claim = c); To explain why this error happened I have to give some extra context. Our ADFS instance is federated with another Identity Provider STS(IP-STS) and is acting as a resource STS(R-STS). The communication between the IP-STS and the R-STS(our ADFS instance) is done through the SAML 2.0 protocol and the tokens returned are also in SAML 2.0 format. However the communication between the R-STS and the relying party(an ASP.NET Core application) is done through WS-Federation and the token format used there is SAML 1.1 . SAML tokens have URI (ClaimType) rules that will differ based on the version of the SAML token you intend to issue. AD FS 2.0 supports WS-Federation, WS-Trust, and SAML 2.0 protocols. WS-Federation protocol only supports SAML 1.1 tok...

ADFS Claim rules

ADFS has the concept of claim rules which allow you to enumerate, add, delete, and modify claims. This is useful when you want for example introduce extra claims (based on data in a database or AD) or transform incoming claims. Although the available documentation is already helpful, I still find it a challenge to write my own claim rules. So therefore this post… Claim rule components To start a claim rule consists of 2 parts, separated by the “=>” operator: An optional(!) condition An issuance statement So both these rules are correct: Rule #1: => issue(type = "https://test/role", value = "employee"); Rule #2: c:[type == "https://test/employee", value == "true"] => issue(type = "https://test/role", value = "employee") The first rule will always generate a new outgoing claim of type https://test/role . The second rule will only generate a new outgoing claim of type https://t...

Azure Pipelines - Nuget - Unable to get local issuer certificate

For an unknown reason some of our builds started to fail suddenly. A look at the build log, showed us the following error message: ##[error]Error: unable to get local issuer certificate An error message we had seen before… The error occurred only on one of our build agents when trying to download a newer nuget version as currently available. We couldn’t find a good solution but as a workaround we manually downloaded the latest nuget version and copied it to the tools folder at: <agent directory>\_work\_tool\NuGet