From the Programme Director

From the Programme Director

Before I joined UTwente as a professor in 2020, I was working at a company in Belgium called Raincode. The official job title was “analyst/developer” — I have also reached the level of the Chief Science Officer, but now’s not the time to talk about executive functions. As a developer, I was writing code daily — mostly in C#, sometimes in other languages, like domain-specific languages that I created myself or helped co-create, like Rascal made by CWI in Amsterdam in 2010 onward. As an analyst, I did some interesting stuff analysing entire codebases of very large companies I am not allowed to name now.

The biggest codebase I have seen with my own eyes and hands, had about 250 million lines of code. If you write a hundred lines of code a day — quite a respectable number, I might add — this will take you about 7000 years to accomplish, even by some miracle you would know what you are doing. That codebase was a decades-old system of an entire bank, and they guarded it like it was their most valuable treasure — because it was! It was a Spanish bank, not the largest there, but the fourth largest. Sometimes I still wonder how much code did the largest bank in Spain had then…

However, the largest problem was not the size but the complexity. Diversity, so to speak. Usually, the codebase would be delivered to me on a flash drive, accompanied by all kinds of advice to be very careful with it, for it contains the “business logic” of the company, the dreaded crucial piece of algorithmics that, if stolen, would allow the competitors to do mean things to my customers. Obviously, I didn’t want that. At first, I opened those flash drives with hopes to find a nice hierarchical structure, with packages, modules, and whatnot. I was gravely disappointed if I expected any of it.

The code was delivered as a collection of files residing in one folder. The files would have no extension. There would be thousands, hundreds of thousands of those, all next to one another, all looking similar, until you take a closer look. A lot of that code was in COBOL — which is typical for the banking sector — and unless you have followed my MSc CS course on Software Evolution, you might not even have heard of that language. It wasn’t a problem, really, we had our own COBOL compiler since it was in high demand, and Microsoft appreciated us having it (targeting their .NET and Azure) so much that they gave us, a tiny club of compiler nerds, the Gold Partner status. We could handle that. Occasionally I would find other languages, and we had to deal with those somehow as well.

However, a part of the codebase was inevitably written in assembler. Have you seen assembler? It is the lowest level programming language possible, besides perhaps pure machine code, and why would you even use it to make anything like banking software? Could it be performance? No, actually, that was not the main reason, as I found out by talking to very senior software engineers. The actual reason was that assembler was free — and when they started decades ago on their banking system, they basically could not afford a real compiler.

We bit the bullet and wrote our own assembler compiler. It sounds crazy to normies’ ears: an assembler is something making machine code from low level programs, and a compiler is something making lower level code from higher level code, this is how it has been since the first compiler was conceived in 1958, before my parents were even born.

Our assembler compiler pretended that assembler was a high level language, and translated each instruction into native .NET code or to some C# simulation code. It was a revolution after I had completed it, since it allowed our customers to preserve their old code exactly the way it was running for generations of programmers, until the renovation activities were complete. (I never claimed it was a good idea to keep it forever, but being in VS/VS Code gives you opportunities you might not have had on older machines, and those help you in testing and replacing it further). We have received an award from Microsoft for technology renovation or some such, it was a huge round thing proudly displayed at the entrance of the company — it is probably still displayed there till this day!

I wrote some papers about its design and moved on. Now I use it occasionally as an example in my lectures — it made a difference for our customers back then, still does, and it can make some impact on today’s students as well. After all, computer science is all about changing the world for the better with the power of technology!
Collectively as humankind, we have created a lot of legacy, and we can also own up to it by facing the consequences and dealing with it with the same technology we claim to have under our grasp.

About Vadim

Dr. Vadim Zaytsev finished Telematics, which now has been absorbed into the CS master as “Internet Science & Technology” specialisation, at UT as his second Master degree. He got his PhD title at the Free University (VU) in Amsterdam. Besides teaching, he has been an active researcher, and published many academic papers on topics such as software engineering and modelling, modernisation and migration of legacy software, transformational and generative techniques, and software languages and grammars, earning the nickname “grammarware”. He worked several years in the industry, developing new compilers, some of which were acknowledged by Microsoft Tech awards.

Since 2020, Vadim is back at UT as an associate professor of software evolution. In 2021, he also won the Inter-Actief Decentralised Educational Award (IDEA). Vadim has a cat named Viking.