Turn of the year is always a good time to look back in history. In this first blog post of the new year, I want to tell you the story of my very first “data warehouse” project. A not very serious blog post with some serious recommendations in the end.
Yesterday, the public Swiss radio station celebrated the 50th anniversary of the Swiss charts (“Schweizer Hitparade”). The whole day, they played number one hits of the last five decades. When I listened to some of the songs of the late seventies and early eighties, that reminded me of my first software development project.
As a teenager, the weekly charts in the radio were an important regular event for me. I did not only listen to all the new songs, but wrote down the chart classifications of the top 15 songs every week – very old-fashioned with paper and pencil. I spent hours with statistical evaluations: Which band had the most songs in the charts? Which songs were in the top 3, 10 or 15 ranks for how many weeks? How was the distribution between English, German and Swiss German songs? As you can see, a lot of really important and useful information for a teenager like me.
In 1981, I bought my first computer: a Commodore PET 3032 with the gigantic memory capacity of 32 kilobytes! Because I spent so much time with this machine, I could never part with it. My old Commodore computer is still in our cellar, but I guess it does not run anymore.
The old hardware… and the young programmer
My first software project was a kind of “data warehouse” (of course, I did not know this term in 1981 – it was first introduced in 1988). I spent several weeks in developing a BASIC program to insert and edit the Swiss chart rankings. The data was stored in a BASIC array in memory. So, I already worked with “in-memory databases” in the early eighties. But I was aware that data must be stored in a “non-volatile” form and even implemented a “Save” function that stored the whole data to an audio cassette.
The user interface was quite comfortable: After starting the program, the history data had to be read from cassette (this took several minutes). Then I was able to insert the results of the next chart edition. Songs that already were in the top 15 of the previous week had not be inserted again. I just had to type in the rank of the last edition, and the program copied (not referenced!) the title and singer/band name automatically. Wow! A kind of “slowly changing charts”…
When the radio show was over and I entered all my new data, I had to rewind the tape of the audio cassette and press the shortcut command “S” to save the data (for young readers of my blog: the Commodore computer had neither a mouse nor a trackpad nor a touch screen – just a keyboard).
The BASIC program contained several statistic functions, for example, an overall ranking for the current year, and some more stuff. I even implemented security functionality. Two passwords were required, one to edit and one to read the data (I can’t remember this detail, but I think the passwords were hard-coded in the program code).
After a short test phase with a small data set, my program was ready to use. I started to type in the chart rankings of 1979 (that was when I started to track the classifications). The papers I used for this were a kind of a “staging area” for my new program. After I inserted the data for the first few weeks, my Commodore displayed an error message I haven’t seen before: OUT OF MEMORY ERROR.
What happened? What I did not realize is that most of the 32K memory was already occupied by the BASIC program I wrote – presumably hundreds of lines of “spaghetti code” with many GOTO commands. My nice concept of loading the whole history from cassette to memory was not very clever. The software project died before it was “in production”. Fortunately, that never happens in real projects nowadays – or does it?
Of course, this experience was quite disappointing for me. But at least, I learned three important things:
- No matter how much memory your hardware has – it’s always too little.
- Software tests should always be done with a realistic amount of data, not only with small test sets.
- A software should contain the functions that are required, not everything that is technically possible.
Although technology has evolved since then, these three rules are still in place – but disregarded in many projects. Perhaps I’m wrong, but could it be that my little software project was not the only one that failed because of these rules?