At work I have a pretty big database that might or might not have a problem and the issue is, I have doubt in SQL to find out if I truly have something to worry about or if it was just a fluke.
In my database there are two tables. The first table, which we’ll call “chart” has a relationship with another much bigger table called “main”. The two tables have an odd interrelationship, in so far that many values from chart are ‘cached’ in main, but since both tables are allowed to vary apart from each other they can drift apart. So, for example, if you add something novel to chart, it isn’t completely reflected in main everywhere you’d expect it, because many of the cached values don’t get automatically updated from chart to main. There is a user utility that can do this task and it’s called “data sync” but I have elected to not use it and instead I have created a rather long SQL query that finds where there are mismatches between main and chart and uses the values in chart to fill in the cached empties in main. For years this has been a daily scheduled job and I’ve been living with the faith that my SQL code is correct and is doing it’s job.
Yesterday a coworker of mine complained to me that their report no longer showed what it should. We verified that there were indeed 47 records in main that should have been selected but only 1 was. I started to query the database and compare the values in chart and their places along the 47 “should have been” selected records and discovered to my deep chagrin that the values were not properly cached as I had faith that they would be. Of course I felt panic starting to gnaw away on the edge of my mind as I instantly had doubt that my SQL code was working properly.
The SQL code that “failed” looks like this:
select m.mainacctno, c.chart_acct
from main m inner join chart c
on m.mainrest=c.chart_code
where m.mainacctno!=c.chart_acct
Here the two columns should be equal, mainacctno and chart_acct for each record that is joined by mainrest=chart_code. The mainrest column is good as is the chart_code column. What I want to see is where, for each comparison, that mainacctno doesn’t equal whatever value is in chart_acct. So, for these 47 records there should be a value of ‘00000PSGI’ that exists in chart and not in main. My query should show these, but yesterday it did not.
Yesterday I had my back up against the wall and couldn’t really spend much time analyzing the problem so I forced the matter with a direct update command which put the proper value in each of the 47 records, essentially shoving ‘00000PSGI’ into each main record.
What bothers me is that when I use the != operator I expect it to behave as I have learned it to behave, to evaluate as true when operand A doesn’t equal operand B. Both columns, mainacctno and chart_acct are both of type char(30). I’ve tried using the other style of not-equal operator that is <> but that doesn’t display either.
I have 1759 records in chart and 1,414,844 records in main. My doubt can only be assuaged by manual comparison and I’m not looking forward to that task. I’ve even tried the “not like” operator to no avail. I wouldn’t be in this situation if I didn’t run afoul of those 47 records where a blank mainacctno apparently equaled a not-blank chart_acct.
What bothers me the most, I think, is that I now doubt the logical operators in SQL. With something so fundamental, everything built on that foundation is now subject to doubt. Does my database really have referential integrity? If logical comparison operators no longer behave in a logical fashion, is the stone that I think my database castle is made of sand?