Wednesday, March 7, 2012

Bug in CASE statement?

(SQL Server 2000, SP3a)
Hello all!
I'm trying to "seed" some random data in a table, and am having some trouble with the CASE
statement that I don't fully understand.
Consider the following:
select id,
[Rating] = case (abs(convert(int, convert(varbinary(4), newid())) % 5))
when 0 then NULL
when 1 then 'Poor'
when 2 then 'Fair'
when 3 then 'Good'
when 4 then 'Excellent'
else '?'
end
from sysobjects
I would expect this to return values in the range of 0-4, which would map to the various
strings that you see. However, I keep getting a high incidence of '?' strings returned.
When I use the exact same expression without the CASE statement:
select distinct Rating
from (
select Rating = abs(convert(int, convert(varbinary(4), newid())) % 5)
from sysobjects
) as q
I show that, indeed, the only values that are returned are 0-4.
This sure *seems* like a bug to me, but I suspect the problem lies in my understanding of
the CASE statement.
Any help would be much appreciated! :-)
John PetersonHi John
The only problem you are having with CASE is your terminology. CASE is an
expression, not a statement. But since you are using it as an expression,
within the SELECT statement, you do seem to understand it.
This is very strange behavior indeed. I think the problem is with NEWID().
You can establish that your understanding of the CASE expression by changing
NEWID() to something else, like id from the sysobjects table (and I am also
returning the value computed by the case expression for comparison
purposes):
select id,abs(convert(int, convert(varbinary(4), id)) % 5),
[Rating] = case (abs(convert(int, convert(varbinary(4), id)) % 5))
when 0 then NULL
when 1 then 'Poor'
when 2 then 'Fair'
when 3 then 'Good'
when 4 then 'Excellent'
else '?'
end
from sysobjects
The above seems to work as expected.
My guess is that NEWID() is being regenerated for EVERY comparison, so the
value is always changing. So your expression abs(convert(int,
convert(varbinary(4), id)) % 5) is evaluated and compared to 0, and if so,
NULL is returned, otherwise the expression is evaluted AGAIN (with a new
NEWID() ) and that is compared to 1, and if that one wasn't equal to one,
another NEWID() expression is compared to 2, etc. So it's hardly surprising
that few of the expressions will match the constant you're comparing too
(although a few will) and many will have no match and fall through to '?'
Why don't you tell us exactly what you're trying to do, and someone may have
another suggestion for you.
In the meantime, I'll try to find out what's happening with NEWID() here.
--
HTH
--
Kalen Delaney
SQL Server MVP
www.SolidQualityLearning.com
"John Peterson" <j0hnp@.comcast.net> wrote in message
news:eaQO0ZzkDHA.976@.tk2msftngp13.phx.gbl...
> (SQL Server 2000, SP3a)
> Hello all!
> I'm trying to "seed" some random data in a table, and am having some
trouble with the CASE
> statement that I don't fully understand.
> Consider the following:
>
> select id,
> [Rating] = case (abs(convert(int, convert(varbinary(4), newid())) %
5))
> when 0 then NULL
> when 1 then 'Poor'
> when 2 then 'Fair'
> when 3 then 'Good'
> when 4 then 'Excellent'
> else '?'
> end
> from sysobjects
>
> I would expect this to return values in the range of 0-4, which would map
to the various
> strings that you see. However, I keep getting a high incidence of '?'
strings returned.
>
> When I use the exact same expression without the CASE statement:
>
> select distinct Rating
> from (
> select Rating = abs(convert(int, convert(varbinary(4), newid())) %
5)
> from sysobjects
> ) as q
>
> I show that, indeed, the only values that are returned are 0-4.
> This sure *seems* like a bug to me, but I suspect the problem lies in my
understanding of
> the CASE statement.
> Any help would be much appreciated! :-)
> John Peterson
>|||Hello, Kalen! Please see inline:
"Kalen Delaney" <replies@.public_newsgroups.com> wrote in message
news:ubXkerzkDHA.3312@.tk2msftngp13.phx.gbl...
> Hi John
> The only problem you are having with CASE is your terminology. CASE is an
> expression, not a statement. But since you are using it as an expression,
> within the SELECT statement, you do seem to understand it.
Thanks -- I always want to use the correct nomenclature. I bet if I would have looked in
BOL, I might have used the right term. :-)
> This is very strange behavior indeed. I think the problem is with NEWID().
> You can establish that your understanding of the CASE expression by changing
> NEWID() to something else, like id from the sysobjects table (and I am also
> returning the value computed by the case expression for comparison
> purposes):
> select id,abs(convert(int, convert(varbinary(4), id)) % 5),
> [Rating] = case (abs(convert(int, convert(varbinary(4), id)) % 5))
> when 0 then NULL
> when 1 then 'Poor'
> when 2 then 'Fair'
> when 3 then 'Good'
> when 4 then 'Excellent'
> else '?'
> end
> from sysobjects
> The above seems to work as expected.
Yeah, that's exactly what I wound up doing in my table, since the IDs were GUIDs, I didn't
necessarily need to use newid().
> My guess is that NEWID() is being regenerated for EVERY comparison, so the
> value is always changing. So your expression abs(convert(int,
> convert(varbinary(4), id)) % 5) is evaluated and compared to 0, and if so,
> NULL is returned, otherwise the expression is evaluted AGAIN (with a new
> NEWID() ) and that is compared to 1, and if that one wasn't equal to one,
> another NEWID() expression is compared to 2, etc. So it's hardly surprising
> that few of the expressions will match the constant you're comparing too
> (although a few will) and many will have no match and fall through to '?'
Oh, interesting! I guess that sort of makes sense if the "internals" of the execution are
always re-evaluating NEWID() instead of just having one persistent value.
> Why don't you tell us exactly what you're trying to do, and someone may have
> another suggestion for you.
> In the meantime, I'll try to find out what's happening with NEWID() here.
All I really wanted to do was generate some random data -- and I tried using the RAND()
function, but it would only evaluate RAND() *once* for every row in the result set (just
like GETDATE() does, I believe); as such, every value was the same. So, I switched to use
NEWID(), since that appears to generate a new value for each row in the result set.
At this point, I was able to use a variant of your suggestion (to use a static value
instead of NEWID()) to generate my random data. But, I thought it would be worth posting
this issue to try and find out more about it. :-)
> --
> HTH
> --
> Kalen Delaney
> SQL Server MVP
> www.SolidQualityLearning.com
>
> "John Peterson" <j0hnp@.comcast.net> wrote in message
> news:eaQO0ZzkDHA.976@.tk2msftngp13.phx.gbl...
> > (SQL Server 2000, SP3a)
> >
> > Hello all!
> >
> > I'm trying to "seed" some random data in a table, and am having some
> trouble with the CASE
> > statement that I don't fully understand.
> >
> > Consider the following:
> >
> >
> > select id,
> > [Rating] = case (abs(convert(int, convert(varbinary(4), newid())) %
> 5))
> > when 0 then NULL
> > when 1 then 'Poor'
> > when 2 then 'Fair'
> > when 3 then 'Good'
> > when 4 then 'Excellent'
> > else '?'
> > end
> > from sysobjects
> >
> >
> > I would expect this to return values in the range of 0-4, which would map
> to the various
> > strings that you see. However, I keep getting a high incidence of '?'
> strings returned.
> >
> >
> > When I use the exact same expression without the CASE statement:
> >
> >
> > select distinct Rating
> > from (
> > select Rating = abs(convert(int, convert(varbinary(4), newid())) %
> 5)
> > from sysobjects
> > ) as q
> >
> >
> > I show that, indeed, the only values that are returned are 0-4.
> >
> > This sure *seems* like a bug to me, but I suspect the problem lies in my
> understanding of
> > the CASE statement.
> >
> > Any help would be much appreciated! :-)
> >
> > John Peterson
> >
> >
>|||Yes, this is an useful distinction between RAND() and NEWID(). I knew
NEWID() regenerated for each row, but it sure seems strange that it would
reevaluate for every comparison within a CASE.
But it's probably good to know this!
--
HTH
--
Kalen Delaney
SQL Server MVP
www.SolidQualityLearning.com
"John Peterson" <j0hnp@.comcast.net> wrote in message
news:uBnY#xzkDHA.2312@.TK2MSFTNGP12.phx.gbl...
> Hello, Kalen! Please see inline:
> "Kalen Delaney" <replies@.public_newsgroups.com> wrote in message
> news:ubXkerzkDHA.3312@.tk2msftngp13.phx.gbl...
> > Hi John
> > The only problem you are having with CASE is your terminology. CASE is
an
> > expression, not a statement. But since you are using it as an
expression,
> > within the SELECT statement, you do seem to understand it.
> Thanks -- I always want to use the correct nomenclature. I bet if I would
have looked in
> BOL, I might have used the right term. :-)
>
> > This is very strange behavior indeed. I think the problem is with
NEWID().
> > You can establish that your understanding of the CASE expression by
changing
> > NEWID() to something else, like id from the sysobjects table (and I am
also
> > returning the value computed by the case expression for comparison
> > purposes):
> >
> > select id,abs(convert(int, convert(varbinary(4), id)) % 5),
> > [Rating] = case (abs(convert(int, convert(varbinary(4), id)) %
5))
> > when 0 then NULL
> > when 1 then 'Poor'
> > when 2 then 'Fair'
> > when 3 then 'Good'
> > when 4 then 'Excellent'
> > else '?'
> > end
> > from sysobjects
> >
> > The above seems to work as expected.
> Yeah, that's exactly what I wound up doing in my table, since the IDs were
GUIDs, I didn't
> necessarily need to use newid().
>
> > My guess is that NEWID() is being regenerated for EVERY comparison, so
the
> > value is always changing. So your expression abs(convert(int,
> > convert(varbinary(4), id)) % 5) is evaluated and compared to 0, and if
so,
> > NULL is returned, otherwise the expression is evaluted AGAIN (with a new
> > NEWID() ) and that is compared to 1, and if that one wasn't equal to
one,
> > another NEWID() expression is compared to 2, etc. So it's hardly
surprising
> > that few of the expressions will match the constant you're comparing too
> > (although a few will) and many will have no match and fall through to
'?'
> Oh, interesting! I guess that sort of makes sense if the "internals" of
the execution are
> always re-evaluating NEWID() instead of just having one persistent value.
>
> > Why don't you tell us exactly what you're trying to do, and someone may
have
> > another suggestion for you.
> > In the meantime, I'll try to find out what's happening with NEWID()
here.
> All I really wanted to do was generate some random data -- and I tried
using the RAND()
> function, but it would only evaluate RAND() *once* for every row in the
result set (just
> like GETDATE() does, I believe); as such, every value was the same. So, I
switched to use
> NEWID(), since that appears to generate a new value for each row in the
result set.
> At this point, I was able to use a variant of your suggestion (to use a
static value
> instead of NEWID()) to generate my random data. But, I thought it would
be worth posting
> this issue to try and find out more about it. :-)
>
> > --
> > HTH
> > --
> > Kalen Delaney
> > SQL Server MVP
> > www.SolidQualityLearning.com
> >
> >
> > "John Peterson" <j0hnp@.comcast.net> wrote in message
> > news:eaQO0ZzkDHA.976@.tk2msftngp13.phx.gbl...
> > > (SQL Server 2000, SP3a)
> > >
> > > Hello all!
> > >
> > > I'm trying to "seed" some random data in a table, and am having some
> > trouble with the CASE
> > > statement that I don't fully understand.
> > >
> > > Consider the following:
> > >
> > >
> > > select id,
> > > [Rating] = case (abs(convert(int, convert(varbinary(4),
newid())) %
> > 5))
> > > when 0 then NULL
> > > when 1 then 'Poor'
> > > when 2 then 'Fair'
> > > when 3 then 'Good'
> > > when 4 then 'Excellent'
> > > else '?'
> > > end
> > > from sysobjects
> > >
> > >
> > > I would expect this to return values in the range of 0-4, which would
map
> > to the various
> > > strings that you see. However, I keep getting a high incidence of '?'
> > strings returned.
> > >
> > >
> > > When I use the exact same expression without the CASE statement:
> > >
> > >
> > > select distinct Rating
> > > from (
> > > select Rating = abs(convert(int, convert(varbinary(4),
newid())) %
> > 5)
> > > from sysobjects
> > > ) as q
> > >
> > >
> > > I show that, indeed, the only values that are returned are 0-4.
> > >
> > > This sure *seems* like a bug to me, but I suspect the problem lies in
my
> > understanding of
> > > the CASE statement.
> > >
> > > Any help would be much appreciated! :-)
> > >
> > > John Peterson
> > >
> > >
> >
> >
>|||Hello
> select id,
> [Rating] = case (abs(convert(int, convert(varbinary(4), newid())) %
5))
> when 0 then NULL
> when 1 then 'Poor'
> when 2 then 'Fair'
> when 3 then 'Good'
> when 4 then 'Excellent'
> else '?'
> end
> from sysobjects
>
> I would expect this to return values in the range of 0-4, which would map
to the various
> strings that you see. However, I keep getting a high incidence of '?'
strings returned.
What version of SQL server are you using?
I've tested your query on my MSSQL 2000 EE + SP3 and it works fine.
Number of '?' is near the number of other variants.
Serge Shakhov|||That's just it -- there should be *no* '?' results. (I'm using SQL 2000 EE SP3a.)
"Serge Shakhov" <REMOVETHIS_ACETYLENE@.mail.ru> wrote in message
news:q0ilmb.039.ln@.proxyserver.ctd.mmk.chel.su...
> Hello
> > select id,
> > [Rating] = case (abs(convert(int, convert(varbinary(4), newid())) %
> 5))
> > when 0 then NULL
> > when 1 then 'Poor'
> > when 2 then 'Fair'
> > when 3 then 'Good'
> > when 4 then 'Excellent'
> > else '?'
> > end
> > from sysobjects
> >
> >
> > I would expect this to return values in the range of 0-4, which would map
> to the various
> > strings that you see. However, I keep getting a high incidence of '?'
> strings returned.
> What version of SQL server are you using?
> I've tested your query on my MSSQL 2000 EE + SP3 and it works fine.
> Number of '?' is near the number of other variants.
>
> Serge Shakhov
>

No comments:

Post a Comment