[personal profile] davidschroth
Behind the cut one will find what I've spent a few hours writing today.

I'll note that it should be incomprehensible to most computer geeks, let alone normal people:



Headline:
Mariner dumps missing information
Description:
SYMPTOMS:
The DUMPLIBS function UWAITSWLS falls over ungracefully when run against
Dorado-400 dumps after Exec CHG 80465 has been applied.
INTERNAL-CONDITIONS:
Unknown.
TECHNICAL-EXPLANATION:
When creating the PCT/BDT, the thread calling CREBDT will acquire a chunk
of virtual space large enough to hold both a maximum size PCT and a
maximum size BDT. This turns out to be 01052000 words. MSMBANKFIT will
turn this requested allocation into an allocation of 01060000 words
(allocations are made on page boundaries, and in multiples of the page
size). Before the integration of CHG 80465, the code that set up the
actual amount of space used would overstate the amount of virtual space
used (01052000 words, rather than the 01000100 or 01010000 words actually
used).
None of this had any effect on dump taking on systems prior to
ONVERSION3E systems - dumping was done on a real page basis, and accuracy
(or lack thereof) of the accounting of the actual virtual space used had
no impact.
For ONVERSION3E systems, however, dumping is driven off of the data
structure that describes virtual memory allocations. Specifically,
ONVERSION3E dumps used a combination of BCLL (lower limit) and BCCR (actual
amount of virtual space being used) to determine what to dump. This use
of BCLL and BCCR for determining what to dump means that ONVERSION3E
dumping has been broken since day one. The location of the PCT and BDT
within the allocated virtual space has always been done by starting at
the last allocated virtual address + 1, and subtracting the maximum size
of the BDT from the last allocated virtual address + 1. This value is
used as the base address of the BDT. The base address of the PCT is
determined by subtracting the highest PCT address + 1 from the BDT base
address. The ONVERSION3E dump code always started at the first allocated
virtual address, and dumped BCCR blocks. Because MSMBANKFIT allocated
more virtual address space than was requested, the ONVERSION3E dumping
code started dumping BCCR blocks at an address before the start of the
PCT, and ended up not dumping some bits of the BDT.
This problem was masked (in most cases) before the integration of CHG
80465, as we used to drastically overstate the amount of virtual space
that was being used.
DEPENDENCIES:
This problem affects ONVERSION3E systems.
RESOLUTION-COMMENTS:
We'll resolve this problem by setting up BCLL for the PCT/BDT container
to the difference between the size allocated and the size requested.
This should cause ONVERSION3E dumps to start dumping at the correct
location.
In addition, we'll correct some minor problems that have crept into the
systems over the years:
1) CHG 80465 introduced a minor bug that could result in the Lower
Limit of the PCT to be set to an incorrect value (note that the
Lower Limit of the PCT is *not* the same thing as BCLL, and really
isn't related to BCLL).
2) The error in calculating the Lower Limit of the PCT happened because
the author missed the code that causes the problem because it was
hidden in a proc. There doesn't actually appear to be any reason
to have this code in a proc anymore, and there are similar procs
defined in MSMBDTHDL that are no longer used. This CHG removes
the proc definitions, and replaces the proc calls with the appropriate
code sequences.
3) CHG 80465 introduced a fencepost error in code that checks the
amount of BDs desired supplied as input. This CHG corrects those
fencepost errors.
4) CHG 80465 modified the way BCCR was calculated for run/program level
PCT/BDTs, but neglected to do so for the application level PCT/BDT.
This has no practical effect, but code in MSMBDTHDL and INDA is
modified so that these calculations look the same for both application
level and run/program level.
5) The INITBDTP proc is cleaned up, and modified to generate a slightly
more efficient code sequence.
6) ONVERSION3E code sequences in COR$VT, DAMCOR, DASBI, IFSDBK, INDA,
MSMBDTHDL, MSMEX, MSMQBHANDLER, MSMWSM, ROUTNG, and TIPHVMGR to
manage the count of allocated pages is deleted. Given the current
state of the memory throttling code, this code is no longer needed.
7) The Mariner adapt added code to set the DLT_FREEDOM bit during the
process of taking a dump, but never put in code to clear the bit.
This CHG adds code to clear the bit when a domain is allocated.
8) Previous changes that modified the way BDTs were allocated and
initialized neglected to modify the code in TIPLOD that allocates
and initializes BDTs. Consequently, TIPLOD still forces FASTBNKBCPBD
into unnatural contortions to get it to allocate and initialize
BDTs. This CHG restructures TIPLOD to look like the other places
in the Exec that acquire and initialize BDTs.


And my co-workers why my 10 lines fixes end up as 1000 lines fixes...

(This post left public in case my former co-worker swings by to take a look. Lewie so far steadfastly refuses to get even a free LJ account. BTW, Lewie, the package arrived today - thanks).

Date: 2007-12-14 09:56 pm (UTC)
From: [identity profile] dd-b.livejournal.com
That's what happens when you turn over rocks!

Do you ever find yourself wondering if the functionality is important, given how long it's been not working? Or at least wondering how severe problems escaped being reported for so long? I certainly do.

Date: 2007-12-14 10:39 pm (UTC)
From: [identity profile] davidschroth.livejournal.com
Do you ever find yourself wondering if the functionality is important, given how long it's been not working? Or at least wondering how severe problems escaped being reported for so long?

Yes, I certainly do. Especially since our documentation is usually good enough for us to figure out when the bug was inserted, and by who.

In this case, though, I can easily figure out why the problem went unrecognized - the problem would only manifest under extreme resource utilization. I fix problems when I find them because I know that if I don't, some customer will find them, and expect them to get fixed yesterday.

There are after-the-fact benefits to experience...

Date: 2007-12-15 12:20 am (UTC)
From: [identity profile] s6b.livejournal.com
It is with great consternation that I admit I can actually understand much of what you wrote (or at least I think I do) despite not having worked on a 2200 for more than 10 years now.
Obviously, this is further evidence of an unhealthy fixation with 2200's and the absence of anything resembling a real life.
Nevertheless, my complements to you on writing what seems to be a relatively lucid PLE (that’s "Problem List Entry" for those not familiar with Unisys-speak).

With regard to your comment about your co-workers wondering why your 10-line fixes ended up as 1000-line fixes, as one of your former co-workers, let me say that not all of your co-workers ever wondered this and those who did usually didn't do so for long.
At some point, folks tend to be more concerned with the alligator population in their own part of the swamp than in others.
So long as your fixes don't bother increase that population in their part (which they usually don't) after a while they just shrug at that the size of your fixes and say, "That's just David."

Profile

davidschroth

March 2018

S M T W T F S
    123
45678910
11121314151617
18192021222324
25262728293031

Most Popular Tags

Style Credit

Expand Cut Tags

No cut tags
Page generated Feb. 8th, 2026 03:00 pm
Powered by Dreamwidth Studios