Strongly architect handles so we can more easily detect bogus
handles. This switches us to a full 32 bits for all handles.
Handle the case of FC disks disappearing and then reappearing-
at least at the FC transport level.
Some better and finer control of debug and non-debug printouts.
and use those cleaned up definitions.
Use 2100 style firmware loading if the load address and
load size is less than 64k. Some apparently buggy ROMs
out there choke otherwise.
Clean up some WWNN derivations from WWPN.
Reintroduce more of a 'channel' concept in preparation for NP-IV support.
This gets rid of the chanA/chanB concept as the 2400 can have up to 128
virtual channels. Actually, with MID firmware you can also have the 2200
and 2300 support 'channels, but they do it with an FL-Port topology.
Because FC cards can now have 'channels', just about every support
function for fibre channel had to be redone to have a channel index
as well. Rototill isp_ioctl.h for channel stuff as well.
Pick up a lot of work about fabric management (hopefully better) and keep
work in place that will allow for dynamic attachment/detachment of devices
(if I can figure out how to make the midlayer support it).
Merge the target code with external trees. Eventually it might even
be sorted out on NetBSD.
Update some firmware stuff.
status correctly (which we never were doing before). Add an underrun
checker for 24XX. The process of sorting this out led to a whole bunch
of endian surprises that had to be dealt with. Fix NVRAM endian issues
for the 24XX as well.
Do a little 2200 related cleanup- in particular, turn off complaints about
not finding a fast posting handle when running with RIO enabled- we are
somehow getting duplicate completions in this case. If we ignore them and
don't complain, all is well, and we actually start averaging > 2 commands
completed per interrupt.
The major changes are:
+ 4Gb (24XX) card support
+ Rewritten fabric and loop evaluation code
+ New f/w sets
The 4Gb changes required major rototilling, which caused a rewrite of
fabric and loop eval code. The latter can now be set up to tune for
dynamic device arrival/departure if the framework is set up for it,
or to be firm about waiting for devices.
Testing has been principally on amd64, i386 and sparc64 and seems to
not have broken things for me.
Start doing the work necessary to support DAC (Dual Address Cycle)
environments. This allows for direct DMA to > 4GB memory from a PCI
card.
Lose STRNCAT over the side and use SNPRINTF instead.
load f/w images > 0x7fff words), set ISP_FW_ATTR_SCCLUN. We explicitly
don't believe we can find attributes if f/w is < 1.17.0, so we have to
set SCCLUN for the 1.15.37 f/w we're using manually- otherwise every
target will replicate itself across all 16 supported luns for non-SCCLUN
f/w.
Do a fallback on reading stuff from the fabric. Some devices/initiators
don't correctly register their type with the fabric nameserver. This
seems to be due to a misinterpretation of what TYPE should mean for a
CT_HDR. In any case, do a fallback to try and catch these misentered
entities.
Add some stuff for default framesize and throttle. Fix some buglets.
to check if we get a RQCS_DATA_UNDERRUN - if we're an FC card, we may
not have RQCS_RU set- if it isn't set, we just lost a DATA XFR IU in the
middle of the exchange. In this case, we have to bomb out the whole xfer.
We had been getting silent data corruption before.
3 bits are lun address modifiers.
Remove code that (incorrectly) thought it was asking the f/w to only
PLOGI if not already PLOGI'd. The current f/w documentation tells us
that we have this backwards.
are used- didn't make a difference, but hey...
Put in commented out GFF_ID code- for use in future attempts to search
the fabric- this probably has to go thru the management server path.
Don't whine about handles we can't find if these are aborted commands
(we know we can't find the handles because we destroy handles after
a successful mailbox abort- we don't wait for the F/W to decide whether
it wants to return a status IOCB after this happens).
to be trying to wriggle out of supporting this well. Instead, use
GID_FT to get a list of Port IDs and then use GPN_ID/GNN_ID to find the
port and node wwn. This should make working on fabrics a bit cleaner and
more stable.
This also caused some cleanup of SNS subcommand canonicalization so that
we can actually check for FS_ACC and FS_RJT, and if we get an FS_RJT,
print out the reason and explanation codes.
We'll keep the old GA_NXT method around if people want to uncomment a
controlling definition in ispvar.h.
This also had us clean up ISPASYNC_FABRICDEV to use a local lportdb argument
and to have the caller explicitly say that a device is at the end of the
fabric list.
Unconst pointer to f/w in the ispdv structure. Too many compilers get
unhappy over our walking the array. Make casts as appropriate so that
initialization in structure is still happy.
Limit length of fabric to 256. This will all go away soon.
Do a cleaner case of keeping multiple CPUs/threads from reading the
same response queue entries.
it worked- but I ran into a case with a 2204 where commands were being lost
right and left. Best be safe.
For target mode, or things called if we call isp_handle_other response- note
that we might have dropped locks by changing the output pointer so we bail
from the loop. It's the responsibility of the entity dropping the lock to
make sure that we let the f/w know we've read thus far into the response
queue (else we begin processing the same entries again- blech!).
Distinguish between 2312 and 2300 cards (they *are* different). Enable
RIO (Reduced Interrupt Operation) for the LVD cards (hey- I've seen
batched completions of the 30 commands at a time with this,....)...
If we get a Port Logout on local loop topologies, we have to force the
f/w to log back in. The easiest way (for us) to do this is to force
a LIP. This also will wake up the disk that probably just had a f/w crash.
Implement mailbox 'continuations'- this allows interrupts to re-drive
a mailbox command if it's one that just essentially repeats the previous
mailbox command (e.g., f/w download). This saves a boatload of sleep/wakeup
twitches.
If we're not a 2300 and we're about to return with a 'bogus interrupt'- check
the semaphore register to be non-zero at all and outgoing mailbox 0- this
seems to be where some of the lost ISP1080 commands came from.
firmware to delay completion of commands so that it can attempt to batch
a bunch of completions at once- either returning 16 bit handles in mailbox
registers, or in a resposne queue entry that has a whole wad of 16 bit handles.
Distinguish between 2300 and 2312 chipsets- if only because the revisions
on the chips have different meanings.
Add more instrumentation plus ISP_GET_STATS and ISP_CLR_STATS ioctls.
Run up the maximum number of response queue entities we'll look at
per interrupt.
If we haven't set HBA role yet, always return success from isp_fc_runstate.
the response queue. Instead of the ad hoc ISP_SWIZZLE_REQUEST, we now have
a complete set of inline functions in isp_inline.h. Each platform is
responsible for providing just one of a set of ISP_IOX_{GET,PUT}{8,16,32}
macros.
The reason this needs to be done is that we need to have a single set of
functions that will work correctly on multiple architectures for both little
and big endian machines. It also needs to work correctly in the case that
we have the request or response queues in memory that has to be treated
specially (e.g., have ddi_dma_sync called on it for Solaris after we update
it or before we read from it).
One thing that falls out of this is that we no longer build requests in the
request queue itself. Instead, we build the request locally (e.g., on the
stack) and then as part of the swizzling operation, copy it to the request
queue entry we've allocated. I thought long and hard about whether this was
too expensive a change to make as it in a lot of cases requires an extra
copy. On balance, the flexbility is worth it. With any luck, the entry that
we build locally stays in a processor writeback cache (after all, it's only
64 bytes) so that the cost of actually flushing it to the memory area that is
the shared queue with the PCI device is not all that expensive. We may examine
this again and try to get clever in the future to try and avoid copies.
Another change that falls out of this is that MEMORYBARRIER should be taken
a lot more seriously. The macro ISP_ADD_REQUEST does a MEMORYBARRIER on the
entry being added. But there had been many other places this had been missing.
It's now very important that it be done.
For NetBSD, it does a ddi_dmamap_sync as appropriate. This gets us out of
the explicit ddi_dmamap_sync on the whole response queue that we did for SBus
cards at each interrupt.
Set things up so that platforms that cannot have an SBus don't get a lot of
the SBus code checks (dead coded out).
Additional changes:
Fix a longstanding buglet of sorts. When we get an entry via isp_getrqentry,
the iptr value that gets returned is the value we intend to eventually plug
into the ISP registers as the entry *one past* the last one we've written-
*not* the current entry we're updating. All along we've been calling sync
functions on the wrong index value. Argh. The 'fix' here is to rename all
'iptr' variables as 'nxti' to remember that this is the 'next' pointer-
not the current pointer.
Devote a single bit to mboxbsy- and set aside bits for output mbox registers
that we need to pick up- we can have at least one command which does not
have any defined output registers (MBOX_EXECUTE_FIRMWARE).
Explicitly decode GetAllNext SNS Response back *as* a GetAllNext response.
Otherwise, we won't unswizzle it correctly.
Nuke some additional __P macros.
If we get a completion status of RQCS_QUEUE_FULL, it means
that the internal queues are full. Other QLogic boards set
the QFULL SCSI status. But *nooooooooooo*, not the 2300.
attributes of some variants of FC f/w (SCCLUN or not). Fake f/w
rev for SBus cards- the f/w versions we're using don't return
version in outgoing mailbox registers like they should.
in how interrupts are down- the 23XX has not only a different place to check
for an interrupt, but unlike all other QLogic cards, you have to read the
status as a 32 bit word- not 16 bit words. Rather than have device specific
functions as called from the core module (in isp_intr), it makes more sense
to have the platform/bus modules do the gruntwork of splitting out the
isr, semaphore register and the first outgoing mailbox register (if needed)
*prior* to calling isp_intr (if calling isp_intr is necessary at all).
Fix longstanding bug where we should have been checking
against Channel B's settings to see whether to apply tag
usage. Oops.
Some more 2300 support- macroize access to request queue in/out pointers.
Firmware crashes now handled in platform outer code via an isp_async call.
If we get a LIP, and we're on a private or public loop, kill off all
active commands as if they had been killed by a 'SCSI Bus Reset'. I've
seen data corruption on commands that complete 'normally' after a LIP.
Bad.
way of doing business- modulo some startup spasms and peculiarities
of the way kthreads are started (*after* configuration, weird) and
some strangeness with the freeze/thaw code, what now happens is
that any of Loop Down, LIP, Loop Reset or Port Datbase or Name
Server Database Changed ASYNC events cause the queues to freeze
for this channel. The arrival of a Loop UP is not relevant.
What *is* relevant is that the Port Datbase or Name Server Changed
async event indicate that it's okay to go and (re)evaluate the
state of the FC link and (re)probe local loop and fabric membership.
We have a kthread do this because it's *sooooo* much nicer to be
able to sleep while doing the 130-250 mailbox commands it'll take
to re-evaluate things.
When the state is well known again, we can unfreeze the channel
queues. Then, as commands start arriving, we simply can start them
or bounce them with XS_SELTIMEOUT (if the device in question has
gone away). Previously, we did lazy evaluation, which meant that
if a change occurred, we would wait until the very *next* command
to go rebuild stuff.
The reason this is not sensible is:
a) Even with sleeping, you can hang up your system because you might be
making some poor stat(2) call pay the price of re-evaluating the whole
fabric.
b) If we ever really want to get to dynamic attachment/detachment, we
should find out sooner, rather than later, where things get to.
Split off ispminphys_1020 from ispminphys- a 1020 has a 24 bit limit-
not anything newer.
Re-enable LIPs and Loop Resets as async events- this allows the outer
layer to set policy about them.
Roll platform major && minor. Remove bogus waitq (no longer used).
Remove callout entry in softc (no longer used). Define some shorthands
for channels. Clean up a variety of cruft left over from the
thorpej_scsipi changeover.