Commit Graph

836 Commits

Author SHA1 Message Date
andvar 82bba4e936 fix various typos in comments. 2024-02-05 21:46:04 +00:00
mrg b0f7b0b0b7 don't test arrays against NULL.
found by GCC 12.
2023-08-01 07:58:41 +00:00
riastradh 42f4e13451 ata(4): Add ATA_DOWNGRADE_MODE to opt_ata.h.
This way adding it to kernel config will trigger recompilation.
2023-07-17 21:12:19 +00:00
mlelstv 554e7eba6a Sanitize capacity values. 2023-01-24 08:34:18 +00:00
andvar c0033c6ef6 s/retrys/retries/ in comments. 2022-07-05 19:21:26 +00:00
andvar ff23aff6ad fix various typos in comments, documentation and messages. 2022-05-31 08:43:13 +00:00
andvar 114b022676 fix various typos in comments. 2022-05-28 22:16:43 +00:00
riastradh 2d01044645 wd(4): Use d_cfdriver/devtounit to avoid open/detach races. 2022-03-28 12:39:37 +00:00
hannken ce218897d7 Lock vnode across VOP_OPEN. 2022-03-19 13:50:02 +00:00
perseant 82a3f34c53 Avoid an unaccounted extra channel freeze, if a reset is requested
more than once before the thread services the request.  Closes PR#56745.
2022-03-14 22:15:51 +00:00
andvar 1daa1a7b85 fix various typos in comments, mainly immediatly/immediately/,
as well shared and recently fixed typos in OpenBSD code by Jonathan Grey.
2022-02-23 21:54:40 +00:00
andvar 7f4592413f fix various typos, mainly in comments. 2022-02-16 22:00:55 +00:00
riastradh 29f68bc1c1 wd(4): Fix bugs in softbadsect handling.
- Don't copyout kernel virtual addresses (of SLIST entries) that
  userland won't use anyway.
  => The structure still has space for this pointer; it's just always
     null when userland gets it now.

- Don't copyout under a lock.

- Stop and return error if copyout fails (unless we've already copied
  some out).

- Don't kmem_free under a lock.

XXX Unclear whether anyone actually uses WD_SOFTBADSECT or why --
it's always been disabled by default.  Maybe we should just remove
it?
2021-12-28 13:27:32 +00:00
skrll 06944c3496 Trailing whitespace 2021-11-12 06:51:04 +00:00
rin 94e57d87a7 PR kern/56403
Fix kernel freeze for wdc(4) variants with ATAC_CAP_NOIRQ:

(1) Change ata_xfer_ops:c_poll from void to int function. When it returns
    ATAPOLL_AGAIN, let ata_xfer_start() iterate itself again.

(2) Let wdc_ata_bio_poll() return ATAPOLL_AGAIN until ATA_ITSDONE is
    achieved.

A similar change has been made for mvsata(4) (see mvsata_bio_poll()),
and no functional changes for other devices.

This is how the drivers worked before jdolecek-ncq branch was merged.

Note that this changes are less likely to cause infinite recursion:

(1) wdc_ata_bio_intr() called from wdc_ata_bio_poll() asserts ATA_ITSDONE
    in its error handling paths via wdc_ata_bio_done().

(2) Return value from c_start (= wdc_ata_bio_start()) is checked in
    ata_xfer_start().

Therefore, errors encountered in ata_xfer_ops:c_poll and c_start routines
terminate the recursion for wdc(4). The situation is similar for mvsata(4).

Still, there is a possibility where ata_xfer_start() takes long time to
finish a normal operation. This can result in a delayed response for lower
priority interrupts. But, I've never observed such a situation, even when
heavy thrashing takes place for swap partition in wd(4).

"Go ahead" by jdolecek@.
2021-10-05 08:01:05 +00:00
rin c2691c28ae Output missing '\n' for capability list when DMA support is not compiled in. 2021-08-29 23:49:32 +00:00
thorpej c7fb772b85 Merge thorpej-cfargs2. 2021-08-07 16:18:40 +00:00
thorpej 2685996b0e Merge thorpej-cfargs branch:
Simplify and make extensible the config_search() / config_found() /
config_attach() interfaces: rather than having different variants for
which arguments you want pass along, just have a single call that
takes a variadic list of tag-value arguments.

Adjust all call sites:
- Simplify wherever possible; don't pass along arguments that aren't
  actually needed.
- Don't be explicit about what interface attribute is attaching if
  the device only has one.  (More simplification.)
- Add a config_probe() function to be used in indirect configuiration
  situations, making is visibly easier to see when indirect config is
  in play, and allowing for future change in semantics.  (As of now,
  this is just a wrapper around config_match(), but that is an
  implementation detail.)

Remove unnecessary or redundant interface attributes where they're not
needed.

There are currently 5 "cfargs" defined:
- CFARG_SUBMATCH (submatch function for direct config)
- CFARG_SEARCH (search function for indirect config)
- CFARG_IATTR (interface attribte)
- CFARG_LOCATORS (locators array)
- CFARG_DEVHANDLE (devhandle_t - wraps OFW, ACPI, etc. handles)

...and a sentinel value CFARG_EOL.

Add some extra sanity checking to ensure that interface attributes
aren't ambiguous.

Use CFARG_DEVHANDLE in MI FDT, OFW, and ACPI code, and macppc and shark
ports to associate those device handles with device_t instance.  This
will trickle trough to more places over time (need back-end for pre-OFW
Sun OBP; any others?).
2021-04-24 23:36:23 +00:00
jmcneill e7c133d3b6 Add G3 and DevSleep definitions. This changes the mask used by
SControl_IPM_NONE from 0x3 to 0x7.
2020-12-27 15:15:45 +00:00
skrll c5d8f61215 Use designated initializers for struct ata_bustype 2020-12-25 08:55:40 +00:00
skrll a6974ca42d Add missing '\n' in debug 2020-12-23 08:17:01 +00:00
jmcneill 7d76b69f68 ata_timeout: restore spl in ATACH_RECOVERING path 2020-12-19 18:09:44 +00:00
riastradh 35a4cb626d autoconf: Blame devices holding up boot with config_pending.
Blame message requires `boot -x' (AB_DEBUG).

Fix ata so it doesn't mismatch config_pending_incr/decr devices.
2020-10-03 22:32:50 +00:00
jakllsch fd597c5490 fix typo that prevented bytes/physsect reporting from working 2020-09-28 12:47:49 +00:00
christos 09cc5f64e5 de-quadruplicate, remove unused argument 2020-09-27 16:58:11 +00:00
skrll f6e3e63266 KNF 2020-08-25 13:42:09 +00:00
jdolecek 985557c6a8 disable downgrade of ATA mode from DMA, as generally not relevant
any more - while it has been instrumental to inadvertedly discover
driver bugs in PIO mode under QEMU recently, generally the switch
more hurts than helps, so now only warn when DMA errors happen

code kept under ATA_DOWNGRADE_MODE ifdef, disabled by default
2020-05-25 19:05:30 +00:00
jdolecek 4a29b12536 make ata_downgrade_mode() static, it's not used anywhere else 2020-05-25 18:29:25 +00:00
jdolecek c39076d0d1 account for already transferred data (partially done I/O) when
retrying an xfer, to avoid reading/writing data from/to wrong offset,
and eventually beyond the end of data buffer

fixes data corruption under QEMU observed by Paul Ripke for emulated
IDE drives
2020-05-24 22:12:29 +00:00
jdolecek 59f493e144 stop timeout handler while scheduling another part of partial I/O,
to avoid race between the timeout and I/O submission; the I/O
submission can sleep with xfer while waiting for the controller to
be ready once it gets to thread context, and timeout might cause
the xfer to be freed, leading to crashes due to use-after-free

this fixes another type of crashes with slow devices under QEMU reported
by Paul Ripke - thanks a lot with extensive debugging help
2020-05-21 09:11:33 +00:00
jdolecek 9b3441ff46 only start the timeout machinery once the I/O is completely setup
and successful, particularly after PIO write is finished

fixes crashes in case the setup is so slow that timeout is triggered
e.g. while still waiting in wdc_wait_for_unbusy() or shortly after, without
drive actually having chance to complete the I/O, as seen in some
configuration under QEMU by Paul Ripke
2020-05-19 08:08:51 +00:00
jdolecek d28f932532 remove unused atacmd_tostatq() 2020-05-15 21:56:14 +00:00
jdolecek cd3efa77db whitespace (bad indent) 2020-05-15 16:58:28 +00:00
thorpej 5dedc14798 Back out changes to use a threadpool for now; it's causing trouble
for some folks on Thinkpads.
2020-05-02 19:09:56 +00:00
thorpej e9bf4cf256 Rather than creating a kthread-per-channel, use a threadpool and a
threadpool-job-per-channel for the in-thread-context work that needs
to be done (which is rare).

On one of my test systems, this results in the total number of LWPs
after multi-user boot dropping from 116 to 78.
2020-04-25 00:07:27 +00:00
jdolecek c61cfedcc1 fix use-after-free for ata xfer on bio submission found by KASAN
driver ata_bio hooks read parts of the xfer after ata_exec_xfer()
call in order to determine return value, change so that the hook
doesn't return any value - callers do not care already,
as all I/O requests are asynchronous

this problem was uncovered by recent change for wd(4) to not hold
wd mutex during ata_bio call, the interrupt for the xfer might
thus actually fire immediately

adjust also ata_exec_command driver hooks similarily - remove all
completion and waiting logic from drivers, upper layer ata code
using AT_WAIT/AT_POLL changed to call ata_wait_cmd() itself

PR kern/55169 by Nick Hudson
2020-04-13 10:49:34 +00:00
maxv babb6cb124 constify 2020-04-13 08:05:02 +00:00
jdolecek 22ba269296 drop wd lock in wdstart1() before calling the ata_bio hook; when called
from ata thread context, that can still need to sleep for wdc attachments
in wdcwait()
2020-04-07 13:22:05 +00:00
jdolecek ffdabc7e1f stop xfer timeouts during recovery, all xfers will be requeued anyway
this avoids race with the timeout routine when processing the xfers
for requeueing

should fix PR kern/54790 by Izumi Tsutsui
2020-04-04 22:30:02 +00:00
jdolecek 2f9d652e0f fix deadlock in wdcwait() when xfer timeout happens while the atabus
thread sleeps in wdcwait() - check current lwp rather than relying
on global ATACH_TH_RUN channel flag

should fix the hang part of the problem reported in
http://mail-index.netbsd.org/netbsd-users/2020/03/12/msg024249.html

thanks to Paul Ripke for providing extensive debugging info
2020-04-04 21:36:15 +00:00
riastradh ffcf681ee3 New ioctl DIOCGSECTORALIGN returns sector alignment parameters.
struct disk_sectoralign {
	/* First aligned sector number.  */
	uint32_t dsa_firstaligned;

	/* Number of sectors per aligned unit.  */
	uint32_t dsa_alignment;
};

- Teach wd(4) to get it from ATA.
- Teach cgd(4) to pass it through from the underlying disk.
- Teach dk(4) to pass it through with adjustments.
- Teach zpool (zfs) to take advantage of it.
  => XXX zpool doesn't seem to understand when the vdev's starting
     sector is misaligned.

Missing:

- ccd(4) and raidframe(4) support -- these should support _using_
  DIOCGSECTORALIGN to decide where to start putting ccd or raid
  stripes on disk, and these should perhaps _implement_
  DIOCGSECTORALIGN by reporting the stripe/interleave factor.

- sd(4) support -- I don't know any obvious way to get it from SCSI,
  but if any SCSI wizards know better than I, please feel free to
  teach sd(4) about it!

- any ld(4) attachments -- might be worth teaching the ld drivers for
  nvme and various raid controllers to get the aligned sector size

There's some duplicate logic here for now.  I'm doing it this way,
rather than gathering the logic into a new disklabel_sectoralign
function or something, so that this change is limited to adding a new
ioctl, without any new kernel symbols, in order to make it easy to
pull up to netbsd-9 without worrying about the module ABI.
2020-03-02 16:01:56 +00:00
riastradh fcabfbde55 Add a flag to dk_dump for virtual disk devices.
If a disk is backed by a physical medium other than itself, such as
cgd(4), then it passes DK_DUMP_RECURSIVE to disable the recursion
detection for dk_dump.

If, however, a device represents a physical medium on its own, such
as wd(4), then it passes 0 instead.

With this, I can now dump to dk on cgd on dk on wd.
2020-03-01 03:21:54 +00:00
simonb f3ef58ddb7 Tidy quirk table and remove outdated quick from the quirk format string. 2020-01-18 11:24:40 +00:00
simonb 2fb6b7b160 Revert kern/54790 and kern/54855 NCQ fix that penalised all Samsung
EVO 860 drives.

ok jdolecek@
2020-01-18 11:22:49 +00:00
ad bdf8ebffe6 Acquire kernel_lock in the bp->b_iodone callback. 2020-01-17 19:30:51 +00:00
jdolecek 0631449bd5 enable the BAD_NCQ quirk for all 860 EVO drives
XXX work-in-progress, it's not clear whether this is driver or controller
XXX problem
2020-01-14 21:08:06 +00:00
jdolecek eef4b266f0 disable NCQ by default for "Samsung SSD 860 EVO 1TB" and
"Samsung SSD 860 EVO 500GB" - these drives have known broken NCQ support
particularly when used with AMD SB710/750 chipsets, problem occur also
under Linux and Windows

https://eu.community.samsung.com/t5/Cameras-IT-Everything-Else/860-EVO-250GB-causing-freezes-on-AMD-system/td-p/575813
https://bugzilla.kernel.org/show_bug.cgi?id=201693

It seems there is no Samsung firmware update to fix this even.

Disable NCQ regardless of the controller, it's likely same problem
exists with other controllers too.

This should fix PR kern/54790 and PR kern/54855
2020-01-13 21:20:17 +00:00
msaitoh a0403cde04 s/transfered/transferred/ 2019-12-27 09:41:48 +00:00
christos 22c0f21763 chuq does not like insomniac allocations so unlock-alloc-lock instead. 2019-10-21 18:58:57 +00:00
christos 7693ab4db6 Fix assert_sleepable() panic by allocating with NOSLEEP. The alternative is
to unlock and relock the channel, but seems more dangerous to do so.
2019-10-21 18:37:47 +00:00