[dm-crypt] aes-xts-plain with aes_x86_64 makes my SSD 5x slower than my encrypted HD
marc at merlins.org
Mon Jul 23 18:12:42 CEST 2012
On Mon, Jul 23, 2012 at 10:14:07AM +0200, Arno Wagner wrote:
> > > SID? That would be "unstable", whit possible assorted problems.
> Ok, sorry for my confusion, what kernel/distro are you running?
Debian testing with pieces of unstable :)
That gives me cryptsetup 1.4.3
(but debian unstable is often not more unstable than your released fedora
core or ubuntu in my experience)
Thanks for the answer and all your questions.
On Mon, Jul 23, 2012 at 12:46:38PM +0200, Milan Broz wrote:
> AES-NI helps a lot and because it is prioritised in crypto api,
> you usually using it without any additional config if hw supports it.
> (Also I see some patches whit run XTS blocks in parallel on crypto api list.)
Yes, the modules all worked perfectly and the correct one was prioritized.
> So no wonder that you get slow operation - in dd/hdparm usually only
> one cpu is processing the request. If CPU is fast enough, no problem.
> If not you will see slowdown. AES-NI will speed up this on that cpu core,
> but will not help run request in parallel on multi-core systems.
Obviously, now that I've already verified that dd is slow while filesystem
operations are almost 10x faster, you're obviously onto something here.
But, I'm confused, why does atop show that dd is only using 6% CPU?
Oooh, it's not that my CPU is fully used, it's just that my CPU is able to
decrypt as quickly as the data is coming in for a 100MB/s hard drive, but
not a 500MB/s SSD and however scheduling works if the data is coming in
faster than the CPU can decode those pages.
(editted: actually no, using 'null' encryptino still gives 25MB/s).
> I do not like this dmcrypt mode a we tried to fix it. There is a bunch of patches
> from Mikulas Patocka which switches parallelization to use all available
> cpus (if not limited by paramater).
> In my tests it improved performance in some cases but not in all situations
> (there were some slow downs which scares me).
> (You can see patches here http://people.redhat.com/mpatocka/patches/kernel/dm-crypt-paralelizace/)
> Unfortunately discussion stopped and device-mapper maintainer forgot about it.
> Well, your mails apparently caused some head-ups here, so I'll try to return
> to this. (Will post them to dm-devel directly this time.)
I appreciate your answer and your looking into this. Since I run recent self
compiled kernel.org kernels, I can test patches as long as they're
reasonably certain not to turn my data into garbage :)
(I have backups, but it just too me too long to rebuild my laptop after my
last SSD crash).
On Mon, Jul 23, 2012 at 01:37:26PM +0200, Milan Broz wrote:
> Common distro env is nice but be sure you have proper crypto modules available.
> Also do not use Fedora rawhide, it has kernel compiled with debug tools
> not usable for testing.
Mmmh, I have one possible thing. I have a preempt kernel. Could that be it?
> You can try start with this:
> 1) (this should be not problem but better check it)
> Check alignment of partitions. Is it aligned to SSD page size?
> (Aligning to 1MiB is always correct ;-)
> Paste fdisk -l -u /dev/sda
Disk /dev/sda: 512.1 GB, 512110190592 bytes
255 heads, 63 sectors/track, 62260 cylinders, total 1000215216 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x09aaf50a
Device Boot Start End Blocks Id System
/dev/sda1 * 2048 502047 250000 83 Linux
/dev/sda2 502048 52930847 26214400 83 Linux
/dev/sda3 52930848 73902367 10485760 82 Linux swap / Solaris
/dev/sda4 73902368 1000215215 463156424 83 Linux
I also used:
cryptsetup luksFormat --align-payload=8192
> 2) try to switch io scheduler to "noop" or "deadline"
> (paste lsblk -t)
I tried both noop and deadline (never used cfq) and it didn't help.
gandalfthegreat:~# lsblk -t
NAME ALIGNMENT MIN-IO OPT-IO PHY-SEC LOG-SEC ROTA SCHED RQ-SIZE
sda 0 512 0 512 512 0 deadline 128
├─sda1 0 512 0 512 512 0 deadline 128
├─sda2 0 512 0 512 512 0 deadline 128
├─sda3 0 512 0 512 512 0 deadline 128
└─sda4 0 512 0 512 512 0 deadline 128
└─cryptroot (dm-0) 0 512 0 512 512 0 128
But just to make sure, I tried cfg, noop, and deadline and it didn't make a
> Also you can try to increase queue size.
Not sure which one it is:
gandalfthegreat:/sys/block/sda/queue# grep . *
scheduler:[noop] deadline cfq
> (Hard core version is to run blktrace and check if request are not split
> unnecessarily :)
I'm not too sure how to read the output, but there it is:
gandalfthegreat:~# reset_cache ; dd if=/dev/mapper/cryptroot of=/dev/null bs=1M count=10; killall blktrace
10+0 records in
10+0 records out
10485760 bytes (10 MB) copied, 0.514707 s, 20.4 MB/s
> 3) Let's test cipher_null (no encryption just fake-copy)
> (you need cryptsetup 1.4.3 for this test).
> create test LUKS device with null cipher: cryptsetup luksFormat -c null
gandalfthegreat:~# cryptsetup luksFormat --align-payload=8192 -c null /dev/sda2
Are you sure? (Type uppercase yes): YES
Enter LUKS passphrase:
gandalfthegreat:~# cryptsetup luksOpen /dev/sda2 test
Enter passphrase for /dev/sda2:
gandalfthegreat:~# hdparm -t -T /dev/mapper/test
Timing cached reads: 12932 MB in 2.00 seconds = 6471.89 MB/sec
Timing buffered disk reads: 76 MB in 3.04 seconds = 25.01 MB/sec
So it's not related to the kind of encryption.
> please note: cipher null means no encryption, just use dmcrypt layer,
> so do not use for valid data :-)
Yes, I figured :)
> 4) which aes module are you using? check lsmod, check /proc/crypto
> you should use either aes-ni or optimized modules (x86_64 etc)
Yep, I tried using both (aes-ni is default) and result was the same.
I'll try rebuilding a non preempt kernel just in case.
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
.... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/
More information about the dm-crypt