[dm-crypt] aes-xts-plain with aes_x86_64 makes my SSD 5x slower than my encrypted HD

Marc MERLIN marc at merlins.org
Mon Jul 23 18:12:42 CEST 2012


On Mon, Jul 23, 2012 at 10:14:07AM +0200, Arno Wagner wrote:
> > > SID? That would be "unstable", whit possible assorted problems.
> 
> Ok, sorry for my confusion, what kernel/distro are you running?
 
Debian testing with pieces of unstable :)
That gives me cryptsetup 1.4.3
(but debian unstable is often not more unstable than your released fedora
core or ubuntu in my experience)


Hi Milan,

Thanks for the answer and all your questions.

On Mon, Jul 23, 2012 at 12:46:38PM +0200, Milan Broz wrote:
> AES-NI helps a lot and because it is prioritised in crypto api,
> you usually using it without any additional config if hw supports it.
> (Also I see some patches whit run XTS blocks in parallel on crypto api list.)

Yes, the modules all worked perfectly and the correct one was prioritized.

> So no wonder that you get slow operation - in dd/hdparm usually only
> one cpu is processing the request. If CPU is fast enough, no problem.
> If not you will see slowdown. AES-NI will speed up this on that cpu core,
> but will not help run request in parallel on multi-core systems.
 
Obviously, now that I've already verified that dd is slow while filesystem
operations are almost 10x faster, you're obviously onto something here.
But, I'm confused, why does atop show that dd is only using 6% CPU?

Oooh, it's not that my CPU is fully used, it's just that my CPU is able to
decrypt as quickly as the data is coming in for a 100MB/s hard drive, but
not a 500MB/s SSD and however scheduling works if the data is coming in
faster than the CPU can decode those pages.
(editted: actually no, using 'null' encryptino still gives 25MB/s).

> I do not like this dmcrypt mode a we tried to fix it. There is a bunch of patches
> from Mikulas Patocka which switches parallelization to use all available
> cpus (if not limited by paramater).
> In my tests it improved performance in some cases but not in all situations
> (there were some slow downs which scares me).
> (You can see patches here http://people.redhat.com/mpatocka/patches/kernel/dm-crypt-paralelizace/)
> 
> Unfortunately discussion stopped and device-mapper maintainer forgot about it.
> 
> Well, your mails apparently caused some head-ups here, so I'll try to return
> to this. (Will post them to dm-devel directly this time.)
 
I appreciate your answer and your looking into this. Since I run recent self
compiled kernel.org kernels, I can test patches as long as they're
reasonably certain not to turn my data into garbage :)
(I have backups, but it just too me too long to rebuild my laptop after my
last SSD crash).

On Mon, Jul 23, 2012 at 01:37:26PM +0200, Milan Broz wrote:
> Common distro env is nice but be sure you have proper crypto modules available.
> Also do not use Fedora rawhide, it has kernel compiled with debug tools
> not usable for testing.
 
Mmmh, I have one possible thing. I have a preempt kernel. Could that be it?
http://marc.merlins.org/tmp/config-3.4.4-amd64-preempt.txt

> You can try start with this:
> 
> 1) (this should be not problem but better check it)
> Check alignment of partitions. Is it aligned to SSD page size?
> (Aligning to 1MiB is always correct ;-)
> Paste fdisk -l -u /dev/sda
 
Disk /dev/sda: 512.1 GB, 512110190592 bytes
255 heads, 63 sectors/track, 62260 cylinders, total 1000215216 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x09aaf50a

   Device Boot      Start         End      Blocks   Id  System
/dev/sda1   *        2048      502047      250000   83  Linux
/dev/sda2          502048    52930847    26214400   83  Linux
/dev/sda3        52930848    73902367    10485760   82  Linux swap / Solaris
/dev/sda4        73902368  1000215215   463156424   83  Linux

I also used:
cryptsetup luksFormat --align-payload=8192

> 2) try to switch io scheduler to "noop" or "deadline"
> (paste lsblk -t)

I tried both noop and deadline (never used cfq) and it didn't help.
gandalfthegreat:~# lsblk -t
NAME                 ALIGNMENT MIN-IO OPT-IO PHY-SEC LOG-SEC ROTA SCHED    RQ-SIZE
sda                          0    512      0     512     512    0 deadline     128
├─sda1                       0    512      0     512     512    0 deadline     128
├─sda2                       0    512      0     512     512    0 deadline     128
├─sda3                       0    512      0     512     512    0 deadline     128
└─sda4                       0    512      0     512     512    0 deadline     128
  └─cryptroot (dm-0)         0    512      0     512     512    0              128

But just to make sure, I tried cfg, noop, and deadline and it didn't make a
difference.

> Also you can try to increase queue size.

Not sure which one it is:

gandalfthegreat:/sys/block/sda/queue# grep . *
add_random:1
discard_granularity:512
discard_max_bytes:2147450880
discard_zeroes_data:0
hw_sector_size:512
iostats:1
logical_block_size:512
max_hw_sectors_kb:32767
max_integrity_segments:0
max_sectors_kb:512
max_segments:168
max_segment_size:65536
minimum_io_size:512
nomerges:0
nr_requests:128
optimal_io_size:0
physical_block_size:512
read_ahead_kb:128
rotational:0
rq_affinity:1
scheduler:[noop] deadline cfq 
 
> (Hard core version is to run blktrace and check if request are not split
> unnecessarily :)

I'm not too sure how to read the output, but there it is:
http://marc.merlins.org/tmp/blktrace.txt
 
Generated with:
gandalfthegreat:~# reset_cache ; dd if=/dev/mapper/cryptroot of=/dev/null bs=1M count=10; killall blktrace
10+0 records in
10+0 records out
10485760 bytes (10 MB) copied, 0.514707 s, 20.4 MB/s

> 3) Let's test cipher_null (no encryption just fake-copy)
> (you need cryptsetup 1.4.3 for this test).
> 
> create test LUKS device with null cipher: cryptsetup luksFormat -c null
 
gandalfthegreat:~# cryptsetup luksFormat --align-payload=8192 -c null /dev/sda2
Are you sure? (Type uppercase yes): YES
Enter LUKS passphrase: 
Verify passphrase: 
gandalfthegreat:~# cryptsetup luksOpen /dev/sda2 test
Enter passphrase for /dev/sda2: 
gandalfthegreat:~# hdparm -t -T /dev/mapper/test
/dev/mapper/test:
 Timing cached reads:   12932 MB in  2.00 seconds = 6471.89 MB/sec
 Timing buffered disk reads:  76 MB in  3.04 seconds =  25.01 MB/sec
gandalfthegreat:~# 

So it's not related to the kind of encryption.

> please note: cipher null means no encryption, just use dmcrypt layer,
> so do not use for valid data :-)
 
Yes, I figured :)

> 4) which aes module are you using? check lsmod, check /proc/crypto
> 
> you should use either aes-ni or optimized modules (x86_64 etc)

Yep, I tried using both (aes-ni is default) and result was the same.

I'll try rebuilding a non preempt kernel just in case.

Marc
-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
                                      .... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/  


More information about the dm-crypt mailing list