 b9dc86d8d6
			
		
	
	b9dc86d8d6
	
	
	
		
			
			When configuring QEMU cache modes for Nova instances, we use
'writethrough' when 'none' is not available.  But that's not correct,
because of our misunderstanding of how cache modes work.  E.g. the
function disk_cachemode() in the libvirt driver assumes that
'writethrough' and 'none' cache modes have the same behaviour with
respect to host crash safety, which is not at all true.
The misunderstanding and complexity stems from not realizing that each
QEMU cache mode is a shorthand to toggle *three* booleans.  Refer to the
convenient cache mode table in the code comment (in
nova/virt/libvirt/driver.py).
As Kevin Wolf (thanks!), QEMU Block Layer maintainer, explains (I made
a couple of micro edits for clarity):
    The thing that makes 'writethrough' so safe against host crashes is
    that it never keeps data in a "write cache", but it calls fsync()
    after _every_ write.  This is also what makes it horribly slow.  But
    'cache=none' doesn't do this and therefore doesn't provide this kind
    of safety.  The guest OS must explicitly flush the cache in the
    right places to make sure data is safe on the disk.  And OSes do
    that.
    So if 'cache=none' is safe enough for you, then 'cache=writeback'
    should be safe enough for you, too -- because both of them have the
    boolean 'cache.writeback=on'.  The difference is only in
    'cache.direct', but 'cache.direct=on' only bypasses the host kernel
    page cache and data could still sit in other caches that could be
    present between QEMU and the disk (such as commonly a volatile write
    cache on the disk itself).
So use 'writeback' mode instead of the debilitatingly slow
'writethrough' for cases where the O_DIRECT-based 'none' is unsupported.
Do the minimum required update to the `disk_cachemodes` config help
text.  (In a future patch, rewrite the cache modes documentation to fix
confusing fragments and outdated information.)
Closes-Bug: #1818847
Change-Id: Ibe236988af24a3b43508eec4efbe52a4ed05d45f
Signed-off-by: Kashyap Chamarthy <kchamart@redhat.com>
Looks-good-to-me'd-by: Kevin Wolf <kwolf@redhat.com>
		
	
		
			
				
	
	
		
			20 lines
		
	
	
		
			970 B
		
	
	
	
		
			YAML
		
	
	
	
	
	
			
		
		
	
	
			20 lines
		
	
	
		
			970 B
		
	
	
	
		
			YAML
		
	
	
	
	
	
| ---
 | |
| fixes:
 | |
|   - |
 | |
|     Update the way QEMU cache mode is configured for Nova guests: If the
 | |
|     file system hosting the directory with Nova instances is capable of
 | |
|     Linux's O_DIRECT, use ``none``; otherwise fallback to ``writeback``
 | |
|     cache mode.  This improves performance without compromising data
 | |
|     integrity.  `Bug 1818847`_.
 | |
| 
 | |
|     Context: What makes ``writethrough`` so safe against host crashes is
 | |
|     that it never keeps data in a "write cache", but it calls fsync()
 | |
|     after *every* write.  This is also what makes it horribly slow.  But
 | |
|     cache mode ``none`` doesn't do this and therefore doesn't provide
 | |
|     this kind of safety.  The guest OS must explicitly flush the cache
 | |
|     in the right places to make sure data is safe on the disk; and all
 | |
|     modern OSes flush data as needed.  So if cache mode ``none`` is safe
 | |
|     enough for you, then ``writeback`` should be safe enough too.
 | |
| 
 | |
|     .. _Bug 1818847: https://bugs.launchpad.net/nova/+bug/1818847
 |