BitUs DT: 心得： How not to use cassandra (下)

C* Summit 2013: How Not to Use Cassandra by Axel Liljencrantz
http://www.slideshare.net/planetcassandra/8-axel-liljencrantz-23204252

https://www.youtube.com/watch?v=0u-EKJBPrj8

<= How not to delete data =>

重點：

Tombstones can only be delete once all non-tombstone values have been deleted.

Tombstones can only be deleted if all values for the specified row are all being compacted

所以對於wide-row，minor compaction幾乎沒辦法回收tombstone

補充：

Size tiered compaction
預設的compaction strategy，原理是對N個(預設是4個)差不多大的SSTable做compaction，然後將這N個sstable merge成一個新的大SSTable(通常這個新的SSTable會變比較大)
由於這個原理，會造成越大的SSTable compaction的機率越低

Leveled Compaction
給SSTable 分層，不同層的SSTable大小會程等比級數的上升，預設是10倍大，ex: L0 5MB, L1 50MB, L2 500MB

L0預設是5MB
L0到5MB時，就會compact，並把超出大小的部份跟跟L1 merge
當L1超出50MB時，會compact並把超出50MB的部份merge進L2…依此類推
理論上大部份的row會在一個SSTable裡(理論上90%，但是實際上可能只有50~80%

http://www.datastax.com/dev/blog/leveled-compaction-in-apache-cassandra

http://www.datastax.com/wp-content/uploads/2011/10/leveled-1.png

http://www.datastax.com/wp-content/uploads/2011/10/leveled-2.png

相較於Size tiered compaction，更適合

-需要較低的讀取延遲

-讀多寫少

-wide-row or rows are frequently updated

不適合以下情況

-機器IO不行

-寫多讀少

-資料寫入之後不再更新

= TTL:ed data =

重點：

Overwritten data could theoretically bounce back

如果TTLed data 去覆寫掉另一個column的值，當TTLed data expired，當舊的data還沒被compaction之前，他就會再跑出來

補充

-TTLed data and compaction

http://www.datastax.com/dev/blog/tombstone-removal-improvement-in-1-2

-CASSANDRA-3442 TTL histogram for sstable metadata (for size tiered compaction)

- CASSANDRA-4234 Add tombstone-removal compaction to LCS (Cassandra 1.2.0 b1)

cassandra 1.2, Cassandra tracks tombstone droppable time for all TTLed/deleted columns and performs standalone compaction onto an SSTable that has droppable tombstones ratio against all columns above certain threshold. The threshold has default value of 20% or 0.2, and you can configure threshold by providing compaction parameter tombstone_threshold when creating column family.

The histogram looks like this:

http://www.datastax.com/wp-content/uploads/2012/07/chart_1.png

table options:

tombstone_compaction_interval

tombstone_compaction_interval && count > tombstone_threshold, then cassandra will trigger a tombstone compaction.

tombstone_threshold

-if garbage-collectable column count > threshold(ratio), then the SSTable will trigger a compaction to purging the tombstone (for the SSTable)

garbage-collectable means at least the data was out of gc_grace

-Cassandra-5228: Drop entire sstables when all columns are expired

Cassandra(2.0 b1)

a separate compaction strategy that doesn't bother merging sstables, just throws out expired ones

<= The playlist service =>

= Tombstone hell =

重點：
expect tombstone would be deleted after 30days, but all tombstone since 1.5 years ago were there
Rows exist in 4+ SSTables, ts never del in minor compactions.

solution: use Major compaction

solution2: repairs during Monday-Friday, Major compaction in Saturday-Sunday

-> Dont use Cassandra to store queues

= Cassandra counters =

Cassandra counters
Distributed counters > works pretty well

Counter added in cassandra 0.8
http://www.datastax.com/dev/blog/whats-new-in-cassandra-0-8-part-2-counters

create a column family with default_validation_class=CounterColumnType

cli: keyspace.prepareColumnMutation(CF_COUNTER1, rowKey, "CounterColumn1").incrementCounterColumn(1).execute();

cql: UPDATE counters SET c1 = c1 + 3, c2 = c2 - 4 WHERE key = row2;

-For each write, only one of the replica has to perform a read, even with many replicas.
-- the read was part of the write, client will not observe.

-if sstable or disk corrupted, the counter cf should rebuild. (cant repair)
-Counter column與非counter的column不能共存 https://issues.apache.org/jira/browse/CASSANDRA-2614
-no TTL for counter column
-couner的remove有些許限制，如果只是要重置counter的話，快速的 incr-delete-incr 可能會讓delete被略過，比較適合的用法是減掉current value
delete最好只用來永久性的刪除某筆counter
-假設Counter的寫入timeout了，client並沒辦法確定counter是不是真的寫入成功(client不能單純地重新寫入，因為根本不知道資料到底有沒有寫成功，多寫了會造成重覆計算)

Counter++(cassandra 2.1b1)

https://issues.apache.org/jira/browse/CASSANDRA-6504

BitUs DT

2014-05-15

心得： How not to use cassandra (下)

沒有留言 :

張貼留言