C* Summit 2013: How Not to Use Cassandra by Axel Liljencrantz
http://www.slideshare.net/planetcassandra/8-axel-liljencrantz-23204252
https://www.youtube.com/watch?v=0u-EKJBPrj8
<= How not to delete data =>
重點:
Tombstones can only be delete once all non-tombstone values have been deleted.
Tombstones can only be deleted if all values for the specified row are all being compacted
所以對於wide-row,minor compaction幾乎沒辦法回收tombstone
補充:
Size tiered compaction
預設的compaction strategy,原理是對N個(預設是4個)差不多大的SSTable做compaction,然後將這N個sstable merge成一個新的大SSTable(通常這個新的SSTable會變比較大)
由於這個原理,會造成越大的SSTable compaction的機率越低
預設的compaction strategy,原理是對N個(預設是4個)差不多大的SSTable做compaction,然後將這N個sstable merge成一個新的大SSTable(通常這個新的SSTable會變比較大)
由於這個原理,會造成越大的SSTable compaction的機率越低
Leveled Compaction
給SSTable 分層,不同層的SSTable大小會程等比級數的上升,預設是10倍大,ex: L0 5MB, L1 50MB, L2 500MB
給SSTable 分層,不同層的SSTable大小會程等比級數的上升,預設是10倍大,ex: L0 5MB, L1 50MB, L2 500MB
L0預設是5MB
L0到5MB時,就會compact,並把超出大小的部份跟跟L1 merge
當L1超出50MB時,會compact並把超出50MB的部份merge進L2…依此類推
理論上大部份的row會在一個SSTable裡(理論上90%,但是實際上可能只有50~80%
L0到5MB時,就會compact,並把超出大小的部份跟跟L1 merge
當L1超出50MB時,會compact並把超出50MB的部份merge進L2…依此類推
理論上大部份的row會在一個SSTable裡(理論上90%,但是實際上可能只有50~80%
相較於Size tiered compaction,更適合
-需要較低的讀取延遲
-讀多寫少
-wide-row or rows are frequently updated
不適合以下情況
-機器IO不行
-寫多讀少
-資料寫入之後不再更新
= TTL:ed data =
重點:
Overwritten data could theoretically bounce back
如果TTLed data 去覆寫掉另一個column的值,當TTLed data expired,當舊的data還沒被compaction之前,他就會再跑出來
補充
-TTLed data and compaction
-CASSANDRA-3442 TTL histogram for sstable metadata (for size tiered compaction)
- CASSANDRA-4234 Add tombstone-removal compaction to LCS (Cassandra 1.2.0 b1)
cassandra 1.2, Cassandra tracks tombstone droppable time for all TTLed/deleted columns and performs standalone compaction onto an SSTable that has droppable tombstones ratio against all columns above certain threshold. The threshold has default value of 20% or 0.2, and you can configure threshold by providing compaction parameter tombstone_threshold when creating column family.
The histogram looks like this:
table options:
- tombstone_compaction_interval
- tombstone_threshold
garbage-collectable means at least the data was out of gc_grace
-Cassandra-5228: Drop entire sstables when all columns are expired
Cassandra(2.0 b1)
a separate compaction strategy that doesn't bother merging sstables, just throws out expired ones
<= The playlist service =>
= Tombstone hell =
重點:
expect tombstone would be deleted after 30days, but all tombstone since 1.5 years ago were there
Rows exist in 4+ SSTables, ts never del in minor compactions.
solution: use Major compaction
solution2: repairs during Monday-Friday, Major compaction in Saturday-Sunday
-> Dont use Cassandra to store queues
= Cassandra counters =
Cassandra counters
Distributed counters > works pretty well
Distributed counters > works pretty well
Counter added in cassandra 0.8
http://www.datastax.com/dev/blog/whats-new-in-cassandra-0-8-part-2-counters
http://www.datastax.com/dev/blog/whats-new-in-cassandra-0-8-part-2-counters
create a column family with default_validation_class=CounterColumnType
cli: keyspace.prepareColumnMutation(CF_COUNTER1, rowKey, "CounterColumn1").incrementCounterColumn(1).execute();
cql: UPDATE counters SET c1 = c1 + 3, c2 = c2 - 4 WHERE key = row2;
-For each write, only one of the replica has to perform a read, even with many replicas.
-- the read was part of the write, client will not observe.
-if sstable or disk corrupted, the counter cf should rebuild. (cant repair)
-Counter column與非counter的column不能共存 https://issues.apache.org/jira/browse/CASSANDRA-2614
-no TTL for counter column
-couner的remove有些許限制,如果只是要重置counter的話,快速的 incr-delete-incr 可能會讓delete被略過,比較適合的用法是減掉current value
delete最好只用來永久性的刪除某筆counter
-假設Counter的寫入timeout了,client並沒辦法確定counter是不是真的寫入成功(client不能單純地重新寫入,因為根本不知道資料到底有沒有寫成功,多寫了會造成重覆計算)
Counter++(cassandra 2.1b1)
https://issues.apache.org/jira/browse/CASSANDRA-6504
沒有留言 :
張貼留言