We encounter exception when using same index name in different keyspaces.
This feature won't exist until Cassandra 3.0, but we're using 1.2.
Cassandra:3.0
https://github.com/bitus-dt/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/config/CFMetaData.java#L877
https://github.com/bitus-dt/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/schema/KeyspaceMetadata.java#L110
Cassandra-1.2
https://github.com/bitus-dt/cassandra/blob/cassandra-1.2/src/java/org/apache/cassandra/config/CFMetaData.java#L1054
In order to use same index name in different keyspaces without upgrade Cassandra.
We fork and patch this feature to our branch.
Branch: https://github.com/bitus-dt/cassandra/tree/cassandra-1.2
Commit: https://github.com/bitus-dt/cassandra/commit/85487ad55c253ac43d4980cf30b9cfeac256af0f
2015-11-01
Add thread pool reject counter to metrics-3.1.2 branch
metrics support thread pool reject counter in 4.0, which is under development (master),
we'are using 3.1.2 and want to use this feature, so fork and patch this feature.
Branch: https://github.com/bitus-dt/metrics/tree/3.1.2-branch
Commit: https://github.com/bitus-dt/metrics/commit/d0c2c0d2e8afd691de82a51701dcd4d793a9761a
we'are using 3.1.2 and want to use this feature, so fork and patch this feature.
Branch: https://github.com/bitus-dt/metrics/tree/3.1.2-branch
Commit: https://github.com/bitus-dt/metrics/commit/d0c2c0d2e8afd691de82a51701dcd4d793a9761a
2015-07-25
Karaf + Log4j2
First of all, Log4j2 for Karaf is provided by Pax Logging which requires pax-logging-api-1.8.x.
Dependencies
startup.properties
org.ops4j.pax.logging.cfg
log4j2.yaml
Create your log4j2.yaml in ${karaf.etc} and then you can configure log4j2 in it as normal Java application.
Furthermore, migrate log4j.rootLogger = INFO, osgi:VmLogAppender in log4j settings to log4j2.
Dependencies
- Apache Karaf 2.4.3 (the version for the example below)
- pax-logging-log4j2-1.8.3 (http://search.maven.org/#search|ga|1|pax-logging-log4j2)
- jackson-annotations-2.4.6
- jackson-core-2.4.6
- jackson-databind-2.4.6.1
- jackson-dataformat-yaml-2.4.6 (if you choose YAML as your log4j2 configuration format which is the one I prefer because it's much more human-readable than XML and JSON)
- disruptor-3.3.2 (if using asynchronous loggers)
startup.properties
#org/ops4j/pax/logging/pax-logging-service/1.8.3/pax-logging-service-1.8.3.jar=8 org/ops4j/pax/logging/pax-logging-log4j2/1.8.3/pax-logging-log4j2-1.8.3.jar=8 # Add other libraries you need below com/fasterxml/jackson/core/jackson-annotations/2.4.6/jackson-annotations-2.4.6.jar=35 com/fasterxml/jackson/core/jackson-core/2.4.6/jackson-core-2.4.6.jar=35 com/fasterxml/jackson/core/jackson-databind/2.4.6.1/jackson-databind-2.4.6.1.jar=35 com/fasterxml/jackson/dataformat/jackson-dataformat-yaml/2.4.6/jackson-dataformat-yaml-2.4.6.jar=35 com/lmax/disruptor/3.3.2/disruptor-3.3.2.jar=35
org.ops4j.pax.logging.cfg
org.ops4j.pax.logging.log4j2.config.file = ${karaf.etc}/log4j2.yaml # Use asynchronous logger for all loggers #org.ops4j.pax.logging.log4j2.async = false
log4j2.yaml
Create your log4j2.yaml in ${karaf.etc} and then you can configure log4j2 in it as normal Java application.
Furthermore, migrate log4j.rootLogger = INFO, osgi:VmLogAppender in log4j settings to log4j2.
appenders: PaxOsgi: name: paxosgi filter: VmLogAppender loggers: root: level: INFO AppenderRef: - ref: paxosgi
2015-01-21
多個Cassandra Node在不同Data Center
本實作是採用Cassandra 1.0.12來實施
Token分配
由於Cassandra 1.0.12並沒有token generator, 建議下載https://raw.github.com/riptano/ComboAMI/2.2/tokentoolv2.py, 產生Token分配表, 如下所示:
Snitch用來設定Topology環境, 目的避免單一node failure.
其中環境可分為DataCenter和Rack, 在本文的測試環境分為DC1和DC2, 統一都是使用第一組機台
原文如下:
Set this to a class that implements
# IEndpointSnitch. The snitch has two functions:
# - it teaches Cassandra enough about your network topology to route
# requests efficiently
# - it allows Cassandra to spread replicas around your cluster to avoid
# correlated failures. It does this by grouping machines into
# "datacenters" and "racks." Cassandra will do its best not to have
# more than one replica on the same "rack" (which may not actually
# be a physical location)
Seed Node
- class_name: org.apache.cassandra.locator.SimpleSeedProvider
parameters:
- seeds: "clouddb1.gc.ubicloud.net,clouddb4.gc.ubicloud.net"
================================================
後記
新增一個sample keyspace
$ ./cqlsh localhost
> CREATE KEYSPACE sample WITH strategy_class = 'NetworkTopologyStrategy' AND strategy_options:DC1 = '3' and strategy_options:DC2 = '3';
Ring的結果:
$ ./nodetool -h self ring
Address DC Rack Status State Load Owns Token
169417178424467235000914166253263322299
node0 172.16.70.32 DC1 RAC1 Up Normal 93.18 KB 0.43% 0
node4 172.16.70.44 DC2 RAC1 Up Normal 74.67 KB 32.91% 55989722784154413846455963776007251813
node1 172.16.70.41 DC1 RAC1 Up Normal 97.89 KB 0.43% 56713727820156410577229101238628035242
node5 172.16.70.45 DC2 RAC1 Up Normal 81.01 KB 32.91% 112703450604310824423685065014635287055
node2 172.16.70.42 DC1 RAC1 Up Normal 97.66 KB 0.43% 113427455640312821154458202477256070484
node3 172.16.70.43 DC2 RAC1 Up Normal 81.01 KB 32.91% 169417178424467235000914166253263322299
$ ./nodetool -h self describering sample
TokenRange:
TokenRange(start_token:55989722784154413846455963776007251813, end_token:56713727820156410577229101238628035242, endpoints:[172.16.70.45, 172.16.70.43, 172.16.70.44, 172.16.70.41, 172.16.70.42, 172.16.70.32], rpc_endpoints:[0.0.0.0, 0.0.0.0, 0.0.0.0, 0.0.0.0, 0.0.0.0, 0.0.0.0], endpoint_details:[EndpointDetails(host:172.16.70.45, datacenter:DC2, rack:RAC1), EndpointDetails(host:172.16.70.43, datacenter:DC2, rack:RAC1), EndpointDetails(host:172.16.70.44, datacenter:DC2, rack:RAC1), EndpointDetails(host:172.16.70.41, datacenter:DC1, rack:RAC1), EndpointDetails(host:172.16.70.42, datacenter:DC1, rack:RAC1), EndpointDetails(host:172.16.70.32, datacenter:DC1, rack:RAC1)])
TokenRange(start_token:113427455640312821154458202477256070484, end_token:169417178424467235000914166253263322299, endpoints:[172.16.70.43, 172.16.70.44, 172.16.70.45, 172.16.70.32, 172.16.70.41, 172.16.70.42], rpc_endpoints:[0.0.0.0, 0.0.0.0, 0.0.0.0, 0.0.0.0, 0.0.0.0, 0.0.0.0], endpoint_details:[EndpointDetails(host:172.16.70.43, datacenter:DC2, rack:RAC1), EndpointDetails(host:172.16.70.44, datacenter:DC2, rack:RAC1), EndpointDetails(host:172.16.70.45, datacenter:DC2, rack:RAC1), EndpointDetails(host:172.16.70.32, datacenter:DC1, rack:RAC1), EndpointDetails(host:172.16.70.41, datacenter:DC1, rack:RAC1), EndpointDetails(host:172.16.70.42, datacenter:DC1, rack:RAC1)])
TokenRange(start_token:169417178424467235000914166253263322299, end_token:0, endpoints:[172.16.70.44, 172.16.70.45, 172.16.70.43, 172.16.70.32, 172.16.70.41, 172.16.70.42], rpc_endpoints:[0.0.0.0, 0.0.0.0, 0.0.0.0, 0.0.0.0, 0.0.0.0, 0.0.0.0], endpoint_details:[EndpointDetails(host:172.16.70.44, datacenter:DC2, rack:RAC1), EndpointDetails(host:172.16.70.45, datacenter:DC2, rack:RAC1), EndpointDetails(host:172.16.70.43, datacenter:DC2, rack:RAC1), EndpointDetails(host:172.16.70.32, datacenter:DC1, rack:RAC1), EndpointDetails(host:172.16.70.41, datacenter:DC1, rack:RAC1), EndpointDetails(host:172.16.70.42, datacenter:DC1, rack:RAC1)])
TokenRange(start_token:56713727820156410577229101238628035242, end_token:112703450604310824423685065014635287055, endpoints:[172.16.70.45, 172.16.70.43, 172.16.70.44, 172.16.70.42, 172.16.70.32, 172.16.70.41], rpc_endpoints:[0.0.0.0, 0.0.0.0, 0.0.0.0, 0.0.0.0, 0.0.0.0, 0.0.0.0], endpoint_details:[EndpointDetails(host:172.16.70.45, datacenter:DC2, rack:RAC1), EndpointDetails(host:172.16.70.43, datacenter:DC2, rack:RAC1), EndpointDetails(host:172.16.70.44, datacenter:DC2, rack:RAC1), EndpointDetails(host:172.16.70.42, datacenter:DC1, rack:RAC1), EndpointDetails(host:172.16.70.32, datacenter:DC1, rack:RAC1), EndpointDetails(host:172.16.70.41, datacenter:DC1, rack:RAC1)])
TokenRange(start_token:112703450604310824423685065014635287055, end_token:113427455640312821154458202477256070484, endpoints:[172.16.70.43, 172.16.70.44, 172.16.70.45, 172.16.70.42, 172.16.70.32, 172.16.70.41], rpc_endpoints:[0.0.0.0, 0.0.0.0, 0.0.0.0, 0.0.0.0, 0.0.0.0, 0.0.0.0], endpoint_details:[EndpointDetails(host:172.16.70.43, datacenter:DC2, rack:RAC1), EndpointDetails(host:172.16.70.44, datacenter:DC2, rack:RAC1), EndpointDetails(host:172.16.70.45, datacenter:DC2, rack:RAC1), EndpointDetails(host:172.16.70.42, datacenter:DC1, rack:RAC1), EndpointDetails(host:172.16.70.32, datacenter:DC1, rack:RAC1), EndpointDetails(host:172.16.70.41, datacenter:DC1, rack:RAC1)])
TokenRange(start_token:0, end_token:55989722784154413846455963776007251813, endpoints:[172.16.70.44, 172.16.70.45, 172.16.70.43, 172.16.70.41, 172.16.70.42, 172.16.70.32], rpc_endpoints:[0.0.0.0, 0.0.0.0, 0.0.0.0, 0.0.0.0, 0.0.0.0, 0.0.0.0], endpoint_details:[EndpointDetails(host:172.16.70.44, datacenter:DC2, rack:RAC1), EndpointDetails(host:172.16.70.45, datacenter:DC2, rack:RAC1), EndpointDetails(host:172.16.70.43, datacenter:DC2, rack:RAC1), EndpointDetails(host:172.16.70.41, datacenter:DC1, rack:RAC1), EndpointDetails(host:172.16.70.42, datacenter:DC1, rack:RAC1), EndpointDetails(host:172.16.70.32, datacenter:DC1, rack:RAC1)])
從describering的結果顯示Ring的排列方式如下:
4 -> 1 -> 5 -> 2 -> 3 -> 0 -> 4
Token分配
由於Cassandra 1.0.12並沒有token generator, 建議下載https://raw.github.com/riptano/ComboAMI/2.2/tokentoolv2.py, 產生Token分配表, 如下所示:
Node | hostname | IP Address | Token | Data Center | Rack |
---|---|---|---|---|---|
node0 | clouddb1.gc.net | 172.16.70.32 | 0 | DC1 | RAC1 |
node1 | clouddb2.gc.net | 172.16.70.41 | 56713727820156410577229101238628035242 | DC1 | RAC1 |
node2 | clouddb3.gc.net | 172.16.70.42 | 113427455640312821154458202477256070485 | DC1 | RAC1 |
node3 | clouddb4.gc.net | 172.16.70.43 | 28356863910078205288614550619314017621 | DC2 | RAC1 |
node4 | clouddb5.gc.net | 172.16.70.44 | 85070591730234615865843651857942052863 | DC2 | RAC1 |
node5 | clouddb6.gc.net | 172.16.70.45 | 141784319550391026443072753096570088106 | DC2 | RAC1 |
修改cassandra.yaml
依token分配表將token和hostname填入initial_token和listen_address,例如
In node 0
initial_token: 0
listen_address: clouddb1.gc.net
initial_token: 0
listen_address: clouddb1.gc.net
In node 1
initial_token: 56713727820156410577229101238628035242
listen_address: clouddb2.gc.net
initial_token: 56713727820156410577229101238628035242
listen_address: clouddb2.gc.net
依此類推
Snitch用來設定Topology環境, 目的避免單一node failure.
其中環境可分為DataCenter和Rack, 在本文的測試環境分為DC1和DC2, 統一都是使用第一組機台
原文如下:
Set this to a class that implements
# IEndpointSnitch. The snitch has two functions:
# - it teaches Cassandra enough about your network topology to route
# requests efficiently
# - it allows Cassandra to spread replicas around your cluster to avoid
# correlated failures. It does this by grouping machines into
# "datacenters" and "racks." Cassandra will do its best not to have
# more than one replica on the same "rack" (which may not actually
# be a physical location)
Cassandra提供幾種Snitch的方式
在node0~node5的conf/cassandra.yaml
且修改conf/cassandra-topology.properties, 指定data center和rack
- SimpleSnitch
Treats Strategy order as proximity. This improves cache
locality when disabling read repair, which can further improve
throughput. Only appropriate for single-datacenter deployments.
- PropertyFileSnitch
Proximity is determined by rack and data center, which are explicitly configured in cassandra-topology.properties.
- RackInferringSnitch
Proximity is determined by rack and data center, which are
assumed to correspond to the 3rd and 2nd octet of each node's IP
address, respectively. Unless this happens to match your deployment
conventions (as it did Facebook's), this is best used as an example of
writing a custom Snitch class.
- Ec2Snitch
Appropriate for EC2 deployments in a single Region. Loads
Region and Availability Zone information from the EC2 API. The Region
is treated as the Datacenter, and the Availability Zone as the rack.
Only private IPs are used, so this will not work across multiple
Regions.
- Ec2MultiRegionSnitch
Uses public IPs as broadcast_address to allow cross-region
connectivity. (Thus, you should set seed addresses to the public IP
as well.) You will need to open the storage_port or ssl_storage_port on
the public IP firewall. (For intra-Region traffic, Cassandra will
switch to the private IP after establishing a connection.)
在node0~node5的conf/cassandra.yaml
endpoint_snitch: PropertyFileSnitch
172.16.70.32=DC1:RAC1
172.16.70.41=DC1:RAC1
172.16.70.42=DC1:RAC1
172.16.70.43=DC2:RAC1
172.16.70.44=DC2:RAC1
172.16.70.45=DC2:RAC1
default=DC1:r1
172.16.70.41=DC1:RAC1
172.16.70.42=DC1:RAC1
172.16.70.43=DC2:RAC1
172.16.70.44=DC2:RAC1
172.16.70.45=DC2:RAC1
default=DC1:r1
Seed Node
指定node0, node1和node2為seed node
node0~node5的conf/cassandra.yaml
seed_provider:node0~node5的conf/cassandra.yaml
- class_name: org.apache.cassandra.locator.SimpleSeedProvider
parameters:
- seeds: "clouddb1.gc.ubicloud.net,clouddb4.gc.ubicloud.net"
開起服務
================================================
後記
新增一個sample keyspace
$ ./cqlsh localhost
> CREATE KEYSPACE sample WITH strategy_class = 'NetworkTopologyStrategy' AND strategy_options:DC1 = '3' and strategy_options:DC2 = '3';
Ring的結果:
$ ./nodetool -h self ring
Address DC Rack Status State Load Owns Token
169417178424467235000914166253263322299
node0 172.16.70.32 DC1 RAC1 Up Normal 93.18 KB 0.43% 0
node4 172.16.70.44 DC2 RAC1 Up Normal 74.67 KB 32.91% 55989722784154413846455963776007251813
node1 172.16.70.41 DC1 RAC1 Up Normal 97.89 KB 0.43% 56713727820156410577229101238628035242
node5 172.16.70.45 DC2 RAC1 Up Normal 81.01 KB 32.91% 112703450604310824423685065014635287055
node2 172.16.70.42 DC1 RAC1 Up Normal 97.66 KB 0.43% 113427455640312821154458202477256070484
node3 172.16.70.43 DC2 RAC1 Up Normal 81.01 KB 32.91% 169417178424467235000914166253263322299
$ ./nodetool -h self describering sample
TokenRange:
TokenRange(start_token:55989722784154413846455963776007251813, end_token:56713727820156410577229101238628035242, endpoints:[172.16.70.45, 172.16.70.43, 172.16.70.44, 172.16.70.41, 172.16.70.42, 172.16.70.32], rpc_endpoints:[0.0.0.0, 0.0.0.0, 0.0.0.0, 0.0.0.0, 0.0.0.0, 0.0.0.0], endpoint_details:[EndpointDetails(host:172.16.70.45, datacenter:DC2, rack:RAC1), EndpointDetails(host:172.16.70.43, datacenter:DC2, rack:RAC1), EndpointDetails(host:172.16.70.44, datacenter:DC2, rack:RAC1), EndpointDetails(host:172.16.70.41, datacenter:DC1, rack:RAC1), EndpointDetails(host:172.16.70.42, datacenter:DC1, rack:RAC1), EndpointDetails(host:172.16.70.32, datacenter:DC1, rack:RAC1)])
TokenRange(start_token:113427455640312821154458202477256070484, end_token:169417178424467235000914166253263322299, endpoints:[172.16.70.43, 172.16.70.44, 172.16.70.45, 172.16.70.32, 172.16.70.41, 172.16.70.42], rpc_endpoints:[0.0.0.0, 0.0.0.0, 0.0.0.0, 0.0.0.0, 0.0.0.0, 0.0.0.0], endpoint_details:[EndpointDetails(host:172.16.70.43, datacenter:DC2, rack:RAC1), EndpointDetails(host:172.16.70.44, datacenter:DC2, rack:RAC1), EndpointDetails(host:172.16.70.45, datacenter:DC2, rack:RAC1), EndpointDetails(host:172.16.70.32, datacenter:DC1, rack:RAC1), EndpointDetails(host:172.16.70.41, datacenter:DC1, rack:RAC1), EndpointDetails(host:172.16.70.42, datacenter:DC1, rack:RAC1)])
TokenRange(start_token:169417178424467235000914166253263322299, end_token:0, endpoints:[172.16.70.44, 172.16.70.45, 172.16.70.43, 172.16.70.32, 172.16.70.41, 172.16.70.42], rpc_endpoints:[0.0.0.0, 0.0.0.0, 0.0.0.0, 0.0.0.0, 0.0.0.0, 0.0.0.0], endpoint_details:[EndpointDetails(host:172.16.70.44, datacenter:DC2, rack:RAC1), EndpointDetails(host:172.16.70.45, datacenter:DC2, rack:RAC1), EndpointDetails(host:172.16.70.43, datacenter:DC2, rack:RAC1), EndpointDetails(host:172.16.70.32, datacenter:DC1, rack:RAC1), EndpointDetails(host:172.16.70.41, datacenter:DC1, rack:RAC1), EndpointDetails(host:172.16.70.42, datacenter:DC1, rack:RAC1)])
TokenRange(start_token:56713727820156410577229101238628035242, end_token:112703450604310824423685065014635287055, endpoints:[172.16.70.45, 172.16.70.43, 172.16.70.44, 172.16.70.42, 172.16.70.32, 172.16.70.41], rpc_endpoints:[0.0.0.0, 0.0.0.0, 0.0.0.0, 0.0.0.0, 0.0.0.0, 0.0.0.0], endpoint_details:[EndpointDetails(host:172.16.70.45, datacenter:DC2, rack:RAC1), EndpointDetails(host:172.16.70.43, datacenter:DC2, rack:RAC1), EndpointDetails(host:172.16.70.44, datacenter:DC2, rack:RAC1), EndpointDetails(host:172.16.70.42, datacenter:DC1, rack:RAC1), EndpointDetails(host:172.16.70.32, datacenter:DC1, rack:RAC1), EndpointDetails(host:172.16.70.41, datacenter:DC1, rack:RAC1)])
TokenRange(start_token:112703450604310824423685065014635287055, end_token:113427455640312821154458202477256070484, endpoints:[172.16.70.43, 172.16.70.44, 172.16.70.45, 172.16.70.42, 172.16.70.32, 172.16.70.41], rpc_endpoints:[0.0.0.0, 0.0.0.0, 0.0.0.0, 0.0.0.0, 0.0.0.0, 0.0.0.0], endpoint_details:[EndpointDetails(host:172.16.70.43, datacenter:DC2, rack:RAC1), EndpointDetails(host:172.16.70.44, datacenter:DC2, rack:RAC1), EndpointDetails(host:172.16.70.45, datacenter:DC2, rack:RAC1), EndpointDetails(host:172.16.70.42, datacenter:DC1, rack:RAC1), EndpointDetails(host:172.16.70.32, datacenter:DC1, rack:RAC1), EndpointDetails(host:172.16.70.41, datacenter:DC1, rack:RAC1)])
TokenRange(start_token:0, end_token:55989722784154413846455963776007251813, endpoints:[172.16.70.44, 172.16.70.45, 172.16.70.43, 172.16.70.41, 172.16.70.42, 172.16.70.32], rpc_endpoints:[0.0.0.0, 0.0.0.0, 0.0.0.0, 0.0.0.0, 0.0.0.0, 0.0.0.0], endpoint_details:[EndpointDetails(host:172.16.70.44, datacenter:DC2, rack:RAC1), EndpointDetails(host:172.16.70.45, datacenter:DC2, rack:RAC1), EndpointDetails(host:172.16.70.43, datacenter:DC2, rack:RAC1), EndpointDetails(host:172.16.70.41, datacenter:DC1, rack:RAC1), EndpointDetails(host:172.16.70.42, datacenter:DC1, rack:RAC1), EndpointDetails(host:172.16.70.32, datacenter:DC1, rack:RAC1)])
從describering的結果顯示Ring的排列方式如下:
4 -> 1 -> 5 -> 2 -> 3 -> 0 -> 4
2015-01-14
關於Cassandra的compaction
以下為最近研究Compaction時的一些心得,有錯歡迎指正
Compaction做什麼?
Compaction指的是將數個 SSTable合併成一個SSTable的動作,並且在合併的同時會merges keys, combines columns, discards tombstones and creates a new index in the merged SSTable.
Compaction的種類
一般將Compaction分成 minor compaction與major compaction兩種
Major compaction
當一個column family在做Major compaction時,會將這個column family的所有的SSTable合併成一個
Minor compaction
Minor compaction是指 Cassandra 在 runtime 時,經由compaction strategy判斷而做的compaction,這類的compaction通常一次只會處理部份的SSTable
Major compaction 與 Minor compaction的定義真的是如此嗎?
cassandra 官方似乎沒有很明確的定義 minor compaction。但在 Datastax的這份nodetool compact 文件中指出:"You can specify a keyspace for compaction. If you do not specify a keyspace, the nodetool command uses the current keyspace. You can specify one or more tables for compaction. If you do not specify a table(s), compaction of all tables in the keyspace occurs. This is called a major compaction. If you do specify a table(s), compaction of the specified table(s) occurs. This is called a minor compaction. " 這說法其實是有問題的,實際去追source code (0.8~2.1版)會發現,不管有沒有指定 column family,nodetool compact最後都是用ColumnFamilyStore.forceMajorCompaction() 這個method,而實際操作確實不管有沒有指定column family,nodetool compact完之後,column family的SSTable都會被合併成一個。因此本文件以行為來區分Major與Minor,至少到2.1版為止的行為都可以解釋得通(不過更之後的版本可能未必還是如此)
Cassandra一定要Major compaction才會清除tombstone嗎?
早期cassandra (<0.6版) 確實是如此,因此會需要定期執行 nodetool compact,但是後來minor compact也可以清除 tombstone了 ( CASSANDRA-1074 )
Compaction是怎麼發生的?
AutoCompaction
Force compaction
Compaction會自動發生(稱為 autoCompaction),也可以手動強制執行。
AutoCompaction
根據column family的 compaction options 來決定是否觸發,基本上 autoCompaction只會觸發 minor compaction。關於compaction options 請自行參考適合你的cassandra column family設定文件
Force compaction
透過 nodetool compact 的指令或是JMX可以觸發強制 compaction。強制compaction會觸發 major compaction
做過手動 compaction 之後,autoCompaction就不會再做了嗎?
錯,只要沒特地把 compaction關掉,不然auto compaction還是會持續被執行。這個問題在網路上有不少人在詢問,主要應該是因為Datastax的幾篇舊的tuning文件( 0.8, 1.0 )有以下這樣的敘述:"once you run a major compaction, automatic minor compactions are no longer triggered frequently forcing you to manually run major compactions on a routine basis. So while read performance will be good immediately following a major compaction, it will continually degrade until the next major compaction is manually invoked. ",這段文字其實是指當做過手動 major compaction之後,column family的SSTable會被合併成一個,這個新的SSTable會是一個相對很大的檔案,而SizeTieredCompactionStrategy的觸發條件是當有幾個接近大小的SSTable存在時,才會做minor compaction將這幾個同大小的SSTable做compaction,因此這個相對較大的SSTable在SizeTieredCompactionStrategy的運作下,可能很難再發生minor compaction,導致管理者未來只能透過major compaction才有機會對這個SSTable做compaction。(可能是因為這段話造成太多誤會,Datastax在 1.1版文件 裡也修改了描述方式) (在cassandra 1.2之後提供了 sstablesplit 的指令可以將這類的大 SSTable檔案切成數個小檔)
Major compaction只會做用在 SizeTieredCompactionStrategy 或 DateTieredCompactionStrategy 這兩種strategy嗎?
是的,Major compaction只會作用在這兩種strategy的column family,對LeveledCompactionStrategy執行major compaction不會有任何反應
要怎麼關掉autoCompaction
- ColumnFamily的JMX method disableAutoCompaction() 可用來在runtime時暫時關掉autoCompaction,但是並沒有enable的method,所以只有重啟node才能回復。
- 1.x版的Cassandra,將SizeTieredCompactionStrategy的 min_compaction_threshold跟max_compaction_threshold 都設成0可以讓該column family不再做autoCompaction
- 2.x版的column family的Compaction options有個 enable的選項可以控制是否啟用autoCompaction
訂閱:
文章
(
Atom
)