Elasticsearch Puppet Module
Table of Contents
- Module description - What the module does and why it is useful
- Setup - The basics of getting started with Elasticsearch
- Usage - Configuration options and additional functionality
- Advanced features - Extra information on advanced usage
- Reference - An under-the-hood peek at what the module is doing and how
- Limitations - OS compatibility, etc.
- Development - Guide for contributing to the module
- Support - When you need help with this module
Module description
This module sets up Elasticsearch instances with additional resource for plugins, templates, and more.
This module is actively tested against Elasticsearch 2.x and 5.x.
Setup
The module manages the following
- Elasticsearch repository files.
- Elasticsearch package.
- Elasticsearch configuration file.
- Elasticsearch service.
- Elasticsearch plugins.
- Elasticsearch templates.
- Elasticsearch ingest pipelines.
- Elasticsearch index settings.
- Elasticsearch Shield/X-Pack users, roles, and certificates.
- Elasticsearch keystores.
Requirements
- The stdlib Puppet library.
- puppet/yum For yum version lock.
- richardc/datacat
- Augeas
- puppetlabs-java for Java installation (optional).
- puppetlabs-java_ks for Shield/X-Pack certificate management (optional).
Repository management
When using the repository management, the following module dependencies are required:
- Debian/Ubuntu: Puppetlabs/apt
- OpenSuSE/SLES: Darin/zypprepo
Beginning with Elasticsearch
Declare the top-level elasticsearch
class (managing repositories) and set up an instance:
class { 'elasticsearch':
java_install => true,
manage_repo => true,
repo_version => '5.x',
}
elasticsearch::instance { 'es-01': }
Note: Elasticsearch 5.x requires a recent version of the JVM.
If you are on a recent version of your distribution of choice (such as Ubuntu 16.04 or CentOS 7), setting java_install => true
will work out-of-the-box.
If you are on an earlier distribution, you may need to take additional measures to install Java 1.8.
Usage
Main class
Most top-level parameters in the elasticsearch
class are set to reasonable defaults.
The following are some parameters that may be useful to override:
Install a specific version
class { 'elasticsearch':
version => '1.4.2'
}
Note: This will only work when using the repository.
Automatically restarting the service (default set to false)
By default, the module will not restart Elasticsearch when the configuration file, package, or plugins change. This can be overridden globally with the following option:
class { 'elasticsearch':
restart_on_change => true
}
Or controlled with the more granular options: restart_config_change
, restart_package_change
, and restart_plugin_change.
Automatic upgrades (default set to false)
class { 'elasticsearch':
autoupgrade => true
}
Removal/Decommissioning
class { 'elasticsearch':
ensure => 'absent'
}
Install everything but disable service(s) afterwards
class { 'elasticsearch':
status => 'disabled'
}
API Settings
Some resources, such as elasticsearch::template
, require communicating with the Elasticsearch REST API.
By default, these API settings are set to:
class { 'elasticsearch':
api_protocol => 'http',
api_host => 'localhost',
api_port => 9200,
api_timeout => 10,
api_basic_auth_username => undef,
api_basic_auth_password => undef,
api_ca_file => undef,
api_ca_path => undef,
validate_tls => true,
}
Each of these can be set at the top-level elasticsearch
class and inherited for each resource or overridden on a per-resource basis.
Dynamically Created Resources
This module supports managing all of its defined types through top-level parameters to better support Hiera and Puppet Enterprise.
For example, to manage an instance and index template directly from the elasticsearch
class:
class { 'elasticsearch':
instances => {
'es-01' => {
'config' => {
'network.host' => '0.0.0.0'
}
}
},
templates => {
'logstash' => {
'content' => {
'template' => 'logstash-*',
'settings' => {
'number_of_replicas' => 0
}
}
}
}
}
Instances
This module works with the concept of instances. For service to start you need to specify at least one instance.
Quick setup
elasticsearch::instance { 'es-01': }
This will set up its own data directory and set the node name to $hostname-$instance_name
Advanced options
Instance specific options can be given:
elasticsearch::instance { 'es-01':
config => { }, # Configuration hash
init_defaults => { }, # Init defaults hash
datadir => [ ], # Data directory
}
See Advanced features for more information.
Plugins
This module can help manage a variety of plugins.
Note that module_dir
is where the plugin will install itself to and must match that published by the plugin author; it is not where you would like to install it yourself.
From an official repository
elasticsearch::plugin { 'lmenezes/elasticsearch-kopf':
instances => 'instance_name'
}
From a custom url
elasticsearch::plugin { 'jetty':
url => 'https://oss-es-plugins.s3.amazonaws.com/elasticsearch-jetty/elasticsearch-jetty-1.2.1.zip',
instances => 'instance_name'
}
Using a proxy
You can also use a proxy if required by setting the proxy_host
and proxy_port
options:
elasticsearch::plugin { 'lmenezes/elasticsearch-kopf',
instances => 'instance_name',
proxy_host => 'proxy.host.com',
proxy_port => 3128
}
Proxies that require usernames and passwords are similarly supported with the proxy_username
and proxy_password
parameters.
Plugin name formats that are supported include:
elasticsearch/plugin/version
(for official elasticsearch plugins downloaded from download.elastic.co)groupId/artifactId/version
(for community plugins downloaded from maven central or OSS Sonatype)username/repository
(for site plugins downloaded from github master)
Upgrading plugins
When you specify a certain plugin version, you can upgrade that plugin by specifying the new version.
elasticsearch::plugin { 'elasticsearch/elasticsearch-cloud-aws/2.1.1': }
And to upgrade, you would simply change it to
elasticsearch::plugin { 'elasticsearch/elasticsearch-cloud-aws/2.4.1': }
Please note that this does not work when you specify 'latest' as a version number.
ES 2.x official plugins
For the Elasticsearch commercial plugins you can refer them to the simple name.
See Plugin installation for more details.
Scripts
Installs scripts to be used by Elasticsearch. These scripts are shared across all defined instances on the same host.
elasticsearch::script { 'myscript':
ensure => 'present',
source => 'puppet:///path/to/my/script.groovy'
}
Script directories can also be recursively managed for large collections of scripts:
elasticsearch::script { 'myscripts_dir':
ensure => 'directory,
source => 'puppet:///path/to/myscripts_dir'
recurse => 'remote',
}
Templates
By default templates use the top-level elasticsearch::api_*
settings to communicate with Elasticsearch.
The following is an example of how to override these settings:
elasticsearch::template { 'templatename':
api_protocol => 'https',
api_host => $::ipaddress,
api_port => 9201,
api_timeout => 60,
api_basic_auth_username => 'admin',
api_basic_auth_password => 'adminpassword',
api_ca_file => '/etc/ssl/certs',
api_ca_path => '/etc/pki/certs',
validate_tls => false,
source => 'puppet:///path/to/template.json',
}
Add a new template using a file
This will install and/or replace the template in Elasticsearch:
elasticsearch::template { 'templatename':
source => 'puppet:///path/to/template.json',
}
Add a new template using content
This will install and/or replace the template in Elasticsearch:
elasticsearch::template { 'templatename':
content => {
'template' => "*",
'settings' => {
'number_of_replicas' => 0
}
}
}
Plain JSON strings are also supported.
elasticsearch::template { 'templatename':
content => '{"template":"*","settings":{"number_of_replicas":0}}'
}
Delete a template
elasticsearch::template { 'templatename':
ensure => 'absent'
}
Ingestion Pipelines
Pipelines behave similar to templates in that their contents can be controlled
over the Elasticsearch REST API with a custom Puppet resource.
API parameters follow the same rules as templates (those settings can either be
controlled at the top-level in the elasticsearch
class or set per-resource).
Adding a new pipeline
This will install and/or replace an ingestion pipeline in Elasticsearch (ingestion settings are compared against the present configuration):
elasticsearch::pipeline { 'addfoo':
content => {
'description' => 'Add the foo field',
'processors' => [{
'set' => {
'field' => 'foo',
'value' => 'bar'
}
}]
}
}
Delete a pipeline
elasticsearch::pipeline { 'addfoo':
ensure => 'absent'
}
Index Settings
This module includes basic support for ensuring an index is present or absent with optional index settings. API access settings follow the pattern previously mentioned for templates.
Creating an index
At the time of this writing, only index settings are supported.
Note that some settings (such as number_of_shards
) can only be set at index
creation time.
elasticsearch::index { 'foo':
settings => {
'index' => {
'number_of_replicas' => 0
}
}
}
Delete an index
elasticsearch::index { 'foo':
ensure => 'absent'
}
Bindings/Clients
Install a variety of clients/bindings:
Python
elasticsearch::python { 'rawes': }
Ruby
elasticsearch::ruby { 'elasticsearch': }
Connection Validator
This module offers a way to make sure an instance has been started and is up and running before
doing a next action. This is done via the use of the es_instance_conn_validator
resource.
es_instance_conn_validator { 'myinstance' :
server => 'es.example.com',
port => '9200',
}
A common use would be for example :
class { 'kibana4' :
require => Es_Instance_Conn_Validator['myinstance'],
}
Package installation
There are two different ways of installing Elasticsearch:
Repository
This option allows you to use an existing repository for package installation.
The repo_version
corresponds with the major.minor
version of Elasticsearch for versions before 2.x.
class { 'elasticsearch':
manage_repo => true,
repo_version => '1.4',
}
For 2.x versions of Elasticsearch, use repo_version => '2.x'
.
class { 'elasticsearch':
manage_repo => true,
repo_version => '2.x',
}
For users who may wish to install via a local repository (for example, through a mirror), the repo_baseurl
parameter is available:
class { 'elasticsearch':
manage_repo => true,
repo_baseurl => 'https://repo.local/yum'
}
Remote package source
When a repository is not available or preferred you can install the packages from a remote source:
http/https/ftp
class { 'elasticsearch':
package_url => 'https://download.elasticsearch.org/elasticsearch/elasticsearch/elasticsearch-1.4.2.deb',
proxy_url => 'http://proxy.example.com:8080/',
}
Setting proxy_url
to a location will enable download using the provided proxy
server.
This parameter is also used by elasticsearch::plugin
.
Setting the port in the proxy_url
is mandatory.
proxy_url
defaults to undef
(proxy disabled).
puppet://
class { 'elasticsearch':
package_url => 'puppet:///path/to/elasticsearch-1.4.2.deb'
}
Local file
class { 'elasticsearch':
package_url => 'file:/path/to/elasticsearch-1.4.2.deb'
}
Java installation
Most sites will manage Java separately; however, this module can attempt to install Java as well. This is done by using the puppetlabs-java module.
class { 'elasticsearch':
java_install => true
}
Specify a particular Java package/version to be installed:
class { 'elasticsearch':
java_install => true,
java_package => 'packagename'
}
When configuring Elasticsearch's memory usage, you can do so by either changing init defaults for Elasticsearch 1.x/2.x (see the following example), or modify it globally in 5.x using jvm.options
:
class { 'elasticsearch':
jvm_options => [
'-Xms4g',
'-Xmx4g'
]
}
jvm.options
can also be controlled per-instance:
elasticsearch::instance { 'es-01':
jvm_options => [
'-Xms4g',
'-Xmx4g'
]
}
Service management
Currently only the basic SysV-style init and Systemd service providers are supported, but other systems could be implemented as necessary (pull requests welcome).
Defaults File
The defaults file (/etc/defaults/elasticsearch
or /etc/sysconfig/elasticsearch
) for the Elasticsearch service can be populated as necessary.
This can either be a static file resource or a simple key value-style hash object, the latter being particularly well-suited to pulling out of a data source such as Hiera.
File source
class { 'elasticsearch':
init_defaults_file => 'puppet:///path/to/defaults'
}
Hash representation
$config_hash = {
'ES_HEAP_SIZE' => '30g',
}
class { 'elasticsearch':
init_defaults => $config_hash
}
Note: init_defaults
hash can be passed to the main class and to the instance.
Advanced features
X-Pack/Shield
X-Pack and Shield file-based users, roles, and certificates can be managed by this module.
Note: If you are planning to use these features, it is highly recommended you read the following documentation to understand the caveats and extent of the resources available to you.
Getting Started
Although this module can handle several types of Shield/X-Pack resources, you are expected to manage the plugin installation and versions for your deployment. For example, the following manifest will install Elasticseach with a single instance running X-Pack:
class { 'elasticsearch':
java_install => true,
manage_repo => true,
repo_version => '5.x',
security_plugin => 'x-pack',
}
elasticsearch::instance { 'es-01': }
elasticsearch::plugin { 'x-pack': instances => 'es-01' }
The following manifest will do the same, but with Shield:
class { 'elasticsearch':
java_install => true,
manage_repo => true,
repo_version => '2.x',
security_plugin => 'shield',
}
elasticsearch::instance { 'es-01': }
Elasticsearch::Plugin { instances => ['es-01'], }
elasticsearch::plugin { 'license': }
elasticsearch::plugin { 'shield': }
The following examples will assume the preceding resources are part of your puppet manifest.
Roles
Roles in the file realm (the esusers
realm in Shield) can be managed using the elasticsearch::role
type.
For example, to create a role called myrole
, you could use the following resource in X-Pack:
elasticsearch::role { 'myrole':
privileges => {
'cluster' => [ 'monitor' ],
'indices' => [{
'names' => [ '*' ],
'privileges' => [ 'read' ],
}]
}
}
And in Shield:
elasticsearch::role { 'myrole':
privileges => {
'cluster' => 'monitor',
'indices' => {
'*' => 'read'
}
}
}
This role would grant users access to cluster monitoring and read access to all indices.
See the Shield or X-Pack documentation for your version to determine what privileges
to use and how to format them (the Puppet hash representation will simply be translated into yaml.)
Note: The Puppet provider for esusers
/users
has fine-grained control over the roles.yml
file and thus will leave the default roles Shield installs in-place.
If you would like to explicitly purge the default roles (leaving only roles managed by puppet), you can do so by including the following in your manifest:
resources { 'elasticsearch_role':
purge => true,
}
Mappings
Associating mappings with a role for file-based management is done by passing an array of strings to the mappings
parameter of the elasticsearch::role
type.
For example, to define a role with mappings:
elasticsearch::role { 'logstash':
mappings => [
'cn=group,ou=devteam',
],
privileges => {
'cluster' => 'manage_index_templates',
'indices' => [{
'names' => ['logstash-*'],
'privileges' => [
'write',
'delete',
'create_index',
],
}],
},
}
Note: Observe the brackets around indices
in the preceding role definition; which is an array of hashes per the format in Shield 2.3.x. Follow the documentation to determine the correct formatting for your version of Shield or X-Pack.
If you'd like to keep the mappings file purged of entries not under Puppet's control, you should use the following resources
declaration because mappings are a separate low-level type:
resources { 'elasticsearch_role_mapping':
purge => true,
}
Users
Users can be managed using the elasticsearch::user
type.
For example, to create a user mysuser
with membership in myrole
:
elasticsearch::user { 'myuser':
password => 'mypassword',
roles => ['myrole'],
}
The password
parameter will also accept password hashes generated from the esusers
/users
utility and ensure the password is kept in-sync with the Shield users
file for all Elasticsearch instances.
elasticsearch::user { 'myuser':
password => '$2a$10$IZMnq6DF4DtQ9c4sVovgDubCbdeH62XncmcyD1sZ4WClzFuAdqspy',
roles => ['myrole'],
}
Note: When using the esusers
/users
provider (the default for plaintext passwords), Puppet has no way to determine whether the given password is in-sync with the password hashed by Shield/X-Pack.
In order to work around this, the elasticsearch::user
resource has been designed to accept refresh events in order to update password values.
This is not ideal, but allows you to instruct the resource to change the password when needed.
For example, to update the aforementioned user's password, you could include the following your manifest:
notify { 'update password': } ~>
elasticsearch::user { 'myuser':
password => 'mynewpassword',
roles => ['myrole'],
}
Certificates
SSL/TLS can be enabled by providing an elasticsearch::instance
type with paths to the certificate and private key files, and a password for the keystore.
elasticsearch::instance { 'es-01':
ssl => true,
ca_certificate => '/path/to/ca.pem',
certificate => '/path/to/cert.pem',
private_key => '/path/to/key.pem',
keystore_password => 'keystorepassword',
}
Note: Setting up a proper CA and certificate infrastructure is outside the scope of this documentation, see the aforementioned Shield or X-Pack guide for more information regarding the generation of these certificate files.
The module will set up a keystore file for the node to use and set the relevant options in elasticsearch.yml
to enable TLS/SSL using the certificates and key provided.
System Keys
Shield/X-Pack system keys can be passed to the module, where they will be placed into individual instance configuration directories.
This can be set at the elasticsearch
class and inherited across all instances:
class { 'elasticsearch':
system_key => 'puppet:///path/to/key',
}
Or set on a per-instance basis:
elasticsearch::instance { 'es-01':
system_key => '/local/path/to/key',
}
Package version pinning
The module supports pinning the package version to avoid accidental upgrades that are not done by Puppet. To enable this feature:
class { 'elasticsearch':
package_pin => true,
version => '1.5.2',
}
In this example we pin the package version to 1.5.2.
Data directories
There are several different ways of setting data directories for Elasticsearch.
In every case the required configuration options are placed in the elasticsearch.yml
file.
Default
By default we use:
/usr/share/elasticsearch/data/$instance_name
Which provides a data directory per instance.
Single global data directory
class { 'elasticsearch':
datadir => '/var/lib/elasticsearch-data'
}
Creates the following for each instance:
/var/lib/elasticsearch-data/$instance_name
Multiple Global data directories
class { 'elasticsearch':
datadir => [ '/var/lib/es-data1', '/var/lib/es-data2']
}
Creates the following for each instance:
/var/lib/es-data1/$instance_name
and
/var/lib/es-data2/$instance_name
.
Single instance data directory
class { 'elasticsearch': }
elasticsearch::instance { 'es-01':
datadir => '/var/lib/es-data-es01'
}
Creates the following for this instance:
/var/lib/es-data-es01
Multiple instance data directories
class { 'elasticsearch': }
elasticsearch::instance { 'es-01':
datadir => ['/var/lib/es-data1-es01', '/var/lib/es-data2-es01']
}
Creates the following for this instance:
/var/lib/es-data1-es01
and
/var/lib/es-data2-es01
.
Shared global data directories
In some cases, you may want to share a top-level data directory among multiple instances.
class { 'elasticsearch':
datadir_instance_directories => false,
config => {
'node.max_local_storage_nodes' => 2
}
}
elasticsearch::instance { 'es-01': }
elasticsearch::instance { 'es-02': }
Will result in the following directories created by Elasticsearch at runtime:
/var/lib/elasticsearch/nodes/0
/var/lib/elasticsearch/nodes/1
See the Elasticsearch documentation for additional information regarding this configuration.
Main and instance configurations
The config
option in both the main class and the instances can be configured to work together.
The options in the instance
config hash will merged with the ones from the main class and override any duplicates.
Simple merging
class { 'elasticsearch':
config => { 'cluster.name' => 'clustername' }
}
elasticsearch::instance { 'es-01':
config => { 'node.name' => 'nodename' }
}
elasticsearch::instance { 'es-02':
config => { 'node.name' => 'nodename2' }
}
This example merges the cluster.name
together with the node.name
option.
Overriding
When duplicate options are provided, the option in the instance config overrides the ones from the main class.
class { 'elasticsearch':
config => { 'cluster.name' => 'clustername' }
}
elasticsearch::instance { 'es-01':
config => { 'node.name' => 'nodename', 'cluster.name' => 'otherclustername' }
}
elasticsearch::instance { 'es-02':
config => { 'node.name' => 'nodename2' }
}
This will set the cluster name to otherclustername
for the instance es-01
but will keep it to clustername
for instance es-02
Configuration writeup
The config
hash can be written in 2 different ways:
Full hash writeup
Instead of writing the full hash representation:
class { 'elasticsearch':
config => {
'cluster' => {
'name' => 'ClusterName',
'routing' => {
'allocation' => {
'awareness' => {
'attributes' => 'rack'
}
}
}
}
}
}
Short hash writeup
class { 'elasticsearch':
config => {
'cluster' => {
'name' => 'ClusterName',
'routing.allocation.awareness.attributes' => 'rack'
}
}
}
Keystore Settings
Recent versions of Elasticsearch include the elasticsearch-keystore utility to create and manage the elasticsearch.keystore
file which can store sensitive values for certain settings.
The settings and values for this file can be controlled by this module.
Settings follow the behavior of the config
parameter for the top-level Elasticsearch class and elasticsearch::instance
defined types.
That is, you may define keystore settings globally, and all values will be merged with instance-specific settings for final inclusion in the elasticsearch.keystore
file.
Note that each hash key is passed to the elasticsearch-keystore
utility in a straightforward manner, so you should specify the hash passed to secrets
in flattened form (that is, without full nested hash representation).
For example, to define cloud plugin credentials for all instances:
class { 'elasticsearch':
secrets => {
'cloud.aws.access_key' => 'AKIA....',
'cloud.aws.secret_key' => 'AKIA....',
}
}
Or, to instead control these settings for a single instance:
elasticsearch::instance { 'es-01':
secrets => {
'cloud.aws.access_key' => 'AKIA....',
'cloud.aws.secret_key' => 'AKIA....',
}
}
Purging Secrets
By default, if a secret setting exists on-disk that is not present in the secrets
hash, this module will leave it intact.
If you prefer to keep only secrets in the keystore that are specified in the secrets
hash, use the purge_secrets
boolean parameter either on the elasticsearch
class to set it globally or per-instance.
Notifying Services
Any changes to keystore secrets will notify running elasticsearch services by respecting the restart_on_change
and restart_config_change
parameters.
Reference
Class parameters are available in the auto-generated documentation pages. Autogenerated documentation for types, providers, and ruby helpers is also available on the same documentation site.
Limitations
This module is built upon and tested against the versions of Puppet listed in the metadata.json file (i.e. the listed compatible versions on the Puppet Forge).
The module has been tested on:
- Debian 7/8
- CentOS 6/7
- OracleLinux 6/7
- Ubuntu 14.04, 16.04
- OpenSuSE 42.x
- SLES 12
Other distro's that have been reported to work:
- RHEL 6
- Scientific 6
Testing on other platforms has been light and cannot be guaranteed.
Development
Please see the CONTRIBUTING.md file for instructions regarding development environments and testing.
Support
Need help? Join us in #elasticsearch on Freenode IRC or on the discussion forum.