Configure volatile layer settings
You can configure volatile layer settings when you create a layer in a catalog.
Data in volatile layers is not encrypted due to the performance-oriented nature of this storage type.
You can specify the storage size of your layer. When selecting the storage capacity, take into consideration that layer storage will include both the data and metadata for your volatile layer.
The HERE platform will enforce the maximum memory policy you have selected once the entire storage is used. For more information on maximum memory policy, see Maximum memory policy
The valid range of storage capacity values is from 100 MB to 21 GB. You can specify it in increments of 100 MB.
To help you calculate what storage size you need, the average metadata size of 10,000 partitions is around 2 MB. For example, if you have 100,000 partitions, you would include 20 MB of additional storage for metadata. For more information on data limits and cost considerations, see Storage and throughput limits.
You can configure data redundancy for your layer. Two options are available:
single-instance: moderate availability. There will be only one instance for storing your data/metadata. Write and read loads may impact each other's performance.
multi-instance: high availability. There will be a total of (3) copies of the data/metadata. One master copy and (2) redundant copies. This option will also provide better performance as read load does not impact the performance of write operations.
multi-instance is highly recommended in a production environment. With
single-instance, failure of the layer's underlying infrastructure may result in irrecoverable loss of data.
If you do not configure data redundancy,
multi-instance is enabled by default.
single-instance, know that all volatile layer data durability statements documented here are waived. For more information on data durability, see Data security and durability.
Maximum memory policy
When a volatile layer is full, a decision needs to be made on what action should be taken. The following options are available:
FailOnWrite: the write operation will fail and an error code will be returned to the client.
ReplaceLessRecentlyUsedPartition: The volatile layer keeps track of when each partition was written and read. When no space is available in the layer, the partition that has not been accessed for the longest time will be automatically removed to create space for the new partition. Note that if removing one partition does not create enough space, several partitions may be removed.
If you do not specify a maximum memory policy,
FailOnWrite is used by default.
A volatile layer can be configured with a retention time value, also known as Time-To-Live or TTL. The TTL value defines the length of time that data in partitions will exist. After the retention time elapses, the data is removed. Specifying a TTL value is especially useful when the validity period of partitions data in the layer is known in advance. For example, if a layer stores traffic incidents which are known to expire within 24 hours, then the TTL should be set to 24 hours. When creating a volatile layer, the default TTL is set to 60 minutes. This value can be modified during layer creation. The TTL value must be between 1 min and 10080min (7days). If you are using the API directly, this value setting is represented in ms (min: 60000ms ; max: 604800000ms)
TTL applies to partition data only. Partition metadata is not removed when the TTL time elapses. Metadata must be deleted explicitly using the Publish API.
The partitioning scheme determines how partitions in the layer are named. Use HERE Tile partitioning for map data and use generic partitioning for other kinds of data. For more information, see Partitions.
The content type specifies the media type to use to identify the kind of data in the layer.
The content encoding setting determines whether to use compression to reduce the size of data stored in the layer. To enable compression, specify gzip.
Compressing data optimizes storage size, transfer I/O, and read costs. However, compressing data results in extra CPU cycles for the actual compression. Consider both the benefits and costs of compression when deciding whether to enable compression for a layer.
Some formats, especially textual formats such as text, XML, JSON, and GeoJSON, have very good compression rates. Other data formats are already compressed, such as JPEG or PNG images, so compressing them again with gzip will not result in reduced size. Often, it will even increase the size of the payload. For general-purpose binary formats like Protobuf, compression rates depend on the actual content and message size. You are advised to test the compression rate on real data to verify whether compression is beneficial.
Compression should not be used for Parquet. Compression breaks random access to blob data, which is necessary to efficiently read data in Parquet.
If the layer contains SDII data, note that the
/layers/<layerID>/sdiimessagelist endpoint does not support compression. So if you enable compression for a layer containing SDII data, you must use the
ingest API's generic endpoint (
/layers/<layerID>) and all compression and decompression must be handled by your application.
If you are using the Data Client Library to read or write data from a compressed layer, compression and decompression are handled automatically.
If you are using the Data API to read or write data from a compressed layer, you must compress data before writing it to the layer. When reading data, the data you receive is in gzip format and you are responsible for decompressing it.
Specifying a schema enables you to share data with others by defining for others how to consume the data. For more information, see Schemas.
The digest property specifies the algorithm used by the data publisher to generate a hash for each partition in the layer. By specifying a digest algorithm for the layer, you communicate to data consumers the algorithm to use to verify the integrity of the data they retrieve from the layer.
You can specify a digest algorithm when creating or updating a layer. If you specify "undefined", you can specify another digest algorithm after the layer is created. If you specify a digest algorithm, you cannot change it later.
When choosing a digest algorithm, consider the following:
- SHA-256 is recommended for applications where strong data security is required
- MD5 and SHA-1 is acceptable when the purpose of applying a hash is to verify data integrity during transit.
Including a hash is optional, but if you intend to provide hashes for partitions in this layer you should specify the algorithm you will use.
The HERE platform does not verify that the algorithm you specify here is the one used to generate the actual hashes, so it is up to the data publisher to ensure that the algorithm specified here is the one used in the publishing process.
For more information about common algorithms, see Secure Hash Algorithms.
Digest and CRC are two different fields. Digest is used for security to prevent human tampering. CRC is used for safety to prevent bit flips by computer hardware or network transportation. You can use both fields.
crc property specifies the CRC algorithm used by the data publisher to generate a checksum for each partition in the layer. When you specify a CRC algorithm for the layer, you tell data consumers which algorithm to use so they can verify the integrity of the data they retrieve from the layer.
You can specify a CRC algorithm when creating or updating a layer. If you specify "undefined", you can specify another CRC algorithm after the layer is created. If you specify a CRC algorithm, you cannot change it later.
This CRC has the following properties
- Padded with zeros to a fixed length of 8 characters
- Stored as a string For example, if your calculated CRC is the
uint32 value of
0x1234a, then the CRC that is actually stored for the partition is the string
Currently only one CRC algorithm is supported:
For more information about common algorithms, see Cyclic redundancy check.
Including a checksum is optional but if you intend to provide checksums for partitions in this layer, you should specify the algorithm you will use.
The HERE Workspace does not verify that the algorithm you specify here is the one used to generate the actual checksums, so it is up to the data publisher to ensure that the algorithm specified here is the one used in the publishing process.
Digest and CRC are two different fields. Digest is used for security reasons to prevent human tampering. CRC is used for safety reasons to prevent bit flips caused by computer hardware or network transportation. You can use both fields.
The geographic area that this layer covers. This setting controls which areas of the world are highlighted in the layer's coverage map in the platform portal.
Specify a list of countries and regions using the two-character ISO 3166-1 alpha 2 code. Optionally, you can add a two-character country subdivision code using the ISO 3166-2 codes for country subdivisions. For example, you can specify 'DE' for Germany, 'BR' for Brazil, or 'CN-HK' for Hong Kong.